100+ datasets found
  1. Traces captured by visiting the top 1500 website

    • kaggle.com
    zip
    Updated Aug 25, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DNS_dataset (2021). Traces captured by visiting the top 1500 website [Dataset]. https://www.kaggle.com/jacksontang16/traces-captured-by-visiting-the-top-1500-website
    Explore at:
    zip(5852806 bytes)Available download formats
    Dataset updated
    Aug 25, 2021
    Authors
    DNS_dataset
    Description

    Dataset

    This dataset was created by DNS_dataset

    Contents

  2. Most visited websites by hierachycal categories

    • kaggle.com
    Updated Sep 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Natanael de Souza Figueiredo (2020). Most visited websites by hierachycal categories [Dataset]. https://www.kaggle.com/natanael127/most-visited-websites-by-hierachycal-categories/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Natanael de Souza Figueiredo
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Alexa Internet was founded in April 1996 by Brewster Kahle and Bruce Gilliat. The company's name was chosen in homage to the Library of Alexandria of Ptolemaic Egypt, drawing a parallel between the largest repository of knowledge in the ancient world and the potential of the Internet to become a similar store of knowledge. (from Wikipedia)

    The categories list was going out by September, 17h, 2020. So I would like to save it. https://support.alexa.com/hc/en-us/articles/360051913314

    This dataset was elaborated by this python script (V2.0): https://github.com/natanael127/dump-alexa-ranking

    Content

    The sites are grouped in 17 macro categories and this tree ends having more than 360.000 nodes. Subjects are very organized and each of them has its own rank of most accessed domains. So, even the keys of a sub-dictionary may be a good small dataset to use.

    Acknowledgements

    Thank you my friend André (https://github.com/andrerclaudio) by helping me with tips of Google Colaboratory and computational power to get the data until our deadline.

    Inspiration

    Alexa ranking was inspired by Library of Alexandria. In the modern world, it may be a good start for AI know more about many, many subjects of the world.

  3. n

    (Dataset) The most visited health websites in the world

    • narcis.nl
    • data.mendeley.com
    Updated Jan 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Acosta-Vargas, P (via Mendeley Data) (2021). (Dataset) The most visited health websites in the world [Dataset]. http://doi.org/10.17632/n468trh5my.1
    Explore at:
    Dataset updated
    Jan 11, 2021
    Dataset provided by
    Data Archiving and Networked Services (DANS)
    Authors
    Acosta-Vargas, P (via Mendeley Data)
    Description

    Evaluation of the most visited health websites in the world

  4. P

    Alexa Domains Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isaac Corley; Jonathan Lwowski; Justin Hoffman (2001). Alexa Domains Dataset [Dataset]. https://paperswithcode.com/dataset/gagan-bhatia
    Explore at:
    Dataset updated
    Feb 1, 2001
    Authors
    Isaac Corley; Jonathan Lwowski; Justin Hoffman
    Description

    This dataset is composed of the URLs of the top 1 million websites. The domains are ranked using the Alexa traffic ranking which is determined using a combination of the browsing behavior of users on the website, the number of unique visitors, and the number of pageviews. In more detail, unique visitors are the number of unique users who visit a website on a given day, and pageviews are the total number of user URL requests for the website. However, multiple requests for the same website on the same day are counted as a single pageview. The website with the highest combination of unique visitors and pageviews is ranked the highest

  5. Colombia: most visited websites 2024, by unique visitors

    • statista.com
    • ai-chatbox.pro
    Updated Jun 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Colombia: most visited websites 2024, by unique visitors [Dataset]. https://www.statista.com/statistics/1409003/most-visited-websites-unique-visitors-colombia/
    Explore at:
    Dataset updated
    Jun 4, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2024
    Area covered
    Colombia
    Description

    In November 2024, Google.com was the leading website in Colombia by unique visits, with around 52.9 million single accesses to the URL during that month. YouTube.com came in second with approximately 30.9 million unique monthly visits. Facebook ranked third with 24.2 million unique monthly visits.

  6. O

    Top 50 Pages By Pageviews on Austintexas.gov -

    • data.austintexas.gov
    • gimi9.com
    • +1more
    application/rdfxml +5
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Austin, Texas - data.austintexas.gov (2023). Top 50 Pages By Pageviews on Austintexas.gov - [Dataset]. https://data.austintexas.gov/City-Government/Top-50-Pages-By-Pageviews-on-Austintexas-gov-/8yfa-b3bq
    Explore at:
    csv, xml, application/rdfxml, application/rssxml, json, tsvAvailable download formats
    Dataset updated
    Dec 6, 2023
    Dataset authored and provided by
    City of Austin, Texas - data.austintexas.gov
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    This data, exported from Google Analytics displays the most popular 50 pages on Austintexas.gov based on the following: Views: The total number of times the page was viewed. Repeated views of a single page are counted. Bounce Rate: The percentage of single-page visits (i.e. visits in which the person left your site from the entrance page without interacting with the page).

    *Note: On July 1, 2023, standard Universal Analytics properties will stop processing data.

  7. A

    ‘Popular Website Traffic Over Time ’ analyzed by Analyst-2

    • analyst-2.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘Popular Website Traffic Over Time ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-popular-website-traffic-over-time-62e4/62549059/?iid=003-357&v=presentation
    Explore at:
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Popular Website Traffic Over Time ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/popular-website-traffice on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    About this dataset

    Background

    Have you every been in a conversation and the question comes up, who uses Bing? This question comes up occasionally because people wonder if these sites have any views. For this research study, we are going to be exploring popular website traffic for many popular websites.

    Methodology

    The data collected originates from SimilarWeb.com.

    Source

    For the analysis and study, go to The Concept Center

    This dataset was created by Chase Willden and contains around 0 samples along with 1/1/2017, Social Media, technical information and other features such as: - 12/1/2016 - 3/1/2017 - and more.

    How to use this dataset

    • Analyze 11/1/2016 in relation to 2/1/2017
    • Study the influence of 4/1/2017 on 1/1/2017
    • More datasets

    Acknowledgements

    If you use this dataset in your research, please credit Chase Willden

    Start A New Notebook!

    --- Original source retains full ownership of the source dataset ---

  8. i

    Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and...

    • ieee-dataport.org
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamad Amar Irsyad Mohd Aminuddin (2024). Website Fingerprinting Dataset of Browsing Network Traffic for Desktop and Mobile Webpages [Dataset]. https://ieee-dataport.org/documents/website-fingerprinting-dataset-browsing-network-traffic-desktop-and-mobile-webpages
    Explore at:
    Dataset updated
    Oct 21, 2024
    Authors
    Mohamad Amar Irsyad Mohd Aminuddin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset of Tor cell file extracted from browsing simulation using Tor Browser. The simulations cover both desktop and mobile webpages. The data collection process was using WFP-Collector tool (https://github.com/irsyadpage/WFP-Collector). All the neccessary configuration to perform the simulation as detailed in the tool repository.The webpage URL is selected by using the first 100 website based on: https://dataforseo.com/free-seo-stats/top-1000-websites.Each webpage URL is visited 90 times for each deskop and mobile browsing mode.

  9. Top syndicated pages from CDC.gov by weekly page views

    • data.virginia.gov
    • healthdata.gov
    • +4more
    csv, json, rdf, xsl
    Updated Aug 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2023). Top syndicated pages from CDC.gov by weekly page views [Dataset]. https://data.virginia.gov/dataset/top-syndicated-pages-from-cdc-gov-by-weekly-page-views
    Explore at:
    csv, xsl, rdf, jsonAvailable download formats
    Dataset updated
    Aug 11, 2023
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    The CDC Content Syndication site at https://tools.cdc.gov/syndication/ allows you to import content from CDC websites directly into your own website or application. These services are provided free of charge from CDC. The data shown in this table represent the weekly top page views from CDC.gov offered by syndication.

  10. f

    Top 15 websites with highest PageRank.

    • plos.figshare.com
    xls
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peiteng Shi; Xiaohan Huang; Jun Wang; Jiang Zhang; Su Deng; Yahui Wu (2023). Top 15 websites with highest PageRank. [Dataset]. http://doi.org/10.1371/journal.pone.0136243.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Peiteng Shi; Xiaohan Huang; Jun Wang; Jiang Zhang; Su Deng; Yahui Wu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The numbers in the parentheses are the ranking orders according to the focus indicators.Top 15 websites with highest PageRank.

  11. What social Media People like the most and why?

    • kaggle.com
    Updated Feb 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nina Luquez (2023). What social Media People like the most and why? [Dataset]. https://www.kaggle.com/ninaluquez/what-social-media-people-like-the-most-and-why/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nina Luquez
    Description

    Dataset

    This dataset was created by Nina Luquez

    Contents

  12. c

    Most popular websites in the Netherlands 2015

    • datacatalogue.cessda.eu
    • ssh.datastations.nl
    Updated Jul 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M. Kleppe; H. Bijleveld (2023). Most popular websites in the Netherlands 2015 [Dataset]. http://doi.org/10.17026/dans-x6h-6qqt
    Explore at:
    Dataset updated
    Jul 4, 2023
    Dataset provided by
    Vrije Universiteit Amsterdam
    Authors
    M. Kleppe; H. Bijleveld
    Area covered
    Netherlands
    Description

    This dataset contains a list of 3654 Dutch websites that we considered the most popular websites in 2015. This list served as whitelist for the Newstracker Research project in which we monitored the online web behaviour of a group of respondents.

    The research project 'The Newstracker' was a subproject of the NWO-funded project 'The New News Consumer: A User-Based Innovation Project to Meet Paradigmatic Change in News Use and Media Habits'.

    For the Newstracker project we aimed to understand the web behaviour of a group of respondents. We created custom-built software to monitor their web browsing behaviour on their laptops and desktops (please find the code in open access at https://github.com/NITechLabs/NewsTracker). For reasons of scale and privacy we created a whitelist with websites that were the most popular websites in 2015. We manually compiled this list by using data of DDMM, Alexa and own research. The dataset consists of 5 columns:
    - the URL
    - the type of website: We created a list of types of websites and each website has been manually labeled with 1 category
    - Nieuws-regio: When the category was 'News', we subdivided these websites in the regional focus: International, National or Local
    - Nieuws-onderwerp: Furthermore, each website under the category News was further subdivided in type of news website. For this we created an own list of news categories and manually coded each website
    - Bron: For each website we noted which source we used to find this website.

    The full description of the research design of the Newstracker including the set-up of this whitelist is included in the following article: Kleppe, M., Otte, M. (in print), 'Analysing & understanding news consumption patterns by tracking online user behaviour with a multimodal research design', Digital Scholarship in the Humanities, doi 10.1093/llc/fqx030.

  13. h

    1k_Website_Screenshots_and_Metadata

    • huggingface.co
    Updated Apr 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Silatus (2023). 1k_Website_Screenshots_and_Metadata [Dataset]. https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 13, 2023
    Dataset authored and provided by
    Silatus
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Dataset Card for 1000 Website Screenshots with Metadata

      Dataset Summary
    

    Silatus is sharing, for free, a segment of a dataset that we are using to train a generative AI model for text-to-mockup conversions. This dataset was collected in December 2022 and early January 2023, so it contains very recent data from 1,000 of the world's most popular websites. You can get our larger 10,000 website dataset for free at: https://silatus.com/datasets This dataset includes: High-res… See the full description on the dataset page: https://huggingface.co/datasets/silatus/1k_Website_Screenshots_and_Metadata.

  14. O

    Open Data BR Site Analytics - Top 10 Assets Viewed or Downloaded

    • data.brla.gov
    application/rdfxml +5
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Open Data BR Site Analytics - Top 10 Assets Viewed or Downloaded [Dataset]. https://data.brla.gov/dataset/Open-Data-BR-Site-Analytics-Top-10-Assets-Viewed-o/ie4p-gccw
    Explore at:
    tsv, application/rssxml, json, csv, application/rdfxml, xmlAvailable download formats
    Dataset updated
    Jun 28, 2025
    Description

    This dataset provides detail on how all assets on a domain are being used (e.g. views, downloads, API reads).

    User activity is provided by date, asset uid, asset type, asset name, access type and user segment. Please see Site Analytics: Asset Access for more detail about these fields.

    The dataset will reflect new Asset Access records within a day of when they occur.

  15. Top 100 HHS Websites [RAW]

    • healthdata.gov
    • data.virginia.gov
    application/rdfxml +5
    Updated Apr 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Top 100 HHS Websites [RAW] [Dataset]. https://healthdata.gov/dataset/Top-100-HHS-Websites-RAW-/xs6e-ics5
    Explore at:
    tsv, xml, application/rssxml, csv, json, application/rdfxmlAvailable download formats
    Dataset updated
    Apr 26, 2024
    Description

    This page serves as the backing dataset for the Top 100 HHS Websites, sorted by total page views. Please refer to the story page here for more information:https://healthdata.gov/stories/s/Top-100-HHS-Websites/d84g-3yzd

  16. w

    Dataset of stocks from Top Ships

    • workwithdata.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of stocks from Top Ships [Dataset]. https://www.workwithdata.com/datasets/stocks?f=1&fcol0=company&fop0=%3D&fval0=Top+Ships
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about stocks. It has 1 row and is filtered where the company is Top Ships. It features 8 columns including stock name, company, exchange, and exchange symbol.

  17. w

    Dataset of stocks from Top Spring International

    • workwithdata.com
    Updated Apr 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2025). Dataset of stocks from Top Spring International [Dataset]. https://www.workwithdata.com/datasets/stocks?f=1&fcol0=company&fop0=%3D&fval0=Top+Spring+International
    Explore at:
    Dataset updated
    Apr 11, 2025
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about stocks. It has 1 row and is filtered where the company is Top Spring International. It features 8 columns including stock name, company, exchange, and exchange symbol.

  18. S

    Website Top Page Views

    • data.sugarlandtx.gov
    csv
    Updated Jun 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Communications and Community Engagement (2025). Website Top Page Views [Dataset]. https://data.sugarlandtx.gov/dataset/website-top-page-views
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 7, 2025
    Dataset authored and provided by
    Communications and Community Engagement
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Instance of an user visiting a particular page on a website.

  19. O

    Corporate Website — Analytics — Top 100 search terms

    • data.qld.gov.au
    • researchdata.edu.au
    html
    Updated Jul 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brisbane City Council (2025). Corporate Website — Analytics — Top 100 search terms [Dataset]. https://www.data.qld.gov.au/dataset/corporate-website-analytics-top-100-search-terms
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Jul 12, 2025
    Dataset authored and provided by
    Brisbane City Council
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is available on Brisbane City Council’s open data website – data.brisbane.qld.gov.au. The site provides additional features for viewing and interacting with the data and for downloading the data in various formats.

    Monthly analytics reports for the Brisbane City Council website

    Information regarding the sessions for Brisbane City Council website during the month including search terms used.

  20. A

    ‘Fortune 1000’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Fortune 1000’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-fortune-1000-03c3/b2a55ac6/?iid=026-666&v=presentation
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Fortune 1000’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/winston56/fortune-500-data-2021 on 13 November 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    Every year Fortune, an American Business Magazine, publishes the Fortune 500, which ranks the top 500 corporations by revenue. This dataset includes the entire Fortune 1000, as opposed to just the top 500.

    Content

    The Fortune 1000 dataset is from the Fortune website, collected by the processes outlined in this notebook. It contains U.S. company data for the year 2021. The dataset is 1000 rows and 18 columns.

    Features

    • Company - values are the name of the company
    • Rank - The 2021 rank established by Fortune (1-1000)
    • Rank Change - The change in the rank from 2020 to 2021. There is only a rank change listed if the company is currently in the top 500 and was previously in the top 500.
    • Revenue - Revenue of each company in millions. This is the criteria used to rank each company.
    • Profit - Profit of each company in millions.
    • Num. of Employees - The number of employees each company employs.
    • Sector - The sector of the market the company operates in.
    • City - The city where the company's headquarters is located.
    • State - The state where the company's headquarters is located
    • Newcomer - Indicates whether or not the company is new to the top Fortune 500 ("yes" or "no"). No value will be listed for companies outside of the top 500.
    • CEO Founder - Indicates whether the CEO of the company is also the founder ("yes" or "no").
    • CEO Woman - Indicates whether the CEO of the company is a woman ("yes" or "no").
    • Profitable - Indicates whether the company is profitable or not ("yes" or "no").
    • Prev. Rank - The 2020 rank of the company, as established by Fortune. There will only be previous rank data for the top 500 companies.
    • CEO - The name of the CEO of the company
    • Website - The url of the company website
    • Ticker - The stock ticker symbol of public companies. Some rows will have empty values because the company is a private corporation.
    • Market Cap - The market cap (or value) of the company in millions. Some rows will have empty values because the company is private. Market valuations were determined on January 20, 2021.

    Inspiration

    This dataset is made to explore the top corporations in the U.S. Answer questions such as: What percentage of companies have women ceo's? How many companies are newcomers? What percentage of companies have ceos who were also founders? What role does profitability play in ranking?

    --- Original source retains full ownership of the source dataset ---

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
DNS_dataset (2021). Traces captured by visiting the top 1500 website [Dataset]. https://www.kaggle.com/jacksontang16/traces-captured-by-visiting-the-top-1500-website
Organization logo

Traces captured by visiting the top 1500 website

Traffic captured by visiting the top 1500 most visited sites ranked by Alexa

Explore at:
zip(5852806 bytes)Available download formats
Dataset updated
Aug 25, 2021
Authors
DNS_dataset
Description

Dataset

This dataset was created by DNS_dataset

Contents

Search
Clear search
Close search
Google apps
Main menu