27 datasets found
  1. Z

    Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values)

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Godahewa, Rakshitha; Bergmeir, Christoph; Webb, Geoff; Hyndman, Rob; Montero-Manso, Pablo (2021). Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892918
    Explore at:
    Dataset updated
    Apr 1, 2021
    Dataset provided by
    Lecturer at Monash University
    Professor at Monash University
    Lecturer at University of Sydney
    PhD Student at Monash University
    Authors
    Godahewa, Rakshitha; Bergmeir, Christoph; Webb, Geoff; Hyndman, Rob; Montero-Manso, Pablo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset was used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 daily time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-10.

    The original dataset contains missing values. They have been simply replaced by zeros.

  2. Total global visitor traffic to Wikipedia.org 2024

    • statista.com
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Total global visitor traffic to Wikipedia.org 2024 [Dataset]. https://www.statista.com/statistics/1259907/wikipedia-website-traffic/
    Explore at:
    Dataset updated
    Aug 20, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Oct 2023 - Mar 2024
    Area covered
    Worldwide
    Description

    In March 2024, close to 4.4 billion unique global visitors had visited Wikipedia.org, slightly down from 4.4 billion visitors since August of the same year. Wikipedia is a free online encyclopedia with articles generated by volunteers worldwide. The platform is hosted by the Wikimedia Foundation.

  3. Average Annual Daily Traffic (AADT)

    • caliper.com
    cdf, dwg, dxf, gdb +9
    Updated Sep 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Caliper Corporation (2025). Average Annual Daily Traffic (AADT) [Dataset]. https://www.caliper.com/mapping-software-data/aadt-traffic-count-data.htm
    Explore at:
    postgresql, postgis, sdo, geojson, shp, cdf, kml, kmz, dxf, dwg, ntf, sql server mssql, gdbAvailable download formats
    Dataset updated
    Sep 23, 2025
    Dataset authored and provided by
    Caliper Corporationhttp://www.caliper.com/
    License

    https://www.caliper.com/license/maptitude-license-agreement.htmhttps://www.caliper.com/license/maptitude-license-agreement.htm

    Time period covered
    2025
    Area covered
    United States
    Description

    Average Annual Daily Traffic data for use with GIS mapping software, databases, and web applications are from Caliper Corporation and contain data on the total volume of vehicle traffic on a highway or road for a year divided by 365 days.

  4. Wikipedia Web Traffic 2018-19

    • kaggle.com
    zip
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    san_bt (2021). Wikipedia Web Traffic 2018-19 [Dataset]. https://www.kaggle.com/datasets/sandeshbhat/wikipedia-web-traffic-201819/code
    Explore at:
    zip(74931432 bytes)Available download formats
    Dataset updated
    Apr 12, 2021
    Authors
    san_bt
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    • Time Series: Time series is a set of observations recorded over regular interval of time, Time series can be beneficial in many fields like stock market prediction, weather forecasting. - Accounts for the fact that data points taken over time may have an internal structure (such as auto correlation, trend or seasonal variation) that should be accounted for.

    • Web traffic: Amount of data sent and received by visitors to a website. - Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are popular and if there are any apparent trends, such as one specific page being viewed mostly by people in a particular country

    Content

    Contains Page Views for 60k Wikipedia articles in 8 different languages taken on a daily basis for 2 years.

    https://i.ibb.co/h1JCgpY/DSLC.png" alt="DSLC">

    A Data Science Life Cycle can be used to create a project. Forecasting can be done for any interval provided sufficient dataset is available. Refer the Github link in the tasks to view the forecast done using ARIMA and Prophet. Further feel free to contribute. Several other models can be used including a neural network to improve the results by many folds.

    Acknowledgements

    Credits :
    1. Wikipedia 2. Google

  5. AKDOT_AADT_Unofficial

    • kaggle.com
    zip
    Updated Aug 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ErikJamesMason (2020). AKDOT_AADT_Unofficial [Dataset]. https://www.kaggle.com/datasets/erikjamesmason/akdot-aadt-unofficial
    Explore at:
    zip(4555565 bytes)Available download formats
    Dataset updated
    Aug 11, 2020
    Authors
    ErikJamesMason
    Description

    AADT for Alaska

    Annual Average Daily Traffic is a standard method to view traffic volumes along roadway segments to encompass the entirety of traffic throughout the year in a single value. Wikipedia - AADT FHWA PDF - Traffic Data Pocket Guide

    Data

    The data comprises of annual statistics throughout the years 2002-2019 through a particular vendor. There are 3 different state changes in systems (migrations from legacy systems), one of which is present around 2013. 2006 AADT is missing entirely from the original database, thus is missing from this dataset.

    Ambitions

    Traffic Data does not always get the attention that other industry/services receive, yet transportation deeply impact all other industries, services, commericial activity, and even livelihood/well-being. The hope is that some development for intelligently predicting traffic could be accomplished, as traffic has relationships to everything from atmosphere/air quality, logistics, pavement, temperature, recreation, etc.

  6. Kaggle Wikipedia Web Traffic Weekly Dataset

    • zenodo.org
    • dataon.kisti.re.kr
    • +1more
    zip
    Updated Apr 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb (2021). Kaggle Wikipedia Web Traffic Weekly Dataset [Dataset]. http://doi.org/10.5281/zenodo.3898338
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 1, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is the aggregated version of the daily dataset used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-05, after aggregating them into weekly.

    The original dataset contains missing values. They have been simply replaced by zeros before aggregation.

  7. Wikipedia: most viewed articles in 2024

    • statista.com
    Updated Dec 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Wikipedia: most viewed articles in 2024 [Dataset]. https://www.statista.com/statistics/1358978/wikipedia-most-viewed-articles-by-number-of-views/
    Explore at:
    Dataset updated
    Dec 3, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    Worldwide
    Description

    The most viewed English-language article on Wikipedia in 2023 was Deaths in 2024, with a total of 44.4 million views. Political topics also dominated the list, with articles related to the 2024 U.S. presidential election and key political figures like Kamala Harris and Donald Trump ranking among the top ten most viewed pages. Wikipedia's language diversity As of December 2024, the English Wikipedia subdomain contained approximately 6.91 million articles, making it the largest in terms of content and registered active users. Interestingly, the Cebuano language ranked second with around 6.11 million entries, although many of these articles are reportedly generated by bots. German and French followed as the next most populous European language subdomains, each with over 18,000 active users. Compared to the rest of the internet, as of January 2024, English was the primary language for over 52 percent of websites worldwide, far outpacing Spanish at 5.5 percent and German at 4.8 percent. Global traffic to Wikipedia.org Hosted by the Wikimedia Foundation, Wikipedia.org saw around 4.4 billion unique global visits in March 2024, a slight decrease from 4.6 billion visitors in January. In addition, as of January 2024, Wikipedia ranked amongst the top ten websites with the most referring subnets worldwide.

  8. E

    A meta analysis of Wikipedia's coronavirus sources during the COVID-19...

    • live.european-language-grid.eu
    • data.niaid.nih.gov
    txt
    Updated Sep 8, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7806
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 8, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    At the height of the coronavirus pandemic, on the last day of March 2020, Wikipedia in all languages broke a record for most traffic in a single day. Since the breakout of the Covid-19 pandemic at the start of January, tens if not hundreds of millions of people have come to Wikipedia to read - and in some cases also contribute - knowledge, information and data about the virus to an ever-growing pool of articles. Our study focuses on the scientific backbone behind the content people across the world read: which sources informed Wikipedia’s coronavirus content, and how was the scientific research on this field represented on Wikipedia. Using citation as readout we try to map how COVID-19 related research was used in Wikipedia and analyse what happened to it before and during the pandemic. Understanding how scientific and medical information was integrated into Wikipedia, and what were the different sources that informed the Covid-19 content, is key to understanding the digital knowledge echosphere during the pandemic. To delimitate the corpus of Wikipedia articles containing Digital Object Identifier (DOI), we applied two different strategies. First we scraped every Wikipedia pages form the COVID-19 Wikipedia project (about 3000 pages) and we filtered them to keep only page containing DOI citations. For our second strategy, we made a search with EuroPMC on Covid-19, SARS-CoV2, SARS-nCoV19 (30’000 sci papers, reviews and preprints) and a selection on scientific papers form 2019 onwards that we compared to the Wikipedia extracted citations from the english Wikipedia dump of May 2020 (2’000’000 DOIs). This search led to 231 Wikipedia articles containing at least one citation of the EuroPMC search or part of the wikipedia COVID-19 project pages containing DOIs. Next, from our 231 Wikipedia articles corpus we extracted DOIs, PMIDs, ISBNs, websites and URLs using a set of regular expressions. Subsequently, we computed several statistics for each wikipedia article and we retrive Atmetics, CrossRef and EuroPMC infromations for each DOI. Finally, our method allowed to produce tables of citations annotated and extracted infromations in each wikipadia articles such as books, websites, newspapers.Files used as input and extracted information on Wikipedia's COVID-19 sources are presented in this archive.See the WikiCitationHistoRy Github repository for the R codes, and other bash/python scripts utilities related to this project.

  9. Sepsis information-seeking behaviors via Wikipedia between 2015 and 2018: A...

    • plos.figshare.com
    • figshare.com
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah (2023). Sepsis information-seeking behaviors via Wikipedia between 2015 and 2018: A mixed methods retrospective observational study [Dataset]. http://doi.org/10.1371/journal.pone.0221596
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raising public awareness of sepsis, a potentially life-threatening dysregulated host response to infection, to hasten its recognition has become a major focus of physicians, investigators, and both non-governmental and governmental agencies. While the internet is a common means by which to seek out healthcare information, little is understood about patterns and drivers of these behaviors. We sought to examine traffic to Wikipedia, a popular and publicly available online encyclopedia, to better understand how, when, and why users access information about sepsis. Utilizing pageview traffic data for all available language localizations of the sepsis and septic shock pages between July 1, 2015 and June 30, 2018, significantly outlying daily pageview totals were identified using a seasonal hybrid extreme studentized deviate approach. Consecutive outlying days were aggregated, and a qualitative analysis was undertaken of print and online news media coverage to identify potential correlates. Traffic patterns were further characterized using paired referrer to resource (i.e. clickstream) data, which were available for a temporal subset of the pageviews. Of the 20,557,055 pageviews across 65 linguistic localizations, 47 of the 1,096 total daily pageview counts were identified as upward outliers. After aggregating sequential outlying days, 25 epochs were examined. Qualitative analysis identified at least one major news media correlate for each, which were typically related to high-profile deaths from sepsis and, less commonly, awareness promotion efforts. Clickstream analysis suggests that most sepsis and septic shock Wikipedia pageviews originate from external referrals, namely search engines. Owing to its granular and publicly available traffic data, Wikipedia holds promise as a means by which to better understand global drivers of online sepsis information seeking. Further characterization of user engagement with this information may help to elucidate means by which to optimize the visibility, content, and delivery of awareness promotion efforts.

  10. 1.6 million UK traffic accidents

    • kaggle.com
    zip
    Updated Sep 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dave Fisher-Hickey (2017). 1.6 million UK traffic accidents [Dataset]. https://www.kaggle.com/forums/f/6458/1-6-million-uk-traffic-accidents
    Explore at:
    zip(138556435 bytes)Available download formats
    Dataset updated
    Sep 17, 2017
    Authors
    Dave Fisher-Hickey
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    United Kingdom
    Description

    Context

    The UK government amassed traffic data from 2000 and 2016, recording over 1.6 million accidents in the process and making this one of the most comprehensive traffic data sets out there. It's a huge picture of a country undergoing change.

    Note that all the contained accident data comes from police reports, so this data does not include minor incidents.

    Content

    ukTrafficAADF.csv tracks how much traffic there was on all major roads in the given time period (2000 through 2016). AADT, the core statistic included in this file, stands for "Average Annual Daily Flow", and is a measure of how activity a road segment based on how many vehicle trips traverse it. The AADT page on Wikipedia is a good reference on the subject.

    Accidents data is split across three CSV files: accidents_2005_to_2007.csv, accidents_2009_to_2011.csv, and accidents_2012_to_2014.csv. These three files together constitute 1.6 million traffic accidents. The total time period is 2005 through 2014, but 2008 is missing.

    A data dictionary for the raw dataset at large is available from the UK Department of Transport website here. For descriptions of individual columns, see the column metadata.

    Acknowledgements

    The license for this dataset is the Open Givernment Licence used by all data on data.gov.uk (here). The raw datasets are available from the UK Department of Transport website here.

    Inspiration

    • How has changing traffic flow impacted accidents?
    • Can we predict accident rates over time? What might improve accident rates?
    • Plot interactive maps of changing trends, e.g. How has London has changed for cyclists? Busiest roads in the nation?
    • Which areas never change and why? Identify infrastructure needs, failings and successes.
    • How have Rural and Urban areas differed (see RoadCategory)? How about the differences between England, Scotland, and Wales?
    • The UK government also like to look at miles driven. You can do this by multiplying the AADF by the corresponding length of road (link length) and by the number of days in the years. What does this tell you about UK roads?
  11. a

    Highway Traffic Analysis

    • icorridor-mto-on-ca.hub.arcgis.com
    Updated Jun 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Authoritative_iCorridor_mto_on_ca (2021). Highway Traffic Analysis [Dataset]. https://icorridor-mto-on-ca.hub.arcgis.com/items/0b8bb8198eb44285991a6751e0c51eb8
    Explore at:
    Dataset updated
    Jun 10, 2021
    Dataset authored and provided by
    Authoritative_iCorridor_mto_on_ca
    Description

    Waze maps provide information about specific routes to assist motorists in avoiding traffic jams. Waze provides information about traffic jams and events that affect road conditions, either from drivers using Waze or from external sources. MTO has partnered with Waze through their Connected Citizen Program to publish Waze reported information on the Ontario 511 website. MTO iCorridor leverages such traffic jams to generate summaries of delay, duration, speed, and length by weekly, monthly, time of day, and day of week. The objective is to determine the pattern of congestion on all provincial highway corridors. The delay analysis was comprised of the followings:Estimation of corridor delay;Estimation of duration and length of congestion; andIdentification of true peak period."It must be noted that a “no congestion scenario” does not necessarily imply that there is no traffic on a specific road. Even when congestion is reduced to zero there may still be vehicles driving on the road. Waze creates “jam lines” that indicate continuous portions of streets where speed has slowed. Waze data provides the exact geographic location, length, speed, and time delay for these jam lines compared to the time it would normally take to transverse the jam line by car. A categorization for the severity of the jam is also provided.the jam data is composed of jam lines (which can change over time) measured at different time intervals. Given the crowd-sourced nature of the data, it cannot be determined if fluctuations in jam line activity are due to actual changes in traffic conditions or due to fluctuations in the number of active Wazers. Evidence from on-the-ground measures supports the notion that changes in jam activity are generally due to actual changes in traffic conditions"ElementValueDescriptionpubDateTimePublication date.linqmap:typeStringTRAFFIC_JAM.georss:lineList of longitude and latitude coordinatesTraffic jam line string (supplied when available).linqmap:speedFloatCurrent average speed on jammed segments in meter/second.linqmap:lengthIntegerJam length in meters.linqmap:delayIntegerDelay of jam compared to free flow speed, in seconds (in case of block, 1).linqmap:streetStringStreet name (as is written in database, no canonical form (supplied when available).linqmap:cityStringCity and state name [City, State] in case both are available, [State] if not associated with a city (supplied when available).linqmap:countryStringAvailable on EU (world) server (see two letters codes in https://en.wikipedia.org/wiki/ISO-31661).linqmap:roadTypeIntegerRoad type (see road types table in the appendix).linqmap:startNodeStringNearest Junction/street/city to jam start (supplied when available).linqmap:endNodeStringNearest Junction/street/city to jam end (supplied when available).linqmap:level0-5Traffic congestion level (0 = free flow 5 = blocked).linqmap:uuidStringUnique jam identifier.linqmap:turnLineCoordinatesA set of coordinates of a turn only when the jam is in a turn (supplied when available).linqmap:turnTypeStringWhat kind of turn it is: left, right, exit R or L, continue straight, or NONE (no info) (supplied when available).linqmap:blockingAlertUuidStringIf the jam is connected to a block (see alerts).ElementValueDescriptionpubDateTimePublication date.georss:pointCoordinatesLocation per report (Lat long).linqmap:uuidStringUnique system ID.linqmap:magvarInteger (0359)Event direction (Driver heading at report time. 0 degrees at North, according to the driver's device).linqmap:typeSee alert type tableEvent type.linqmap:subtypeSee alert subtypes tableEvent subtype depends on parameter.linqmap:reportDescriptionStringReport description (supplied when available).linqmap:streetStringStreet name (as is written in database, no canonical form, may be null).linqmap:cityStringCity and state name [City, State] in case both are available, [State] if not associated with a city (supplied when available).linqmap:countryStringSee two letters codes in .linqmap:roadTypeIntegerRoad type (see road types table in the appendix).linqmap:reportRatingIntegerUser rank between 16 (6 = high ranked user).linqmap:jamUuidStringIf the alert is connected to a jam jam ID.linqmap:Reliability (new)0-10How reliable is the report, 10 being most reliable. Based on reporter level and user respon-reference from (https://ops.fhwa.dot.gov/publications/fhwahop18084/ch2.htm)

  12. Index figures traffic density

    • cbs.nl
    xml
    Updated Jan 5, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centraal Bureau voor de Statistiek (2012). Index figures traffic density [Dataset]. https://www.cbs.nl/en-gb/figures/detail/37674ENG
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 5, 2012
    Dataset authored and provided by
    Centraal Bureau voor de Statistiek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    The Netherlands
    Description

    Index figures traffic density by most important road categories and by type of day (weekday/working day) and region 1994 - 2005; January 1998 - December 2005 Changed on January 05 2012. Frequency: Discontinued.

  13. Traffic simulation of Ingolstadt in SUMO

    • catalog.savenow.de
    html, url
    Updated Nov 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AUDI AG (2023). Traffic simulation of Ingolstadt in SUMO [Dataset]. https://catalog.savenow.de/dataset/traffic-simulation-of-ingolstadt-in-sumo
    Explore at:
    url, htmlAvailable download formats
    Dataset updated
    Nov 14, 2023
    Dataset provided by
    Audihttp://audi.com/
    Area covered
    Ingolstadt
    Description

    This simulation was developed to create a realistic multimodal environment for traffic research. It is created in the SAVe:, SAVeNoW and KIVI research projects.

    The 24h simulation currently contains routes for passenger vehicles, heavy-vehicle traffic as well as bicycles. Both vehicle demand as well as traffic light settings are created from data for Wednesday the 16.09.2020 to create a weekday traffic situation mostly uninfluenced by covid-restrictions. However, it is possible to create a 24-hour calibration for each day since August 2019.

    The Multimodal simulation contains also Trips of PT-Vehicles created from GTFS Data provided by the local transport authority. For the GTFS-Format see: General Transit Feed Specification (GTFS) - Wikipedia

    The simulation serves as the digital twin of traffic in Ingolstadt. It is used for the scenario analysis and the virtual test field.

    The simulation might be used for various simulative studies. It could serve as a test bed for autonomous driving functions as well as a simulative environment for the exploration of mobility services.

    https://raw.githubusercontent.com/TUM-VT/sumo_ingolstadt/main/docs/simulation_view.png" alt="Alt text" title="a title">

  14. S

    Bard Statistics By Users, Usage, Traffic and Facts (2025)

    • sci-tech-today.com
    Updated Oct 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sci-Tech Today (2025). Bard Statistics By Users, Usage, Traffic and Facts (2025) [Dataset]. https://www.sci-tech-today.com/stats/bard-statistics/
    Explore at:
    Dataset updated
    Oct 14, 2025
    Dataset authored and provided by
    Sci-Tech Today
    License

    https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Introduction

    Bard Statistics: As a tech journalist and content creator, I have been tracking the evolution of Google Bard (now Gemini) since day one. Its quick turnaround from a controversial launch to a major industry strength has been one of the most exciting stories in tech.

    Today, I'm providing an exclusive analysis based on over two years of data, incorporating the fresh 2025 usage metrics and demographic insights. I'm here to show you exactly what this growth means for your AI roadmap. Without further ado, let’s get started.

  15. COVID-19 Pandemic Wikipedia Readership

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isaac Johnson; Leila Zia; Joseph Allemandou; Marcel Ruiz Forns; Nuria Ruiz; Fabian Kaelin (2023). COVID-19 Pandemic Wikipedia Readership [Dataset]. http://doi.org/10.6084/m9.figshare.14548032.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Isaac Johnson; Leila Zia; Joseph Allemandou; Marcel Ruiz Forns; Nuria Ruiz; Fabian Kaelin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data release includes two Wikipedia datasets related to the readership of the project as it relates to the early COVID-19 pandemic period. The first dataset is COVID-19 article page views by country, the second dataset is one hop navigation where one of the two pages are COVID-19 related. The data covers roughly the first six months of the pandemic, more specifically from January 1st 2020 to June 30th 2020. For more background on the pandemic in those months, see English Wikipedia's Timeline of the COVID-19 pandemic.Wikipedia articles are considered COVID-19 related according the methodology described here, the list of COVID-19 articles used for the released datasets is available in covid_articles.tsv. For simplicity and transparency, the same list of articles from 20 April 2020 was used for the entire dataset though in practice new COVID-19-relevant articles were constantly being created as the pandemic evolved.Privacy considerationsWhile this data is considered valuable for the insight that it can provide about information-seeking behaviors around the pandemic in its early months across diverse geographies, care must be taken to not inadvertently reveal information about the behavior of individual Wikipedia readers. We put in place a number of filters to release as much data as we can while minimizing the risk to readers.The Wikimedia foundation started to release most viewed articles by country from Jan 2021. At the beginning of the COVID-19 an exemption was made to store reader data about the pandemic with additional privacy protections:- exclude the page views from users engaged in an edit session- exclude reader data from specific countries (with a few exceptions)- the aggregated statistics are based on 50% of reader sessions that involve a pageview to a COVID-19-related article (see covid_pages.tsv). As a control, a 1% random sample of reader sessions that have no pageviews to COVID-19-related articles was kept. In aggregate, we make sure this 1% non-COVID-19 sample and 50% COVID-19 sample represents less than 10% of pageviews for a country for that day. The randomization and filters occurs on a daily cadence with all timestamps in UTC.- exclude power users - i.e. userhashes with greater than 500 pageviews in a day. This doubles as another form of likely bot removal, protects very heavy users of the project, and also in theory would help reduce the chance of a single user heavily skewing the data.- exclude readership from users of the iOS and Android Wikipedia apps. In effect, the view counts in this dataset represent comparable trends rather than the total amount of traffic from a given country. For more background on readership data per country data, and the COVID-19 privacy protections in particular, see this phabricator.To further minimize privacy risks, a k-anonymity threshold of 100 was applied to the aggregated counts. For example, a page needs to be viewed at least 100 times in a given country and week in order to be included in the dataset. In addition, the view counts are floored to a multiple of 100.DatasetsThe datasets published in this release are derived from a reader session dataset generated by the code in this notebook with the filtering described above. The raw reader session data itself will not be publicly available due to privacy considerations. The datasets described below are similar to the pageviews and clickstream data that the Wikimedia foundation publishes already, with the addition of the country specific counts.COVID-19 pageviewsThe file covid_pageviews.tsv contains:- pageview counts for COVID-19 related pages, aggregated by week and country- k-anonymity threshold of 100- example: In the 13th week of 2020 (23 March - 29 March 2020), the page 'Pandémie_de_Covid-19_en_Italie' on French Wikipedia was visited 11700 times from readers in Belgium- as a control bucket, we include pageview counts to all pages aggregated by week and country. Due to privacy considerations during the collection of the data, the control bucket was sampled at ~1% of all view traffic. The view counts for the control title are thus proportional to the total number of pageviews to all pages.The file is ~8 MB and contains ~134000 data points across the 27 weeks, 108 countries, and 168 projects.Covid reader session bigramsThe file covid_session_bigrams.tsv contains:- number of occurrences of visits to pages A -> B, where either A or B is a COVID-19 related article. Note that the bigrams are tuples (from, to) of articles viewed in succession, the underlying mechanism can be clicking on a link in an article, but it may also have been a new search or reading both articles based on links from third source articles. In contrast, the clickstream data is based on referral information only- aggregated by month and country- k-anonymity threshold of 100- example: In March of 2020, there were a 1000 occurences of readers accessing the page es.wikipedia/SARS-CoV-2 followed by es.wikipedia/Orthocoronavirinae from ChileThe file is ~10 MB and contains ~90000 bigrams across the 6 months, 96 countries, and 56 projects.ContactPlease reach out to research-feedback@wikimedia.org for any questions.

  16. Traffic Sign Data set

    • kaggle.com
    zip
    Updated Feb 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shivam Singhal (2020). Traffic Sign Data set [Dataset]. https://www.kaggle.com/shivamsinghal1012/traffic-sign-data-set
    Explore at:
    zip(410290 bytes)Available download formats
    Dataset updated
    Feb 11, 2020
    Authors
    Shivam Singhal
    Description

    Context

    This dataset contains information about all the traffic signs used in day to day life and can be further used for the computer vision self driving car projects, it has classification of over 82 different types of traffic sign.

    Content

    This dataset contains information about all the traffic signs used.

    Acknowledgements

    This data set is scrapped from Wikipedia

  17. Traffic Flow Data In Ho Chi Minh City, Viet Nam

    • hub.tumidata.org
    • kaggle.com
    url, zip
    Updated Jun 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TUMI (2024). Traffic Flow Data In Ho Chi Minh City, Viet Nam [Dataset]. https://hub.tumidata.org/dataset/traffic_flow_data_in_ho_chi_minh_city_viet_nam_hochiminhcity
    Explore at:
    url, zip(8379823)Available download formats
    Dataset updated
    Jun 4, 2024
    Dataset provided by
    Tumi Inc.http://www.tumi.com/
    Area covered
    Vietnam, Ho Chi Minh City
    Description

    Traffic Flow Data In Ho Chi Minh City, Viet Nam
    This dataset falls under the category Traffic Generating Parameters.
    It contains the following data: Traffic flow
    This dataset was scouted on 2022-02-10 as part of a data sourcing project conducted by TUMI. License information might be outdated: Check original source for current licensing. The data can be accessed using the following URL / API Endpoint: https://www.kaggle.com/thanhnguyen2612/traffic-flow-data-in-ho-chi-minh-city-viet-nam

  18. S

    ChatGPT-4 Statistics By Traffic, Visitor Engagement And Country

    • sci-tech-today.com
    Updated Nov 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sci-Tech Today (2025). ChatGPT-4 Statistics By Traffic, Visitor Engagement And Country [Dataset]. https://www.sci-tech-today.com/stats/chatgpt-4-statistics/
    Explore at:
    Dataset updated
    Nov 13, 2025
    Dataset authored and provided by
    Sci-Tech Today
    License

    https://www.sci-tech-today.com/privacy-policyhttps://www.sci-tech-today.com/privacy-policy

    Time period covered
    2022 - 2032
    Area covered
    Global
    Description

    Introduction

    ChatGPT-4 Statistics: In 2024, ChatGPT-4 has seen a notable surge in user engagement, processing millions of queries daily. Its high accuracy and reliability have made it a popular choice for businesses and individuals. Over 70% of users report high satisfaction, reflecting the model's effectiveness across various applications, from customer service to content creation. ChatGPT-4 excels at interpreting and generating human-like text, thanks to continuous updates that enhance its ability to handle complex queries.

    Developed by OpenAI, ChatGPT stands for "Chat Generative Pre-trained Transformer." This advanced model surpasses GPT-3.5 by offering improved accuracy, better context handling, and even image understanding. These features highlight ChatGPT-4's transformative role in AI-driven communication.

  19. Leading websites worldwide 2025, by monthly visits

    • statista.com
    • boostndoto.org
    Updated Oct 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading websites worldwide 2025, by monthly visits [Dataset]. https://www.statista.com/statistics/1201880/most-visited-websites-worldwide/
    Explore at:
    Dataset updated
    Oct 29, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Aug 2025
    Area covered
    Worldwide
    Description

    In August 2025, Google.com was the most visited website worldwide, with an average of 98.2 billion monthly visits. The platform has maintained its leading position since June 2010, when it surpassed Yahoo to take first place. YouTube ranked second during the same period, recording over 48 billion monthly visits. The internet leaders: search, social, and e-commerce Social networks, search engines, and e-commerce websites shape the online experience as we know it. While Google leads the global online search market by far, YouTube and Facebook have become the world’s most popular websites for user generated content, solidifying Alphabet’s and Meta’s leadership over the online landscape. Meanwhile, websites such as Amazon and eBay generate millions in profits from the sale and distribution of goods, making the e-market sector an integral part of the global retail scene. What is next for online content? Powering social media and websites like Reddit and Wikipedia, user-generated content keeps moving the internet’s engines. However, the rise of generative artificial intelligence will bring significant changes to how online content is produced and handled. ChatGPT is already transforming how online search is performed, and news of Google's 2024 deal for licensing Reddit content to train large language models (LLMs) signal that the internet is likely to go through a new revolution. While AI's impact on the online market might bring both opportunities and challenges, effective content management will remain crucial for profitability on the web.

  20. TomTom Intermediate Traffic Service

    • data.norge.no
    • transportdata.be
    • +1more
    dtd_xml
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomtom sales b.v. norway branch nuf (2025). TomTom Intermediate Traffic Service [Dataset]. https://data.norge.no/en/datasets/0a86d898-70f7-3dd7-b099-732fc45e220a/tomtom-intermediate-traffic-service
    Explore at:
    dtd_xmlAvailable download formats
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    TomTomhttp://www.tomtom.com/
    Description

    We developed TomTom Intermediate Traffic to deliver detailed, real-time traffic content to business customers who integrate it into their own applications. Target customers for TomTom Intermediate Traffic include automotive OEMs, web and application developers, and governments. We deliver bulk traffic flow information that provides a comprehensive view of the entire road network. TomTom delivered our first live traffic product in 2007 and our experience has taught us how to continue delivering the best traffic products in the market. Our real-time traffic products are created by merging multiple data sources, including anonymized measurement data from over 650 million GPS-enabled devices. Using highly granular data, gathered on nearly every stretch of road, we can calculate travel times and speeds for virtually any day or time. We focus on our travel information so our customers can focus on their own business objectives.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Godahewa, Rakshitha; Bergmeir, Christoph; Webb, Geoff; Hyndman, Rob; Montero-Manso, Pablo (2021). Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3892918

Kaggle Wikipedia Web Traffic Daily Dataset (without Missing Values)

Explore at:
Dataset updated
Apr 1, 2021
Dataset provided by
Lecturer at Monash University
Professor at Monash University
Lecturer at University of Sydney
PhD Student at Monash University
Authors
Godahewa, Rakshitha; Bergmeir, Christoph; Webb, Geoff; Hyndman, Rob; Montero-Manso, Pablo
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset was used in the Kaggle Wikipedia Web Traffic forecasting competition. It contains 145063 daily time series representing the number of hits or web traffic for a set of Wikipedia pages from 2015-07-01 to 2017-09-10.

The original dataset contains missing values. They have been simply replaced by zeros.

Search
Clear search
Close search
Google apps
Main menu