29 datasets found
  1. w

    Wikipedia Pageviews Fields

    • windsor.ai
    json
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Windsor.ai (2024). Wikipedia Pageviews Fields [Dataset]. https://windsor.ai/data-field/wikipedia_pageviews/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jun 1, 2024
    Dataset provided by
    Windsor.ai
    Variables measured
    Today, Source, top.day, top.year, top.month, top.access, Data Source, top.project, top.articles, per-article.agent, and 6 more
    Description

    Auto-generated structured data of Wikipedia Pageviews from table Fields

  2. English Wikipedia pageviews by second

    • figshare.com
    • huggingface.co
    • +1more
    application/gzip
    Updated Jan 19, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Os Keyes (2016). English Wikipedia pageviews by second [Dataset]. http://doi.org/10.6084/m9.figshare.1394684.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    figshare
    Authors
    Os Keyes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This file contains a count of pageviews to the English-language Wikipedia from 2015-03-16T00:00:00 to 2015-04-25T15:59:59, grouped by timestamp (down to a one-second resolution level) and site (mobile or desktop). The smallest number of events in a group is 645; because of this, we are confident there should not be privacy implications of releasing this data.

  3. Google Trends and Wikipedia Page Views

    • zenodo.org
    • explore.openaire.eu
    application/gzip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mitsuo Yoshida; Mitsuo Yoshida (2020). Google Trends and Wikipedia Page Views [Dataset]. http://doi.org/10.5281/zenodo.14539
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mitsuo Yoshida; Mitsuo Yoshida
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Abstract (our paper)

    The frequency of a web search keyword generally reflects the degree of public interest in a particular subject matter. Search logs are therefore useful resources for trend analysis. However, access to search logs is typically restricted to search engine providers. In this paper, we investigate whether search frequency can be estimated from a different resource such as Wikipedia page views of open data. We found frequently searched keywords to have remarkably high correlations with Wikipedia page views. This suggests that Wikipedia page views can be an effective tool for determining popular global web search trends.

    Data

    personal-name.txt.gz:
    The first column is the Wikipedia article id, the second column is the search keyword, the third column is the Wikipedia article title, and the fourth column is the total of page views from 2008 to 2014.

    personal-name_data_google-trends.txt.gz, personal-name_data_wikipedia.txt.gz:
    The first column is the period to be collected, the second column is the source (Google or Wikipedia), the third column is the Wikipedia article id, the fourth column is the search keyword, the fifth column is the date, and the sixth column is the value of search trend or page view.

    Publication

    This data set was created for our study. If you make use of this data set, please cite:
    Mitsuo Yoshida, Yuki Arase, Takaaki Tsunoda, Mikio Yamamoto. Wikipedia Page View Reflects Web Search Trend. Proceedings of the 2015 ACM Web Science Conference (WebSci '15). no.65, pp.1-2, 2015.
    http://dx.doi.org/10.1145/2786451.2786495
    http://arxiv.org/abs/1509.02218 (author-created version)

    Note

    The raw data of Wikipedia page views is available in the following page.
    http://dumps.wikimedia.org/other/pagecounts-raw/

  4. Wikipedia English: number of page views 2023, by country

    • statista.com
    Updated Dec 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Wikipedia English: number of page views 2023, by country [Dataset]. https://www.statista.com/statistics/1428253/wikipedia-english-page-views-country/
    Explore at:
    Dataset updated
    Dec 13, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Nov 2023
    Area covered
    Worldwide
    Description

    In November 2023, the English version of Wikipedia received over 3 billion page views originating from the United States across all platforms. The United Kingdom was the country to generate the second-most page views for the subdomain, with 809.9 million views, followed by India, with 773.2 million visualizations.

  5. Z

    Yearly pageviews of English Wikipedia articles with potential links to green...

    • data.niaid.nih.gov
    Updated Nov 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Leva, Federico (2020). Yearly pageviews of English Wikipedia articles with potential links to green open access scholarly articles [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3783467
    Explore at:
    Dataset updated
    Nov 16, 2020
    Dataset authored and provided by
    Leva, Federico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Number of visits in 2019 for a sample of 23462 English Wikipedia articles which contain references to academic sources which have a green open access copy available but not yet used. The consultation statistics were retrieved from the Wikimedia pageviews API using the Python client (script also included). The sample was selected among articles which in April 2020 had at least one citation of an academic paper (using the "cite journal" template) for which OAbot (through Unpaywall data) had found a green open access URL to add (gratis open access, not necessarily libre open access). Data shows that the top 1 % most visited articles received 30 % of the visits: over 500 million in the year, corresponding to 1 million potential citation link clicks to distribute across all references assuming a 0.2 % click-through rate per Piccardi et al. (2020).

  6. Wikipedia Page Views of Japanese Comic

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mitsuo Yoshida; Mitsuo Yoshida (2020). Wikipedia Page Views of Japanese Comic [Dataset]. http://doi.org/10.5281/zenodo.60886
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Mitsuo Yoshida; Mitsuo Yoshida
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Abstract (our paper)

    This paper investigates the page view and interlanguage link at Wikipedia for Japanese comic analysis. This paper is based on a preliminary investigation, and obtained three results, but the analysis is insufficient to use the results for a market research immediately. I am looking for research collaborators in order to conduct a more detailed analysis.

    Data

    Publication

    This data set was created for our study. If you make use of this data set, please cite:
    Mitsuo Yoshida. Preliminary Investigation for Japanese Comic Analysis using Wikipedia. Proceedings of the Fifth Asian Conference on Information Systems (ACIS 2016). pp.229-230, 2016.

  7. h

    wikipedia-20250620

    • huggingface.co
    Updated Jul 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NeuML (2025). wikipedia-20250620 [Dataset]. https://huggingface.co/datasets/NeuML/wikipedia-20250620
    Explore at:
    Dataset updated
    Jul 3, 2025
    Dataset authored and provided by
    NeuML
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for Wikipedia English June 2025

    Dataset created using this repo with a June 2025 Wikipedia snapshot. This repo also has a precomputed pageviews database. This database has the aggregated number of views for each page in Wikipedia. This file is built using the Wikipedia Pageview complete dumps

  8. h

    wikipedia-20240901

    • huggingface.co
    Updated Sep 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NeuML (2024). wikipedia-20240901 [Dataset]. https://huggingface.co/datasets/NeuML/wikipedia-20240901
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 1, 2024
    Dataset authored and provided by
    NeuML
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Dataset Card for Wikipedia English September 2024

    Dataset created using this repo with a September 2024 Wikipedia snapshot. This repo also has a precomputed pageviews database. This database has the aggregated number of views for each page in Wikipedia. This file is built using the Wikipedia Pageview complete dumps

  9. Wikipedia Web Traffic 2018-19

    • kaggle.com
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    san_bt (2021). Wikipedia Web Traffic 2018-19 [Dataset]. https://www.kaggle.com/datasets/sandeshbhat/wikipedia-web-traffic-201819/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2021
    Dataset provided by
    Kaggle
    Authors
    san_bt
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    • Time Series: Time series is a set of observations recorded over regular interval of time, Time series can be beneficial in many fields like stock market prediction, weather forecasting. - Accounts for the fact that data points taken over time may have an internal structure (such as auto correlation, trend or seasonal variation) that should be accounted for.

    • Web traffic: Amount of data sent and received by visitors to a website. - Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are popular and if there are any apparent trends, such as one specific page being viewed mostly by people in a particular country

    Content

    Contains Page Views for 60k Wikipedia articles in 8 different languages taken on a daily basis for 2 years.

    https://i.ibb.co/h1JCgpY/DSLC.png" alt="DSLC">

    A Data Science Life Cycle can be used to create a project. Forecasting can be done for any interval provided sufficient dataset is available. Refer the Github link in the tasks to view the forecast done using ARIMA and Prophet. Further feel free to contribute. Several other models can be used including a neural network to improve the results by many folds.

    Acknowledgements

    Credits :
    1. Wikipedia 2. Google

  10. COVID-19 Pandemic Wikipedia Readership

    • figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isaac Johnson; Leila Zia; Joseph Allemandou; Marcel Ruiz Forns; Nuria Ruiz; Fabian Kaelin (2023). COVID-19 Pandemic Wikipedia Readership [Dataset]. http://doi.org/10.6084/m9.figshare.14548032.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Isaac Johnson; Leila Zia; Joseph Allemandou; Marcel Ruiz Forns; Nuria Ruiz; Fabian Kaelin
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This data release includes two Wikipedia datasets related to the readership of the project as it relates to the early COVID-19 pandemic period. The first dataset is COVID-19 article page views by country, the second dataset is one hop navigation where one of the two pages are COVID-19 related. The data covers roughly the first six months of the pandemic, more specifically from January 1st 2020 to June 30th 2020. For more background on the pandemic in those months, see English Wikipedia's Timeline of the COVID-19 pandemic.Wikipedia articles are considered COVID-19 related according the methodology described here, the list of COVID-19 articles used for the released datasets is available in covid_articles.tsv. For simplicity and transparency, the same list of articles from 20 April 2020 was used for the entire dataset though in practice new COVID-19-relevant articles were constantly being created as the pandemic evolved.Privacy considerationsWhile this data is considered valuable for the insight that it can provide about information-seeking behaviors around the pandemic in its early months across diverse geographies, care must be taken to not inadvertently reveal information about the behavior of individual Wikipedia readers. We put in place a number of filters to release as much data as we can while minimizing the risk to readers.The Wikimedia foundation started to release most viewed articles by country from Jan 2021. At the beginning of the COVID-19 an exemption was made to store reader data about the pandemic with additional privacy protections:- exclude the page views from users engaged in an edit session- exclude reader data from specific countries (with a few exceptions)- the aggregated statistics are based on 50% of reader sessions that involve a pageview to a COVID-19-related article (see covid_pages.tsv). As a control, a 1% random sample of reader sessions that have no pageviews to COVID-19-related articles was kept. In aggregate, we make sure this 1% non-COVID-19 sample and 50% COVID-19 sample represents less than 10% of pageviews for a country for that day. The randomization and filters occurs on a daily cadence with all timestamps in UTC.- exclude power users - i.e. userhashes with greater than 500 pageviews in a day. This doubles as another form of likely bot removal, protects very heavy users of the project, and also in theory would help reduce the chance of a single user heavily skewing the data.- exclude readership from users of the iOS and Android Wikipedia apps. In effect, the view counts in this dataset represent comparable trends rather than the total amount of traffic from a given country. For more background on readership data per country data, and the COVID-19 privacy protections in particular, see this phabricator.To further minimize privacy risks, a k-anonymity threshold of 100 was applied to the aggregated counts. For example, a page needs to be viewed at least 100 times in a given country and week in order to be included in the dataset. In addition, the view counts are floored to a multiple of 100.DatasetsThe datasets published in this release are derived from a reader session dataset generated by the code in this notebook with the filtering described above. The raw reader session data itself will not be publicly available due to privacy considerations. The datasets described below are similar to the pageviews and clickstream data that the Wikimedia foundation publishes already, with the addition of the country specific counts.COVID-19 pageviewsThe file covid_pageviews.tsv contains:- pageview counts for COVID-19 related pages, aggregated by week and country- k-anonymity threshold of 100- example: In the 13th week of 2020 (23 March - 29 March 2020), the page 'Pandémie_de_Covid-19_en_Italie' on French Wikipedia was visited 11700 times from readers in Belgium- as a control bucket, we include pageview counts to all pages aggregated by week and country. Due to privacy considerations during the collection of the data, the control bucket was sampled at ~1% of all view traffic. The view counts for the control title are thus proportional to the total number of pageviews to all pages.The file is ~8 MB and contains ~134000 data points across the 27 weeks, 108 countries, and 168 projects.Covid reader session bigramsThe file covid_session_bigrams.tsv contains:- number of occurrences of visits to pages A -> B, where either A or B is a COVID-19 related article. Note that the bigrams are tuples (from, to) of articles viewed in succession, the underlying mechanism can be clicking on a link in an article, but it may also have been a new search or reading both articles based on links from third source articles. In contrast, the clickstream data is based on referral information only- aggregated by month and country- k-anonymity threshold of 100- example: In March of 2020, there were a 1000 occurences of readers accessing the page es.wikipedia/SARS-CoV-2 followed by es.wikipedia/Orthocoronavirinae from ChileThe file is ~10 MB and contains ~90000 bigrams across the 6 months, 96 countries, and 56 projects.ContactPlease reach out to research-feedback@wikimedia.org for any questions.

  11. w

    Wikimedia user agents

    • data.wu.ac.at
    tsv
    Updated Mar 6, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wikimedia (2015). Wikimedia user agents [Dataset]. https://data.wu.ac.at/schema/datahub_io/YTYxNDBmYjItMjE2Ni00ZDQ4LThmZmQtOGUyMTQ5MTA2NDUz
    Explore at:
    tsvAvailable download formats
    Dataset updated
    Mar 6, 2015
    Dataset provided by
    Wikimedia
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    A dataset of parsed reader and editor browser agents from the Wikimedia web properties. The intent behind releasing the parsed agents is to make it easier for Wikimedia developers to understand how to best test their software for the group they're targeting.

    The actual data collection and anonymisation process varied between readers and editors. For readers, a 1:1000 sampled log of pageviews in February 2014 was taken. Any user agent that had more than 500 (in other words, 500,000) requests in a 24-hour period, from no fewer than 500/500,000 distinct IP addresses, was extracted, along with a count of how many times the agent appeared. For editors, a 90 day sample (December 2014 - February 2015) of user agents was taken globally; any user agent used by >= 50 distinct users was extracted, along with a count of the associated number of edits.

  12. f

    Selection of English Wikipedia pages (CNs) regarding topics with a direct...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirko Kämpf; Eric Tessenow; Dror Y. Kenett; Jan W. Kantelhardt (2023). Selection of English Wikipedia pages (CNs) regarding topics with a direct relation to the emerging Hadoop (Big Data) market. [Dataset]. http://doi.org/10.1371/journal.pone.0141892.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mirko Kämpf; Eric Tessenow; Dror Y. Kenett; Jan W. Kantelhardt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Apache Hadoop is the central software project, beside Apache SOLR, and Apache Lucene (SW, software). Companies which offer Hadoop distributions and Hadoop based solutions are the central companies in the scope of the study (HV, hardware vendors). Other companies started very early with Hadoop related projects as early adopters (EA). Global players (GP) are affected by this emerging market, its opportunities and the new competitors (NC). Some new but highly relevant companies like Talend or LucidWorks have been selected because of their obvious commitment to the open source ideas. Widely adopted technologies with a relation to the selected research topic are represented by the group TEC.

  13. f

    Sepsis information-seeking behaviors via Wikipedia between 2015 and 2018: A...

    • plos.figshare.com
    • figshare.com
    xlsx
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah (2023). Sepsis information-seeking behaviors via Wikipedia between 2015 and 2018: A mixed methods retrospective observational study [Dataset]. http://doi.org/10.1371/journal.pone.0221596
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Raising public awareness of sepsis, a potentially life-threatening dysregulated host response to infection, to hasten its recognition has become a major focus of physicians, investigators, and both non-governmental and governmental agencies. While the internet is a common means by which to seek out healthcare information, little is understood about patterns and drivers of these behaviors. We sought to examine traffic to Wikipedia, a popular and publicly available online encyclopedia, to better understand how, when, and why users access information about sepsis. Utilizing pageview traffic data for all available language localizations of the sepsis and septic shock pages between July 1, 2015 and June 30, 2018, significantly outlying daily pageview totals were identified using a seasonal hybrid extreme studentized deviate approach. Consecutive outlying days were aggregated, and a qualitative analysis was undertaken of print and online news media coverage to identify potential correlates. Traffic patterns were further characterized using paired referrer to resource (i.e. clickstream) data, which were available for a temporal subset of the pageviews. Of the 20,557,055 pageviews across 65 linguistic localizations, 47 of the 1,096 total daily pageview counts were identified as upward outliers. After aggregating sequential outlying days, 25 epochs were examined. Qualitative analysis identified at least one major news media correlate for each, which were typically related to high-profile deaths from sepsis and, less commonly, awareness promotion efforts. Clickstream analysis suggests that most sepsis and septic shock Wikipedia pageviews originate from external referrals, namely search engines. Owing to its granular and publicly available traffic data, Wikipedia holds promise as a means by which to better understand global drivers of online sepsis information seeking. Further characterization of user engagement with this information may help to elucidate means by which to optimize the visibility, content, and delivery of awareness promotion efforts.

  14. Data from: The impact of news exposure on collective attention in the United...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, csv +1
    Updated Mar 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michele Tizzoni; Michele Tizzoni; André Panisson; André Panisson; Daniela Paolotti; Daniela Paolotti; Ciro Cattuto; Ciro Cattuto (2020). The impact of news exposure on collective attention in the United States during the 2016 Zika epidemic [Dataset]. http://doi.org/10.5281/zenodo.3603916
    Explore at:
    zip, csv, application/gzipAvailable download formats
    Dataset updated
    Mar 2, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Michele Tizzoni; Michele Tizzoni; André Panisson; André Panisson; Daniela Paolotti; Daniela Paolotti; Ciro Cattuto; Ciro Cattuto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This repository contains the data of the study "The impact of news exposure on collective attention in the United States during the 2016 Zika epidemic".

    Epidemiological data

    The folder zika_USA_weekly_cases_2016.zip contains weekly ZIKV incidence counts reported by the US Centers for Disease Control and Prevention in 2016, by state. Data were extracted from reports made publicly available by the CDC at: https://zenodo.org/record/584136#.Xk07-RNKjOQ

    Web news data

    The file news_GDELT_data.csv.gz contains all news items extracted from the GDELT platform (https://www.gdeltproject.org/) matching TAX_DISEASE_ZIKA as a Theme, and United_States as a Location in the GDELT platform.

    TV closed captions

    The file zika_TV_mentions_dataframe.csv contains all the TV news items of 2016 matching the word ``Zika" in the TV News Archive https://archive.org/details/tv

    Wikipedia pageview counts

    Dataset 1: wikipedia_dataset1_zika_daily_pageview_usa.csv

    Content of each line of the dataset: day, pageview_count

    The dataset contains the daily number of pageview counts of 128 different Wikipedia pages related to the Zika virus (aggregated and summed to total) originated in the United States, from January 1st to December 31st, 2016.

    Dataset 2: wikipedia_dataset2_zika_daily_pageview_bystate.zip

    Content of each line of the dataset: day, pageview_count, state

    The dataset contains the daily number of pageview counts of 128 different Wikipedia pages related to the Zika virus (aggregated and summed to total) originated in the United States, disaggregated by state, from January 1st to December 31st, 2016.

    Dataset 3: wikipedia_dataset3_zika_pagecount_by_city.csv

    Content of each line of the dataset: US_city, pageview_count_Zika,pageview_count_total

    The dataset contains the total number of pageview counts of 128 different Wikipedia pages related to the Zika virus (pageview_count_Zika) originated in 788 cities (US_city) of the United States with a population larger than 40,000 in 2016.The dataset also contains the total number of pageview counts to all Wikipedia pages (all Wikipedia projects, pageview_count_total) originated in 788 cities (US_city) of the United States with a population larger than 40,000 in 2016."

  15. d

    Replication Data for: Click, click boom: Using Wikipedia data to predict...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oswald, Christian; Ohrenhofer, Daniel (2023). Replication Data for: Click, click boom: Using Wikipedia data to predict changes in battle-related deaths [Dataset]. http://doi.org/10.7910/DVN/W4BAN2
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Oswald, Christian; Ohrenhofer, Daniel
    Description

    Data and methods development are key to improve our ability to forecast conflict. Relatively recent data sources such as mobile phone and social media data or images have received widespread attention in conflict research. Oftentimes these do not cover substantial parts of the globe or they are difficult to obtain and manipulate, which makes regular updating challenging. The sometimes vast amounts of data can also be computationally and financially costly. The data source we propose instead is cheap, readily and openly available, and updated in real time, and it provides global coverage: Wikipedia. We argue that the number of country page views can be considered a measure of interest or salience, whereas the number of page changes can be considered a measure of controversy between competing political views. We expect these predictors to be particularly successful in capturing tensions before a conflict escalates. We test our argument by predicting changes in battle-related deaths in Africa on the country-month level. We find evidence that country page views do increase predictive performance while page changes do not. Contrary to our expectation, our model seems to capture long-term trends better than sharp short-term changes.

  16. f

    Pageviews of pages with at least one DOI citation and the referrals from DOI...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lauren A. Maggio; John M. Willinsky; Ryan M. Steinberg; Daniel Mietchen; Joseph L. Wass; Ting Dong (2023). Pageviews of pages with at least one DOI citation and the referrals from DOI citations during August 2016. [Dataset]. http://doi.org/10.1371/journal.pone.0190046.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Lauren A. Maggio; John M. Willinsky; Ryan M. Steinberg; Daniel Mietchen; Joseph L. Wass; Ting Dong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pageviews of pages with at least one DOI citation and the referrals from DOI citations during August 2016.

  17. d

    Replication Data for: \"Using party press releases and Wikipedia page view...

    • dataone.org
    Updated Nov 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Debus, Marc; Christopher Florczak (2023). Replication Data for: \"Using party press releases and Wikipedia page view data to analyse developments and determinants of parties’ issue prevalence: Evidence for the right-wing populist ‘Alternative for Germany’ [Dataset]. http://doi.org/10.7910/DVN/1XGQF2
    Explore at:
    Dataset updated
    Nov 8, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Debus, Marc; Christopher Florczak
    Description

    This data replicates the findings of the manuscript 'Using party press releases and Wikipedia page view data to analyse developments and determinants of parties’ issue prevalence: Evidence for the right-wing populist ‘Alternative for Germany’'

  18. WikiRank 05.2019 - quality, popularity and AI for Wikipedia articles

    • figshare.com
    bz2
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wiki Rank (2023). WikiRank 05.2019 - quality, popularity and AI for Wikipedia articles [Dataset]. http://doi.org/10.6084/m9.figshare.8231273.v2
    Explore at:
    bz2Available download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Wiki Rank
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset includes a list of over 39 million Wikipedia articles in 55 languages with quality scores by WikiRank (https://wikirank.net). Quality scores of articles are based on Wikipedia dumps from May, 2019. Popularity and Authors' Interest based on activity in April 2019.License All files included in this datasets are released under CC0: https://creativecommons.org/publicdomain/zero/1.0/Format• page_id -- The identifier of the Wikipedia article (int), e.g. 4519301• page_name -- The title of the Wikipedia article (utf-8), e.g. General relativity• wikirank_quality -- quality score for Wikipedia article in a scale 0-100 (as of May 1, 2019)• poularity -- miedian of daily number of page views of the Wikipedia article during April 2019• authors_interest -- number of authors of the Wikipedia article during April 2019

  19. f

    A season for all things: Phenological imprints in Wikipedia usage and their...

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John C. Mittermeier; Uri Roll; Thomas J. Matthews; Richard Grenyer (2023). A season for all things: Phenological imprints in Wikipedia usage and their relevance to conservation [Dataset]. http://doi.org/10.1371/journal.pbio.3000146
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOS Biology
    Authors
    John C. Mittermeier; Uri Roll; Thomas J. Matthews; Richard Grenyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Phenology plays an important role in many human–nature interactions, but these seasonal patterns are often overlooked in conservation. Here, we provide the first broad exploration of seasonal patterns of interest in nature across many species and cultures. Using data from Wikipedia, a large online encyclopedia, we analyzed 2.33 billion pageviews to articles for 31,751 species across 245 languages. We show that seasonality plays an important role in how and when people interact with plants and animals online. In total, over 25% of species in our data set exhibited a seasonal pattern in at least one of their language-edition pages, and seasonality is significantly more prevalent in pages for plants and animals than it is in a random selection of Wikipedia articles. Pageview seasonality varies across taxonomic clades in ways that reflect observable patterns in phenology, with groups such as insects and flowering plants having higher seasonality than mammals. Differences between Wikipedia language editions are significant; pages in languages spoken at higher latitudes exhibit greater seasonality overall, and species seldom show the same pattern across multiple language editions. These results have relevance to conservation policy formulation and to improving our understanding of what drives human interest in biodiversity.

  20. Outlying Wikipedia sepsis and septic shock epochs with potential media...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah (2023). Outlying Wikipedia sepsis and septic shock epochs with potential media correlates (2015 to 2018). [Dataset]. http://doi.org/10.1371/journal.pone.0221596.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Craig S. Jabaley; Robert F. Groff; Theresa J. Barnes; Mark E. Caridi-Scheible; James M. Blum; Vikas N. O’Reilly-Shah
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Outlying Wikipedia sepsis and septic shock epochs with potential media correlates (2015 to 2018).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Windsor.ai (2024). Wikipedia Pageviews Fields [Dataset]. https://windsor.ai/data-field/wikipedia_pageviews/

Wikipedia Pageviews Fields

Explore at:
jsonAvailable download formats
Dataset updated
Jun 1, 2024
Dataset provided by
Windsor.ai
Variables measured
Today, Source, top.day, top.year, top.month, top.access, Data Source, top.project, top.articles, per-article.agent, and 6 more
Description

Auto-generated structured data of Wikipedia Pageviews from table Fields

Search
Clear search
Close search
Google apps
Main menu