100+ datasets found
  1. e

    wikipedia.org Traffic Analytics Data

    • analytics.explodingtopics.com
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). wikipedia.org Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/wikipedia.org
    Explore at:
    Dataset updated
    Jun 1, 2025
    Variables measured
    Global Rank, Monthly Visits, Authority Score, US Country Rank, Online Services Category Rank
    Description

    Traffic analytics, rankings, and competitive metrics for wikipedia.org as of June 2025

  2. h

    wikipedia-analysis

    • huggingface.co
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Omar Kamali (2025). wikipedia-analysis [Dataset]. https://huggingface.co/datasets/omarkamali/wikipedia-analysis
    Explore at:
    Dataset updated
    Jul 24, 2025
    Authors
    Omar Kamali
    Description

    omarkamali/wikipedia-analysis dataset hosted on Hugging Face and contributed by the HF Datasets community

  3. f

    Selection of English Wikipedia pages (CNs) regarding topics with a direct...

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirko Kämpf; Eric Tessenow; Dror Y. Kenett; Jan W. Kantelhardt (2023). Selection of English Wikipedia pages (CNs) regarding topics with a direct relation to the emerging Hadoop (Big Data) market. [Dataset]. http://doi.org/10.1371/journal.pone.0141892.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Mirko Kämpf; Eric Tessenow; Dror Y. Kenett; Jan W. Kantelhardt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Apache Hadoop is the central software project, beside Apache SOLR, and Apache Lucene (SW, software). Companies which offer Hadoop distributions and Hadoop based solutions are the central companies in the scope of the study (HV, hardware vendors). Other companies started very early with Hadoop related projects as early adopters (EA). Global players (GP) are affected by this emerging market, its opportunities and the new competitors (NC). Some new but highly relevant companies like Talend or LucidWorks have been selected because of their obvious commitment to the open source ideas. Widely adopted technologies with a relation to the selected research topic are represented by the group TEC.

  4. E

    A meta analysis of Wikipedia's coronavirus sources during the COVID-19...

    • live.european-language-grid.eu
    txt
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7806
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 8, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    At the height of the coronavirus pandemic, on the last day of March 2020, Wikipedia in all languages broke a record for most traffic in a single day. Since the breakout of the Covid-19 pandemic at the start of January, tens if not hundreds of millions of people have come to Wikipedia to read - and in some cases also contribute - knowledge, information and data about the virus to an ever-growing pool of articles. Our study focuses on the scientific backbone behind the content people across the world read: which sources informed Wikipedia’s coronavirus content, and how was the scientific research on this field represented on Wikipedia. Using citation as readout we try to map how COVID-19 related research was used in Wikipedia and analyse what happened to it before and during the pandemic. Understanding how scientific and medical information was integrated into Wikipedia, and what were the different sources that informed the Covid-19 content, is key to understanding the digital knowledge echosphere during the pandemic. To delimitate the corpus of Wikipedia articles containing Digital Object Identifier (DOI), we applied two different strategies. First we scraped every Wikipedia pages form the COVID-19 Wikipedia project (about 3000 pages) and we filtered them to keep only page containing DOI citations. For our second strategy, we made a search with EuroPMC on Covid-19, SARS-CoV2, SARS-nCoV19 (30’000 sci papers, reviews and preprints) and a selection on scientific papers form 2019 onwards that we compared to the Wikipedia extracted citations from the english Wikipedia dump of May 2020 (2’000’000 DOIs). This search led to 231 Wikipedia articles containing at least one citation of the EuroPMC search or part of the wikipedia COVID-19 project pages containing DOIs. Next, from our 231 Wikipedia articles corpus we extracted DOIs, PMIDs, ISBNs, websites and URLs using a set of regular expressions. Subsequently, we computed several statistics for each wikipedia article and we retrive Atmetics, CrossRef and EuroPMC infromations for each DOI. Finally, our method allowed to produce tables of citations annotated and extracted infromations in each wikipadia articles such as books, websites, newspapers.Files used as input and extracted information on Wikipedia's COVID-19 sources are presented in this archive.See the WikiCitationHistoRy Github repository for the R codes, and other bash/python scripts utilities related to this project.

  5. Wikipedia Knowledge Graph dataset

    • zenodo.org
    • produccioncientifica.ugr.es
    • +1more
    pdf, tsv
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wenceslao Arroyo-Machado; Wenceslao Arroyo-Machado; Daniel Torres-Salinas; Daniel Torres-Salinas; Rodrigo Costas; Rodrigo Costas (2024). Wikipedia Knowledge Graph dataset [Dataset]. http://doi.org/10.5281/zenodo.6346900
    Explore at:
    tsv, pdfAvailable download formats
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Wenceslao Arroyo-Machado; Wenceslao Arroyo-Machado; Daniel Torres-Salinas; Daniel Torres-Salinas; Rodrigo Costas; Rodrigo Costas
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Wikipedia is the largest and most read online free encyclopedia currently existing. As such, Wikipedia offers a large amount of data on all its own contents and interactions around them, as well as different types of open data sources. This makes Wikipedia a unique data source that can be analyzed with quantitative data science techniques. However, the enormous amount of data makes it difficult to have an overview, and sometimes many of the analytical possibilities that Wikipedia offers remain unknown. In order to reduce the complexity of identifying and collecting data on Wikipedia and expanding its analytical potential, after collecting different data from various sources and processing them, we have generated a dedicated Wikipedia Knowledge Graph aimed at facilitating the analysis, contextualization of the activity and relations of Wikipedia pages, in this case limited to its English edition. We share this Knowledge Graph dataset in an open way, aiming to be useful for a wide range of researchers, such as informetricians, sociologists or data scientists.

    There are a total of 9 files, all of them in tsv format, and they have been built under a relational structure. The main one that acts as the core of the dataset is the page file, after it there are 4 files with different entities related to the Wikipedia pages (category, url, pub and page_property files) and 4 other files that act as "intermediate tables" making it possible to connect the pages both with the latter and between pages (page_category, page_url, page_pub and page_link files).

    The document Dataset_summary includes a detailed description of the dataset.

    Thanks to Nees Jan van Eck and the Centre for Science and Technology Studies (CWTS) for the valuable comments and suggestions.

  6. Use of data analytics application for M&A in the U.S. 2018

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Use of data analytics application for M&A in the U.S. 2018 [Dataset]. https://www.statista.com/statistics/943056/use-of-data-analytics-application-for-ma-usa/
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2018
    Area covered
    United States
    Description

    This statistic presents the use of data analytics application for M&A in the United States in 2018. At that time, ** percent of executives surveyed considered data analytics to be a core concept of their M&A analysis.

  7. wikipedia.org Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    • stb2.digiseotools.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). wikipedia.org Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/wikipedia.org/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    wikipedia.org is ranked #7 in US with 4.76B Traffic. Categories: Newspapers, Online Services. Learn more about website traffic, market share, and more!

  8. Data analytics tools in use by organizations in the United States 2015-2017

    • statista.com
    Updated Dec 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2015). Data analytics tools in use by organizations in the United States 2015-2017 [Dataset]. https://www.statista.com/statistics/500119/united-states-survey-use-data-analytics-tools/
    Explore at:
    Dataset updated
    Dec 1, 2015
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2015
    Area covered
    United States
    Description

    The statistic shows the analytics tools currently in use by business organizations in the United States, as well as the analytics tools respondents believe they will be using in two years, according to a 2015 survey conducted by the Harvard Business Review Analytics Service. As of 2015, ** percent of respondents believed they were going to use predictive analytics for data analysis in two years' time.

  9. machine_learning_wikipedia

    • kaggle.com
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Will Learn (2025). machine_learning_wikipedia [Dataset]. https://www.kaggle.com/datasets/willlearn1/machine-learning-wikipedia/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Will Learn
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    This is a basic web-scrape of the wikipedia entry on 'machine learning'. i've used this to break it into chunks in my program which will be used to provide 'context' for gen-AI to demonstrate RAG

  10. Top uses of data an analytics within companies worldwide 2018

    • statista.com
    Updated Jul 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Top uses of data an analytics within companies worldwide 2018 [Dataset]. https://www.statista.com/statistics/893798/worldwide-data-analytics-top-uses-companies/
    Explore at:
    Dataset updated
    Jul 1, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2018
    Area covered
    Worldwide
    Description

    This statistic shows the ways that companies are using data and analytics worldwide as of 2018. Around ** percent of respondents stated that one of the top uses of data and analytics in their company was as a driver of strategy and change.

  11. wikipedia.com Website Traffic, Ranking, Analytics [July 2025]

    • stb2.digiseotools.com
    • semrush.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). wikipedia.com Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://stb2.digiseotools.com/website/wikipedia.com/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://sem1.theseowheel.com/company/legal/terms-of-service/https://sem1.theseowheel.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    wikipedia.com is ranked #11826 in US with 1.53M Traffic. Categories: Online Services. Learn more about website traffic, market share, and more!

  12. Usage areas for big data analytics in companies worldwide 2017

    • statista.com
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Usage areas for big data analytics in companies worldwide 2017 [Dataset]. https://www.statista.com/statistics/933409/worldwide-big-data-analytics-usage-areas/
    Explore at:
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2017
    Area covered
    Worldwide
    Description

    The statistic shows the areas in which companies use or plan to use big data analytics worldwide as of 2017. A quarter of respondents stated that their company currently uses big data analytics for marketing, with another ** percent stating that their business planned to use big data for marketing within the next 12 months.

  13. nn.wikipedia.org Website Traffic, Ranking, Analytics [July 2025]

    • semrush.com
    Updated Aug 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). nn.wikipedia.org Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://www.semrush.com/website/nn.wikipedia.org/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://www.semrush.com/company/legal/terms-of-service/https://www.semrush.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    nn.wikipedia.org is ranked #7 in US with 59.3K Traffic. Categories: . Learn more about website traffic, market share, and more!

  14. d

    Replication Data for: Measuring Wikipedia Article Quality in One Dimension...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Sep 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TeBlunthuis, Nathan (2024). Replication Data for: Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression [Dataset]. http://doi.org/10.7910/DVN/U5V0G1
    Explore at:
    Dataset updated
    Sep 25, 2024
    Dataset provided by
    Harvard Dataverse
    Authors
    TeBlunthuis, Nathan
    Description

    This dataset provides code, data, and instructions for replicating the analysis of Measuring Wikipedia Article Quality in One Dimension by Extending ORES with Ordinal Regression published in OpenSym 2021 (link to come). The paper introduces a method for transforming scores from the ORES quality models into a single dimensional measure of quality amenable for statistical analysis that is well-calibrated to a dataset. The purpose is to improve the validity of research into article quality through more precise measurement. The code and data for replicating the paper are found in this dataverse repository. If you wish to use method on a new dataset, you should obtain the actively maintaned version of the code from this git repository. If you attempt to replicate part of this repository please let me know via an email to nathante@uw.edu. Replicating the Analysis from the OpenSym Paper This project analyzes a sample of articles with quality labels from the English Wikipedia XML dumps from March 2020. Copies of the dumps are not provided in this dataset. They can be obtained via https://dumps.wikimedia.org/. Everything else you need to replicate the project (other than a sufficiently powerful computer) should be available here. The project is organized into stages. The prerequisite data files are provided at each stage so you do not need to rerun the entire pipeline from the beginning, which is not easily done without a high-performance computer. If you start replicating at an intermediate stage, this should overwrite the inputs to the downstream stages. This should make it easier to verify a partial replication. To help manage the size of the dataverse, all code files are included in code.tar.gz. Extracting this with tar xzvf code.tar.gz is the first step. Getting Set Up You need a version of R >= 4.0 and a version of Python >= 3.7.8. You also need a bash shell, tar, gzip, and make installed as they should be installed on any Unix system. To install brms you need a working C++ compiler. If you run into trouble see the instructions for installing Rstan. The datasets were built on CentOS 7, except for the ORES scoring which was done on Ubuntu 18.04.5 and building which was done on Debian 9. The RemembR and pyRembr projects provide simple tools for saving intermediate variables for building papers with LaTex. First, extract the articlequality.tar.gz, RemembR.tar.gz and pyRembr.tar.gz archives. Then, install the following: Python Packages Running the following steps in a new Python virtual environment is strongly recommended. Run pip3 install -r requirements.txt to install the Python dependencies. Then navigate into the pyRembr directory and run python3 setup.py install. R Packages Run Rscript install_requirements.R to install the necessary R libraries. If you run into trouble installing brms see the instructions on Drawing a Sample of Labeled Articles I provide steps and intermediate data files for replicating the sampling of labeled articles. The steps in this section are quite computationally intensive. Those only interested in replicating the models and analyses should skip this section. Extracting Metadata from Wikipedia Dumps Metadata from the Wikipedia dumps is required for calibrating models to the revision and article levels of analysis. You can use the wikiq Python script from the mediawiki dump tools git repository to extract metadata from the XML dumps as TSV files. The version of wikiq that was used is provided here. Running Wikiq on a full dump of English Wikipedia in a reasonable amount of requires considerable computing resources. For this project, Wikiq was run on Hyak a high performance computer at the University of Washington. The code for doing so is highly speicific to Hyak. For transparency and in case it helps others using similar academic computers this code is included in WikiqRunning.tar.gz. A copy of the wikiq output is included in this dataset in the multi-part archive enwiki202003-wikiq.tar.gz. To extract this archive, download all the parts and then run cat enwiki202003-wikiq.tar.gz* > enwiki202003-wikiq.tar.gz && tar xvzf enwiki202003-wikiq.tar.gz. Obtaining Quality Labels for Articles We obtain up-to-date labels for each article using the articlequality python package included in articlequality.tar.gz. The XML dumps are also the input to this step, and while it does not require a great deal of memory, a powerful computer (we used 28 cores) is helpful so that it completes in a reasonable amount of time. extract_quality_labels.sh runs the command to extract the labels from the xml dumps. The resulting files have the format data/enwiki-20200301-pages-meta-history*.xml-p*.7z_article_labelings.json and are included in this dataset in the archive enwiki202003-article_labelings-json.tar.gz. Taking a Sample of Quality Labels I used Apache Spark to merge the metadata from Wikiq with the quality labels and to draw a sample of articles where each quality class is equally represented. To...

  15. Polish Wikipedia articles with "Cite web" templates linking to celebrity...

    • figshare.com
    txt
    Updated Dec 9, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krzysztof Jasiutowicz (2018). Polish Wikipedia articles with "Cite web" templates linking to celebrity gossip blogs and websites [Dataset]. http://doi.org/10.6084/m9.figshare.7441154.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 9, 2018
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Krzysztof Jasiutowicz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Polish Wikipedia articles with "Cite web" templates linking to celebrity gossip blogs and websites.

  16. Wikipedia Web Traffic 2018-19

    • kaggle.com
    Updated Apr 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    san_bt (2021). Wikipedia Web Traffic 2018-19 [Dataset]. https://www.kaggle.com/datasets/sandeshbhat/wikipedia-web-traffic-201819/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2021
    Dataset provided by
    Kaggle
    Authors
    san_bt
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    • Time Series: Time series is a set of observations recorded over regular interval of time, Time series can be beneficial in many fields like stock market prediction, weather forecasting. - Accounts for the fact that data points taken over time may have an internal structure (such as auto correlation, trend or seasonal variation) that should be accounted for.

    • Web traffic: Amount of data sent and received by visitors to a website. - Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are popular and if there are any apparent trends, such as one specific page being viewed mostly by people in a particular country

    Content

    Contains Page Views for 60k Wikipedia articles in 8 different languages taken on a daily basis for 2 years.

    https://i.ibb.co/h1JCgpY/DSLC.png" alt="DSLC">

    A Data Science Life Cycle can be used to create a project. Forecasting can be done for any interval provided sufficient dataset is available. Refer the Github link in the tasks to view the forecast done using ARIMA and Prophet. Further feel free to contribute. Several other models can be used including a neural network to improve the results by many folds.

    Acknowledgements

    Credits :
    1. Wikipedia 2. Google

  17. n

    Data from: Robust clustering of languages across Wikipedia growth

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Sep 19, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristina Ban; Matjaž Perc; Zoran Levnajić (2017). Robust clustering of languages across Wikipedia growth [Dataset]. http://doi.org/10.5061/dryad.sk0q2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 19, 2017
    Dataset provided by
    University of Maribor
    Faculty of Information Studies, Ljubljanska cesta 31A, 8000 Novo Mesto, Slovenia
    Authors
    Kristina Ban; Matjaž Perc; Zoran Levnajić
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Wikipedia is the largest existing knowledge repository that is growing on a genuine crowdsourcing support. While the English Wikipedia is the most extensive and the most researched one with over 5 million articles, comparatively little is known about the behaviour and growth of the remaining 283 smaller Wikipedias, the smallest of which, Afar, has only one article. Here, we use a subset of these data, consisting of 14 962 different articles, each of which exists in 26 different languages, from Arabic to Ukrainian. We study the growth of Wikipedias in these languages over a time span of 15 years. We show that, while an average article follows a random path from one language to another, there exist six well-defined clusters of Wikipedias that share common growth patterns. The make-up of these clusters is remarkably robust against the method used for their determination, as we verify via four different clustering methods. Interestingly, the identified Wikipedia clusters have little correlation with language families and groups. Rather, the growth of Wikipedia across different languages is governed by different factors, ranging from similarities in culture to information literacy.

  18. Leading methods of data analytics application in M&A in the U.S. 2018

    • statista.com
    Updated Jul 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Leading methods of data analytics application in M&A in the U.S. 2018 [Dataset]. https://www.statista.com/statistics/943048/methods-of-data-analytics-application-in-manda-usa/
    Explore at:
    Dataset updated
    Jul 9, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2018
    Area covered
    United States
    Description

    This statistic presents the leading methods of data analytics application in the mergers and acquisitions sector in the United States in 2018. At that time, ** percent of executives surveyed were using data analytics on customers and markets.

  19. wikipedia.nl Website Traffic, Ranking, Analytics [July 2025]

    • stb2.digiseotools.com
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Semrush (2025). wikipedia.nl Website Traffic, Ranking, Analytics [July 2025] [Dataset]. https://stb2.digiseotools.com/website/wikipedia.nl/overview/
    Explore at:
    Dataset updated
    Aug 12, 2025
    Dataset authored and provided by
    Semrushhttps://fr.semrush.com/
    License

    https://sem1.theseowheel.com/company/legal/terms-of-service/https://sem1.theseowheel.com/company/legal/terms-of-service/

    Time period covered
    Aug 12, 2025
    Area covered
    Worldwide
    Variables measured
    visits, backlinks, bounceRate, pagesPerVisit, authorityScore, organicKeywords, avgVisitDuration, referringDomains, trafficByCountry, paidSearchTraffic, and 3 more
    Measurement technique
    Semrush Traffic Analytics; Click-stream data
    Description

    wikipedia.nl is ranked #6981 in NL with 105.52K Traffic. Categories: . Learn more about website traffic, market share, and more!

  20. Citations with identifiers in Wikipedia

    • figshare.com
    application/gzip
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aaron Halfaker; Bahodir Mansurov; Miriam Redi; Dario Taraborelli (2023). Citations with identifiers in Wikipedia [Dataset]. http://doi.org/10.6084/m9.figshare.1299540.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Aaron Halfaker; Bahodir Mansurov; Miriam Redi; Dario Taraborelli
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset includes a list of citations with identifiers extracted from the most recent version of Wikipedia across all language editions. The data was parsed from the Wikipedia content dumps published on March 1, 2018. License All files included in this datasets are released under CC0: https://creativecommons.org/publicdomain/zero/1.0/ Projects Previous versions of this dataset ("Scholarly citations in Wikipedia") were limited to the English language edition. The current version includes one dataset for each of the 298 languages editions that Wikipedia supports as of March 2018. Projects are identified by their ISO 639-1/639-2 language code, per https://meta.wikimedia.org/wiki/List_of_Wikipedias. Identifiers • PubMed IDs (pmid) and PubMedCentral IDs (pmcid).• Digital Object Identifiers (doi)• International Standard Book Number (isbn)• ArXiv Ids (arxiv) Format Each row in the dataset represents a citation as a (Wikipedia article, cited source) pair. Metadata about when the citation was first added is included. • page_id -- The identifier of the Wikipedia article (int), e.g. 1325125• page_title -- The title of the Wikipedia article (utf-8), e.g. Club cell• rev_id -- The Wikipedia revision where the citation was first added (int), e.g. 282470030• timestamp -- The timestamp of the revision where the citation was first added. (ISO 8601 datetime), e.g. 2009-04-08T01:52:20Z• type -- The type of identifier, e.g. pmid• id -- The id of the cited source (utf-8), e.g. 18179694 Source code https://github.com/halfak/Extract-scholarly-article-citations-from-Wikipedia (MIT Licensed) A copy of this dataset is also available at https://analytics.wikimedia.org/datasets/archive/public-datasets/all/mwrefs/Notes Citation identifers are extracted as-is from Wikipedia article content. Our spot-checking suggests that 98% of identifiers resolve.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). wikipedia.org Traffic Analytics Data [Dataset]. https://analytics.explodingtopics.com/website/wikipedia.org

wikipedia.org Traffic Analytics Data

Explore at:
Dataset updated
Jun 1, 2025
Variables measured
Global Rank, Monthly Visits, Authority Score, US Country Rank, Online Services Category Rank
Description

Traffic analytics, rankings, and competitive metrics for wikipedia.org as of June 2025

Search
Clear search
Close search
Google apps
Main menu