4 datasets found
  1. E

    A meta analysis of Wikipedia's coronavirus sources during the COVID-19...

    • live.european-language-grid.eu
    • zenodo.org
    txt
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7806
    Explore at:
    txtAvailable download formats
    Dataset updated
    Sep 8, 2022
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    At the height of the coronavirus pandemic, on the last day of March 2020, Wikipedia in all languages broke a record for most traffic in a single day. Since the breakout of the Covid-19 pandemic at the start of January, tens if not hundreds of millions of people have come to Wikipedia to read - and in some cases also contribute - knowledge, information and data about the virus to an ever-growing pool of articles. Our study focuses on the scientific backbone behind the content people across the world read: which sources informed Wikipedia’s coronavirus content, and how was the scientific research on this field represented on Wikipedia. Using citation as readout we try to map how COVID-19 related research was used in Wikipedia and analyse what happened to it before and during the pandemic. Understanding how scientific and medical information was integrated into Wikipedia, and what were the different sources that informed the Covid-19 content, is key to understanding the digital knowledge echosphere during the pandemic. To delimitate the corpus of Wikipedia articles containing Digital Object Identifier (DOI), we applied two different strategies. First we scraped every Wikipedia pages form the COVID-19 Wikipedia project (about 3000 pages) and we filtered them to keep only page containing DOI citations. For our second strategy, we made a search with EuroPMC on Covid-19, SARS-CoV2, SARS-nCoV19 (30’000 sci papers, reviews and preprints) and a selection on scientific papers form 2019 onwards that we compared to the Wikipedia extracted citations from the english Wikipedia dump of May 2020 (2’000’000 DOIs). This search led to 231 Wikipedia articles containing at least one citation of the EuroPMC search or part of the wikipedia COVID-19 project pages containing DOIs. Next, from our 231 Wikipedia articles corpus we extracted DOIs, PMIDs, ISBNs, websites and URLs using a set of regular expressions. Subsequently, we computed several statistics for each wikipedia article and we retrive Atmetics, CrossRef and EuroPMC infromations for each DOI. Finally, our method allowed to produce tables of citations annotated and extracted infromations in each wikipadia articles such as books, websites, newspapers.Files used as input and extracted information on Wikipedia's COVID-19 sources are presented in this archive.See the WikiCitationHistoRy Github repository for the R codes, and other bash/python scripts utilities related to this project.

  2. o

    Wikipedia Articles Dataset

    • opendatabay.com
    .undefined
    Updated May 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bright Data (2025). Wikipedia Articles Dataset [Dataset]. https://www.opendatabay.com/data/premium/b6292674-e94d-4a7e-93c0-00cf1474ffdd
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    May 25, 2025
    Dataset authored and provided by
    Bright Data
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Data Science and Analytics
    Description

    Access a wealth of information, including article titles, raw text, images, and structured references. Popular use cases include knowledge extraction, trend analysis, and content development.

    Use our Wikipedia Articles dataset to access a vast collection of articles across a wide range of topics, from history and science to culture and current events. This dataset offers structured data on articles, categories, and revision histories, enabling deep analysis into trends, knowledge gaps, and content development.

    Tailored for researchers, data scientists, and content strategists, this dataset allows for in-depth exploration of article evolution, topic popularity, and interlinking patterns. Whether you are studying public knowledge trends, performing sentiment analysis, or developing content strategies, the Wikipedia Articles dataset provides a rich resource to understand how information is shared and consumed globally.

    Dataset Features - url: Direct URL to the original Wikipedia article.
    - title: The title or name of the Wikipedia article.
    - table_of_contents: A list or structure outlining the article's sections and hierarchy.
    - raw_text: Unprocessed full text content of the article.
    - cataloged_text: Cleaned and structured version of the article’s content, optimized for analysis.
    - images: Links or data on images embedded in the article.
    - see_also: Related articles linked under the “See Also” section.
    - references: Sources cited in the article for credibility.
    - external_links: Links to external websites or resources mentioned in the article.
    - categories: Tags or groupings classifying the article by topic or domain.
    - timestamp: Last edit date or revision time of the article snapshot.

    Distribution - Data Volume: 11 Columns and 2.19 M Rows
    - Format: CSV

    Usage This dataset supports a wide range of applications: - Knowledge Extraction: Identify key entities, relationships, or events from Wikipedia content.
    - Content Strategy & SEO: Discover trending topics and content gaps.
    - Machine Learning: Train NLP models (e.g., summarisation, classification, QA systems).
    - Historical Trend Analysis: Study how public interest in topics changes over time.
    - Link Graph Modeling: Understand how information is interconnected.

    Coverage - Geographic Coverage: Global (multi-language Wikipedia versions also available)
    - Time Range: Continuous updates; snapshots available from early 2000s to present.

    License

    CUSTOM

    Please review the respective licenses below:

    1. Data Provider's License

    Who Can Use It - Data Scientists: For training or testing NLP and information retrieval systems.
    - Researchers: For computational linguistics, social science, or digital humanities.
    - Businesses: To enhance AI-powered content tools or customer insight platforms.
    - Educators/Students: For building projects, conducting research, or studying knowledge systems.

    Suggested Dataset Names 1. Wikipedia Corpus+
    2. Wikipedia Stream Dataset
    3. Wikipedia Knowledge Bank
    4. Open Wikipedia Dataset

    Pricing

    Based on Delivery frequency

    ~Up to $0.0025 per record. Min order $250

    Approximately 283 new records are added each month. Approximately 1.12M records are updated each month. Get the complete dataset each delivery, including all records. Retrieve only the data you need with the flexibility to set Smart Updates.

    • Monthly

    New snapshot each month, 12 snapshots/year Paid monthly

    • Quarterly

    New snapshot each quarter, 4 snapshots/year Paid quarterly

    • Bi-annual

    New snapshot every 6 months, 2 snapshots/year Paid twice-a-year

    • One-time purchase

    New snapshot one-time delivery Paid once

  3. E

    Enterprise Wiki Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Enterprise Wiki Software Report [Dataset]. https://www.datainsightsmarket.com/reports/enterprise-wiki-software-1437994
    Explore at:
    doc, pdf, pptAvailable download formats
    Dataset updated
    May 1, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Enterprise Wiki Software market is experiencing robust growth, driven by the increasing need for efficient knowledge management and collaboration within organizations of all sizes. The market, valued at approximately $2.5 billion in 2025, is projected to exhibit a Compound Annual Growth Rate (CAGR) of 15% from 2025 to 2033, reaching an estimated market value of $8 billion by 2033. This expansion is fueled by several key factors. Firstly, the rise of remote work and hybrid work models necessitates improved internal communication and knowledge sharing, making enterprise wiki software a crucial tool for maintaining productivity and organizational alignment. Secondly, the increasing adoption of cloud-based solutions offers scalability, accessibility, and cost-effectiveness, further boosting market growth. Furthermore, the integration of advanced features like AI-powered search, version control, and robust security measures enhances the overall value proposition for businesses. While initial implementation costs and the need for comprehensive training can act as restraints, the long-term benefits in terms of improved employee productivity, reduced operational costs, and enhanced knowledge accessibility are compelling organizations to overcome these challenges. The market segmentation reveals a significant demand from both large enterprises and SMEs, with cloud-based solutions gaining traction due to their flexibility and affordability. North America currently holds the largest market share, followed by Europe and Asia Pacific. However, emerging economies in Asia Pacific are poised for significant growth in the coming years due to increasing digital adoption and the expansion of businesses in this region. The competitive landscape is dynamic, with established players like Atlassian and Zoho competing with emerging niche players offering specialized features. Success in this market depends on delivering user-friendly interfaces, robust security features, seamless integration with existing enterprise systems, and a strong focus on customer support and ongoing product development. The market is expected to witness further consolidation as companies strive for market leadership through strategic partnerships, acquisitions, and product innovation.

  4. Total Ecommerce Software by Category

    • aftership.com
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AfterShip (2023). Total Ecommerce Software by Category [Dataset]. https://www.aftership.com/ecommerce/statistics
    Explore at:
    Dataset updated
    Dec 5, 2023
    Dataset authored and provided by
    AfterShiphttps://www.aftership.com/
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The majority of software solutions are utilized in the Parcel Tracking sector, encompassing 322 tools, which represents 12.44% of the total. This dominance indicates the sector's reliance on specialized eCommerce software. In the CMS category, there are 295 software solutions in use, accounting for 11.40% of the overall software distribution. This highlights the importance of digital tools in this area. The Analytics sector also shows significant software usage with 198 solutions, making up 7.65% of the total. This reflects the sector's adoption of technology for eCommerce activities. This distribution underscores the varying degrees of technology integration across different categories, revealing how each sector leverages eCommerce software to enhance its operations and customer engagement.

  5. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2022). A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7806

A meta analysis of Wikipedia's coronavirus sources during the COVID-19 pandemic

Explore at:
txtAvailable download formats
Dataset updated
Sep 8, 2022
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

At the height of the coronavirus pandemic, on the last day of March 2020, Wikipedia in all languages broke a record for most traffic in a single day. Since the breakout of the Covid-19 pandemic at the start of January, tens if not hundreds of millions of people have come to Wikipedia to read - and in some cases also contribute - knowledge, information and data about the virus to an ever-growing pool of articles. Our study focuses on the scientific backbone behind the content people across the world read: which sources informed Wikipedia’s coronavirus content, and how was the scientific research on this field represented on Wikipedia. Using citation as readout we try to map how COVID-19 related research was used in Wikipedia and analyse what happened to it before and during the pandemic. Understanding how scientific and medical information was integrated into Wikipedia, and what were the different sources that informed the Covid-19 content, is key to understanding the digital knowledge echosphere during the pandemic. To delimitate the corpus of Wikipedia articles containing Digital Object Identifier (DOI), we applied two different strategies. First we scraped every Wikipedia pages form the COVID-19 Wikipedia project (about 3000 pages) and we filtered them to keep only page containing DOI citations. For our second strategy, we made a search with EuroPMC on Covid-19, SARS-CoV2, SARS-nCoV19 (30’000 sci papers, reviews and preprints) and a selection on scientific papers form 2019 onwards that we compared to the Wikipedia extracted citations from the english Wikipedia dump of May 2020 (2’000’000 DOIs). This search led to 231 Wikipedia articles containing at least one citation of the EuroPMC search or part of the wikipedia COVID-19 project pages containing DOIs. Next, from our 231 Wikipedia articles corpus we extracted DOIs, PMIDs, ISBNs, websites and URLs using a set of regular expressions. Subsequently, we computed several statistics for each wikipedia article and we retrive Atmetics, CrossRef and EuroPMC infromations for each DOI. Finally, our method allowed to produce tables of citations annotated and extracted infromations in each wikipadia articles such as books, websites, newspapers.Files used as input and extracted information on Wikipedia's COVID-19 sources are presented in this archive.See the WikiCitationHistoRy Github repository for the R codes, and other bash/python scripts utilities related to this project.

Search
Clear search
Close search
Google apps
Main menu