100+ datasets found
  1. w

    COVID-19 Open Research Dataset

    • datacatalog.library.wayne.edu
    Updated Mar 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Allen Institute for Artificial Intelligence (2020). COVID-19 Open Research Dataset [Dataset]. https://datacatalog.library.wayne.edu/dataset/covid-19-open-research-dataset
    Explore at:
    Dataset updated
    Mar 31, 2020
    Dataset provided by
    Allen Institute for Artificial Intelligence
    Description

    The COVID-19 Open Research Dataset is an extensive machine-readable resource of over 45,000 scholarly articles, including over 33,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease.

    The dataset is updated weekly and contains all COVID-19 and coronavirus-related research (e.g., SARS, MERS) from the following sources: PubMed's PMC open access corpus (using this query: COVID-19 and coronavirus research), additional COVID-19 research articles from a corpus maintained by the World Health Organization (WHO), and bioRxiv and medRxiv pre-prints (using this query: COVID-19 and coronavirus research). Also available is a comprehensive metadata file of 44,000 coronavirus and COVID-19 research articles with links to PubMed, Microsoft Academic, and the WHO COVID-19 database of publications (includes articles without open access full text).

  2. Data from: cord19

    • huggingface.co
    Updated Apr 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ai2 (2020). cord19 [Dataset]. https://huggingface.co/datasets/allenai/cord19
    Explore at:
    Dataset updated
    Apr 15, 2020
    Dataset provided by
    Allen Institute for AIhttp://allenai.org/
    Authors
    Ai2
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Description

    The Covid-19 Open Research Dataset (CORD-19) is a growing resource of scientific papers on Covid-19 and related historical coronavirus research. CORD-19 is designed to facilitate the development of text mining and information retrieval systems over its rich collection of metadata and structured full text papers. Since its release, CORD-19 has been downloaded over 75K times and has served as the basis of many Covid-19 text mining and discovery systems.

    The dataset itself isn't defining a specific task, but there is a Kaggle challenge that define 17 open research questions to be solved with the dataset: https://www.kaggle.com/allen-institute-for-ai/CORD-19-research-challenge/tasks

  3. g

    Dataset for the DIssemination of REgistered COVID-19 Clinical Trials...

    • maia-sh.github.io
    csv
    Updated Jun 30, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maia Salholz-Hillel; Nicholas J. DeVito; Peter Grabitz (2020). Dataset for the DIssemination of REgistered COVID-19 Clinical Trials (DIRECCT) Study [Dataset]. https://maia-sh.github.io/direcct-data/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 30, 2020
    Dataset provided by
    The DataLab, Nuffield Department of Primary Care Health Sciences, University of Oxford, Oxford, United Kingdom
    QUEST Center for Transforming Biomedical Research, Berlin Institute of Health (BIH) at Charité – Universitätsmedizin Berlin, Berlin, Germany
    Authors
    Maia Salholz-Hillel; Nicholas J. DeVito; Peter Grabitz
    Time period covered
    Jan 1, 2020 - Jun 30, 2020
    Area covered
    Variables measured
    id, doi, trn, url, pmid, n_trn, phase, source, cord_id, is_dupe, and 45 more
    Dataset funded by
    German Bundesministerium fĂĽr Bildung und Forschung (BMBF)
    Description

    The DIRECCT study is a multi-phase, living examination of clinical trial results dissemination throughout the COVID-19 pandemic. This dataset contains trials, registrations, and results from Phase 1 of the project, examining trials completed during the first six months of the pandemic (i.e., through 30 June 2020). This dataset is provided as a relational database of three CSVs which can joined on the id column. Data was collected using a combination of automated and manual strategies; automated searches were performed on 30 June 2020, and manual searches were performed between 21 October 2020 and 18 January 2021. Data sources for trials and registrations include the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) list of registered COVID-19 studies, individual clinical trial registries, and the COVID-19 TrialsTracker (https://covid19.trialstracker.net/). Data sources for results include COVID-19 Open Research Dataset Challenge (CORD-19), PubMed, EuropePMC, Google Scholar, and Google. Additional information on the project is available at the project's OSF page: http://doi.org/10.17605/osf.io/5f8j2

  4. Toolkit and Curated Archive for COVID-19 Research Challenge Dataset

    • datasets.ai
    • catalog.data.gov
    0
    Updated Mar 11, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2021). Toolkit and Curated Archive for COVID-19 Research Challenge Dataset [Dataset]. https://datasets.ai/datasets/toolkit-and-curated-archive-for-covid-19-research-challenge-dataset-18091
    Explore at:
    0Available download formats
    Dataset updated
    Mar 11, 2021
    Dataset authored and provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    This GitHub repository contains a downloadable snapshot of National Institute of Standards and Technology's COVID-19 Data Repository, curated from the COVID-19 Open Research Dataset (CORD-19) provided by the Allen Institute for AI. Curated Archive for Covid-19 Research Challenge Dataset- The COVID-19 Data Repository provides searchable CORD-19 data and metadata, including full-text extracted from the original CORD-19 JavaScript Object Notation (JSON) files. It is built using the Configurable Data Curation System (CDCS) developed at NIST.

  5. Data from: COVID-19++: A Citation-Aware Covid-19 Dataset for the Analysis of...

    • zenodo.org
    zip
    Updated Sep 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lukas Galke; Lukas Galke; Lisa Langnickel; Gavin Lüdemann; Tetyana Melnychuk; Eva Seidlmayer; Konrad U. Förstner; Carsten Schultz; Klaus Tochtermann; Lisa Langnickel; Gavin Lüdemann; Tetyana Melnychuk; Eva Seidlmayer; Konrad U. Förstner; Carsten Schultz; Klaus Tochtermann (2021). COVID-19++: A Citation-Aware Covid-19 Dataset for the Analysis of Research Dynamics [Dataset]. http://doi.org/10.5281/zenodo.5531084
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 27, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Lukas Galke; Lukas Galke; Lisa Langnickel; Gavin Lüdemann; Tetyana Melnychuk; Eva Seidlmayer; Konrad U. Förstner; Carsten Schultz; Klaus Tochtermann; Lisa Langnickel; Gavin Lüdemann; Tetyana Melnychuk; Eva Seidlmayer; Konrad U. Förstner; Carsten Schultz; Klaus Tochtermann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    COVID-19++ is a citation-aware COVID-19 dataset for the analysis of research dynamics. In addition to primary COVID-19 related articles and preprints from 2020, it includes citations and the metadata of first-order cited work. All publications are annotated with MeSH terms, either from the ground truth, or via ConceptMapper, if no ground truth was available.

    The data is organized in CSV files

    - Paper metadata (paper_id, publdate, title, data_source): paper.csv

    - Annotation data, mapping paper_id to MeSH terms: annotation.csv

    - Authorship data, mapping paper_id to author, optionally with ORCID: authorship.csv
    - Paired DOIs of citing and cited papers: references.csv

    The column data source within the paper metadata has the value KE (for metadata from ZB MED KE), PP (for preprints) or CR (for cited resources from CrossRef)

    This work was supported by BMBF within the programme ``Quantitative Wissenschaftsforschung'' under grant numbers 01PU17013A, 01PU17013B, 01PU17013C.

  6. CORD-19 fastText Vectors

    • kaggle.com
    zip
    Updated Jul 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Mezzetti (2020). CORD-19 fastText Vectors [Dataset]. https://www.kaggle.com/datasets/davidmezzetti/cord19-fasttext-vectors
    Explore at:
    zip(1718387142 bytes)Available download formats
    Dataset updated
    Jul 7, 2020
    Authors
    David Mezzetti
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Content

    fastText 300 dimension vectors built against the COVID-19 Open Research Dataset (CORD-19) with minCount=3.

    Only processed alphanumeric strings, required at least 1 alpha character and for strings to be at least 2 characters long.

    The following stop words were also not processed:

    • a, an, and, are, as, at, be, but, by, for, if, in, into, is, it,no, not, of, on, or, such, that, the, their, then, there, these, they, this, to, was, will, with

    Acknowledgements

    Banner Photo by martinsanchez on Unsplash

  7. n

    COVID-19 Open Research Dataset (CORD-19)

    • scidm.nchc.org.tw
    Updated Oct 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). COVID-19 Open Research Dataset (CORD-19) [Dataset]. https://scidm.nchc.org.tw/dataset/covid-19-open-research-dataset-cord-19
    Explore at:
    Dataset updated
    Oct 10, 2020
    Description

    A Free, Open Resource for the Global Research Community In response to the COVID-19 pandemic, the Allen Institute for AI has partnered with leading research groups to prepare and distribute the COVID-19 Open Research Dataset (CORD-19), a free resource of over 29,000 scholarly articles, including over 13,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease. The corpus will be updated weekly as new research is published in peer-reviewed publications and archival services like bioRxiv, medRxiv, and others. Commercial use subset (includes PMC content) -- 9000 papers, 186Mb Non-commercial use subset (includes PMC content) -- 1973 papers, 36Mb

  8. s

    COVID-19 Open Research Dataset

    • scicrunch.org
    • rrid.site
    • +2more
    Updated Aug 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). COVID-19 Open Research Dataset [Dataset]. http://identifiers.org/RRID:SCR_018336
    Explore at:
    Dataset updated
    Aug 11, 2024
    Description

    Collection of scholarly articles about COVID-19 and coronavirus family of viruses for use by global research community. Dataset is updated on weekly basis.

  9. m

    COVID-19 Research Datasets: Meta-regression Analysis

    • figshare.manchester.ac.uk
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Schultz; Ling Tan (2023). COVID-19 Research Datasets: Meta-regression Analysis [Dataset]. http://doi.org/10.48420/16908415.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    University of Manchester
    Authors
    David Schultz; Ling Tan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The datasets from this article.

    Tan, L., and D. M. Schultz, 2022: How is COVID-19 affected by weather? Meta-regression of 158 studies and recommendations for best practices in future research. Wea. Climate Soc., 14, 237–255, https://doi.org/10.1175/WCAS-D-21-0132.1.

  10. Impact on research during COVID-19 pandemic India 2020

    • statista.com
    Updated Jan 26, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2026). Impact on research during COVID-19 pandemic India 2020 [Dataset]. https://www.statista.com/statistics/1290659/india-impact-on-research-during-covid-19-pandemic/
    Explore at:
    Dataset updated
    Jan 26, 2026
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2020
    Area covered
    India
    Description

    According to ** percent of the faculty, research funding in the south Asian country of India had decreased during the COVID-19 pandemic in 2020. About ** percent of the research faculty stated that the international research tie-ups also had come down during the pandemic.

  11. CORD-19 Research papers Title embeddings

    • kaggle.com
    zip
    Updated Mar 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Narasimha Prasanna HN (2020). CORD-19 Research papers Title embeddings [Dataset]. https://www.kaggle.com/narasimha1997/cord-19-title-embeddings
    Explore at:
    zip(131990061 bytes)Available download formats
    Dataset updated
    Mar 29, 2020
    Authors
    Narasimha Prasanna HN
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Recent AllenNLP released Open COVID-19 research dataset. This dataset contains 44k research papers related to Coronavirus and other diseases. This dataset exposed many challenges for data scientists to explore raw data and provide insights about the virus. BERT and other state-of-the-art models can be used to extract meanings from these raw data. Having a word vector embeddings from BERT can ease many NLP tasks on the original data.

    Content

    The dataset is just a single npz file which contains the title embeddings of 40k research papers. You can load this as a Numpy array and use it to perform NLP tasks like semantic search, topic modelling, clustering, QA etc.

    Acknowledgements

    To planet Earth

  12. cord_19_inverness_all_v7

    • kaggle.com
    zip
    Updated Apr 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maciej Obarski (2020). cord_19_inverness_all_v7 [Dataset]. https://www.kaggle.com/mobarski/cord-19-inverness-all-v7
    Explore at:
    zip(1848797460 bytes)Available download formats
    Dataset updated
    Apr 15, 2020
    Authors
    Maciej Obarski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Maciej Obarski

    Released under Attribution 4.0 International (CC BY 4.0)

    Contents

  13. Data for: Open COVID Trials (OCT) Project

    • zenodo.org
    • data.niaid.nih.gov
    • +2more
    csv, txt
    Updated Aug 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Borghi; John Borghi; Cheyenne Payne; Lily Ren; Amanda Woodward; Connie Wong; Christopher Stave; Cheyenne Payne; Lily Ren; Amanda Woodward; Connie Wong; Christopher Stave (2022). Data for: Open COVID Trials (OCT) Project [Dataset]. http://doi.org/10.5061/dryad.mkkwh7137
    Explore at:
    txt, csvAvailable download formats
    Dataset updated
    Aug 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    John Borghi; John Borghi; Cheyenne Payne; Lily Ren; Amanda Woodward; Connie Wong; Christopher Stave; Cheyenne Payne; Lily Ren; Amanda Woodward; Connie Wong; Christopher Stave
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The COVID-19 pandemic has brought substantial attention to the systems used to communicate biomedical research. In particular, the need to rapidly and credibly communicate research findings has led many stakeholders to encourage researchers to adopt open science practices such as posting preprints and sharing data. To examine the degree to which this has led to the actual adoption of such practices, we examined the "openness" of a sample of 539 published papers describing the results of randomized controlled trials testing interventions to prevent or treat COVID-19. The majority (56%) of the papers in this sample were free to read at the time of our investigation and 23.56% were preceded by preprints. However, there is no guarantee that the papers without an open license will be available without a subscription in the future, and only 49.61% of the preprints we identified were linked to the subsequent peer-reviewed version. Of the 331 papers in our sample with statements identifying if (and how) related datasets were available, only a paucity indicated that data was available in a repository that facilitates rapid verification and reuse. Our results demonstrate that, while progress has been made, there is still a significant mismatch between aspiration and actual practice in the adoption of open science in an important area of the COVID-19 literature.

  14. f

    Supplementary material: COVID-19 clinical trials: who is likely to...

    • becaris.figshare.com
    docx
    Updated Jul 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kimberly A. Fisher; Mara M. Epstein; Ngoc Nguyen; Hassan Fouayzi; Sybil Crawford; Benjamin P. Linas; Kathleen M. Mazor (2024). Supplementary material: COVID-19 clinical trials: who is likely to participate and why? [Dataset]. http://doi.org/10.6084/m9.figshare.26363029.v1
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset provided by
    Becaris
    Authors
    Kimberly A. Fisher; Mara M. Epstein; Ngoc Nguyen; Hassan Fouayzi; Sybil Crawford; Benjamin P. Linas; Kathleen M. Mazor
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    These are peer-reviewed supplementary materials for the article 'COVID-19 clinical trials: who is likely to participate and why?' published in the Journal of Comparative Effectiveness Research.Appendix 1: Research Participation SurveyAppendix 2: Statistical Analyses and ResultsSupplemental Table 1: Logistic Regression Models Predicting Intent to Participate in Hypothetical Research StudyAim: To identify factors associated with willingness to participate in a COVID-19 clinical trial and reasons for and against participating. Materials & methods: We surveyed Massachusetts (MA, USA) residents online using the Dynata survey platform and via phone using random digit dialing between October and November 2021. Respondents were asked to imagine they were hospitalized with COVID-19 and invited to participate in a treatment trial. We assessed willingness to participate by asking, “Which way are you leaning” and why. We used multivariate logistic regression to model factors associated with leaning toward participation. Open-ended responses were analyzed using conventional content analysis. Results: Of 1071 respondents, 65.6% leaned toward participating. Multivariable analyses revealed college education (OR: 1.59; 95% CI: 1.11, 2.27), trust in the healthcare system (OR: 1.32; 95% CI: 1.10, 1.58) and relying on doctors (OR: 1.77; 95% CI: 1.45, 2.17) and family or friends (OR: 1.31; 95% CI: 1.11, 1.54) to make health decisions were significantly associated with leaning toward participating. Respondents with lower health literacy (OR: 0.57; 95% CI: 0.36, 0.91) and who identify as Black (OR: 0.40; 95% CI: 0.24, 0.68), Hispanic (OR: 0.61; 95% CI: 0.38, 0.98), or republican (OR: 0.61; 95% CI: 0.38, 0.97) were significantly less likely to lean toward participating. Common reasons for participating included helping others, benefitting oneself and deeming the study low risk. Common reasons for leaning against were deeming the study high risk, disliking experimental treatments and not wanting to be a guinea pig. Conclusion: Our finding that vulnerable individuals and those with lower levels of trust in the healthcare system are less likely to be receptive to participating in a COVID-19 clinical trial highlights that work is needed to achieve a healthcare system that provides confidence to historically disadvantaged groups that their participation in research will benefit their community.

  15. Datasheet1_Mobility data shows effectiveness of control strategies for...

    • frontiersin.figshare.com
    pdf
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yuval Berman; Shannon D. Algar; David M. Walker; Michael Small (2024). Datasheet1_Mobility data shows effectiveness of control strategies for COVID-19 in remote, sparse and diffuse populations.pdf [Dataset]. http://doi.org/10.3389/fepid.2023.1201810.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 7, 2024
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Yuval Berman; Shannon D. Algar; David M. Walker; Michael Small
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data that is collected at the individual-level from mobile phones is typically aggregated to the population-level for privacy reasons. If we are interested in answering questions regarding the mean, or working with groups appropriately modeled by a continuum, then this data is immediately informative. However, coupling such data regarding a population to a model that requires information at the individual-level raises a number of complexities. This is the case if we aim to characterize human mobility and simulate the spatial and geographical spread of a disease by dealing in discrete, absolute numbers. In this work, we highlight the hurdles faced and outline how they can be overcome to effectively leverage the specific dataset: Google COVID-19 Aggregated Mobility Research Dataset (GAMRD). Using a case study of Western Australia, which has many sparsely populated regions with incomplete data, we firstly demonstrate how to overcome these challenges to approximate absolute flow of people around a transport network from the aggregated data. Overlaying this evolving mobility network with a compartmental model for disease that incorporated vaccination status we run simulations and draw meaningful conclusions about the spread of COVID-19 throughout the state without de-anonymizing the data. We can see that towns in the Pilbara region are highly vulnerable to an outbreak originating in Perth. Further, we show that regional restrictions on travel are not enough to stop the spread of the virus from reaching regional Western Australia. The methods explained in this paper can be therefore used to analyze disease outbreaks in similarly sparse populations. We demonstrate that using this data appropriately can be used to inform public health policies and have an impact in pandemic responses.

  16. Sharing research data and findings relevant to the novel coronavirus...

    • zenodo.org
    • resodate.org
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eleanor Cox; Eleanor Cox; Lucia Loffreda; Lucia Loffreda; Andrea Chiarelli; Andrea Chiarelli (2023). Sharing research data and findings relevant to the novel coronavirus (COVID-19) outbreak - Survey responses [Dataset]. http://doi.org/10.5281/zenodo.6620689
    Explore at:
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Eleanor Cox; Eleanor Cox; Lucia Loffreda; Lucia Loffreda; Andrea Chiarelli; Andrea Chiarelli
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The spreadsheets in the present dataset (CSV format) include the anonymised responses to our online survey of signatories of the Joint Statement on open research and data sharing. Responses have been split into quantitative responses (i.e., closed survey questions) and qualitative responses (i.e., free text survey questions).

    This data has been used to inform our final report, which is available in our Zenodo Project Community.

  17. COVID-19 clinical studies in Mexico 2025, by phase

    • statista.com
    Updated Nov 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). COVID-19 clinical studies in Mexico 2025, by phase [Dataset]. https://www.statista.com/statistics/1203597/mexico-covid-19-clinical-trials-phase/
    Explore at:
    Dataset updated
    Nov 29, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Feb 13, 2025
    Area covered
    Mexico
    Description

    As of February 2025, a total of ** clinical studies targeting COVID-19 in Mexico were in phase *. Meanwhile, ***** COVID-19 clinical trials were in early phase * in the North American country. As of June 3, 2022, there were over ***** drugs and vaccines in development targeting the coronavirus disease (COVID-19) worldwide.

  18. Data from: Analysis of shared research data in Spanish scientific papers...

    • zenodo.org
    Updated Sep 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roxana Cerda-Cosme; Roxana Cerda-Cosme; Eva Méndez; Eva Méndez (2022). Analysis of shared research data in Spanish scientific papers about COVID-19: a first approach [Dataset]. http://doi.org/10.5281/zenodo.5711105
    Explore at:
    Dataset updated
    Sep 29, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Roxana Cerda-Cosme; Roxana Cerda-Cosme; Eva Méndez; Eva Méndez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction: During the coronavirus pandemic, changes in the way science is done and shared occurred, which motivates meta-research to help understand science communication in crises and improve its effectiveness. Objective: To study how many Spanish scientific papers on COVID-19 published during 2020 share their research data. Methodology: Qualitative and descriptive study applying nine attributes: (1) availability, (2) accessibility, (3) format, (4) licensing, (5) linkage, (6) funding, (7) editorial policy, (8) content and (9) statistics. Results: We analyzed 1340 papers, 1173 (87.5%) did not have research data. 12.5% share their research data of which 2.1% share their data in repositories, 5% share their data through a simple request, 0.2% do not have permission to share their data and 5.2% share their data as supplementary material. Conclusions: There is a small percentage that shares their research data, however it demonstrates the researchers' poor knowledge on how to properly share their research data and their lack of knowledge on what is research data.

  19. Data from: A large-scale COVID-19 Twitter chatter dataset for open...

    • zenodo.org
    • explore.openaire.eu
    application/gzip, tsv +1
    Updated Apr 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Katya Artemova; Elena Tutubalina; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Katya Artemova; Elena Tutubalina (2023). A large-scale COVID-19 Twitter chatter dataset for open scientific research - an international collaboration [Dataset]. http://doi.org/10.5281/zenodo.3941254
    Explore at:
    application/gzip, tsv, zipAvailable download formats
    Dataset updated
    Apr 17, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Juan M. Banda; Juan M. Banda; Ramya Tekumalla; Ramya Tekumalla; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Katya Artemova; Elena Tutubalina; Gerardo Chowell; Gerardo Chowell; Guanyu Wang; Jingyuan Yu; Tuo Liu; Yuning Ding; Katya Artemova; Elena Tutubalina
    Description

    NEW in Version 18: Besides our regular update, we now have included the tweet identifiers and their respective tweet location place country code for the clean version of the dataset. This is found on the clean_place_country.tar.gz file, each file is identified by the two-character ISO country code as the file suffix.

    Due to the relevance of the COVID-19 global pandemic, we are releasing our dataset of tweets acquired from the Twitter Stream related to COVID-19 chatter. Since our first release we have received additional data from our new collaborators, allowing this resource to grow to its current size. Dedicated data gathering started from March 11th yielding over 4 million tweets a day. We have added additional data provided by our new collaborators from January 27th to March 27th, to provide extra longitudinal coverage. Version 10 added ~1.5 million tweets in the Russian language collected between January 1st and May 8th, gracefully provided to us by: Katya Artemova (NRU HSE) and Elena Tutubalina (KFU). From version 12 we have included daily hashtags, mentions and emoijis and their frequencies the respective zip files. From version 14 we have included the tweet identifiers and their respective language for the clean version of the dataset. This is found on the clean_languages.tar.gz file, each file is identified by the two-character language code as the file suffix.

    The data collected from the stream captures all languages, but the higher prevalence are: English, Spanish, and French. We release all tweets and retweets on the full_dataset.tsv file (490,385,226 unique tweets), and a cleaned version with no retweets on the full_dataset-clean.tsv file (120,722,431 unique tweets). There are several practical reasons for us to leave the retweets, tracing important tweets and their dissemination is one of them. For NLP tasks we provide the top 1000 frequent terms in frequent_terms.csv, the top 1000 bigrams in frequent_bigrams.csv, and the top 1000 trigrams in frequent_trigrams.csv. Some general statistics per day are included for both datasets in the statistics-full_dataset.tsv and statistics-full_dataset-clean.tsv files. For more statistics and some visualizations visit: http://www.panacealab.org/covid19/

    More details can be found (and will be updated faster at: https://github.com/thepanacealab/covid19_twitter) and our pre-print about the dataset (https://arxiv.org/abs/2004.03688)

    As always, the tweets distributed here are only tweet identifiers (with date and time added) due to the terms and conditions of Twitter to re-distribute Twitter data ONLY for research purposes. The need to be hydrated to be used.

  20. CORD-19 QA

    • kaggle.com
    zip
    Updated Jun 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Mezzetti (2020). CORD-19 QA [Dataset]. https://www.kaggle.com/davidmezzetti/cord19-qa
    Explore at:
    zip(7461393 bytes)Available download formats
    Dataset updated
    Jun 18, 2020
    Authors
    David Mezzetti
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Content

    This dataset contains files to assist in building question-answer models for the CORD-19 dataset.

    Files included:

    cord19.txt: Line-by-line export of CORD-19 data with a focus on high quality, study design detected articles. cord19-qa.csv: CSV rows of question, context, answer combinations for the CORD-19 dataset cord19-qa.json: SQuAD 2.0 formatted question, context, answer combinations

    Transformer models

    Transformer models fine-tuned for language modeling, SQuAD 2.0 and this dataset can be used within HuggingFace Transformers.

    Acknowledgements

    Banner Photo Jeremy Thomas on Unsplash

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Allen Institute for Artificial Intelligence (2020). COVID-19 Open Research Dataset [Dataset]. https://datacatalog.library.wayne.edu/dataset/covid-19-open-research-dataset

COVID-19 Open Research Dataset

CORD-19

Explore at:
Dataset updated
Mar 31, 2020
Dataset provided by
Allen Institute for Artificial Intelligence
Description

The COVID-19 Open Research Dataset is an extensive machine-readable resource of over 45,000 scholarly articles, including over 33,000 with full text, about COVID-19 and the coronavirus family of viruses for use by the global research community. This dataset is intended to mobilize researchers to apply recent advances in natural language processing to generate new insights in support of the fight against this infectious disease.

The dataset is updated weekly and contains all COVID-19 and coronavirus-related research (e.g., SARS, MERS) from the following sources: PubMed's PMC open access corpus (using this query: COVID-19 and coronavirus research), additional COVID-19 research articles from a corpus maintained by the World Health Organization (WHO), and bioRxiv and medRxiv pre-prints (using this query: COVID-19 and coronavirus research). Also available is a comprehensive metadata file of 44,000 coronavirus and COVID-19 research articles with links to PubMed, Microsoft Academic, and the WHO COVID-19 database of publications (includes articles without open access full text).

Search
Clear search
Close search
Google apps
Main menu