100+ datasets found
  1. US Clinical Trials Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    United States
    Description

    This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.

  2. TREC 2022 Clinical Trials Dataset

    • catalog.data.gov
    • s.cnmilf.com
    Updated Sep 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). TREC 2022 Clinical Trials Dataset [Dataset]. https://catalog.data.gov/dataset/trec-2022-clinical-trials-dataset
    Explore at:
    Dataset updated
    Sep 11, 2024
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The goal of the Clinical Trials track is to focus research on the clinical trials matching problem: given a free text summary of a patient health record, find suitable clinical trials for that patient.

  3. Data (i.e., evidence) about evidence based medicine

    • figshare.com
    • search.datacite.org
    png
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
    Explore at:
    pngAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Jorge H Ramirez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

    1. Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References
    2. Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873
    3. The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
      Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:
    4. Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.
    5. Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)
    6. Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242
    7. Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181
    8. Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151
    9. Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles
    10. Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725
    11. Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22
    12. Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106
    13. Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597
    14. Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655
    15. Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
  4. n

    Data from: Sharing of clinical trial data and results reporting practices...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Jul 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello (2019). Sharing of clinical trial data and results reporting practices among large pharmaceutical companies: cross sectional descriptive study and pilot of a tool to improve company practices [Dataset]. http://doi.org/10.5061/dryad.k81584t
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 25, 2019
    Authors
    Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objectives: To develop and pilot a tool to measure and improve pharmaceutical companies’ clinical trial data sharing policies and practices. Design: Cross sectional descriptive analysis. Setting: Large pharmaceutical companies with novel drugs approved by the US Food and Drug Administration in 2015. Data sources: Data sharing measures were adapted from 10 prominent data sharing guidelines from expert bodies and refined through a multi-stakeholder deliberative process engaging patients, industry, academics, regulators, and others. Data sharing practices and policies were assessed using data from ClinicalTrials.gov, Drugs@FDA, corporate websites, data sharing platforms and registries (eg, the Yale Open Data Access (YODA) Project and Clinical Study Data Request (CSDR)), and personal communication with drug companies. Main outcome measures: Company level, multicomponent measure of accessibility of participant level clinical trial data (eg, analysis ready dataset and metadata); drug and trial level measures of registration, results reporting, and publication; company level overall transparency rankings; and feasibility of the measures and ranking tool to improve company data sharing policies and practices. Results: Only 25% of large pharmaceutical companies fully met the data sharing measure. The median company data sharing score was 63% (interquartile range 58-85%). Given feedback and a chance to improve their policies to meet this measure, three companies made amendments, raising the percentage of companies in full compliance to 33% and the median company data sharing score to 80% (73-100%). The most common reasons companies did not initially satisfy the data sharing measure were failure to share data by the specified deadline (75%) and failure to report the number and outcome of their data requests. Across new drug applications, a median of 100% (interquartile range 91-100%) of trials in patients were registered, 65% (36-96%) reported results, 45% (30-84%) were published, and 95% (69-100%) were publicly available in some form by six months after FDA drug approval. When examining results on the drug level, less than half (42%) of reviewed drugs had results for all their new drug applications trials in patients publicly available in some form by six months after FDA approval. Conclusions: It was feasible to develop a tool to measure data sharing policies and practices among large companies and have an impact in improving company practices. Among large companies, 25% made participant level trial data accessible to external investigators for new drug approvals in accordance with the current study’s measures; this proportion improved to 33% after applying the ranking tool. Other measures of trial transparency were higher. Some companies, however, have substantial room for improvement on transparency and data sharing of clinical trials.

  5. 2019 Clinical Trial Data Sharing Survey Results - Data

    • wellcome.figshare.com
    xlsx
    Updated Jan 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni (2020). 2019 Clinical Trial Data Sharing Survey Results - Data [Dataset]. http://doi.org/10.6084/m9.figshare.11603295.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 14, 2020
    Dataset provided by
    Wellcome Trusthttps://wellcome.org/
    Authors
    Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The full anonymised dataset from our recent survey into the attitudes towards clinical trial data sharing. The invitation to participate was distributed to clinical trialists funded by the Wellcome Trust, Bill and Melinda Gates Foundation, Cancer Research UK, and UK Medical Research Council.

  6. d

    National Database for Clinical Trials Related to Mental Illness (NDCT)

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (NIH) (2025). National Database for Clinical Trials Related to Mental Illness (NDCT) [Dataset]. https://catalog.data.gov/dataset/national-database-for-clinical-trials-related-to-mental-illness-ndct
    Explore at:
    Dataset updated
    Jul 16, 2025
    Dataset provided by
    National Institutes of Health (NIH)
    Description

    The National Database for Clinical Trials Related to Mental Illness (NDCT) is an extensible informatics platform for relevant data at all levels of biological and behavioral organization (molecules, genes, neural tissue, behavioral, social and environmental interactions) and for all data types (text, numeric, image, time series, etc.) related to clinical trials funded by the National Institute of Mental Health. Sharing data, associated tools, methodologies and results, rather than just summaries or interpretations, accelerates research progress. Community-wide sharing requires common data definitions and standards, as well as comprehensive and coherent informatics approaches for the sharing of de-identified human subject research data. Built on the National Database for Autism Research (NDAR) informatics platform, NDCT provides a comprehensive data sharing platform for NIMH grantees supporting clinical trials.

  7. w

    Dataset of books series that contain Statistical design and analysis of...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of books series that contain Statistical design and analysis of clinical trials : principles and methods [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Statistical+design+and+analysis+of+clinical+trials+%3A+principles+and+methods&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is Statistical design and analysis of clinical trials : principles and methods. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  8. Clinical Trial Data Visualization Market Report | Global Forecast From 2025...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Clinical Trial Data Visualization Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-clinical-trial-data-visualization-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Clinical Trial Data Visualization Market Outlook


    The global clinical trial data visualization market size is projected to grow from USD 0.75 billion in 2023 to USD 2.62 billion by 2032, reflecting a compound annual growth rate (CAGR) of 15.2% during the forecast period. This growth is driven by the increasing complexity of clinical trials, the need for enhanced data transparency, and the rising adoption of digital tools in the healthcare sector.



    One of the key drivers for the growth of the clinical trial data visualization market is the escalating complexity and volume of data generated during clinical trials. The pharmaceutical and biotechnology sectors are witnessing a surge in clinical trials, which demand sophisticated data management and visualization tools to make sense of the vast amounts of data collected. These tools enable researchers to identify patterns, trends, and outliers more efficiently, thereby accelerating the decision-making process and improving clinical trial outcomes.



    Another significant factor contributing to market growth is the increasing emphasis on data transparency and regulatory compliance. Regulatory bodies, such as the FDA and EMA, are mandating greater transparency in clinical trial data to ensure patient safety and data integrity. Data visualization tools facilitate the clear presentation of complex data, making it easier for regulatory bodies and stakeholders to review and approve clinical trial processes. This ensures that clinical trials are conducted in a more transparent and compliant manner, thus driving the adoption of these tools.



    The advent of advanced technologies, such as artificial intelligence (AI) and machine learning (ML), is also playing a crucial role in the growth of the clinical trial data visualization market. These technologies are being increasingly integrated into data visualization tools to enhance their capabilities. AI and ML algorithms can analyze large datasets quickly and provide insights that were previously unattainable. This not only improves the efficiency of clinical trials but also enhances the accuracy and reliability of the data being presented.



    As the clinical trial data visualization market continues to expand, the importance of Clinical Trial Data Security becomes increasingly paramount. With the vast amounts of data generated during trials, ensuring the confidentiality, integrity, and availability of this data is critical. Organizations must implement robust security measures to protect sensitive information from unauthorized access and breaches. This involves not only securing the data itself but also safeguarding the systems and networks that store and process this information. As regulatory bodies tighten their data protection requirements, companies are investing in advanced security technologies and practices to comply with these standards and maintain trust with stakeholders. The focus on Clinical Trial Data Security is not just about compliance; it is about ensuring the reliability and credibility of clinical trial outcomes, which ultimately impacts patient safety and the development of new therapies.



    Regionally, North America is expected to dominate the clinical trial data visualization market due to the presence of a large number of pharmaceutical and biotechnology companies, a well-established healthcare infrastructure, and a strong focus on research and development. Europe is also expected to witness significant growth, driven by the increasing adoption of digital technologies in clinical trials and supportive regulatory frameworks. The Asia Pacific region is poised to grow at the fastest rate, fueled by the expanding pharmaceutical industry, growing investments in healthcare technology, and an increasing number of clinical trials being conducted in countries like China and India.



    Component Analysis


    The clinical trial data visualization market is segmented into software and services based on components. The software segment is expected to hold the largest market share during the forecast period. This can be attributed to the increasing demand for advanced software solutions that offer real-time data analysis and visualization capabilities. These software tools are designed to handle large volumes of data and provide intuitive visual representations that facilitate better understanding and decision-making.



    Furthermore, the integration of AI and ML technologies into data visualization software is enhancing their capabilities, makin

  9. U

    Data from: Availability of Study Protocols for Randomized Trials Published...

    • datacatalog.hshsl.umaryland.edu
    Updated Mar 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Doshi; O'Mareen Spence; Kyungwan Hong; Richie Onwuchekwa Uba (2024). Availability of Study Protocols for Randomized Trials Published in High-Impact Medical Journals: A Cross-Sectional Analysis [Dataset]. http://doi.org/10.5281/zenodo.1344634
    Explore at:
    Dataset updated
    Mar 27, 2024
    Dataset provided by
    HS/HSL
    Authors
    Peter Doshi; O'Mareen Spence; Kyungwan Hong; Richie Onwuchekwa Uba
    Description

    To improve reporting transparency and research integrity, some journals have begun publishing study protocols and statistical analysis plans alongside trial publications. To determine the overall availability and characteristics of protocols and statistical analysis plans this study reviewed all randomized clinical trials (RCT) published in 2016 in the following 5 general medicine journals: Annals of Internal Medicine, BMJ, JAMA, Lancet, and NEJM. Characteristics of RCTs were extracted from the publication and clinical trial registry. A detailed assessment of protocols and statistical analysis plans was conducted in a 20% random sample of trials. Dataset contains extraction sheets (as SAS data files), code to calculate the values in the tables in the manuscript, and a supplemental file with additional notes on methods used in the study.

  10. Z

    Final Dataset for the DIssemination of REgistered COVID-19 Clinical Trials...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DeVito, Nicholas J. (2024). Final Dataset for the DIssemination of REgistered COVID-19 Clinical Trials (DIRECCT) Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8181414
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Salholz-Hillel, Maia
    Schult, Tjada A.
    Grabitz, Peter
    Carlisle, Benjamin Gregory
    Goldacre, Ben
    Pugh-Jones, Molly
    Hildebrand, Nicole
    Schwietering, Johannes
    Strech, Daniel
    DeVito, Nicholas J.
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The DIRECCT study is a multi-phase examination of clinical trial results dissemination during the COVID-19 pandemic.

    Interim data for trials completed during the first six months of the pandemic (i.e., 1 January 2020 – 30 June 2020) was previously deposited at https://doi.org/10.5281/zenodo.4669936. This data deposit comprises the results of searches for trials completed during the first 18-months of the pandemic (i.e., 1 January 2020 – 30 June 2021). The data structure for the final phase of the project is not identical to the interim data as it was substantially more complex. The data include datatables (CSVs) that can be treated as relational and joined on the id or trn columns. See datamodel.png for an overview of the data.

    Details on data sources and methods for the creation and analysis of this dataset are available in a detailed protocol (Version 3.1, 19 July 2023) : https://osf.io/w8t7r

    Note: This repository will be updated with additional information including a codebook and archives of raw data.

    Additional information on the project is available at the project's OSF page: https://doi.org/10.17605/osf.io/5f8j2.

  11. f

    Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Miron; Rafael Gonçalves; Mark A. Musen (2023). Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov" [Dataset]. http://doi.org/10.6084/m9.figshare.12743939.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Laura Miron; Rafael Gonçalves; Mark A. Musen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.

  12. Clean data from survey of statisticians on Adverse Event analysis practices...

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachel Phillips; Victoria Cornelius (2023). Clean data from survey of statisticians on Adverse Event analysis practices in RCTs [Dataset]. http://doi.org/10.6084/m9.figshare.12436574.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Rachel Phillips; Victoria Cornelius
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset (Stata v15.1) containing responses from a survey of UK Clinical Research Collaboration registered clinical trial units (CTUs) and industry statisticians from both pharmaceuticals and clinical research organisations (http://dx.doi. org/10.1136/bmjopen-2020- 036875) Data is de-identified. The dataset contains descriptive variables describing participant's experience, as well as responses to questions on current adverse event analysis practices, awareness of specialist methods for adverse event analysis and priorities, concerns and barriers participants experience when analysing adverse event data.

  13. Data cleaning using unstructured data

    • zenodo.org
    zip
    Updated Jul 30, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer (2024). Data cleaning using unstructured data [Dataset]. http://doi.org/10.5281/zenodo.13135983
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this project, we work on repairing three datasets:

    • Trials design: This dataset was obtained from the European Union Drug Regulating Authorities Clinical Trials Database (EudraCT) register and the ground truth was created from external registries. In the dataset, multiple countries, identified by the attribute country_protocol_code, conduct the same clinical trials which is identified by eudract_number. Each clinical trial has a title that can help find informative details about the design of the trial.
    • Trials population: This dataset delineates the demographic origins of participants in clinical trials primarily conducted across European countries. This dataset include structured attributes indicating whether the trial pertains to a specific gender, age group or healthy volunteers. Each of these categories is labeled as (`1') or (`0') respectively denoting whether it is included in the trials or not. It is important to note that the population category should remain consistent across all countries conducting the same clinical trial identified by an eudract_number. The ground truth samples in the dataset were established by aligning information about the trial populations provided by external registries, specifically the CT.gov database and the German Trials database. Additionally, the dataset comprises other unstructured attributes that categorize the inclusion criteria for trial participants such as inclusion.
    • Allergens: This dataset contains information about products and their allergens. The data was collected from the German version of the `Alnatura' (Access date: 24 November, 2020), a free database of food products from around the world `Open Food Facts', and the websites: `Migipedia', 'Piccantino', and `Das Ist Drin'. There may be overlapping products across these websites. Each product in the dataset is identified by a unique code. Samples with the same code represent the same product but are extracted from a differentb source. The allergens are indicated by (‘2’) if present, or (‘1’) if there are traces of it, and (‘0’) if it is absent in a product. The dataset also includes information on ingredients in the products. Overall, the dataset comprises categorical structured data describing the presence, trace, or absence of specific allergens, and unstructured text describing ingredients.

    N.B: Each '.zip' file contains a set of 5 '.csv' files which are part of the afro-mentioned datasets:

    • "{dataset_name}_train.csv": samples used for the ML-model training. (e.g "allergens_train.csv")
    • "{dataset_name}_test.csv": samples used to test the the ML-model performance. (e.g "allergens_test.csv")
    • "{dataset_name}_golden_standard.csv": samples represent the ground truth of the test samples. (e.g "allergens_golden_standard.csv")
    • "{dataset_name}_parker_train.csv": samples repaired using Parker Engine used for the ML-model training. (e.g "allergens_parker_train.csv")
    • "{dataset_name}_parker_train.csv": samples repaired using Parker Engine used to test the the ML-model performance. (e.g "allergens_parker_test.csv")
  14. Clinical Trials

    • kaggle.com
    Updated Nov 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Clinical Trials [Dataset]. https://www.kaggle.com/datasets/thedevastator/a-quick-overview-of-clinical-trials/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    Clinical Trials

    Clinical trials over the years along with start / end dates, outcome and more.

    By Aero Data Lab [source]

    About this dataset

    This dataset contains information on clinical trials conducted by sponsors. Each row represents a clinical trial, and the columns represent various attributes of the trial, such as the National Clinical Trial Number, the sponsor of the trial, the title of the trial, and so on.

    The purpose of this dataset is to provide a bird's-eye view of the clinical trial landscape. By understanding which sponsors are conducting which trials and for what conditions, we can get a better sense of where research is headed and what new treatments may be on the horizon

    How to use the dataset

    • NCT is a unique identifier for clinical trials. It stands for National Clinical Trial Number.
    • Sponsor is the organization that is funding the clinical trial.
    • Title is the name of the clinical trial.
    • Summary is a brief summary of the clinical trial.
    • Start Year is the year that the clinical trial started.
    • Start Month is the month that the clinical trial started.
    • Phase is the stage of development of the investigative drug or device (I), which can be one of four types: I, II, III, or IV.
    • Enrollment is The number of participants in the clinical trial.
    • Status is The status of enrollment in the study, which can be Recruiting, Not yet recruiting, Active, not recruiting, Completed, Suspended, or Terminated.

    Condition indicates what medical condition(s) are being studied in this particular NCT record

    Research Ideas

    • Identify patterns in clinical trials to improve the development process
    • Understand how different sponsors fund clinical trials

    Acknowledgements

    By Aero Data Lab [source]

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: AERO-BirdsEye-Data.csv | Column name | Description | |:----------------|:-----------------------------------------------------------------| | NCT | National Clinical Trial number. (String) | | Sponsor | Name of the sponsor conducting the clinical trial. (String) | | Title | Title of the clinical trial. (String) | | Summary | Brief summary of the clinical trial. (String) | | Start_Year | Year the clinical trial started. (Integer) | | Start_Month | Month the clinical trial started. (String) | | Phase | Phase of the clinical trial. (String) | | Enrollment | Number of participants enrolled in the clinical trial. (Integer) | | Status | Status of the clinical trial. (String) | | Condition | Condition being tested in the clinical trial. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit By Aero Data Lab [source]

  15. Data from: Clinical Research: A Globalized Network

    • figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Trevor Richter (2023). Clinical Research: A Globalized Network [Dataset]. http://doi.org/10.6084/m9.figshare.1246725.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Trevor Richter
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These files relate to data extracted from ClinicalTrials.gov. In the database file, data for individual clinical trials are included, and attributes include study identifier, study type, trial dates, interventions, sample size, countries in which the study was conducted, etc. The Edges file contains geographic data derived from the clinical trials data set that can be used to generate networks to illustrate geographic connectivity through clinical research, using open access software such as Gephi. The Gephi file includes networks for all countries worldwide, as well as regional networks for each major grographic region. The figures are network diagrams generated by Gephi showing geographic connectivity among individual countries through common participation in multinational clinical trials. The thickness of the connecting lines (edges) reflects the strength of a connection.

  16. Big Data Analytics for Clinical Research Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Big Data Analytics for Clinical Research Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/big-data-analytics-for-clinical-research-market-global-industry-analysis
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Big Data Analytics for Clinical Research Market Outlook



    As per our latest research, the Big Data Analytics for Clinical Research market size reached USD 7.45 billion globally in 2024, reflecting a robust adoption pace driven by the increasing digitization of healthcare and clinical trial processes. The market is forecasted to grow at a CAGR of 17.2% from 2025 to 2033, reaching an estimated USD 25.54 billion by 2033. This significant growth is primarily attributed to the rising need for real-time data-driven decision-making, the proliferation of electronic health records (EHRs), and the growing emphasis on precision medicine and personalized healthcare solutions. The industry is experiencing rapid technological advancements, making big data analytics a cornerstone in transforming clinical research methodologies and outcomes.




    Several key growth factors are propelling the expansion of the Big Data Analytics for Clinical Research market. One of the primary drivers is the exponential increase in clinical data volumes from diverse sources, including EHRs, wearable devices, genomics, and imaging. Healthcare providers and research organizations are leveraging big data analytics to extract actionable insights from these massive datasets, accelerating drug discovery, optimizing clinical trial design, and improving patient outcomes. The integration of artificial intelligence (AI) and machine learning (ML) algorithms with big data platforms has further enhanced the ability to identify patterns, predict patient responses, and streamline the entire research process. These technological advancements are reducing the time and cost associated with clinical research, making it more efficient and effective.




    Another significant factor fueling market growth is the increasing collaboration between pharmaceutical & biotechnology companies and technology firms. These partnerships are fostering the development of advanced analytics solutions tailored specifically for clinical research applications. The demand for real-world evidence (RWE) and real-time patient monitoring is rising, particularly in the context of post-market surveillance and regulatory compliance. Big data analytics is enabling stakeholders to gain deeper insights into patient populations, treatment efficacy, and adverse event patterns, thereby supporting evidence-based decision-making. Furthermore, the shift towards decentralized and virtual clinical trials is creating new opportunities for leveraging big data to monitor patient engagement, adherence, and safety remotely.




    The regulatory landscape is also evolving to accommodate the growing use of big data analytics in clinical research. Regulatory agencies such as the FDA and EMA are increasingly recognizing the value of data-driven approaches for enhancing the reliability and transparency of clinical trials. This has led to the establishment of guidelines and frameworks that encourage the adoption of big data technologies while ensuring data privacy and security. However, the implementation of stringent data protection regulations, such as GDPR and HIPAA, poses challenges related to data integration, interoperability, and compliance. Despite these challenges, the overall outlook for the Big Data Analytics for Clinical Research market remains highly positive, with sustained investments in digital health infrastructure and analytics capabilities.




    From a regional perspective, North America currently dominates the Big Data Analytics for Clinical Research market, accounting for the largest share due to its advanced healthcare infrastructure, high adoption of digital technologies, and strong presence of leading pharmaceutical companies. Europe follows closely, driven by increasing government initiatives to promote health data interoperability and research collaborations. The Asia Pacific region is emerging as a high-growth market, supported by expanding healthcare IT investments, rising clinical trial activities, and growing awareness of data-driven healthcare solutions. Latin America and the Middle East & Africa are also witnessing gradual adoption, albeit at a slower pace, due to infrastructural and regulatory challenges. Overall, the global market is poised for substantial growth across all major regions over the forecast period.



  17. Clinical Trials Database (CTD)

    • open.canada.ca
    html, json, xml
    Updated Dec 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Health Canada (2024). Clinical Trials Database (CTD) [Dataset]. https://open.canada.ca/data/en/dataset/d6fe4b32-2eaf-4ac0-9e35-b3841f25e3a7
    Explore at:
    xml, json, htmlAvailable download formats
    Dataset updated
    Dec 9, 2024
    Dataset provided by
    Health Canadahttp://www.hc-sc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    Health Canada's Clinical Trials Database is a listing of information about phase I, II and III clinical trials in patients. The database is managed by Health Canada and provides a source of information about Canadian clinical trials involving human pharmaceutical and biological drugs. Additional information on Health Canada’s CTD is available at: https://www.canada.ca/en/health-canada/services/drugs-health-products/drug-products/health-canada-clinical-trials-database/frequently-asked-questions.html

  18. e

    Patient-relevance of outcome measures in breast cancer clinical trials -...

    • datarepository.eur.nl
    pdf
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Diana Delnoij; Jasmijn Plooij (2025). Patient-relevance of outcome measures in breast cancer clinical trials - Data Files [Dataset]. http://doi.org/10.25397/eur.28314236.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    Erasmus University Rotterdam (EUR)
    Authors
    Diana Delnoij; Jasmijn Plooij
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    The primary research question for which these data have been used, was: ‘How patient-relevant are outcomes measured in clinical trials for breast cancer drugs?’. Subquestions were: 1. Which treatment outcomes are relevant for breast cancer patients? 2. Which outcome measures are used in clinical trials for breast cancer drugs? 3. How much overlap is there between patient-relevant outcomes and outcomes measured in clinical trials?The dataset has been used to answer subquestion 2. Data have been obtained by searching Clinicaltrials.gov for trials conducted between January 2014 and March 2024 inclusive. Further inclusion criteria were that studies had to be phase III trials and had to focus on breast cancer, adults (18-64 years old) and drugs. Interventions focusing on lifestyle changes, Chinese medicine, anaesthesia, surgery and diagnostic methods were excluded. Ultimately, 264 trials were included and forty-five excluded. To determine the outcome measures used, the study plan of every included trial was reviewed and recorded on the data sheet.

  19. n

    NIDA Data Share

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). NIDA Data Share [Dataset]. http://identifiers.org/RRID:SCR_002002
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Website which allows data from completed clinical trials to be distributed to investigators and public. Researchers can download de-identified data from completed NIDA clinical trial studies to conduct analyses that improve quality of drug abuse treatment. Incorporates data from Division of Therapeutics and Medical Consequences and Center for Clinical Trials Network.

  20. Raw Data for IntoValue Dataset

    • zenodo.org
    zip
    Updated Feb 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maia Salholz-Hillel; Maia Salholz-Hillel; Delwen Franzen; Delwen Franzen; Benjamin Gregory Carlisle; Benjamin Gregory Carlisle; Nico Riedel; Nico Riedel (2023). Raw Data for IntoValue Dataset [Dataset]. http://doi.org/10.5281/zenodo.7590083
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 2, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maia Salholz-Hillel; Maia Salholz-Hillel; Delwen Franzen; Delwen Franzen; Benjamin Gregory Carlisle; Benjamin Gregory Carlisle; Nico Riedel; Nico Riedel
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This data deposit includes large raw data used for the "IntoValue" dataset, which underlies several projects at the QUEST Center for Responsible Research in the Berlin Institute of Health (BIH) @ Charité. An initial version of the IntoValue dataset is available in Zenodo: https://doi.org/10.5281/zenodo.5141342. Based on this initial version, the dataset is actively developed and maintained in GitHub: https://github.com/maia-sh/intovalue-data. This Zenodo deposit serves to store large raw data files for individual trials and are used in that GitHub repository. These data are deposited for computational reproducibility and documentation; they are not intended to be used for additional projects and do not reflect the most current/accurate data available from each source.

    This deposit contains raw data from the following sources:

    PubMed (pubmed.zip): PubMed XML files are provided courtesty of the U.S. National Library of Medicine and were accessed via the Entrez Programming Utilities (E-utilities) API. The files were downloaded on 2021-08-15 and do not reflect the most current/accurate data available from NLM. The following scripts were used to download and create these files: get-pubmed.R; download-pubmed.R.

    German Clinical Trials Registry (DRKS) (drks.zip): DRKS does not provide an API and was webscrapped on 2022-11-01. The following scripts were used to download and create these XML files: get-drks.R; drks-functions.R

    ClinicalTrials.gov (ctgov.zip): ClinicalTrials.gov was accessed via the Clinical Trials Transformation Initiative (CTTI) Aggregate Content of ClinicalTrials.gov (AACT) via its PostgreSQL database API.The API was queried and CSV files were generated on 2022-11-01. The following scripts were used to download and create these files: get-process-aact.R.

    ClinicalTrials.gov 2018 (ctgov_2018.zip): Additional trial data for 2018. ClinicalTrials.gov was accessed via the Clinical Trials Transformation Initiative (CTTI) Aggregate Content of ClinicalTrials.gov (AACT) via its PostgreSQL database API.The API was queried and CSV files were generated on 2022-11-01. The following scripts were used to download and create these files: get-process-aact.R.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/
Organization logo

US Clinical Trials Data Package

The Registry And Results Database;ClinicalTrials.gov Database;Clinical Studies Database;US Clinical Trials Of Human Participants Database;Development Of A Clinical Prediction Model

Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description

This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.

Search
Clear search
Close search
Google apps
Main menu