100+ datasets found
  1. US Clinical Trials Data Package

    • johnsnowlabs.com
    csv
    Updated Jan 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 20, 2021
    Dataset authored and provided by
    John Snow Labs
    Area covered
    United States
    Description

    This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.

  2. TREC 2022 Clinical Trials Dataset

    • catalog.data.gov
    • s.cnmilf.com
    Updated Sep 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Standards and Technology (2024). TREC 2022 Clinical Trials Dataset [Dataset]. https://catalog.data.gov/dataset/trec-2022-clinical-trials-dataset
    Explore at:
    Dataset updated
    Sep 11, 2024
    Dataset provided by
    National Institute of Standards and Technologyhttp://www.nist.gov/
    Description

    The goal of the Clinical Trials track is to focus research on the clinical trials matching problem: given a free text summary of a patient health record, find suitable clinical trials for that patient.

  3. 2019 Clinical Trial Data Sharing Survey Results - Data

    • wellcome.figshare.com
    xlsx
    Updated Jan 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni (2020). 2019 Clinical Trial Data Sharing Survey Results - Data [Dataset]. http://doi.org/10.6084/m9.figshare.11603295.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 14, 2020
    Dataset provided by
    Wellcome Trusthttps://wellcome.org/
    Authors
    Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The full anonymised dataset from our recent survey into the attitudes towards clinical trial data sharing. The invitation to participate was distributed to clinical trialists funded by the Wellcome Trust, Bill and Melinda Gates Foundation, Cancer Research UK, and UK Medical Research Council.

  4. w

    Dataset of books series that contain Statistical design and analysis of...

    • workwithdata.com
    Updated Nov 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2024). Dataset of books series that contain Statistical design and analysis of clinical trials : principles and methods [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Statistical+design+and+analysis+of+clinical+trials+%3A+principles+and+methods&j=1&j0=books
    Explore at:
    Dataset updated
    Nov 25, 2024
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset is about book series. It has 1 row and is filtered where the books is Statistical design and analysis of clinical trials : principles and methods. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.

  5. Clean data from survey of statisticians on Adverse Event analysis practices...

    • figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachel Phillips; Victoria Cornelius (2023). Clean data from survey of statisticians on Adverse Event analysis practices in RCTs [Dataset]. http://doi.org/10.6084/m9.figshare.12436574.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Rachel Phillips; Victoria Cornelius
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset (Stata v15.1) containing responses from a survey of UK Clinical Research Collaboration registered clinical trial units (CTUs) and industry statisticians from both pharmaceuticals and clinical research organisations (http://dx.doi. org/10.1136/bmjopen-2020- 036875) Data is de-identified. The dataset contains descriptive variables describing participant's experience, as well as responses to questions on current adverse event analysis practices, awareness of specialist methods for adverse event analysis and priorities, concerns and barriers participants experience when analysing adverse event data.

  6. Z

    Interim Dataset for the DIssemination of REgistered COVID-19 Clinical Trials...

    • data.niaid.nih.gov
    Updated Jul 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pugh-Jones, Molly (2023). Interim Dataset for the DIssemination of REgistered COVID-19 Clinical Trials (DIRECCT) Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4669936
    Explore at:
    Dataset updated
    Jul 25, 2023
    Dataset provided by
    Pugh-Jones, Molly
    DeVito, Nicholas J.
    Strech, Daniel
    Grabitz, Peter
    Salholz-Hillel, Maia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The DIRECCT study is a multi-phase, living examination of clinical trial results dissemination throughout the COVID-19 pandemic.

    Currently the data for Phase 1 are available. Phase 1 of the project examined trials completed during the first six months of the pandemic (i.e., through 30 June 2020). Data were collected using a combination of automated and manual strategies; automated searches were performed on 30 June 2020, and manual searches were performed between 21 October 2020 and 18 January 2021.

    The data for the study are split into three datatables: trials, registrations, and results. The three datatables can be treated as relational and joined on the id column. Variables are documented in data-dictionary.

    Data sources for trials and registrations include the International Clinical Trials Registry Platform (ICTRP) list of registered COVID-19 studies, and individual clinical trial registries; data from these sources were curated and cleaned through the COVID-19 TrialsTracker project (https://covid19.trialstracker.net/). Some of the trial data included in the dataset are provisional and have not been systematically quality controlled (e.g., data on interventions); this is noted in the data dictionary when applicable. Data sources for results include information on trial results located from our automated and manual searches in the COVID-19 Open Research Dataset (CORD-19), PubMed, EuropePMC, Google Scholar, Google, and registries.

    Additional information on the project is available at the project's OSF page: https://doi.org/10.17605/osf.io/5f8j2.

  7. Clinical Trials

    • kaggle.com
    Updated Nov 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Clinical Trials [Dataset]. https://www.kaggle.com/datasets/thedevastator/a-quick-overview-of-clinical-trials/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    Clinical Trials

    Clinical trials over the years along with start / end dates, outcome and more.

    By Aero Data Lab [source]

    About this dataset

    This dataset contains information on clinical trials conducted by sponsors. Each row represents a clinical trial, and the columns represent various attributes of the trial, such as the National Clinical Trial Number, the sponsor of the trial, the title of the trial, and so on.

    The purpose of this dataset is to provide a bird's-eye view of the clinical trial landscape. By understanding which sponsors are conducting which trials and for what conditions, we can get a better sense of where research is headed and what new treatments may be on the horizon

    How to use the dataset

    • NCT is a unique identifier for clinical trials. It stands for National Clinical Trial Number.
    • Sponsor is the organization that is funding the clinical trial.
    • Title is the name of the clinical trial.
    • Summary is a brief summary of the clinical trial.
    • Start Year is the year that the clinical trial started.
    • Start Month is the month that the clinical trial started.
    • Phase is the stage of development of the investigative drug or device (I), which can be one of four types: I, II, III, or IV.
    • Enrollment is The number of participants in the clinical trial.
    • Status is The status of enrollment in the study, which can be Recruiting, Not yet recruiting, Active, not recruiting, Completed, Suspended, or Terminated.

    Condition indicates what medical condition(s) are being studied in this particular NCT record

    Research Ideas

    • Identify patterns in clinical trials to improve the development process
    • Understand how different sponsors fund clinical trials

    Acknowledgements

    By Aero Data Lab [source]

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: AERO-BirdsEye-Data.csv | Column name | Description | |:----------------|:-----------------------------------------------------------------| | NCT | National Clinical Trial number. (String) | | Sponsor | Name of the sponsor conducting the clinical trial. (String) | | Title | Title of the clinical trial. (String) | | Summary | Brief summary of the clinical trial. (String) | | Start_Year | Year the clinical trial started. (Integer) | | Start_Month | Month the clinical trial started. (String) | | Phase | Phase of the clinical trial. (String) | | Enrollment | Number of participants enrolled in the clinical trial. (Integer) | | Status | Status of the clinical trial. (String) | | Condition | Condition being tested in the clinical trial. (String) |

    Acknowledgements

    If you use this dataset in your research, please credit By Aero Data Lab [source]

  8. Clinical Trial Data Visualization Market Report | Global Forecast From 2025...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Clinical Trial Data Visualization Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-clinical-trial-data-visualization-market
    Explore at:
    csv, pdf, pptxAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Clinical Trial Data Visualization Market Outlook


    The global clinical trial data visualization market size is projected to grow from USD 0.75 billion in 2023 to USD 2.62 billion by 2032, reflecting a compound annual growth rate (CAGR) of 15.2% during the forecast period. This growth is driven by the increasing complexity of clinical trials, the need for enhanced data transparency, and the rising adoption of digital tools in the healthcare sector.



    One of the key drivers for the growth of the clinical trial data visualization market is the escalating complexity and volume of data generated during clinical trials. The pharmaceutical and biotechnology sectors are witnessing a surge in clinical trials, which demand sophisticated data management and visualization tools to make sense of the vast amounts of data collected. These tools enable researchers to identify patterns, trends, and outliers more efficiently, thereby accelerating the decision-making process and improving clinical trial outcomes.



    Another significant factor contributing to market growth is the increasing emphasis on data transparency and regulatory compliance. Regulatory bodies, such as the FDA and EMA, are mandating greater transparency in clinical trial data to ensure patient safety and data integrity. Data visualization tools facilitate the clear presentation of complex data, making it easier for regulatory bodies and stakeholders to review and approve clinical trial processes. This ensures that clinical trials are conducted in a more transparent and compliant manner, thus driving the adoption of these tools.



    The advent of advanced technologies, such as artificial intelligence (AI) and machine learning (ML), is also playing a crucial role in the growth of the clinical trial data visualization market. These technologies are being increasingly integrated into data visualization tools to enhance their capabilities. AI and ML algorithms can analyze large datasets quickly and provide insights that were previously unattainable. This not only improves the efficiency of clinical trials but also enhances the accuracy and reliability of the data being presented.



    As the clinical trial data visualization market continues to expand, the importance of Clinical Trial Data Security becomes increasingly paramount. With the vast amounts of data generated during trials, ensuring the confidentiality, integrity, and availability of this data is critical. Organizations must implement robust security measures to protect sensitive information from unauthorized access and breaches. This involves not only securing the data itself but also safeguarding the systems and networks that store and process this information. As regulatory bodies tighten their data protection requirements, companies are investing in advanced security technologies and practices to comply with these standards and maintain trust with stakeholders. The focus on Clinical Trial Data Security is not just about compliance; it is about ensuring the reliability and credibility of clinical trial outcomes, which ultimately impacts patient safety and the development of new therapies.



    Regionally, North America is expected to dominate the clinical trial data visualization market due to the presence of a large number of pharmaceutical and biotechnology companies, a well-established healthcare infrastructure, and a strong focus on research and development. Europe is also expected to witness significant growth, driven by the increasing adoption of digital technologies in clinical trials and supportive regulatory frameworks. The Asia Pacific region is poised to grow at the fastest rate, fueled by the expanding pharmaceutical industry, growing investments in healthcare technology, and an increasing number of clinical trials being conducted in countries like China and India.



    Component Analysis


    The clinical trial data visualization market is segmented into software and services based on components. The software segment is expected to hold the largest market share during the forecast period. This can be attributed to the increasing demand for advanced software solutions that offer real-time data analysis and visualization capabilities. These software tools are designed to handle large volumes of data and provide intuitive visual representations that facilitate better understanding and decision-making.



    Furthermore, the integration of AI and ML technologies into data visualization software is enhancing their capabilities, makin

  9. f

    CK4Gen, High Utility Synthetic Survival Datasets

    • figshare.com
    zip
    Updated Nov 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Kuo (2024). CK4Gen, High Utility Synthetic Survival Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27611388.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 5, 2024
    Dataset provided by
    figshare
    Authors
    Nicholas Kuo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ===###Overview:This repository provides high-utility synthetic survival datasets generated using the CK4Gen framework, optimised to retain critical clinical characteristics for use in research and educational settings. Each dataset is based on a carefully curated ground truth dataset, processed with standardised variable definitions and analytical approaches, ensuring a consistent baseline for survival analysis.###===###Description:The repository includes synthetic versions of four widely utilised and publicly accessible survival analysis datasets, each anchored in foundational studies and aligned with established ground truth variations to support robust clinical research and training.#---GBSG2: Based on Schumacher et al. [1]. The study evaluated the effects of hormonal treatment and chemotherapy duration in node-positive breast cancer patients, tracking recurrence-free and overall survival among 686 women over a median of 5 years. Our synthetic version is derived from a variation of the GBSG2 dataset available in the lifelines package [2], formatted to match the descriptions in Sauerbrei et al. [3], which we treat as the ground truth.ACTG320: Based on Hammer et al. [4]. The study investigates the impact of adding the protease inhibitor indinavir to a standard two-drug regimen for HIV-1 treatment. The original clinical trial involved 1,151 patients with prior zidovudine exposure and low CD4 cell counts, tracking outcomes over a median follow-up of 38 weeks. Our synthetic dataset is derived from a variation of the ACTG320 dataset available in the sksurv package [5], which we treat as the ground truth dataset.WHAS500: Based on Goldberg et al. [6]. The study follows 500 patients to investigate survival rates following acute myocardial infarction (MI), capturing a range of factors influencing MI incidence and outcomes. Our synthetic data replicates a ground truth variation from the sksurv package, which we treat as the ground truth dataset.FLChain: Based on Dispenzieri et al. [7]. The study assesses the prognostic relevance of serum immunoglobulin free light chains (FLCs) for overall survival in a large cohort of 15,859 participants. Our synthetic version is based on a variation available in the sksurv package, which we treat as the ground truth dataset.###===###Notes:Please find an in-depth discussion on these datasets, as well as their generation process, in the link below, to our paper:https://arxiv.org/abs/2410.16872Kuo, et al. "CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare." arXiv preprint arXiv:2410.16872 (2024).###===###References:[1]: Schumacher, et al. “Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German breast cancer study group.”, Journal of Clinical Oncology, 1994.[2]: Davidson-Pilon “lifelines: Survival Analysis in Python”, Journal of Open Source Software, 2019.[3]: Sauerbrei, et al. “Modelling the effects of standard prognostic factors in node-positive breast cancer”, British Journal of Cancer, 1999.[4]: Hammer, et al. “A controlled trial of two nucleoside analogues plus indinavir in persons with human immunodeficiency virus infection and cd4 cell counts of 200 per cubic millimeter or less”, New England Journal of Medicine, 1997.[5]: Pölsterl “scikit-survival: A library for time-to-event analysis built on top of scikit-learn”, Journal of Machine Learning Research, 2020.[6]: Goldberg, et al. “Incidence and case fatality rates of acute myocardial infarction (1975–1984): the Worcester heart attack study”, American Heart Journal, 1988.[7]: Dispenzieri, et al. “Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population”, in Mayo Clinic Proceedings, 2012.

  10. Clinical Data Management System Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Clinical Data Management System Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-clinical-data-management-system-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Clinical Data Management System Market Outlook



    The global clinical data management system market size is projected to reach approximately USD 2.8 billion by 2032, up from USD 1.1 billion in 2023, reflecting a robust compound annual growth rate (CAGR) of around 11%. This significant growth is primarily driven by the increasing complexity of clinical trials and the need for efficient data management solutions across various sectors.



    One of the primary growth factors for the clinical data management system market is the exponential increase in the volume and complexity of clinical trial data, necessitating advanced data management systems. The proliferation of personalized medicine and precision healthcare has led to an increase in the data points collected during clinical trials, making traditional methods of data management obsolete. Advanced clinical data management systems facilitate the efficient handling, storage, and analysis of this data, ensuring compliance with regulatory standards and enhancing the overall efficiency of clinical trials.



    Another pivotal growth driver is the substantial increase in research and development (R&D) activities within the pharmaceutical and biotechnology sectors. Companies are heavily investing in R&D to develop new drugs and therapies, leading to a surge in the number of clinical trials conducted globally. This surge has created a burgeoning demand for innovative and robust clinical data management solutions that can streamline trial processes and ensure data integrity. Furthermore, the growing trend of outsourcing clinical trials to contract research organizations (CROs) has amplified the need for standardized data management processes.



    The adoption of cloud-based solutions is also significantly contributing to market growth. Cloud-based clinical data management systems offer numerous advantages over traditional on-premises solutions, including scalability, cost-efficiency, and real-time data access. These benefits are particularly appealing to small and medium-sized enterprises (SMEs) and academic research institutes, which often operate with limited budgets. The increased reliance on remote monitoring and decentralized trials, accelerated by the COVID-19 pandemic, is further propelling the adoption of cloud-based solutions in the clinical data management system market.



    The increasing complexity of clinical trials and the need for efficient data management have led to the growing adoption of Clinical Trial Management Software. This software plays a pivotal role in streamlining the management of clinical trials by providing tools for planning, tracking, and managing clinical trial data. With features such as study planning, budget management, and regulatory compliance tracking, Clinical Trial Management Software enhances the efficiency of clinical trials and ensures the integrity of data. As the demand for more sophisticated data management solutions rises, the integration of such software becomes crucial for organizations aiming to optimize their clinical trial processes and outcomes.



    Regionally, North America dominates the clinical data management system market, driven by a well-established healthcare infrastructure, significant R&D investments, and the presence of major pharmaceutical and biotechnology companies. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period. The rising prevalence of chronic diseases, increasing clinical trial activities, and favorable government initiatives are fostering market growth in this region. The growing outsourcing of clinical trials to countries like India and China, due to cost advantages and a skilled workforce, is also a critical regional growth driver.



    Component Analysis



    The clinical data management system market is segmented into software and services, each playing a crucial role in the overall ecosystem. Software solutions dominate the market due to their ability to streamline data collection, processing, and analysis. These solutions offer various functionalities, including electronic data capture (EDC), clinical trial management systems (CTMS), and clinical data repositories. The increasing adoption of advanced analytics and artificial intelligence (AI) within these software solutions is further enhancing their capability to manage and interpret complex data sets, driving their demand.



    Services, on the other hand, encompass a wide range of offer

  11. f

    Dataset: Clinical Trials

    • figshare.com
    application/x-gzip
    Updated Jan 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SN SciGraph Team; Michele Pasin (2023). Dataset: Clinical Trials [Dataset]. http://doi.org/10.6084/m9.figshare.7376477.v4
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jan 31, 2023
    Dataset provided by
    SN SciGraph
    Authors
    SN SciGraph Team; Michele Pasin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    The Clinical Trials dataset includes information about Clinical Trials mentioned in Springer Nature publications.See also: https://scigraph.springernature.com/explorer/datasets/data_at_a_glance/NOTE: this dataset was made available by Dimensions.ai.A clinical trial record usually includes information about dates, sponsor organizations, subjects, external identifiers and abstract when available.Version info:* http://scigraph.downloads.uberresearch.com/archives/current/TIMESTAMP.txt* http://scigraph.downloads.uberresearch.com/archives/current/LICENSE.txt

  12. f

    Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"

    • figshare.com
    zip
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laura Miron; Rafael Gonçalves; Mark A. Musen (2023). Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov" [Dataset]. http://doi.org/10.6084/m9.figshare.12743939.v2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    figshare
    Authors
    Laura Miron; Rafael Gonçalves; Mark A. Musen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.

  13. n

    Data from: Sharing of clinical trial data and results reporting practices...

    • data.niaid.nih.gov
    • zenodo.org
    • +1more
    zip
    Updated Jul 25, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello (2019). Sharing of clinical trial data and results reporting practices among large pharmaceutical companies: cross sectional descriptive study and pilot of a tool to improve company practices [Dataset]. http://doi.org/10.5061/dryad.k81584t
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 25, 2019
    Authors
    Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Objectives: To develop and pilot a tool to measure and improve pharmaceutical companies’ clinical trial data sharing policies and practices. Design: Cross sectional descriptive analysis. Setting: Large pharmaceutical companies with novel drugs approved by the US Food and Drug Administration in 2015. Data sources: Data sharing measures were adapted from 10 prominent data sharing guidelines from expert bodies and refined through a multi-stakeholder deliberative process engaging patients, industry, academics, regulators, and others. Data sharing practices and policies were assessed using data from ClinicalTrials.gov, Drugs@FDA, corporate websites, data sharing platforms and registries (eg, the Yale Open Data Access (YODA) Project and Clinical Study Data Request (CSDR)), and personal communication with drug companies. Main outcome measures: Company level, multicomponent measure of accessibility of participant level clinical trial data (eg, analysis ready dataset and metadata); drug and trial level measures of registration, results reporting, and publication; company level overall transparency rankings; and feasibility of the measures and ranking tool to improve company data sharing policies and practices. Results: Only 25% of large pharmaceutical companies fully met the data sharing measure. The median company data sharing score was 63% (interquartile range 58-85%). Given feedback and a chance to improve their policies to meet this measure, three companies made amendments, raising the percentage of companies in full compliance to 33% and the median company data sharing score to 80% (73-100%). The most common reasons companies did not initially satisfy the data sharing measure were failure to share data by the specified deadline (75%) and failure to report the number and outcome of their data requests. Across new drug applications, a median of 100% (interquartile range 91-100%) of trials in patients were registered, 65% (36-96%) reported results, 45% (30-84%) were published, and 95% (69-100%) were publicly available in some form by six months after FDA drug approval. When examining results on the drug level, less than half (42%) of reviewed drugs had results for all their new drug applications trials in patients publicly available in some form by six months after FDA approval. Conclusions: It was feasible to develop a tool to measure data sharing policies and practices among large companies and have an impact in improving company practices. Among large companies, 25% made participant level trial data accessible to external investigators for new drug approvals in accordance with the current study’s measures; this proportion improved to 33% after applying the ranking tool. Other measures of trial transparency were higher. Some companies, however, have substantial room for improvement on transparency and data sharing of clinical trials.

  14. Z

    EXTRACT-NOAC randomized clinical trial Dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Feb 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ockerman, Anna (2022). EXTRACT-NOAC randomized clinical trial Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=ZENODO_4621030
    Explore at:
    Dataset updated
    Feb 21, 2022
    Dataset provided by
    Verhamme, Peter
    Ockerman, Anna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database concerns a randomized clinical trial, called the EXTRACT-NOAC trial. The title of the trial is 'Tranexamic acid and bleeding after dental extraction in patients treated with non-vitamin K oral anticoagulants'. The aim of the trial was to evaluate the efficacy of tranexamic acid mouthwash to reduce bleeding after dental extraction in patients on non-vitamin K oral anticoagulants.

  15. d

    Data from: Compliance with mandatory reporting of clinical trial results on...

    • dataone.org
    • data.niaid.nih.gov
    • +3more
    Updated Apr 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew P. Prayle; Matthew N. Hurley; Alan R. Smyth (2025). Compliance with mandatory reporting of clinical trial results on ClinicalTrials.gov: cross sectional study [Dataset]. http://doi.org/10.5061/dryad.j512f21p
    Explore at:
    Dataset updated
    Apr 14, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Andrew P. Prayle; Matthew N. Hurley; Alan R. Smyth
    Time period covered
    Jan 1, 2012
    Description

    OBJECTIVE: To examine compliance with mandatory reporting of summary clinical trial results (within one year of completion of trial) on ClinicalTrials.gov for studies that fall under the recent Food and Drug Administration Amendments Act (FDAAA) legislation. DESIGN: Registry based study of clinical trial summaries. DATA SOURCES: ClinicalTrials.gov, searched on 19 January 2011, with cross referencing with Drugs@FDA to determine for which trials mandatory reporting was required within one year. SELECTION CRITERIA: Studies registered on ClinicalTrials.gov with US sites which completed between 1 January and 31 December 2009. MAIN OUTCOME MEASURE: Proportion of trials for which results had been reported. RESULTS: The ClinicalTrials.gov registry contained 83,579 entries for interventional trials, of which 5642 were completed within the timescale of interest. We identified trials as falling within the mandatory reporting rules if they were covered by the FDAAA (trials of a drug, device, or bio...

  16. Z

    Dataset for: Feasibility study to improve clinical trial transparency with...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salholz-Hillel, Maia (2024). Dataset for: Feasibility study to improve clinical trial transparency with individualized report cards at a large university medical center [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10467053
    Explore at:
    Dataset updated
    Jan 8, 2024
    Dataset provided by
    Salholz-Hillel, Maia
    Franzen, Delwen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This deposit contains data associated with a feasibility study evaluating the use of individualized report cards to improve trial transparency at the Charité - Universitätsmedizin Berlin. It primarily includes large raw data files and other files compiled by, or used in the project code repository: https://github.com/quest-bih/tv-ct-transparency/. These data are deposited for documentation and computational reproducibility; they do not reflect the most current/accurate data available from each source.

    The deposit contains:

    Survey data (survey-data.csv): Participant responses for an anonymous survey conducted to assess the usefulness of the report cards and infosheet. The survey was administered in LimeSurvey and hosted on a server at the QUEST Center for Responsible Research at the Berlin Institute of Health at Charité – Universitätsmedizin Berlin. Any information that could potentially identify participants, such as IP address and free-text fields (e.g., corrections, comments) were removed. This file serves as input for the analysis of the survey data.

  17. d

    Replication Data for: Identifying unreported links between...

    • search.dataone.org
    Updated Nov 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Shifeng (2023). Replication Data for: Identifying unreported links between ClinicalTrials.gov trial registrations and their published results [Dataset]. http://doi.org/10.7910/DVN/MEROWG
    Explore at:
    Dataset updated
    Nov 12, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Liu, Shifeng
    Description

    This is the dataset we used to train and evaluate for the paper Identifying unreported links between ClinicalTrials.gov trial registrations and their published results. This dataset is collected on 29 September 2020. The corresponding code and the structure of the dataset can be found in https://github.com/evidence-surveillance/unreported_link_identidication. We highly recommend building your own datasets with more specific criteria for specific applications rather than directly applying this dataset. This dataset includes extracted Clinical Trial registrations, extracted PubMed articles' titles and abstracts, the automatically extracted links between clinical trials and PubMed articles (already shuffled), manually curated links and the transformed data with corresponding vectorizers. This dataset contains 27,280 automatically constructed links (used for training and evaluation) and 90 manually curated links (used ​for evaluation only). The first 27,280 links refer to the automatically constructed links and the last 90 links refer to the manually curated links.

  18. d

    Data from: A Cluster Randomized Controlled Trial of the Safe Public Spaces...

    • catalog.data.gov
    • icpsr.umich.edu
    Updated Mar 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institute of Justice (2025). A Cluster Randomized Controlled Trial of the Safe Public Spaces in Schools Program, New York City, 2016-2018 [Dataset]. https://catalog.data.gov/dataset/a-cluster-randomized-controlled-trial-of-the-safe-public-spaces-in-schools-program-ne-2016-f67d7
    Explore at:
    Dataset updated
    Mar 12, 2025
    Dataset provided by
    National Institute of Justice
    Area covered
    New York
    Description

    This study tests the efficacy of an intervention--Safe Public Spaces (SPS) -- focused on improving the safety of public spaces in schools, such as hallways, cafeterias, and stairwells. Twenty-four schools with middle grades in a large urban area were recruited for participation and were pair-matched and then assigned to either treatment or control. The study comprises four components: an implementation evaluation, a cost study, an impact study, and a community crime study. Community-crime-study: The community crime study used the arrest of juveniles from the NYPD (New York Police Department) data. The data can be found at (https://data.cityofnewyork.us/Public-Safety/NYPD-Arrests-Data-Historic-/8h9b-rp9u). Data include all arrest for the juvenile crime during the life of the intervention. The 12 matched schools were identified and geo-mapped using Quantum GIS (QGIS) 3.8 software. Block groups in the 2010 US Census in which the schools reside and neighboring block groups were mapped into micro-areas. This resulted in twelve experimental school blocks and 11 control blocks which the schools reside (two of the control schools existed in the same census block group). Additionally, neighboring blocks using were geo-mapped into 70 experimental and 77 control adjacent block groups (see map). Finally, juvenile arrests were mapped into experimental and control areas. Using the ARIMA time-series method in Stata 15 statistical software package, arrest data were analyzed to compare the change in juvenile arrests in the experimental and control sites. Cost-study: For the cost study, information from the implementing organization (Engaging Schools) was combined with data from phone conversations and follow-up communications with staff in school sites to populate a Resource Cost Model. The Resource Cost Model Excel file will be provided for archiving. This file contains details on the staff time and materials allocated to the intervention, as well as the NYC prices in 2018 US dollars associated with each element. Prices were gathered from multiple sources, including actual NYC DOE data on salaries for position types for which these data were available and district salary schedules for the other staff types. Census data were used to calculate benefits. Impact-evaluation: The impact evaluation was conducted using data from the Research Alliance for New York City Schools. Among the core functions of the Research Alliance is maintaining a unique archive of longitudinal data on NYC schools to support ongoing research. The Research Alliance builds and maintains an archive of longitudinal data about NYC schools. Their agreement with the New York City Department of Education (NYC DOE) outlines the data they receive, the process they use to obtain it, and the security measures to keep it safe. Implementation-study: The implementation study comprises the baseline survey and observation data. Interview transcripts are not archived.

  19. i

    Data - Predicting Clinical Trial Outcomes Using Drug Bioactivities

    • ieee-dataport.org
    Updated Dec 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prashanth Athri (2021). Data - Predicting Clinical Trial Outcomes Using Drug Bioactivities [Dataset]. https://ieee-dataport.org/documents/data-predicting-clinical-trial-outcomes-using-drug-bioactivities
    Explore at:
    Dataset updated
    Dec 27, 2021
    Authors
    Prashanth Athri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    duration of development

  20. U

    Data from: Patient Consent to Publication and Data Sharing in Industry and...

    • datacatalog.hshsl.umaryland.edu
    Updated Mar 27, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    O'Mareen Spence; Richie Onwuchekwa Uba; Seongbin Shin; Peter Doshi (2024). Patient Consent to Publication and Data Sharing in Industry and NIH-Funded Clinical Trials [Dataset]. http://doi.org/10.5281/zenodo.1231072
    Explore at:
    Dataset updated
    Mar 27, 2024
    Dataset provided by
    HS/HSL
    Authors
    O'Mareen Spence; Richie Onwuchekwa Uba; Seongbin Shin; Peter Doshi
    Time period covered
    Jan 1, 1983 - Dec 31, 2013
    Description

    Clinical trial participants are often motivated by the altruistic assumption that study results will contribute to medical knowledge. Additionally, the sharing of research data is rapidly developing into an ethical standard. An evaluation of 144 blank (sample) informed consent forms (ICF) was undertaken to determine the extent to which clinical trial participants were apprised of researchers’ intent to publish results, share de-identified data, and the overall benefit to medical knowledge. This dataset consists of 98 ICFs from industry-funded trials from the European Medicines Agency (EMA) and 46 ICFs from publicly-funded trials listed in the National Heart, Lung and Blood Institute (NHLBI) Biologic Specimen and Data Repository Information Coordinating Center (BioLINCC). The documents were reviewed for identification and extraction of stated or implied language for the following 5 aspects of each study: publication of results, sharing de-identified data, data ownership, confidentiality of identifiable data and, whether the trial will produce knowledge that offers public benefit. Results indicate that investigators rarely disclose intent to share de-identifiable data or commitment to publish. All ICFs are available via 2 zip files, one for the industry-funded trials and the other for the trials in BioLINCC. Also included is the study extraction sheet.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/
Organization logo

US Clinical Trials Data Package

The Registry And Results Database;ClinicalTrials.gov Database;Clinical Studies Database;US Clinical Trials Of Human Participants Database;Development Of A Clinical Prediction Model

Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description

This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.

Search
Clear search
Close search
Google apps
Main menu