100+ datasets found

US Clinical Trials Data Package
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description
This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.
TREC 2022 Clinical Trials Dataset
catalog.data.gov
s.cnmilf.com
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). TREC 2022 Clinical Trials Dataset [Dataset]. https://catalog.data.gov/dataset/trec-2022-clinical-trials-dataset
Explore at:
Dataset updated
Sep 11, 2024
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
The goal of the Clinical Trials track is to focus research on the clinical trials matching problem: given a free text summary of a patient health record, find suitable clinical trials for that patient.
Data (i.e., evidence) about evidence based medicine
figshare.com
search.datacite.org
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1093997.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jorge H Ramirez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References

Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873

The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:

Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.

Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)

Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242

Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181

Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151

Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles

Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725

Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22

Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106

Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597

Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655

Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
n
Data from: Sharing of clinical trial data and results reporting practices...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Jul 25, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello (2019). Sharing of clinical trial data and results reporting practices among large pharmaceutical companies: cross sectional descriptive study and pilot of a tool to improve company practices [Dataset]. http://doi.org/10.5061/dryad.k81584t
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.k81584t
Dataset updated
Jul 25, 2019
Authors
Jennifer Miller; Joseph S. Ross; Marc Wilenzick; Michelle M. Mello
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Objectives: To develop and pilot a tool to measure and improve pharmaceutical companies’ clinical trial data sharing policies and practices. Design: Cross sectional descriptive analysis. Setting: Large pharmaceutical companies with novel drugs approved by the US Food and Drug Administration in 2015. Data sources: Data sharing measures were adapted from 10 prominent data sharing guidelines from expert bodies and refined through a multi-stakeholder deliberative process engaging patients, industry, academics, regulators, and others. Data sharing practices and policies were assessed using data from ClinicalTrials.gov, Drugs@FDA, corporate websites, data sharing platforms and registries (eg, the Yale Open Data Access (YODA) Project and Clinical Study Data Request (CSDR)), and personal communication with drug companies. Main outcome measures: Company level, multicomponent measure of accessibility of participant level clinical trial data (eg, analysis ready dataset and metadata); drug and trial level measures of registration, results reporting, and publication; company level overall transparency rankings; and feasibility of the measures and ranking tool to improve company data sharing policies and practices. Results: Only 25% of large pharmaceutical companies fully met the data sharing measure. The median company data sharing score was 63% (interquartile range 58-85%). Given feedback and a chance to improve their policies to meet this measure, three companies made amendments, raising the percentage of companies in full compliance to 33% and the median company data sharing score to 80% (73-100%). The most common reasons companies did not initially satisfy the data sharing measure were failure to share data by the specified deadline (75%) and failure to report the number and outcome of their data requests. Across new drug applications, a median of 100% (interquartile range 91-100%) of trials in patients were registered, 65% (36-96%) reported results, 45% (30-84%) were published, and 95% (69-100%) were publicly available in some form by six months after FDA drug approval. When examining results on the drug level, less than half (42%) of reviewed drugs had results for all their new drug applications trials in patients publicly available in some form by six months after FDA approval. Conclusions: It was feasible to develop a tool to measure data sharing policies and practices among large companies and have an impact in improving company practices. Among large companies, 25% made participant level trial data accessible to external investigators for new drug approvals in accordance with the current study’s measures; this proportion improved to 33% after applying the ranking tool. Other measures of trial transparency were higher. Some companies, however, have substantial room for improvement on transparency and data sharing of clinical trials.
2019 Clinical Trial Data Sharing Survey Results - Data
wellcome.figshare.com
xlsx
Updated Jan 14, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni (2020). 2019 Clinical Trial Data Sharing Survey Results - Data [Dataset]. http://doi.org/10.6084/m9.figshare.11603295.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11603295.v1
Dataset updated
Jan 14, 2020
Dataset provided by
Wellcome Trusthttps://wellcome.org/
Authors
Georgina Humphreys; George Merriott; Rachel Knowles; Ben Pierson; Paola Quattroni
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The full anonymised dataset from our recent survey into the attitudes towards clinical trial data sharing. The invitation to participate was distributed to clinical trialists funded by the Wellcome Trust, Bill and Melinda Gates Foundation, Cancer Research UK, and UK Medical Research Council.
d
National Database for Clinical Trials Related to Mental Illness (NDCT)
catalog.data.gov
healthdata.gov
+2more
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (NIH) (2025). National Database for Clinical Trials Related to Mental Illness (NDCT) [Dataset]. https://catalog.data.gov/dataset/national-database-for-clinical-trials-related-to-mental-illness-ndct
Explore at:
Dataset updated
Jul 16, 2025
Dataset provided by
National Institutes of Health (NIH)
Description
The National Database for Clinical Trials Related to Mental Illness (NDCT) is an extensible informatics platform for relevant data at all levels of biological and behavioral organization (molecules, genes, neural tissue, behavioral, social and environmental interactions) and for all data types (text, numeric, image, time series, etc.) related to clinical trials funded by the National Institute of Mental Health. Sharing data, associated tools, methodologies and results, rather than just summaries or interpretations, accelerates research progress. Community-wide sharing requires common data definitions and standards, as well as comprehensive and coherent informatics approaches for the sharing of de-identified human subject research data. Built on the National Database for Autism Research (NDAR) informatics platform, NDCT provides a comprehensive data sharing platform for NIMH grantees supporting clinical trials.
w
Dataset of books series that contain Statistical design and analysis of...
workwithdata.com
Updated Nov 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Work With Data (2024). Dataset of books series that contain Statistical design and analysis of clinical trials : principles and methods [Dataset]. https://www.workwithdata.com/datasets/book-series?f=1&fcol0=j0-book&fop0=%3D&fval0=Statistical+design+and+analysis+of+clinical+trials+%3A+principles+and+methods&j=1&j0=books
Explore at:
Dataset updated
Nov 25, 2024
Dataset authored and provided by
Work With Data
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset is about book series. It has 1 row and is filtered where the books is Statistical design and analysis of clinical trials : principles and methods. It features 10 columns including number of authors, number of books, earliest publication date, and latest publication date.
Clinical Trial Data Visualization Market Report | Global Forecast From 2025...
dataintelo.com
csv, pdf, pptx
Updated Jan 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dataintelo (2025). Clinical Trial Data Visualization Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-clinical-trial-data-visualization-market
Explore at:
csv, pdf, pptxAvailable download formats
Dataset updated
Jan 7, 2025
Dataset authored and provided by
Dataintelo
License
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
Time period covered
2024 - 2032
Area covered
Global
Description
Clinical Trial Data Visualization Market Outlook

The global clinical trial data visualization market size is projected to grow from USD 0.75 billion in 2023 to USD 2.62 billion by 2032, reflecting a compound annual growth rate (CAGR) of 15.2% during the forecast period. This growth is driven by the increasing complexity of clinical trials, the need for enhanced data transparency, and the rising adoption of digital tools in the healthcare sector.

One of the key drivers for the growth of the clinical trial data visualization market is the escalating complexity and volume of data generated during clinical trials. The pharmaceutical and biotechnology sectors are witnessing a surge in clinical trials, which demand sophisticated data management and visualization tools to make sense of the vast amounts of data collected. These tools enable researchers to identify patterns, trends, and outliers more efficiently, thereby accelerating the decision-making process and improving clinical trial outcomes.

Another significant factor contributing to market growth is the increasing emphasis on data transparency and regulatory compliance. Regulatory bodies, such as the FDA and EMA, are mandating greater transparency in clinical trial data to ensure patient safety and data integrity. Data visualization tools facilitate the clear presentation of complex data, making it easier for regulatory bodies and stakeholders to review and approve clinical trial processes. This ensures that clinical trials are conducted in a more transparent and compliant manner, thus driving the adoption of these tools.

The advent of advanced technologies, such as artificial intelligence (AI) and machine learning (ML), is also playing a crucial role in the growth of the clinical trial data visualization market. These technologies are being increasingly integrated into data visualization tools to enhance their capabilities. AI and ML algorithms can analyze large datasets quickly and provide insights that were previously unattainable. This not only improves the efficiency of clinical trials but also enhances the accuracy and reliability of the data being presented.

As the clinical trial data visualization market continues to expand, the importance of Clinical Trial Data Security becomes increasingly paramount. With the vast amounts of data generated during trials, ensuring the confidentiality, integrity, and availability of this data is critical. Organizations must implement robust security measures to protect sensitive information from unauthorized access and breaches. This involves not only securing the data itself but also safeguarding the systems and networks that store and process this information. As regulatory bodies tighten their data protection requirements, companies are investing in advanced security technologies and practices to comply with these standards and maintain trust with stakeholders. The focus on Clinical Trial Data Security is not just about compliance; it is about ensuring the reliability and credibility of clinical trial outcomes, which ultimately impacts patient safety and the development of new therapies.

Regionally, North America is expected to dominate the clinical trial data visualization market due to the presence of a large number of pharmaceutical and biotechnology companies, a well-established healthcare infrastructure, and a strong focus on research and development. Europe is also expected to witness significant growth, driven by the increasing adoption of digital technologies in clinical trials and supportive regulatory frameworks. The Asia Pacific region is poised to grow at the fastest rate, fueled by the expanding pharmaceutical industry, growing investments in healthcare technology, and an increasing number of clinical trials being conducted in countries like China and India.

Component Analysis

The clinical trial data visualization market is segmented into software and services based on components. The software segment is expected to hold the largest market share during the forecast period. This can be attributed to the increasing demand for advanced software solutions that offer real-time data analysis and visualization capabilities. These software tools are designed to handle large volumes of data and provide intuitive visual representations that facilitate better understanding and decision-making.

Furthermore, the integration of AI and ML technologies into data visualization software is enhancing their capabilities, makin
U
Data from: Availability of Study Protocols for Randomized Trials Published...
datacatalog.hshsl.umaryland.edu
Updated Mar 27, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter Doshi; O'Mareen Spence; Kyungwan Hong; Richie Onwuchekwa Uba (2024). Availability of Study Protocols for Randomized Trials Published in High-Impact Medical Journals: A Cross-Sectional Analysis [Dataset]. http://doi.org/10.5281/zenodo.1344634
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.1344634
Dataset updated
Mar 27, 2024
Dataset provided by
HS/HSL
Authors
Peter Doshi; O'Mareen Spence; Kyungwan Hong; Richie Onwuchekwa Uba
Description
To improve reporting transparency and research integrity, some journals have begun publishing study protocols and statistical analysis plans alongside trial publications. To determine the overall availability and characteristics of protocols and statistical analysis plans this study reviewed all randomized clinical trials (RCT) published in 2016 in the following 5 general medicine journals: Annals of Internal Medicine, BMJ, JAMA, Lancet, and NEJM. Characteristics of RCTs were extracted from the publication and clinical trial registry. A detailed assessment of protocols and statistical analysis plans was conducted in a 20% random sample of trials. Dataset contains extraction sheets (as SAS data files), code to calculate the values in the tables in the manuscript, and a supplemental file with additional notes on methods used in the study.
Z
Final Dataset for the DIssemination of REgistered COVID-19 Clinical Trials...
data.niaid.nih.gov
zenodo.org
Updated Jul 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DeVito, Nicholas J. (2024). Final Dataset for the DIssemination of REgistered COVID-19 Clinical Trials (DIRECCT) Study [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8181414
Explore at:
Dataset updated
Jul 11, 2024
Dataset provided by
Salholz-Hillel, Maia
Schult, Tjada A.
Grabitz, Peter
Carlisle, Benjamin Gregory
Goldacre, Ben
Pugh-Jones, Molly
Hildebrand, Nicole
Schwietering, Johannes
Strech, Daniel
DeVito, Nicholas J.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The DIRECCT study is a multi-phase examination of clinical trial results dissemination during the COVID-19 pandemic.

Interim data for trials completed during the first six months of the pandemic (i.e., 1 January 2020 – 30 June 2020) was previously deposited at https://doi.org/10.5281/zenodo.4669936. This data deposit comprises the results of searches for trials completed during the first 18-months of the pandemic (i.e., 1 January 2020 – 30 June 2021). The data structure for the final phase of the project is not identical to the interim data as it was substantially more complex. The data include datatables (CSVs) that can be treated as relational and joined on the id or trn columns. See datamodel.png for an overview of the data.

Details on data sources and methods for the creation and analysis of this dataset are available in a detailed protocol (Version 3.1, 19 July 2023) : https://osf.io/w8t7r

Note: This repository will be updated with additional information including a codebook and archives of raw data.

Additional information on the project is available at the project's OSF page: https://doi.org/10.17605/osf.io/5f8j2.
f
Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"
figshare.com
zip
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Laura Miron; Rafael Gonçalves; Mark A. Musen (2023). Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov" [Dataset]. http://doi.org/10.6084/m9.figshare.12743939.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12743939.v2
Dataset updated
Jun 1, 2023
Dataset provided by
figshare
Authors
Laura Miron; Rafael Gonçalves; Mark A. Musen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This fileset provides supporting data and corpora for the empirical study described in: Laura Miron, Rafael S. Goncalves and Mark A. Musen. Obstacles to the Reuse of Metadata in ClinicalTrials.govDescription of filesOriginal data files:- AllPublicXml.zip contains the set of all public XML records in ClinicalTrials.gov (protocols and summary results information), on which all remaining analyses are based. Set contains 302,091 records downloaded on April 3, 2019.- public.xsd is the XML schema downloaded from ClinicalTrials.gov on April 3, 2019, used to validate records in AllPublicXML.BioPortal API Query Results- condition_matches.csv contains the results of querying the BioPortal API for all ontology terms that are an 'exact match' to each condition string scraped from the ClinicalTrials.gov XML. Columns={filename, condition, url, bioportal term, cuis, tuis}. - intervention_matches.csv contains BioPortal API query results for all interventions scraped from the ClinicalTrials.gov XML. Columns={filename, intervention, url, bioportal term, cuis, tuis}.Data Element Definitions- supplementary_table_1.xlsx Mapping of element names, element types, and whether elements are required in ClinicalTrials.gov data dictionaries, the ClinicalTrials.gov XML schema declaration for records (public.XSD), the Protocol Registration System (PRS), FDAAA801, and the WHO required data elements for clinical trial registrations.Column and value definitions: - CT.gov Data Dictionary Section: Section heading for a group of data elements in the ClinicalTrials.gov data dictionary (https://prsinfo.clinicaltrials.gov/definitions.html) - CT.gov Data Dictionary Element Name: Name of an element/field according to the ClinicalTrials.gov data dictionaries (https://prsinfo.clinicaltrials.gov/definitions.html) and (https://prsinfo.clinicaltrials.gov/expanded_access_definitions.html) - CT.gov Data Dictionary Element Type: "Data" if the element is a field for which the user provides a value, "Group Heading" if the element is a group heading for several sub-fields, but is not in itself associated with a user-provided value. - Required for CT.gov for Interventional Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to interventional records (only observational or expanded access) - Required for CT.gov for Observational Records: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to observational records (only interventional or expanded access) - Required in CT.gov for Expanded Access Records?: "Required" if the element is required for interventional records according to the data dictionary, "CR" if the element is conditionally required, "Jan 2017" if the element is required for studies starting on or after January 18, 2017, the effective date of the FDAAA801 Final Rule, "-" indicates if this element is not applicable to expanded access records (only interventional or observational) - CT.gov XSD Element Definition: abbreviated xpath to the corresponding element in the ClinicalTrials.gov XSD (public.XSD). The full xpath includes 'clinical_study/' as a prefix to every element. (There is a single top-level element called "clinical_study" for all other elements.) - Required in XSD? : "Yes" if the element is required according to public.XSD, "No" if the element is optional, "-" if the element is not made public or included in the XSD - Type in XSD: "text" if the XSD type was "xs:string" or "textblock", name of enum given if type was enum, "integer" if type was "xs:integer" or "xs:integer" extended with the "type" attribute, "struct" if the type was a struct defined in the XSD - PRS Element Name: Name of the corresponding entry field in the PRS system - PRS Entry Type: Entry type in the PRS system. This column contains some free text explanations/observations - FDAAA801 Final Rule FIeld Name: Name of the corresponding required field in the FDAAA801 Final Rule (https://www.federalregister.gov/documents/2016/09/21/2016-22129/clinical-trials-registration-and-results-information-submission). This column contains many empty values where elements in ClinicalTrials.gov do not correspond to a field required by the FDA - WHO Field Name: Name of the corresponding field required by the WHO Trial Registration Data Set (v 1.3.1) (https://prsinfo.clinicaltrials.gov/trainTrainer/WHO-ICMJE-ClinTrialsgov-Cross-Ref.pdf)Analytical Results:- EC_human_review.csv contains the results of a manual review of random sample eligibility criteria from 400 CT.gov records. Table gives filename, criteria, and whether manual review determined the criteria to contain criteria for "multiple subgroups" of participants.- completeness.xlsx contains counts and percentages of interventional records missing fields required by FDAAA801 and its Final Rule.- industry_completeness.xlsx contains percentages of interventional records missing required fields, broken up by agency class of trial's lead sponsor ("NIH", "US Fed", "Industry", or "Other"), and before and after the effective date of the Final Rule- location_completeness.xlsx contains percentages of interventional records missing required fields, broken up by whether record listed at least one location in the United States and records with only international location (excluding trials with no listed location), and before and after the effective date of the Final RuleIntermediate Results:- cache.zip contains pickle and csv files of pandas dataframes with values scraped from the XML records in AllPublicXML. Downloading these files greatly speeds up running analysis steps from jupyter notebooks in our github repository.
Clean data from survey of statisticians on Adverse Event analysis practices...
figshare.com
bin
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachel Phillips; Victoria Cornelius (2023). Clean data from survey of statisticians on Adverse Event analysis practices in RCTs [Dataset]. http://doi.org/10.6084/m9.figshare.12436574.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12436574.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Rachel Phillips; Victoria Cornelius
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset (Stata v15.1) containing responses from a survey of UK Clinical Research Collaboration registered clinical trial units (CTUs) and industry statisticians from both pharmaceuticals and clinical research organisations (http://dx.doi. org/10.1136/bmjopen-2020- 036875) Data is de-identified. The dataset contains descriptive variables describing participant's experience, as well as responses to questions on current adverse event analysis practices, awareness of specialist methods for adverse event analysis and priorities, concerns and barriers participants experience when analysing adverse event data.
Data cleaning using unstructured data
zenodo.org
zip
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer (2024). Data cleaning using unstructured data [Dataset]. http://doi.org/10.5281/zenodo.13135983
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13135983
Dataset updated
Jul 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this project, we work on repairing three datasets:

Trials design: This dataset was obtained from the European Union Drug Regulating Authorities Clinical Trials Database (EudraCT) register and the ground truth was created from external registries. In the dataset, multiple countries, identified by the attribute country_protocol_code, conduct the same clinical trials which is identified by eudract_number. Each clinical trial has a title that can help find informative details about the design of the trial.

Trials population: This dataset delineates the demographic origins of participants in clinical trials primarily conducted across European countries. This dataset include structured attributes indicating whether the trial pertains to a specific gender, age group or healthy volunteers. Each of these categories is labeled as (`1') or (`0') respectively denoting whether it is included in the trials or not. It is important to note that the population category should remain consistent across all countries conducting the same clinical trial identified by an eudract_number. The ground truth samples in the dataset were established by aligning information about the trial populations provided by external registries, specifically the CT.gov database and the German Trials database. Additionally, the dataset comprises other unstructured attributes that categorize the inclusion criteria for trial participants such as inclusion.

Allergens: This dataset contains information about products and their allergens. The data was collected from the German version of the `Alnatura' (Access date: 24 November, 2020), a free database of food products from around the world `Open Food Facts', and the websites: `Migipedia', 'Piccantino', and `Das Ist Drin'. There may be overlapping products across these websites. Each product in the dataset is identified by a unique code. Samples with the same code represent the same product but are extracted from a differentb source. The allergens are indicated by (‘2’) if present, or (‘1’) if there are traces of it, and (‘0’) if it is absent in a product. The dataset also includes information on ingredients in the products. Overall, the dataset comprises categorical structured data describing the presence, trace, or absence of specific allergens, and unstructured text describing ingredients.

N.B: Each '.zip' file contains a set of 5 '.csv' files which are part of the afro-mentioned datasets:

"{dataset_name}_train.csv": samples used for the ML-model training. (e.g "allergens_train.csv")

"{dataset_name}_test.csv": samples used to test the the ML-model performance. (e.g "allergens_test.csv")

"{dataset_name}_golden_standard.csv": samples represent the ground truth of the test samples. (e.g "allergens_golden_standard.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used for the ML-model training. (e.g "allergens_parker_train.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used to test the the ML-model performance. (e.g "allergens_parker_test.csv")
Clinical Trials
kaggle.com
Updated Nov 25, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Clinical Trials [Dataset]. https://www.kaggle.com/datasets/thedevastator/a-quick-overview-of-clinical-trials/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 25, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Description
Clinical Trials

Clinical trials over the years along with start / end dates, outcome and more.

By Aero Data Lab [source]

About this dataset

This dataset contains information on clinical trials conducted by sponsors. Each row represents a clinical trial, and the columns represent various attributes of the trial, such as the National Clinical Trial Number, the sponsor of the trial, the title of the trial, and so on.

The purpose of this dataset is to provide a bird's-eye view of the clinical trial landscape. By understanding which sponsors are conducting which trials and for what conditions, we can get a better sense of where research is headed and what new treatments may be on the horizon

How to use the dataset

NCT is a unique identifier for clinical trials. It stands for National Clinical Trial Number.

Sponsor is the organization that is funding the clinical trial.

Title is the name of the clinical trial.

Summary is a brief summary of the clinical trial.

Start Year is the year that the clinical trial started.

Start Month is the month that the clinical trial started.

Phase is the stage of development of the investigative drug or device (I), which can be one of four types: I, II, III, or IV.

Enrollment is The number of participants in the clinical trial.

Status is The status of enrollment in the study, which can be Recruiting, Not yet recruiting, Active, not recruiting, Completed, Suspended, or Terminated.

Condition indicates what medical condition(s) are being studied in this particular NCT record

Research Ideas

Identify patterns in clinical trials to improve the development process

Understand how different sponsors fund clinical trials

Acknowledgements

By Aero Data Lab [source]

License

License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

Columns

File: AERO-BirdsEye-Data.csv | Column name | Description | |:----------------|:-----------------------------------------------------------------| | NCT | National Clinical Trial number. (String) | | Sponsor | Name of the sponsor conducting the clinical trial. (String) | | Title | Title of the clinical trial. (String) | | Summary | Brief summary of the clinical trial. (String) | | Start_Year | Year the clinical trial started. (Integer) | | Start_Month | Month the clinical trial started. (String) | | Phase | Phase of the clinical trial. (String) | | Enrollment | Number of participants enrolled in the clinical trial. (Integer) | | Status | Status of the clinical trial. (String) | | Condition | Condition being tested in the clinical trial. (String) |

Acknowledgements

If you use this dataset in your research, please credit By Aero Data Lab [source]
Data from: Clinical Research: A Globalized Network
figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Trevor Richter (2023). Clinical Research: A Globalized Network [Dataset]. http://doi.org/10.6084/m9.figshare.1246725.v3
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1246725.v3
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Trevor Richter
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These files relate to data extracted from ClinicalTrials.gov. In the database file, data for individual clinical trials are included, and attributes include study identifier, study type, trial dates, interventions, sample size, countries in which the study was conducted, etc. The Edges file contains geographic data derived from the clinical trials data set that can be used to generate networks to illustrate geographic connectivity through clinical research, using open access software such as Gephi. The Gephi file includes networks for all countries worldwide, as well as regional networks for each major grographic region. The figures are network diagrams generated by Gephi showing geographic connectivity among individual countries through common participation in multinational clinical trials. The thickness of the connecting lines (edges) reflects the strength of a connection.
Big Data Analytics for Clinical Research Market Research Report 2033
growthmarketreports.com
csv, pdf, pptx
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Growth Market Reports (2025). Big Data Analytics for Clinical Research Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/big-data-analytics-for-clinical-research-market-global-industry-analysis
Explore at:
pdf, csv, pptxAvailable download formats
Dataset updated
Jun 30, 2025
Dataset authored and provided by
Growth Market Reports
Time period covered
2024 - 2032
Area covered
Global
Description
Big Data Analytics for Clinical Research Market Outlook

As per our latest research, the Big Data Analytics for Clinical Research market size reached USD 7.45 billion globally in 2024, reflecting a robust adoption pace driven by the increasing digitization of healthcare and clinical trial processes. The market is forecasted to grow at a CAGR of 17.2% from 2025 to 2033, reaching an estimated USD 25.54 billion by 2033. This significant growth is primarily attributed to the rising need for real-time data-driven decision-making, the proliferation of electronic health records (EHRs), and the growing emphasis on precision medicine and personalized healthcare solutions. The industry is experiencing rapid technological advancements, making big data analytics a cornerstone in transforming clinical research methodologies and outcomes.

Several key growth factors are propelling the expansion of the Big Data Analytics for Clinical Research market. One of the primary drivers is the exponential increase in clinical data volumes from diverse sources, including EHRs, wearable devices, genomics, and imaging. Healthcare providers and research organizations are leveraging big data analytics to extract actionable insights from these massive datasets, accelerating drug discovery, optimizing clinical trial design, and improving patient outcomes. The integration of artificial intelligence (AI) and machine learning (ML) algorithms with big data platforms has further enhanced the ability to identify patterns, predict patient responses, and streamline the entire research process. These technological advancements are reducing the time and cost associated with clinical research, making it more efficient and effective.

Another significant factor fueling market growth is the increasing collaboration between pharmaceutical & biotechnology companies and technology firms. These partnerships are fostering the development of advanced analytics solutions tailored specifically for clinical research applications. The demand for real-world evidence (RWE) and real-time patient monitoring is rising, particularly in the context of post-market surveillance and regulatory compliance. Big data analytics is enabling stakeholders to gain deeper insights into patient populations, treatment efficacy, and adverse event patterns, thereby supporting evidence-based decision-making. Furthermore, the shift towards decentralized and virtual clinical trials is creating new opportunities for leveraging big data to monitor patient engagement, adherence, and safety remotely.

The regulatory landscape is also evolving to accommodate the growing use of big data analytics in clinical research. Regulatory agencies such as the FDA and EMA are increasingly recognizing the value of data-driven approaches for enhancing the reliability and transparency of clinical trials. This has led to the establishment of guidelines and frameworks that encourage the adoption of big data technologies while ensuring data privacy and security. However, the implementation of stringent data protection regulations, such as GDPR and HIPAA, poses challenges related to data integration, interoperability, and compliance. Despite these challenges, the overall outlook for the Big Data Analytics for Clinical Research market remains highly positive, with sustained investments in digital health infrastructure and analytics capabilities.

From a regional perspective, North America currently dominates the Big Data Analytics for Clinical Research market, accounting for the largest share due to its advanced healthcare infrastructure, high adoption of digital technologies, and strong presence of leading pharmaceutical companies. Europe follows closely, driven by increasing government initiatives to promote health data interoperability and research collaborations. The Asia Pacific region is emerging as a high-growth market, supported by expanding healthcare IT investments, rising clinical trial activities, and growing awareness of data-driven healthcare solutions. Latin America and the Middle East & Africa are also witnessing gradual adoption, albeit at a slower pace, due to infrastructural and regulatory challenges. Overall, the global market is poised for substantial growth across all major regions over the forecast period.

"https://growthmarketreports.com/request-sample/5077">
Clinical Trials Database (CTD)
open.canada.ca
html, json, xml
Updated Dec 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Health Canada (2024). Clinical Trials Database (CTD) [Dataset]. https://open.canada.ca/data/en/dataset/d6fe4b32-2eaf-4ac0-9e35-b3841f25e3a7
Explore at:
xml, json, htmlAvailable download formats
Dataset updated
Dec 9, 2024
Dataset provided by
Health Canadahttp://www.hc-sc.gc.ca/
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Description
Health Canada's Clinical Trials Database is a listing of information about phase I, II and III clinical trials in patients. The database is managed by Health Canada and provides a source of information about Canadian clinical trials involving human pharmaceutical and biological drugs. Additional information on Health Canada’s CTD is available at: https://www.canada.ca/en/health-canada/services/drugs-health-products/drug-products/health-canada-clinical-trials-database/frequently-asked-questions.html
e
Patient-relevance of outcome measures in breast cancer clinical trials -...
datarepository.eur.nl
pdf
Updated Mar 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diana Delnoij; Jasmijn Plooij (2025). Patient-relevance of outcome measures in breast cancer clinical trials - Data Files [Dataset]. http://doi.org/10.25397/eur.28314236.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.25397/eur.28314236.v1
Dataset updated
Mar 27, 2025
Dataset provided by
Erasmus University Rotterdam (EUR)
Authors
Diana Delnoij; Jasmijn Plooij
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
The primary research question for which these data have been used, was: ‘How patient-relevant are outcomes measured in clinical trials for breast cancer drugs?’. Subquestions were: 1. Which treatment outcomes are relevant for breast cancer patients? 2. Which outcome measures are used in clinical trials for breast cancer drugs? 3. How much overlap is there between patient-relevant outcomes and outcomes measured in clinical trials?The dataset has been used to answer subquestion 2. Data have been obtained by searching Clinicaltrials.gov for trials conducted between January 2014 and March 2024 inclusive. Further inclusion criteria were that studies had to be phase III trials and had to focus on breast cancer, adults (18-64 years old) and drugs. Interventions focusing on lifestyle changes, Chinese medicine, anaesthesia, surgery and diagnostic methods were excluded. Ultimately, 264 trials were included and forty-five excluded. To determine the outcome measures used, the study plan of every included trial was reviewed and recorded on the data sheet.
n
NIDA Data Share
neuinfo.org
dknet.org
+1more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). NIDA Data Share [Dataset]. http://identifiers.org/RRID:SCR_002002
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_002002 https://identifiers.org/RRID:SCR_002002/resolver?q=&i=rrid
Dataset updated
Jan 29, 2022
Description
Website which allows data from completed clinical trials to be distributed to investigators and public. Researchers can download de-identified data from completed NIDA clinical trial studies to conduct analyses that improve quality of drug abuse treatment. Incorporates data from Division of Therapeutics and Medical Consequences and Center for Clinical Trials Network.
Raw Data for IntoValue Dataset
zenodo.org
zip
Updated Feb 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maia Salholz-Hillel; Maia Salholz-Hillel; Delwen Franzen; Delwen Franzen; Benjamin Gregory Carlisle; Benjamin Gregory Carlisle; Nico Riedel; Nico Riedel (2023). Raw Data for IntoValue Dataset [Dataset]. http://doi.org/10.5281/zenodo.7590083
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7590083
Dataset updated
Feb 2, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Maia Salholz-Hillel; Maia Salholz-Hillel; Delwen Franzen; Delwen Franzen; Benjamin Gregory Carlisle; Benjamin Gregory Carlisle; Nico Riedel; Nico Riedel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data deposit includes large raw data used for the "IntoValue" dataset, which underlies several projects at the QUEST Center for Responsible Research in the Berlin Institute of Health (BIH) @ Charité. An initial version of the IntoValue dataset is available in Zenodo: https://doi.org/10.5281/zenodo.5141342. Based on this initial version, the dataset is actively developed and maintained in GitHub: https://github.com/maia-sh/intovalue-data. This Zenodo deposit serves to store large raw data files for individual trials and are used in that GitHub repository. These data are deposited for computational reproducibility and documentation; they are not intended to be used for additional projects and do not reflect the most current/accurate data available from each source.

This deposit contains raw data from the following sources:

PubMed (pubmed.zip): PubMed XML files are provided courtesty of the U.S. National Library of Medicine and were accessed via the Entrez Programming Utilities (E-utilities) API. The files were downloaded on 2021-08-15 and do not reflect the most current/accurate data available from NLM. The following scripts were used to download and create these files: get-pubmed.R; download-pubmed.R.

German Clinical Trials Registry (DRKS) (drks.zip): DRKS does not provide an API and was webscrapped on 2022-11-01. The following scripts were used to download and create these XML files: get-drks.R; drks-functions.R

ClinicalTrials.gov (ctgov.zip): ClinicalTrials.gov was accessed via the Clinical Trials Transformation Initiative (CTTI) Aggregate Content of ClinicalTrials.gov (AACT) via its PostgreSQL database API.The API was queried and CSV files were generated on 2022-11-01. The following scripts were used to download and create these files: get-process-aact.R.

ClinicalTrials.gov 2018 (ctgov_2018.zip): Additional trial data for 2018. ClinicalTrials.gov was accessed via the Clinical Trials Transformation Initiative (CTTI) Aggregate Content of ClinicalTrials.gov (AACT) via its PostgreSQL database API.The API was queried and CSV files were generated on 2022-11-01. The following scripts were used to download and create these files: get-process-aact.R.

Facebook

Twitter

Click to copy link

Link copied

Cite

John Snow Labs (2021). US Clinical Trials Data Package [Dataset]. https://www.johnsnowlabs.com/marketplace/us-clinical-trials-data-package/

US Clinical Trials Data Package

The Registry And Results Database;ClinicalTrials.gov Database;Clinical Studies Database;US Clinical Trials Of Human Participants Database;Development Of A Clinical Prediction Model

Explore at:

csvAvailable download formats

Dataset updated

Jan 20, 2021

Dataset authored and provided by

John Snow Labs

Area covered

United States

Description

This data package contains datasets on clinical trials conducted in the United States. Diseases include cervical cancer, diabetes, acute respiratory infection as well as stress. This data package also includes clinical trials registry and results database.

Clear search

Close search

Google apps

Main menu

US Clinical Trials Data Package

TREC 2022 Clinical Trials Dataset

Data (i.e., evidence) about evidence based medicine

Data from: Sharing of clinical trial data and results reporting practices...

2019 Clinical Trial Data Sharing Survey Results - Data

National Database for Clinical Trials Related to Mental Illness (NDCT)

Dataset of books series that contain Statistical design and analysis of...

Clinical Trial Data Visualization Market Report | Global Forecast From 2025...

Clinical Trial Data Visualization Market Outlook

Component Analysis

Data from: Availability of Study Protocols for Randomized Trials Published...

Final Dataset for the DIssemination of REgistered COVID-19 Clinical Trials...

Data from "Obstacles to the Reuse of Study Metadata in ClinicalTrials.gov"

Clean data from survey of statisticians on Adverse Event analysis practices...

Data cleaning using unstructured data

Clinical Trials

Clinical Trials

Clinical trials over the years along with start / end dates, outcome and more.

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Acknowledgements

Data from: Clinical Research: A Globalized Network

Big Data Analytics for Clinical Research Market Research Report 2033

Big Data Analytics for Clinical Research Market Outlook

Clinical Trials Database (CTD)

Patient-relevance of outcome measures in breast cancer clinical trials -...

NIDA Data Share

Raw Data for IntoValue Dataset

US Clinical Trials Data Package

The Registry And Results Database;ClinicalTrials.gov Database;Clinical Studies Database;US Clinical Trials Of Human Participants Database;Development Of A Clinical Prediction Model