91 datasets found

f
Data from: The natural history of ALS: Baseline characteristics from a...
tandf.figshare.com
docx
Updated Feb 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Berger; Matteo Locatelli; Ximena Arcila-Londono; Ghazala Hayat; Nicholas Olney; James Wymer; Kelly Gwathmey; Christian Lunetta; Terry Heiman-Patterson; Senda Ajroud-Driss; Eric A. Macklin; Marie-Abèle Bind; Kimberly Goslin; Tamela Stuchiner; Lauren Brown; Tracy Bazan; Tyler Regan; Ashley Adamo; Valerie Ferment; Carly Schroeder; Megan Somers; Georgios Manousakis; Kenneth Faulconer; Ervin Sinani; Julia Mirochnick; Hong Yu; Alexander V. Sherman; David Walk (2024). The natural history of ALS: Baseline characteristics from a multicenter clinical cohort [Dataset]. http://doi.org/10.6084/m9.figshare.23701648.v1
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23701648.v1
Dataset updated
Feb 9, 2024
Dataset provided by
Taylor & Francis
Authors
Alex Berger; Matteo Locatelli; Ximena Arcila-Londono; Ghazala Hayat; Nicholas Olney; James Wymer; Kelly Gwathmey; Christian Lunetta; Terry Heiman-Patterson; Senda Ajroud-Driss; Eric A. Macklin; Marie-Abèle Bind; Kimberly Goslin; Tamela Stuchiner; Lauren Brown; Tracy Bazan; Tyler Regan; Ashley Adamo; Valerie Ferment; Carly Schroeder; Megan Somers; Georgios Manousakis; Kenneth Faulconer; Ervin Sinani; Julia Mirochnick; Hong Yu; Alexander V. Sherman; David Walk
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Amyotrophic lateral sclerosis (ALS) is a rare disease with urgent need for improved treatment. Despite the acceleration of research in recent years, there is a need to understand the full natural history of the disease. As only 40% of people living with ALS are eligible for typical clinical trials, clinical trial datasets may not generalize to the full ALS population. While biomarker and cohort studies have more generous inclusion criteria, these too may not represent the full range of phenotypes, particularly if the burden for participation is high. To permit a complete understanding of the heterogeneity of ALS, comprehensive data on the full range of people with ALS is needed. The ALS Natural History Consortium (ALS NHC) consists of nine ALS clinics and was created to build a comprehensive dataset reflective of the ALS population. At each clinic, most patients are asked to participate and about 95% do. After obtaining consent, a minimum dataset is abstracted from each participant’s electronic health record. Participant burden is therefore minimal. Data on 1925 ALS patients were submitted as of 9 December 2022. ALS NHC participants were more heterogeneous relative to anonymized clinical trial data from the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) database. The ALS NHC includes ALS patients of older age of onset and a broader distribution of El Escorial categories, than the PRO-ACT database. ALS NHC participants had a higher diversity of diagnostic and demographic data compared to ALS clinical trial participants.Key MessagesWhat is already known on this topic: Current knowledge of the natural history of ALS derives largely from regional and national registries that have broad representation of the population of people living with ALS but do not always collect covariates and clinical outcomes. Clinical studies with rich datasets of participant characteristics and validated clinical outcomes have stricter inclusion and exclusion criteria that may not be generalizable to the full ALS population.What this study adds: To bridge this gap, we collected baseline characteristics for a sample of the population of people living with ALS seen at a consortium of ALS clinics that collect extensive, pre-specified participant-level data, including validated outcome measures.How this study might affect research, practice, or policy: A clinic-based longitudinal dataset can improve our understanding of the natural history of ALS and can be used to inform the design and analysis of clinical trials and health economics studies, to help the prediction of clinical course, to find matched controls for open label extension trials and expanded access protocols, and to document real-world evidence of the impact of novel treatments and changes in care practice. What is already known on this topic: Current knowledge of the natural history of ALS derives largely from regional and national registries that have broad representation of the population of people living with ALS but do not always collect covariates and clinical outcomes. Clinical studies with rich datasets of participant characteristics and validated clinical outcomes have stricter inclusion and exclusion criteria that may not be generalizable to the full ALS population. What this study adds: To bridge this gap, we collected baseline characteristics for a sample of the population of people living with ALS seen at a consortium of ALS clinics that collect extensive, pre-specified participant-level data, including validated outcome measures. How this study might affect research, practice, or policy: A clinic-based longitudinal dataset can improve our understanding of the natural history of ALS and can be used to inform the design and analysis of clinical trials and health economics studies, to help the prediction of clinical course, to find matched controls for open label extension trials and expanded access protocols, and to document real-world evidence of the impact of novel treatments and changes in care practice.
Data cleaning using unstructured data
zenodo.org
zip
Updated Jul 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer (2024). Data cleaning using unstructured data [Dataset]. http://doi.org/10.5281/zenodo.13135983
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13135983
Dataset updated
Jul 30, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Rihem Nasfi; Rihem Nasfi; Antoon Bronselaer; Antoon Bronselaer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In this project, we work on repairing three datasets:

Trials design: This dataset was obtained from the European Union Drug Regulating Authorities Clinical Trials Database (EudraCT) register and the ground truth was created from external registries. In the dataset, multiple countries, identified by the attribute country_protocol_code, conduct the same clinical trials which is identified by eudract_number. Each clinical trial has a title that can help find informative details about the design of the trial.

Trials population: This dataset delineates the demographic origins of participants in clinical trials primarily conducted across European countries. This dataset include structured attributes indicating whether the trial pertains to a specific gender, age group or healthy volunteers. Each of these categories is labeled as (`1') or (`0') respectively denoting whether it is included in the trials or not. It is important to note that the population category should remain consistent across all countries conducting the same clinical trial identified by an eudract_number. The ground truth samples in the dataset were established by aligning information about the trial populations provided by external registries, specifically the CT.gov database and the German Trials database. Additionally, the dataset comprises other unstructured attributes that categorize the inclusion criteria for trial participants such as inclusion.

Allergens: This dataset contains information about products and their allergens. The data was collected from the German version of the `Alnatura' (Access date: 24 November, 2020), a free database of food products from around the world `Open Food Facts', and the websites: `Migipedia', 'Piccantino', and `Das Ist Drin'. There may be overlapping products across these websites. Each product in the dataset is identified by a unique code. Samples with the same code represent the same product but are extracted from a differentb source. The allergens are indicated by (‘2’) if present, or (‘1’) if there are traces of it, and (‘0’) if it is absent in a product. The dataset also includes information on ingredients in the products. Overall, the dataset comprises categorical structured data describing the presence, trace, or absence of specific allergens, and unstructured text describing ingredients.

N.B: Each '.zip' file contains a set of 5 '.csv' files which are part of the afro-mentioned datasets:

"{dataset_name}_train.csv": samples used for the ML-model training. (e.g "allergens_train.csv")

"{dataset_name}_test.csv": samples used to test the the ML-model performance. (e.g "allergens_test.csv")

"{dataset_name}_golden_standard.csv": samples represent the ground truth of the test samples. (e.g "allergens_golden_standard.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used for the ML-model training. (e.g "allergens_parker_train.csv")

"{dataset_name}_parker_train.csv": samples repaired using Parker Engine used to test the the ML-model performance. (e.g "allergens_parker_test.csv")
Randomized controlled clinical trials with tagged information regarding the...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Sep 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paul Windisch; Daniel R. Zwahlen (2024). Randomized controlled clinical trials with tagged information regarding the number of participants [Dataset]. http://doi.org/10.5061/dryad.g1jwstr0b
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.g1jwstr0b
Dataset updated
Sep 16, 2024
Dataset provided by
Kantonsspital Winterthur
Authors
Paul Windisch; Daniel R. Zwahlen
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Background: Extracting the sample size from randomized controlled trials (RCTs) remains a challenge to developing better search functionalities or automating systematic reviews. Most current approaches rely on the sample size being explicitly mentioned in the abstract. Data collection: A random sample of 996 randomized controlled trials (RCTs) from seven major journals (British Medical Journal, JAMA, JAMA Oncology, Journal of Clinical Oncology, Lancet, Lancet Oncology, New England Journal of Medicine) published between 2010 and 2022 were labeled. To do so, abstracts were retrieved as a txt file from PubMed and parsed using regular expressions (i.e., expressions that match certain patterns in text). For each trial, the number of people who were randomized was retrieved by looking at the abstract, followed by the full publication if the number could not be determined with certainty from the abstract. In addition, six different entities were tagged in each abstract, independent of whether the information was presented using words or integers. If the number of people who were randomized was explicitly stated (e.g., using the words “randomly,” “randomized,” etc.), this was tagged as “RANDOMIZED_TOTAL.” If the number of people who were analyzed was presented, this was tagged as “ANALYSIS_TOTAL”. If the number of people who completed the trial or a certain follow-up period was presented, this was tagged as “COMPLETION_TOTAL. If the number of people who were part of the trial without being more specific was presented, this was tagged as “GENERAL_TOTAL”. If the number of people who were assigned to an arm of the trial was presented, this was tagged as “ARM”. Lastly, if the number of patients who were assigned to an arm was presented in the context of how many patients experienced an event, this was tagged as “ARM_EVENT”. If the abstract did not contain the aforementioned entities, the manuscript was added to the dataset without any tags. Data properties: Each trial is a row in the csv file. For a detailed description, please have a look at the enclosed Readme file.
Data (i.e., evidence) about evidence based medicine
figshare.com
search.datacite.org
png
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jorge H Ramirez (2023). Data (i.e., evidence) about evidence based medicine [Dataset]. http://doi.org/10.6084/m9.figshare.1093997.v24
Explore at:
pngAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.1093997.v24
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Jorge H Ramirez
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Update — December 7, 2014. – Evidence-based medicine (EBM) is not working for many reasons, for example: 1. Incorrect in their foundations (paradox): hierarchical levels of evidence are supported by opinions (i.e., lowest strength of evidence according to EBM) instead of real data collected from different types of study designs (i.e., evidence). http://dx.doi.org/10.6084/m9.figshare.1122534 2. The effect of criminal practices by pharmaceutical companies is only possible because of the complicity of others: healthcare systems, professional associations, governmental and academic institutions. Pharmaceutical companies also corrupt at the personal level, politicians and political parties are on their payroll, medical professionals seduced by different types of gifts in exchange of prescriptions (i.e., bribery) which very likely results in patients not receiving the proper treatment for their disease, many times there is no such thing: healthy persons not needing pharmacological treatments of any kind are constantly misdiagnosed and treated with unnecessary drugs. Some medical professionals are converted in K.O.L. which is only a puppet appearing on stage to spread lies to their peers, a person supposedly trained to improve the well-being of others, now deceits on behalf of pharmaceutical companies. Probably the saddest thing is that many honest doctors are being misled by these lies created by the rules of pharmaceutical marketing instead of scientific, medical, and ethical principles. Interpretation of EBM in this context was not anticipated by their creators. “The main reason we take so many drugs is that drug companies don’t sell drugs, they sell lies about drugs.” ―Peter C. Gøtzsche “doctors and their organisations should recognise that it is unethical to receive money that has been earned in part through crimes that have harmed those people whose interests doctors are expected to take care of. Many crimes would be impossible to carry out if doctors weren’t willing to participate in them.” —Peter C Gøtzsche, The BMJ, 2012, Big pharma often commits corporate crime, and this must be stopped. Pending (Colombia): Health Promoter Entities (In Spanish: EPS ―Empresas Promotoras de Salud).

Misinterpretations New technologies or concepts are difficult to understand in the beginning, it doesn’t matter their simplicity, we need to get used to new tools aimed to improve our professional practice. Probably the best explanation is here in these videos (credits to Antonio Villafaina for sharing these videos with me). English https://www.youtube.com/watch?v=pQHX-SjgQvQ&w=420&h=315 Spanish https://www.youtube.com/watch?v=DApozQBrlhU&w=420&h=315 ----------------------- Hypothesis: hierarchical levels of evidence based medicine are wrong Dear Editor, I have data to support the hypothesis described in the title of this letter. Before rejecting the null hypothesis I would like to ask the following open question:Could you support with data that hierarchical levels of evidence based medicine are correct? (1,2) Additional explanation to this question: – Only respond to this question attaching publicly available raw data.– Be aware that more than a question this is a challenge: I have data (i.e., evidence) which is contrary to classic (i.e., McMaster) or current (i.e., Oxford) hierarchical levels of evidence based medicine. An important part of this data (but not all) is publicly available. References

Ramirez, Jorge H (2014): The EBM challenge. figshare. http://dx.doi.org/10.6084/m9.figshare.1135873

The EBM Challenge Day 1: No Answers. Competing interests: I endorse the principles of open data in human biomedical research Read this letter on The BMJ – August 13, 2014.http://www.bmj.com/content/348/bmj.g3725/rr/762595Re: Greenhalgh T, et al. Evidence based medicine: a movement in crisis? BMJ 2014; 348: g3725. _ Fileset contents Raw data: Excel archive: Raw data, interactive figures, and PubMed search terms. Google Spreadsheet is also available (URL below the article description). Figure 1. Unadjusted (Fig 1A) and adjusted (Fig 1B) PubMed publication trends (01/01/1992 to 30/06/2014). Figure 2. Adjusted PubMed publication trends (07/01/2008 to 29/06/2014) Figure 3. Google search trends: Jan 2004 to Jun 2014 / 1-week periods. Figure 4. PubMed publication trends (1962-2013) systematic reviews and meta-analysis, clinical trials, and observational studies.
Figure 5. Ramirez, Jorge H (2014): Infographics: Unpublished US phase 3 clinical trials (2002-2014) completed before Jan 2011 = 50.8%. figshare.http://dx.doi.org/10.6084/m9.figshare.1121675 Raw data: "13377 studies found for: Completed | Interventional Studies | Phase 3 | received from 01/01/2002 to 01/01/2014 | Worldwide". This database complies with the terms and conditions of ClinicalTrials.gov: http://clinicaltrials.gov/ct2/about-site/terms-conditions Supplementary Figures (S1-S6). PubMed publication delay in the indexation processes does not explain the descending trends in the scientific output of evidence-based medicine. Acknowledgments I would like to acknowledge the following persons for providing valuable concepts in data visualization and infographics:

Maria Fernanda Ramírez. Professor of graphic design. Universidad del Valle. Cali, Colombia.

Lorena Franco. Graphic design student. Universidad del Valle. Cali, Colombia. Related articles by this author (Jorge H. Ramírez)

Ramirez JH. Lack of transparency in clinical trials: a call for action. Colomb Med (Cali) 2013;44(4):243-6. URL: http://www.ncbi.nlm.nih.gov/pubmed/24892242

Ramirez JH. Re: Evidence based medicine is broken (17 June 2014). http://www.bmj.com/node/759181

Ramirez JH. Re: Global rules for global health: why we need an independent, impartial WHO (19 June 2014). http://www.bmj.com/node/759151

Ramirez JH. PubMed publication trends (1992 to 2014): evidence based medicine and clinical practice guidelines (04 July 2014). http://www.bmj.com/content/348/bmj.g3725/rr/759895 Recommended articles

Greenhalgh Trisha, Howick Jeremy,Maskrey Neal. Evidence based medicine: a movement in crisis? BMJ 2014;348:g3725

Spence Des. Evidence based medicine is broken BMJ 2014; 348:g22

Schünemann Holger J, Oxman Andrew D,Brozek Jan, Glasziou Paul, JaeschkeRoman, Vist Gunn E et al. Grading quality of evidence and strength of recommendations for diagnostic tests and strategies BMJ 2008; 336:1106

Lau Joseph, Ioannidis John P A, TerrinNorma, Schmid Christopher H, OlkinIngram. The case of the misleading funnel plot BMJ 2006; 333:597

Moynihan R, Henry D, Moons KGM (2014) Using Evidence to Combat Overdiagnosis and Overtreatment: Evaluating Treatments, Tests, and Disease Definitions in the Time of Too Much. PLoS Med 11(7): e1001655. doi:10.1371/journal.pmed.1001655

Katz D. A-holistic view of evidence based medicinehttp://thehealthcareblog.com/blog/2014/05/02/a-holistic-view-of-evidence-based-medicine/ ---
f
CK4Gen, High Utility Synthetic Survival Datasets
figshare.com
zip
Updated Nov 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Kuo (2024). CK4Gen, High Utility Synthetic Survival Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.27611388.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.27611388.v1
Dataset updated
Nov 5, 2024
Dataset provided by
figshare
Authors
Nicholas Kuo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
===###Overview:This repository provides high-utility synthetic survival datasets generated using the CK4Gen framework, optimised to retain critical clinical characteristics for use in research and educational settings. Each dataset is based on a carefully curated ground truth dataset, processed with standardised variable definitions and analytical approaches, ensuring a consistent baseline for survival analysis.###===###Description:The repository includes synthetic versions of four widely utilised and publicly accessible survival analysis datasets, each anchored in foundational studies and aligned with established ground truth variations to support robust clinical research and training.#---GBSG2: Based on Schumacher et al. [1]. The study evaluated the effects of hormonal treatment and chemotherapy duration in node-positive breast cancer patients, tracking recurrence-free and overall survival among 686 women over a median of 5 years. Our synthetic version is derived from a variation of the GBSG2 dataset available in the lifelines package [2], formatted to match the descriptions in Sauerbrei et al. [3], which we treat as the ground truth.ACTG320: Based on Hammer et al. [4]. The study investigates the impact of adding the protease inhibitor indinavir to a standard two-drug regimen for HIV-1 treatment. The original clinical trial involved 1,151 patients with prior zidovudine exposure and low CD4 cell counts, tracking outcomes over a median follow-up of 38 weeks. Our synthetic dataset is derived from a variation of the ACTG320 dataset available in the sksurv package [5], which we treat as the ground truth dataset.WHAS500: Based on Goldberg et al. [6]. The study follows 500 patients to investigate survival rates following acute myocardial infarction (MI), capturing a range of factors influencing MI incidence and outcomes. Our synthetic data replicates a ground truth variation from the sksurv package, which we treat as the ground truth dataset.FLChain: Based on Dispenzieri et al. [7]. The study assesses the prognostic relevance of serum immunoglobulin free light chains (FLCs) for overall survival in a large cohort of 15,859 participants. Our synthetic version is based on a variation available in the sksurv package, which we treat as the ground truth dataset.###===###Notes:Please find an in-depth discussion on these datasets, as well as their generation process, in the link below, to our paper:https://arxiv.org/abs/2410.16872Kuo, et al. "CK4Gen: A Knowledge Distillation Framework for Generating High-Utility Synthetic Survival Datasets in Healthcare." arXiv preprint arXiv:2410.16872 (2024).###===###References:[1]: Schumacher, et al. “Randomized 2 x 2 trial evaluating hormonal treatment and the duration of chemotherapy in node-positive breast cancer patients. German breast cancer study group.”, Journal of Clinical Oncology, 1994.[2]: Davidson-Pilon “lifelines: Survival Analysis in Python”, Journal of Open Source Software, 2019.[3]: Sauerbrei, et al. “Modelling the effects of standard prognostic factors in node-positive breast cancer”, British Journal of Cancer, 1999.[4]: Hammer, et al. “A controlled trial of two nucleoside analogues plus indinavir in persons with human immunodeficiency virus infection and cd4 cell counts of 200 per cubic millimeter or less”, New England Journal of Medicine, 1997.[5]: Pölsterl “scikit-survival: A library for time-to-event analysis built on top of scikit-learn”, Journal of Machine Learning Research, 2020.[6]: Goldberg, et al. “Incidence and case fatality rates of acute myocardial infarction (1975–1984): the Worcester heart attack study”, American Heart Journal, 1988.[7]: Dispenzieri, et al. “Use of nonclonal serum immunoglobulin free light chains to predict overall survival in the general population”, in Mayo Clinic Proceedings, 2012.
Dataset from A Randomised, Placebo Controlled, Ascending, Repeat Dose Study...
data.niaid.nih.gov
Updated Dec 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GSK Clinical Trials (2024). Dataset from A Randomised, Placebo Controlled, Ascending, Repeat Dose Study in Healthy Volunteers Investigating Safety, Tolerability, Pharmacokinetics and Pharmacodynamics of GSK356278 [Dataset]. http://doi.org/10.25934/00001137
Explore at:
Unique identifier
https://doi.org/10.25934/00001137
Dataset updated
Dec 4, 2024
Dataset provided by
GSK plchttp://gsk.com/
Authors
GSK Clinical Trials
Area covered
Netherlands
Variables measured
Nausea, Retching, Vomiting, 12 Lead Ecg, Adverse Event, Laboratory Test, Echocardiography, Pharmacokinetics, Electrocardiogram, Vital Signs Measurement, and 3 more
Description
The study drug, GSK356278, is a possible new medicine for the treatment of Huntington's disease. Huntington's disease, which is often called HD, is caused by a faulty gene that is passed down through families. HD causes damage to nerve cells in the brain which causes them to waste away. As the damage progresses patients develop symptoms that affect every aspect of life. HD reduces people's ability to walk, talk, think, communicate and causes uncontrolled movements. GSK356278 may slow down the progression of damage to nerve cells in people with HD and help with their ability to think. GSK356278 was well tolerated when it was given as a single dose to healthy people. In this study we want to see what effects, both good and bad, GSK356278 has in people when it is taken every day. During the study we will look at about 3 different doses of GSK356278 in about 36 healthy people. The study will also look at how GSK356278 tablets behave in the body after it is swallowed (this is called pharmacokinetics). The study will also look at effects of GSK356278 on the body (this is called pharmacodynamics). The study will help to design future clinical studies with GSK356278.
r
Africa Centre for Health and Population Studies
rrid.site
dknet.org
+1more
Updated Jun 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Africa Centre for Health and Population Studies [Dataset]. http://identifiers.org/RRID:SCR_008964
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_008964
Dataset updated
Jun 28, 2025
Description
Longitudinal datasets of demographic, social, medical and economic information from a rural demographic in northern KwaZulu-Natal, South Africa where HIV prevalence is extremely high. The data may be filtered by demographics, years, or by individuals questionnaires. The datasets may be used by other researchers but the Africa Centre requests notification that anyone contact them when downloading their data. The datasets are provided in three formats: Stata11 .dta; tables in a MS-Access .accdb database; and worksheets in a MS-Excel .xlsx workbook. Datasets are generated approximately every six months containing information spanning the whole period of surveillance from 1/1/2000 to present.
A
‘District Attorney Trials’ analyzed by Analyst-2
analyst-2.ai
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com), ‘District Attorney Trials’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/data-gov-district-attorney-trials-5026/e5d8a100/?iid=009-172&v=presentation
Explore at:
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘District Attorney Trials’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://catalog.data.gov/dataset/bf4e8477-c11c-42f7-80c8-cf12a7062938 on 11 February 2022.

--- Dataset description provided by original source is as follows ---

A. SUMMARY

Please note that the "Data Last Updated" date on this page denotes the most recent time the open data portal automation process ran and does not reflect the most recent update to the data in this dataset. To confirm the completeness of this dataset please contact the District Attorney's office at districtattorney@sfgov.org.

This dataset contains information on SFDA trial outcomes since 2014. It includes information on all jury trials except for cases that have been sealed or expunged due to record clearance protocols.

Jury trials are resource-intensive, and often the most public part of the criminal process. The vast majority of cases do not go to trial and resolve after plea negotiations between the prosecutor and the defense attorney. If the case cannot be resolved through plea negotiations, it may go to trial before a judge or a jury. A jury is made up of 12 people from the community, and often a few alternate jurors. Most trials are jury trials, and both the prosecution and the defense will have an opportunity to excuse some jurors before a final jury is sworn. The selected jury will hear the evidence and will be tasked with determining whether the charges have been proven beyond a reasonable doubt.

More information about the trial process and this dataset can be found under the Trials tab on the DA Stat page.

Disclaimer: The San Francisco District Attorney's Office does not guarantee the accuracy, completeness, or timeliness of the information as the data is subject to change as modifications and updates are completed.

B. HOW THE DATASET IS CREATED

At the conclusion of a trial, relevant data is manually entered into the District Attorney Office's case management system. Trial data reports are pulled from this system on a semi-regular basis, cleaned, anonymized, and added to Open Data.

C. UPDATE PROCESS District Attorney's Office strive to update this dataset every month. However, the creation of this dataset requires a manual pull from the Office's case management system and is dependent on staff availability. The Open Data portal automation process will run the 1st of every month regardless of if an update to this dataset has been made.

D. HOW TO USE THIS DATASET Please review the Trials tab on the DA Stat page for more information about this dataset.

--- Original source retains full ownership of the source dataset ---

The NIMH Healthy Research Volunteer Dataset

openneuro.org

Updated Feb 18, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Allison C. Nugent; Adam G Thomas; Margaret Mahoney; Alison Gibbons; Jarrod Smith; Antoinette Charles; Jacob S Shaw; Jeffrey D Stout; Anna M Namyst; Arshitha Basavaraj; Eric Earl; Dustin Moraczewski; Emily Guinee; Michael Liu; Travis Riddle; Joseph Snow; Shruti Japee; Morgan Andrews; Adriana Pavletic; Stephen Sinclair; Vinai Roopchansingh; Peter A Bandettini; Joyce Chung (2025). The NIMH Healthy Research Volunteer Dataset [Dataset]. http://doi.org/10.18112/openneuro.ds005752.v2.1.0

Explore at:

Unique identifier

https://doi.org/10.18112/openneuro.ds005752.v2.1.0

Dataset updated

Feb 18, 2025

Dataset provided by

OpenNeurohttps://openneuro.org/

Authors

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The National Institute of Mental Health (NIMH) Research Volunteer (RV) Data Set

A comprehensive dataset characterizing healthy research volunteers in terms of clinical assessments, mood-related psychometrics, cognitive function neuropsychological tests, structural and functional magnetic resonance imaging (MRI), along with diffusion tensor imaging (DTI), and a comprehensive magnetoencephalography battery (MEG).

In addition, blood samples are currently banked for future genetic analysis. All data collected in this protocol are broadly shared in the OpenNeuro repository, in the Brain Imaging Data Structure (BIDS) format. In addition, task paradigms and basic pre-processing scripts are shared on GitHub. This dataset is unprecedented in its depth of characterization of a healthy population and will allow a wide array of investigations into normal cognition and mood regulation.

This dataset is licensed under the Creative Commons Zero (CC0) v1.0 License.

Release Notes

Release v2.0.0

This release includes data collected between 2020-06-03 (cut-off date for v1.0.0) and 2024-04-01. Notable changes in this release:

769 new participants have been added along with re-evaluation data for 15 participants. Total unique participants count is now 1859.
visit and age_at_visit columns added to phenotype files to distinguish between visits and intervals between them.
Follow-up online survey data included.
Replaced Beck Anxiety Inventory (BAI) and Beck Depression Inventory-II (BDI-II) with General Anxiety Disorder-7 (GAD7) and Patient Health Questionnaire 9 (PHQ9) surveys, respectively.
Discontinued the Perceived Health rating survey.
Added Brief Trauma Questionnaire (BTQ) and Big Five personality survey to online screening questionnaires.
MRI:
- Replaced ADNI-3 resting state sequence with a multi-echo sequence with higher spatial resolution.
- Replaced field map scans with a shorter reversed-blipped EPI scan.
MEG:
- Some participants have 6-minute empty room data instead of the shorter duration empty room acquisition.

See the CHANGES file for complete version-wise changelog.

Participant Eligibility

To be eligible for the study, participants need to be medically healthy adults over 18 years of age with the ability to read, speak and understand English. All participants provided electronic informed consent for online pre-screening, and written informed consent for all other procedures. Participants with a history of mental illness or suicidal or self-injury thoughts or behavior are excluded. Additional exclusion criteria include current illicit drug use, abnormal medical exam, and less than an 8th grade education or IQ below 70. Current NIMH employees, or first degree relatives of NIMH employees are prohibited from participating. Study participants are recruited through direct mailings, bulletin boards and listservs, outreach exhibits, print advertisements, and electronic media.

Clinical Measures

All potential volunteers visit the study website, check a box indicating consent, and fill out preliminary screening questionnaires. The questionnaires include basic demographics, the World Health Organization Disability Assessment Schedule 2.0 (WHODAS 2.0), the DSM-5 Self-Rated Level 1 Cross-Cutting Symptom Measure, the DSM-5 Level 2 Cross-Cutting Symptom Measure - Substance Use, the Alcohol Use Disorders Identification Test (AUDIT), the Edinburgh Handedness Inventory, and a brief clinical history checklist. The WHODAS 2.0 is a 15 item questionnaire that assesses overall general health and disability, with 14 items distributed over 6 domains: cognition, mobility, self-care, “getting along”, life activities, and participation. The DSM-5 Level 1 cross-cutting measure uses 23 items to assess symptoms across diagnoses, although an item regarding self-injurious behavior was removed from the online self-report version. The DSM-5 Level 2 cross-cutting measure is adapted from the NIDA ASSIST measure, and contains 15 items to assess use of both illicit drugs and prescription drugs without a doctor’s prescription. The AUDIT is a 10 item screening assessment used to detect harmful levels of alcohol consumption, and the Edinburgh Handedness Inventory is a systematic assessment of handedness. These online results do not contain any personally identifiable information (PII). At the conclusion of the questionnaires, participants are prompted to send an email to the study team. These results are reviewed by the study team, who determines if the participant is appropriate for an in-person interview.

Participants who meet all inclusion criteria are scheduled for an in-person screening visit to determine if there are any further exclusions to participation. At this visit, participants receive a History and Physical exam, Structured Clinical Interview for DSM-5 Disorders (SCID-5), the Beck Depression Inventory-II (BDI-II), Beck Anxiety Inventory (BAI), and the Kaufman Brief Intelligence Test, Second Edition (KBIT-2). The purpose of these cognitive and psychometric tests is two-fold. First, these measures are designed to provide a sensitive test of psychopathology. Second, they provide a comprehensive picture of cognitive functioning, including mood regulation. The SCID-5 is a structured interview, administered by a clinician, that establishes the absence of any DSM-5 axis I disorder. The KBIT-2 is a brief (20 minute) assessment of intellectual functioning administered by a trained examiner. There are three subtests, including verbal knowledge, riddles, and matrices.

Biological and physiological measures

Biological and physiological measures are acquired, including blood pressure, pulse, weight, height, and BMI. Blood and urine samples are taken and a complete blood count, acute care panel, hepatic panel, thyroid stimulating hormone, viral markers (HCV, HBV, HIV), c-reactive protein, creatine kinase, urine drug screen and urine pregnancy tests are performed. In addition, three additional tubes of blood samples are collected and banked for future analysis, including genetic testing.

Imaging Studies

Participants were given the option to enroll in optional magnetic resonance imaging (MRI) and magnetoencephalography (MEG) studies.

MRI

On the same visit as the MRI scan, participants are administered a subset of tasks from the NIH Toolbox Cognition Battery. The four tasks asses attention and executive functioning (Flanker Inhibitory Control and Attention Task), executive functioning (Dimensional Change Card Sort Task), episodic memory (Picture Sequence Memory Task), and working memory (List Sorting Working Memory Task). The MRI protocol used was initially based on the ADNI-3 basic protocol, but was later modified to include portions of the ABCD protocol in the following manner:

The T1 scan from ADNI3 was replaced by the T1 scan from the ABCD protocol.
The Axial T2 2D FLAIR acquisition from ADNI2 was added, and fat saturation turned on.
Fat saturation was turned on for the pCASL acquisition.
The high-resolution in-plane hippocampal 2D T2 scan was removed, and replaced with the whole brain 3D T2 scan from the ABCD protocol (which is resolution and bandwidth matched to the T1 scan).
The slice-select gradient reversal method was turned on for DTI acquisition, and reconstruction interpolation turned off.
Scans for distortion correction were added (reversed-blip scans for DTI and resting state scans).
The 3D FLAIR sequence was made optional, and replaced by one where the prescription and other acquisition parameters provide resolution and geometric correspondence between the T1 and T2 scans.

MEG

The optional MEG studies were added to the protocol approximately one year after the study was initiated, thus there are relatively fewer MEG recordings in comparison to the MRI dataset. MEG studies are performed on a 275 channel CTF MEG system. The position of the head was localized at the beginning and end of the recording using three fiducial coils. These coils were placed 1.5 cm above the nasion, and at each ear, 1.5 cm from the tragus on a line between the tragus and the outer canthus of the eye. For some participants, photographs were taken of the three coils and used to mark the points on the T1 weighted structural MRI scan for co-registration. For the remainder of the participants, a BrainSight neuro-navigation unit was used to coregister the MRI, anatomical fiducials, and localizer coils directly prior to MEG data acquisition.

Specific Survey and Test Data within Data Set

NOTE: In the release 2.0 of the dataset, two measures Brief Trauma Questionnaire (BTQ) and Big Five personality survey were added to the online screening questionnaires. Also, for the in-person screening visit, the Beck Anxiety Inventory (BAI) and Beck Depression Inventory-II (BDI-II) were replaced with the General Anxiety Disorder-7 (GAD7) and Patient Health Questionnaire 9 (PHQ9) surveys, respectively. The Perceived Health rating survey was discontinued.

1. Preliminary Online Screening Questionnaires

Survey or Test	BIDS TSV Name
Alcohol Use Disorders Identification Test (AUDIT)	audit.tsv
Brief Trauma Questionnaire (BTQ)	btq.tsv
Big-Five Personality	big_five_personality.tsv
Demographics	demographics.tsv
Drug Use Questionnaire

autism prevalence studies
catalog.data.gov
data.virginia.gov
+4more
Updated Nov 10, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Centers for Disease Control and Prevention (2020). autism prevalence studies [Dataset]. https://catalog.data.gov/dataset/autism-prevalence-studies
Explore at:
Dataset updated
Nov 10, 2020
Dataset provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
Description
This data table provides a collection of information from peer-reviewed autism prevalence studies. Information reported from each study includes the autism prevalence estimate and additional study characteristics (e.g., case ascertainment and criteria). A PubMed search was conducted to identify studies published at any time through September 2020 using the search terms: autism (title/abstract) OR autistic (title/abstract) AND prevalence (title/abstract). Data were abstracted and included if the study fulfilled the following criteria: • The study was published in English; • The study produced at least one autism prevalence estimate; and • The study was population-based (any age range) within a defined geographic area.
National Child Development Study: Social Participation and Identity,...
beta.ukdataservice.ac.uk
Updated 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J. Elliott; M. Savage; A. Miles; S. Parsons (2023). National Child Development Study: Social Participation and Identity, 2007-2010 [Dataset]. http://doi.org/10.5255/ukda-sn-6691-3
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-6691-3
Dataset updated
2023
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Authors
J. Elliott; M. Savage; A. Miles; S. Parsons
Description
The National Child Development Study (NCDS) is a continuing longitudinal study that seeks to follow the lives of all those living in Great Britain who were born in one particular week in 1958. The aim of the study is to improve understanding of the factors affecting human development over the whole lifespan.

The NCDS has its origins in the Perinatal Mortality Survey (PMS) (the original PMS study is held at the UK Data Archive under SN 2137). This study was sponsored by the National Birthday Trust Fund and designed to examine the social and obstetric factors associated with stillbirth and death in early infancy among the 17,000 children born in England, Scotland and Wales in that one week. Selected data from the PMS form NCDS sweep 0, held alongside NCDS sweeps 1-3, under SN 5565.

Survey and Biomeasures Data (GN 33004):
To date there have been nine attempts to trace all members of the birth cohort in order to monitor their physical, educational and social development. The first three sweeps were carried out by the National Children's Bureau, in 1965, when respondents were aged 7, in 1969, aged 11, and in 1974, aged 16 (these sweeps form NCDS1-3, held together with NCDS0 under SN 5565). The fourth sweep, also carried out by the National Children's Bureau, was conducted in 1981, when respondents were aged 23 (held under SN 5566). In 1985 the NCDS moved to the Social Statistics Research Unit (SSRU) - now known as the Centre for Longitudinal Studies (CLS). The fifth sweep was carried out in 1991, when respondents were aged 33 (held under SN 5567). For the sixth sweep, conducted in 1999-2000, when respondents were aged 42 (NCDS6, held under SN 5578), fieldwork was combined with the 1999-2000 wave of the 1970 Birth Cohort Study (BCS70), which was also conducted by CLS (and held under GN 33229). The seventh sweep was conducted in 2004-2005 when the respondents were aged 46 (held under SN 5579), the eighth sweep was conducted in 2008-2009 when respondents were aged 50 (held under SN 6137) and the ninth sweep was conducted in 2013 when respondents were aged 55 (held under SN 7669).

Four separate datasets covering responses to NCDS over all sweeps are available. National Child Development Deaths Dataset: Special Licence Access (SN 7717) covers deaths; National Child Development Study Response and Outcomes Dataset (SN 5560) covers all other responses and outcomes; National Child Development Study: Partnership Histories (SN 6940) includes data on live-in relationships; and National Child Development Study: Activity Histories (SN 6942) covers work and non-work activities. Users are advised to order these studies alongside the other waves of NCDS.

From 2002-2004, a Biomedical Survey was completed and is available under End User Licence (EUL) (SN 8731) and Special Licence (SL) (SN 5594). Proteomics analyses of blood samples are available under SL SN 9254.

Linked Geographical Data (GN 33497):
A number of geographical variables are available, under more restrictive access conditions, which can be linked to the NCDS EUL and SL access studies.

Linked Administrative Data (GN 33396):
A number of linked administrative datasets are available, under more restrictive access conditions, which can be linked to the NCDS EUL and SL access studies. These include a Deaths dataset (SN 7717) available under SL and the Linked Health Administrative Datasets (SN 8697) available under Secure Access.

Additional Sub-Studies (GN 33562):
In addition to the main NCDS sweeps, further studies have also been conducted on a range of subjects such as parent migration, unemployment, behavioural studies and respondent essays. The full list of NCDS studies available from the UK Data Service can be found on the NCDS series access data webpage.

How to access genetic and/or bio-medical sample data from a range of longitudinal surveys:
For information on how to access biomedical data from NCDS that are not held at the UKDS, see the CLS Genetic data and biological samples webpage.

Further information about the full NCDS series can be found on the Centre for Longitudinal Studies website.
The Social Participation and Identity project combined quantitative longitudinal data with a qualitative investigation of a sub-sample of the NCDS cohort when they were aged 50, presented here as a mixed-methods data collection containing both qualitative and quantitative data. This was the first attempt to interview members of a national, longitudinal cohort study in depth, with the possibility of linking such biographical narratives to structured survey data collected throughout the life course. Interviews were conducted with a sub-sample of 220 NCDS cohort members resident in Great Britain (England, Scotland and Wales). The interviews were organised into six main sections focussing on: 1) Neighbourhood and belonging; 2) Leisure activities and social participation; 3) Personal communities; 4) Life histories; 5) Identity; 6) Reflections on being part of the NCDS.

Further information:
details of the qualitative NCDS project (and the rest of the NCDS series) can be found on the Centre for Longitudinal Studies website and Social Participation and Identity ESRC award webpage.
A list of the UK Data Archive's NCDS holdings, both quantitative and qualitative, can be found on the NCDS key data page.
For the first and second editions of the study (2011 and 2012), the interview transcripts, interviewer observation summaries, gender identity diagrams and life trajectory diagrams for all participants were made available. For the third edition (July 2013), 179 essays collected from the subproject participants at the time of the NCDS2 wave (conducted 1969) were added to the study. See documentation for further details. (Users should note that an additional sample of transcribed essays from a wider set of NCDS2 participants is available from the Archive under SN 8313.)
o
Public Health Portfolio dataset
nihr.opendatasoft.com
nihr.aws-ec2-eu-central-1.opendatasoft.com
csv, excel, json
Updated May 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Public Health Portfolio dataset [Dataset]. https://nihr.opendatasoft.com/explore/dataset/phof-datase/
Explore at:
excel, json, csvAvailable download formats
Dataset updated
May 29, 2025
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
The NIHR is one of the main funders of public health research in the UK. Public health research falls within the remit of a range of NIHR Research Programmes, NIHR Centres of Excellence and Facilities, plus the NIHR Academy. NIHR awards from all NIHR Research Programmes and the NIHR Academy that were funded between January 2006 and the present extraction date are eligible for inclusion in this dataset. An agreed inclusion/exclusion criteria is used to categorise awards as public health awards (see below). Following inclusion in the dataset, public health awards are second level coded to one of the four Public Health Outcomes Framework domains. These domains are: (1) wider determinants (2) health improvement (3) health protection (4) healthcare and premature mortality.More information on the Public Health Outcomes Framework domains can be found here.This dataset is updated quarterly to include new NIHR awards categorised as public health awards. Please note that for those Public Health Research Programme projects showing an Award Budget of £0.00, the project is undertaken by an on-call team for example, PHIRST, Public Health Review Team, or Knowledge Mobilisation Team, as part of an ongoing programme of work.Inclusion criteriaThe NIHR Public Health Overview project team worked with colleagues across NIHR public health research to define the inclusion criteria for NIHR public health research awards. NIHR awards are categorised as public health awards if they are determined to be ‘investigations of interventions in, or studies of, populations that are anticipated to have an effect on health or on health inequity at a population level.’ This definition of public health is intentionally broad to capture the wide range of NIHR public health awards across prevention, health improvement, health protection, and healthcare services (both within and outside of NHS settings). This dataset does not reflect the NIHR’s total investment in public health research. The intention is to showcase a subset of the wider NIHR public health portfolio. This dataset includes NIHR awards categorised as public health awards from NIHR Research Programmes and the NIHR Academy. This dataset does not currently include public health awards or projects funded by any of the three NIHR Research Schools or any of the NIHR Centres of Excellence and Facilities. Therefore, awards from the NIHR Schools for Public Health, Primary Care and Social Care, NIHR Public Health Policy Research Unit and the NIHR Health Protection Research Units do not feature in this curated portfolio.DisclaimersUsers of this dataset should acknowledge the broad definition of public health that has been used to develop the inclusion criteria for this dataset. This caveat applies to all data within the dataset irrespective of the funding NIHR Research Programme or NIHR Academy award.Please note that this dataset is currently subject to a limited data quality review. We are working to improve our data collection methodologies. Please also note that some awards may also appear in other NIHR curated datasets. Further informationFurther information on the individual awards shown in the dataset can be found on the NIHR’s Funding & Awards website here. Further information on individual NIHR Research Programme’s decision making processes for funding health and social care research can be found here.Further information on NIHR’s investment in public health research can be found as follows: NIHR School for Public Health here. NIHR Public Health Policy Research Unit here. NIHR Health Protection Research Units here. NIHR Public Health Research Programme Health Determinants Research Collaborations (HDRC) here. NIHR Public Health Research Programme Public Health Intervention Responsive Studies Teams (PHIRST) here.
Data from: Population Assessment of Tobacco and Health (PATH) Study [United...
icpsr.umich.edu
Updated Jun 27, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Inter-university Consortium for Political and Social Research [distributor] (2025). Population Assessment of Tobacco and Health (PATH) Study [United States] Restricted-Use Files [Dataset]. http://doi.org/10.3886/ICPSR36231.v42
Explore at:
Unique identifier
https://doi.org/10.3886/ICPSR36231.v42
Dataset updated
Jun 27, 2025
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
License
https://www.icpsr.umich.edu/web/ICPSR/studies/36231/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36231/terms
Area covered
United States
Description
The PATH Study was launched in 2011 to inform the Food and Drug Administration's regulatory activities under the Family Smoking Prevention and Tobacco Control Act (TCA). The PATH Study is a collaboration between the National Institute on Drug Abuse (NIDA), National Institutes of Health (NIH), and the Center for Tobacco Products (CTP), Food and Drug Administration (FDA). The study sampled over 150,000 mailing addresses across the United States to create a national sample of people who use or do not use tobacco. 45,971 adults and youth constitute the first (baseline) wave, Wave 1, of data collected by this longitudinal cohort study. These 45,971 adults and youth along with 7,207 "shadow youth" (youth ages 9 to 11 sampled at Wave 1) make up the 53,178 participants that constitute the Wave 1 Cohort. Respondents are asked to complete an interview at each follow-up wave. Youth who turn 18 by the current wave of data collection are considered "aged-up adults" and are invited to complete the Adult Interview. Additionally, "shadow youth" are considered "aged-up youth" upon turning 12 years old, when they are asked to complete an interview after parental consent. At Wave 4, a probability sample of 14,098 adults, youth, and shadow youth ages 10 to 11 was selected from the civilian, noninstitutionalized population (CNP) at the time of Wave 4. This sample was recruited from residential addresses not selected for Wave 1 in the same sampled Primary Sampling Unit (PSU)s and segments using similar within-household sampling procedures. This "replenishment sample" was combined for estimation and analysis purposes with Wave 4 adult and youth respondents from the Wave 1 Cohort who were in the CNP at the time of Wave 4. This combined set of Wave 4 participants, 52,731 participants in total, forms the Wave 4 Cohort. At Wave 7, a probability sample of 14,863 adults, youth, and shadow youth ages 9 to 11 was selected from the CNP at the time of Wave 7. This sample was recruited from residential addresses not selected for Wave 1 or Wave 4 in the same sampled PSUs and segments using similar within-household sampling procedures. This "second replenishment sample" was combined for estimation and analysis purposes with the Wave 7 adult and youth respondents from the Wave 4 Cohorts who were at least age 15 and in the CNP at the time of Wave 7. This combined set of Wave 7 participants, 46,169 participants in total, forms the Wave 7 Cohort. Please refer to the Restricted-Use Files User Guide that provides further details about children designated as "shadow youth" and the formation of the Wave 1, Wave 4, and Wave 7 Cohorts. Dataset 0002 (DS0002) contains the data from the State Design Data. This file contains 7 variables and 82,139 cases. The state identifier in the State Design file reflects the participant's state of residence at the time of selection and recruitment for the PATH Study. Dataset 1011 (DS1011) contains the data from the Wave 1 Adult Questionnaire. This data file contains 2,021 variables and 32,320 cases. Each of the cases represents a single, completed interview. Dataset 1012 (DS1012) contains the data from the Wave 1 Youth and Parent Questionnaire. This file contains 1,431 variables and 13,651 cases. Dataset 1411 (DS1411) contains the Wave 1 State Identifier data for Adults and has 5 variables and 32,320 cases. Dataset 1412 (DS1412) contains the Wave 1 State Identifier data for Youth (and Parents) and has 5 variables and 13,651 cases. The same 5 variables are in each State Identifier dataset, including PERSONID for linking the State Identifier to the questionnaire and biomarker data and 3 variables designating the state (state Federal Information Processing System (FIPS), state abbreviation, and full name of the state). The State Identifier values in these datasets represent participants' state of residence at the time of Wave 1, which is also their state of residence at the time of recruitment. Dataset 1611 (DS1611) contains the Tobacco Universal Product Code (UPC) data from Wave 1. This data file contains 32 variables and 8,601 cases. This file contains UPC values on the packages of tobacco products used or in the possession of adult respondents at the time of Wave 1. The UPC values can be used to identify and validate the specific products used by respondents and augment the analyses of the characteristics of tobacco products used
u
Longitudinal ALS EEG Dataset for Motor Imagery Studies
rdr.ucl.ac.uk
bin
Updated Jan 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rishan Patel; Dai Jiang; Barney Bryson; Tom Carlson; Andreas Demosthenous; Andrew Geronimo (2025). Longitudinal ALS EEG Dataset for Motor Imagery Studies [Dataset]. http://doi.org/10.5522/04/28156016.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5522/04/28156016.v1
Dataset updated
Jan 24, 2025
Dataset provided by
University College London
Authors
Rishan Patel; Dai Jiang; Barney Bryson; Tom Carlson; Andreas Demosthenous; Andrew Geronimo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises EEG recordings from eight ALS patients aged between 45.5 and 74 years. Patients exhibited revised ALS Functional Rating Scale (ALSFRS-R) scores ranging from 0 to 46, with time since symptom onset (TSSO) varying between 12 and 113 months. Notably, no disease progression was reported during the study period, ensuring stability in clinical conditions. The participants were recruited from the Penn State Hershey Medical Center ALS Clinic and had confirmed ALS diagnoses without significant dementia. This rigorous selection criterion ensured the validity and reliability of the dataset for motor imagery analysis in an ALS population.The EEG data were collected using 19 electrodes placed according to the international 10-20 system (FP1, FP2, F7, F3, FZ, F4, F8, T7, C3, CZ, C4, T8, P7, P3, PZ, P4, P8, O1, O2), with signals referenced to linked earlobes and a ground electrode at FPz. Additionally, three electrooculogram (EOG) electrodes were employed to facilitate artifact removal, maintaining impedance levels below 10 kΩ throughout data acquisition. The data were amplified using two g.USBamp systems (g.tec GmbH) and recorded via the BCI2000 software suite, with supplementary preprocessing in MATLAB. All experimental procedures adhered strictly to Penn State University’s IRB protocol PRAMSO40647EP, ensuring ethical compliance.Each participant underwent four brain-computer interface (BCI) sessions conducted over a period of 1 to 2 months. Each session consisted of four runs, with 10 trials per class (left hand, right hand, and rest) for a total of 40 trials per session. The sessions began with a calibration run to initialize the system, followed by feedback runs during which participants controlled a cursor's movement through motor imagery, specifically imagined grasping movements. The study design, focused on motor imagery (MI), generated a total of 160 trials per participant over two months.This dataset holds significance in studying the longitudinal dynamics of motor imagery decoding in ALS patients. To ensure reproducibility of our findings and to promote advancements in the field, we have received explicit permission from Prof. Geronimo of Penn State University to distribute this dataset in the processed format for research purposes. The original publication of this collection can be found below.How to use this dataset: This dataset is structured in MATLAB as a collection of subject-specific structs, where each subject is represented as a single struct. Each struct contains three fields:L: Trials corresponding to Left Motor Imagery.R: Trials corresponding to Right Motor Imagery.Re: Trials corresponding to Rest state.Each field contains an array of trials, where each trial is represented as a matrix with, Rows as Timestamps, and Columns as channels.Primary Collection: Geronimo A, Simmons Z, Schiff SJ. Performance predictors of brain-computer interfaces in patients with amyotrophic lateral sclerosis. Journal of neural engineering 2016 13. 10.1088/1741-2560/13/2/026002.All code for any publications with this data has been made publicly available at the following link:https://github.com/rishannp/Auto-Adaptive-FBCSPhttps://github.com/rishannp/Motor-Imagery---Graph-Attention-Network
Understanding Society: Calendar Year Dataset, 2022
beta.ukdataservice.ac.uk
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Institute For Social University Of Essex (2024). Understanding Society: Calendar Year Dataset, 2022 [Dataset]. http://doi.org/10.5255/ukda-sn-9333-1
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-9333-1
Dataset updated
2024
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Authors
Institute For Social University Of Essex
Description
Understanding Society, (UK Household Longitudinal Study), which began in 2009, is conducted by the Institute for Social and Economic Research (ISER) at the University of Essex and the survey research organisations Verian Group (formerly Kantar Public) and NatCen. It builds on and incorporates, the British Household Panel Survey (BHPS), which began in 1991.

The Understanding Society: Calendar Year Dataset, 2022, is designed for analysts to conduct cross-sectional analysis for the 2022 calendar year. The Calendar Year datasets combine data collected in a specific year from across multiple waves and these are released as separate calendar year studies, with appropriate analysis weights, starting with the 2020 Calendar Year dataset. Each subsequent year, an additional yearly study is released.

The Calendar Year data is designed to enable timely cross-sectional analysis of individuals and households in a calendar year. Such analysis can, however, only involve variables that are collected in every wave (excluding rotating content, which is only collected in some of the waves). Due to overlapping fieldwork, the data files combine data collected in the three waves that make up a calendar year. Analysis cannot be restricted to data collected in one wave during a calendar year, as this subset will not be representative of the population. Further details and guidance on this study can be found in the document 9333_main_survey_calendar_year_user_guide_2022.

These calendar year datasets should be used for cross-sectional analysis only. For those interested in longitudinal analyses using Understanding Society please access the main survey datasets: End User Licence version or Special Licence version.

Understanding Society: the UK Household Longitudinal Study, started in 2009 with a general population sample (GPS) of UK residents living in private households of around 26,000 households and an ethnic minority boost sample (EMBS) of 4,000 households. All members of these responding households and their descendants became part of the core sample who were eligible to be interviewed every year. Anyone who joined these households after this initial wave was also interviewed as long as they lived with these core sample members to provide the household context. At each annual interview, some basic demographic information was collected about every household member, information about the household is collected from one household member, all 16+-year-old household members are eligible for adult interviews, 10-15-year-old household members are eligible for youth interviews, and some information is collected about 0-9 year-olds from their parents or guardians. Since 1991 until 2008/9 a similar survey, the British Household Panel Survey (BHPS), was fielded. The surviving members of this survey sample were incorporated into Understanding Society in 2010. In 2015, an immigrant and ethnic minority boost sample (IEMBS) of around 2,500 households was added. In 2022, a GPS boost sample (GPS2) of around 5,700 households was added. To know more about the sample design, following rules, interview modes, incentives, consent, and questionnaire content, please see the study overview and user guide.

Co-funders

In addition to the Economic and Social Research Council, co-funders for the study included the Department of Work and Pensions, the Department for Education, the Department for Transport, the Department of Culture, Media and Sport, the Department for Community and Local Government, the Department of Health, the Scottish Government, the Welsh Assembly Government, the Northern Ireland Executive, the Department of Environment and Rural Affairs, and the Food Standards Agency.

End User Licence and Special Licence versions:

There are two versions of the Calendar Year 2022 data. One is available under the standard End User Licence (EUL) agreement (SN 9333), and the other is a Special Licence (SL) version (SN 9334). The SL version contains month and year of birth variables instead of just age, more detailed country and occupation coding for a number of variables and various income variables have not been top-coded (see document 9333_eul_vs_sl_variable_differences for more details). Users are advised first to obtain the standard EUL version of the data to see if they are sufficient for their research requirements. The SL data have more restrictive access conditions; prospective users of the SL version will need to complete an extra application form and demonstrate to the data owners exactly why they need access to the additional variables in order to get permission to use that version. The main longitudinal versions of the Understanding Society study may be found under SNs 6614 (EUL) and 6931 (SL).

Low- and Medium-level geographical identifiers produced for the mainstage longitudinal dataset can be used with this Calendar Year 2022 dataset, subject to SL access conditions. See the User Guide for further details.

Suitable data analysis software

These data are provided by the depositor in Stata format. Users are strongly advised to analyse them in Stata. Transfer to other formats may result in unforeseen issues. Stata SE or MP software is needed to analyse the larger files, which contain about 1,800 variables.
D
Public Life Data - People Moving
data.seattle.gov
cos-data.seattle.gov
+2more
application/rdfxml +5
Updated Feb 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seattle Department of Transportation (2023). Public Life Data - People Moving [Dataset]. https://data.seattle.gov/Community-and-Culture/Public-Life-Data-People-Moving/7rx6-5pgd
Explore at:
tsv, csv, application/rdfxml, application/rssxml, json, xmlAvailable download formats
Dataset updated
Feb 15, 2023
Dataset authored and provided by
Seattle Department of Transportationhttp://www.seattle.gov/transportation/
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
Provides data on people moving through space, including total number observed, gender breakdown, group size, and age groups.

The City of Seattle Department of Transportation (SDOT) is providing data from the public life studies it has conducted since 2017. These studies consist of measuring the number of people using public space and the types of activities present on select sidewalks across the city, as well as several parks and plazas. The data set is continually updated as SDOT and other parties conduct public life studies using Gehl Institute’s Public Life Data Protocol.

This dataset consists of four component spreadsheets and a GeoJSON file, which provide public life data as well as information about the study design and study locations:

1 Public Life Study: provides details on the different studies that have been conducted, including project information. https://data.seattle.gov/Transportation/Public-Life-Data-Study/7qru-sdcp

2 Public Life Location: provides details on the sites selected for each study, including various attributes to allow for comparison across sites. https://data.seattle.gov/Transportation/Public-Life-Data-Locations/fg6z-cn3y

3 Public Life People Moving: provides data on people moving through space, including total number observed, gender breakdown, group size, and age groups.

4 Public Life People Staying: provides data on people staying still in the space, including total number observed, demographic data, group size, postures, and activities. https://data.seattle.gov/Transportation/Public-Life-Data-People-Staying/5mzj-4rtf

5 Public Life Geography: A GeoJSON file with polygons of every location studied. https://data.seattle.gov/Transportation/Public-Life-Data-Geography/v4q3-5hvp

Please download and refer to the Public Life metadata document - in the attachment section below - for comprehensive information about all of the Public Life datasets.
U
Time Diary Study (CAPS-DIARY module)
dataverse-staging.rdmc.unc.edu
datasearch.gesis.org
Updated May 18, 2009
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UNC Dataverse (2009). Time Diary Study (CAPS-DIARY module) [Dataset]. https://dataverse-staging.rdmc.unc.edu/dataset.xhtml?persistentId=hdl:1902.29/CAPS-DIARY
Explore at:
tsv(68411), application/x-sas-transport(237840), application/x-spss-por(75276), application/x-sas-transport(242160), application/x-spss-por(75850), application/x-sas-transport(240000), txt(70468), application/x-spss-por(74374), application/x-spss-por(77572), tsv(65433), txt(452140), txt(91461), application/x-sas-transport(1613120), application/x-spss-por(75358), txt(135850), txt(237380), application/x-spss-por(392206), txt(219960), txt(223730), txt(243880), application/x-sas-transport(945520), txt(437710), txt(447330), application/x-sas-transport(235680), txt(239720), tsv(65759), tsv(66745), txt(134420), txt(198510), txt(231010), application/x-spss-por(75522), text/x-sas-syntax(14192), tsv(66377), application/x-spss-por(75686), txt(218140), txt(247000), txt(229190), txt(456950), tsv(67095), txt(209820), txt(29480), txt(234130), text/x-sas-syntax(14213), tsv(67582), txt(223990), txt(227110), txt(432900), application/x-spss-por(74702), application/x-spss-por(76506), txt(248950), application/x-spss-por(75768), txt(132990), text/x-sas-syntax(14212), tsv(66338), tsv(65479), txt(442520), txt(133120), txt(220870), text/x-sas-syntax(14200), tsv(515401), txt(130390), txt(222560), txt(217100), txt(246350), tsv(66085), txt(461760), application/x-spss-por(76260), tsv(66939), txt(235560), txt(229450), txt(72104), tsv(66400), txt(211510), txt(226850), application/x-spss-por(492492), txt(205790), txt(210210), tsv(66217), tsv(66157), txt(234390), application/x-spss-por(75112), application/x-spss-por(75932), txt(224770), application/x-spss-por(74784), tsv(66192), txt(131560), txt(230100), txt(219050), tsv(382593), txt(213980), tsv(66604), txt(140140)Available download formats
Dataset updated
May 18, 2009
Dataset provided by
UNC Dataverse
License
https://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARYhttps://dataverse-staging.rdmc.unc.edu/api/datasets/:persistentId/versions/1.0/customlicense?persistentId=hdl:1902.29/CAPS-DIARY
Description
The purpose of this project is to determine how college students distribute their activities in time (with a particular focus on academic and athletic activities) and to examine the factors that influence such distributions.Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday. Rs were told the week before they were to report which day was assigned and were given a report form to complete during that day. They entered the i nformation from that form when they returned the next week.The activity codes included were: 0: Sleeping. 1: Attending classes. 2: Studying or preparing classroom assignments. 3: Working at a jog (including CAPS). 4: Cooking, home chores, laundry, grocery shopping. 5: Errands, non-grocery shopping, gardening, animal care. 6: Eating. 7: Bathing, getting dressed, etc. 8: Sports, exercising, other physical activities. 9: Playing competitive games (cards, darts, videogames, frisbee, chess, Tr ivial Pursuit, etc.). 10: Participating in UNC-sponsored organizations (student government, band, sorority, etc.). 11: Listening to the radio. 12: Watching TV. 13: Reading for pleasure (not studying or reading for class). 14: Going to a movie. 15: Attending a cultural event (such as a play, concert, or museum). 16: Attending a sports event as a spectator. 17: Partying. 18: Religious activities. 19: Conversation. 20: Travel. 21: Resting. 22: Doing other things DIARY1-8: These datasets contain a matrix of activities by times for a particular day. Included is time period, activity code (see above), # of friends present, # of others present. (Rs were allowed to report doing two activities at once. In these cases they were also asked to report the % of time during the time period affected which was allocated to the first of the two activities listed.)THE DIARY DATASETS ARE STORED IN RAW FORM. SUMMARY FILES, CALLED TIMEREP, CONTAIN MOST SUMMA RY INFORMATION WHICH MIGHT BE USED IN ANALYSES. THE DIARY DATASETS CAN BE LISTED TO ALLOW UNIQUE CODING OF THE ORIGINAL DATA. Each R reported once about each of the seven days of the week and an additional time about either Saturday or Sunday.TIMEREP: The TIMEREP dataset is a summary file which gives the amount of time spent on each activity during each of the eight reporting periods and also includes more detailed information about many of the activities from follow-up questions which were asked if the respondent reported having engaged in certain activities. Data from additional questions asked of every respondent after each diary entry are also included: contact with family members, number of alcoholic drinks consumed during the 24 hour period reported on, number of friends and others present while drinking, number of cigarettes smoked on day reported about, and number of classes skipped on day reported about. Follow-up questions include detail about kind of physical activity or sports participation, kind of university organization, kind of radio program listened to and place of listening, kind of TV program watched and place of watching, kind of reading material read and topic, alcohol consumed while partying and place of partying, conversation topics, kind of travel, activities included in 'other' category.Special processing is required to put the dataset into SAS format. See spec for details.
Spinal Cord Images - Spine MRI Dataset
kaggle.com
Updated Feb 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Training Data (2024). Spinal Cord Images - Spine MRI Dataset [Dataset]. https://www.kaggle.com/datasets/trainingdatapro/spinal-cord-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 21, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Training Data
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Spine MRI Dataset, Fracture Detection, Anomaly Detection & Segmentation

The dataset consists of .dcm files containing MRI scans of the spine of the person with several dystrophic changes, such as osteochondrosis, spondyloarthrosis, hemangioma, physiological lordosis smoothed, osteophytes and aggravated defects. The images are labeled by the doctors and accompanied by report in PDF-format.

The dataset includes 9 studies, made from the different angles which provide a comprehensive understanding of a several dystrophic changes and useful in training spine anomaly classification algorithms. Each scan includes detailed imaging of the spine, including the vertebrae, discs, nerves, and surrounding tissues.

MRI study angles in the dataset

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2F62acce9c1d60720bdd396e036718f406%2FFrame%2084.png?generation=1708543957118470&alt=media" alt="">

💴 For Commercial Usage: Full version of the dataset includes 20,000 spine studies of people with different conditions, leave a request on TrainingData to buy the dataset

Types of diseases and conditions in the full dataset:

Degeneration of discs

Osteophytes

Osteochondrosis

Hemangioma

Disk extrusion

Spondylitis

AND MANY OTHER CONDITIONS

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F12421376%2Fd2f21b9ac7dc26a3554e4647db47df57%2F3.gif?generation=1708543677763656&alt=media" alt="">

Researchers and healthcare professionals can use this dataset to study spinal conditions and disorders, such as herniated discs, spinal stenosis, scoliosis, and fractures. The dataset can also be used to develop and evaluate new imaging techniques, computer algorithms for image analysis, and artificial intelligence models for automated diagnosis.

OTHER MEDICAL SPINE MRI DATASETS:

Spine MRI Dataset -Anomaly Detection

Spine Segmentation Dataset - MRI Images

Spinal Vertebrae Segmentation Dataset

Spine Magnetic Resonance Imaging Dataset

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Content

The dataset includes:

ST000001: includes subfolders with 9 studies. Each study includes MRI-scans in .dcm and .jpg formats,

DICOMDIR: includes information about the patient's condition and links to access files,

Spine_MRI_2.pdf: includes medical report, provided by the radiologist,

.csv file: includes id of the studies and the number of files

Medical reports include the following data:

Patient's demographic information,

Description of the case,

Preliminary diagnosis,

Recommendations on the further actions

All patients consented to the publication of data

Medical data might be collected in accordance with your requirements.

TrainingData provides high-quality data annotation tailored to your needs

keywords: visual, label, positive, negative, symptoms, clinically, sensory, varicella, syndrome, predictors, diagnosed, rsna cervical, image train, segmentations meta, spine train, mri spine scans, spinal imaging, radiology dataset, neuroimaging, medical imaging data, image segmentation, lumbar spine mri, thoracic spine mri, cervical spine mri, spine anatomy, spinal cord mri, orthopedic imaging, radiologist dataset, mri scan analysis, spine mri dataset, machine learning medical imaging, spinal abnormalities, image classification, neural network spine scans, mri data analysis, deep learning medical imaging, mri image processing, spine tumor detection, spine injury diagnosis, mri image segmentation, spine mri classification, artificial intelligence in radiology, spine abnormalities detection, spine pathology analysis, mri feature extraction, tomography, cloud
National Child Development Study: Linked Health Administrative Datasets...
beta.ukdataservice.ac.uk
datacatalogue.cessda.eu
Updated 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
UCL Institute of Education University College London (2025). National Child Development Study: Linked Health Administrative Datasets (Hospital Episode Statistics), England, 1997-2023: Secure Access [Dataset]. http://doi.org/10.5255/ukda-sn-8697-3
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-8697-3
Dataset updated
2025
Dataset provided by
UK Data Servicehttps://ukdataservice.ac.uk/
datacite
Authors
UCL Institute of Education University College London
Area covered
England
Description
The National Child Development Study (NCDS) is a continuing longitudinal study that seeks to follow the lives of all those living in Great Britain who were born in one particular week in 1958. The aim of the study is to improve understanding of the factors affecting human development over the whole lifespan.

The NCDS has its origins in the Perinatal Mortality Survey (PMS) (the original PMS study is held at the UK Data Archive under SN 2137). This study was sponsored by the National Birthday Trust Fund and designed to examine the social and obstetric factors associated with stillbirth and death in early infancy among the 17,000 children born in England, Scotland and Wales in that one week. Selected data from the PMS form NCDS sweep 0, held alongside NCDS sweeps 1-3, under SN 5565.

Survey and Biomeasures Data (GN 33004):
To date there have been ten attempts to trace all members of the birth cohort in order to monitor their physical, educational and social development. The first three sweeps were carried out by the National Children's Bureau, in 1965, when respondents were aged 7, in 1969, aged 11, and in 1974, aged 16 (these sweeps form NCDS1-3, held together with NCDS0 under SN 5565). The fourth sweep, also carried out by the National Children's Bureau, was conducted in 1981, when respondents were aged 23 (held under SN 5566). In 1985 the NCDS moved to the Social Statistics Research Unit (SSRU) - now known as the Centre for Longitudinal Studies (CLS). The fifth sweep was carried out in 1991, when respondents were aged 33 (held under SN 5567). For the sixth sweep, conducted in 1999-2000, when respondents were aged 42 (NCDS6, held under SN 5578), fieldwork was combined with the 1999-2000 wave of the 1970 Birth Cohort Study (BCS70), which was also conducted by CLS (and held under GN 33229). The seventh sweep was conducted in 2004-2005 when the respondents were aged 46 (held under SN 5579), the eighth sweep was conducted in 2008-2009 when respondents were aged 50 (held under SN 6137), the ninth sweep was conducted in 2013 when respondents were aged 55 (held under SN 7669), and the tenth sweep was conducted in 2020-24 when the respondents were aged 60-64 (held under SN 9412).

A Secure Access version of the NCSD is available under SN 9413, containing detailed sensitive variables not available under Safeguarded access (currently only sweep 10 data). Variables include uncommon health conditions (including age at diagnosis), full employment codes and income/finance details, and specific life circumstances (e.g. pregnancy details, year/age of emigration from GB).

Four separate datasets covering responses to NCDS over all sweeps are available. National Child Development Deaths Dataset: Special Licence Access (SN 7717) covers deaths; National Child Development Study Response and Outcomes Dataset (SN 5560) covers all other responses and outcomes; National Child Development Study: Partnership Histories (SN 6940) includes data on live-in relationships; and National Child Development Study: Activity Histories (SN 6942) covers work and non-work activities. Users are advised to order these studies alongside the other waves of NCDS.

From 2002-2004, a Biomedical Survey was completed and is available under End User Licence (EUL) (SN 8731) and Special Licence (SL) (SN 5594). Proteomics analyses of blood samples are available under SL SN 9254.

Linked Geographical Data (GN 33497):
A number of geographical variables are available, under more restrictive access conditions, which can be linked to the NCDS EUL and SL access studies.

Linked Administrative Data (GN 33396):
A number of linked administrative datasets are available, under more restrictive access conditions, which can be linked to the NCDS EUL and SL access studies. These include a Deaths dataset (SN 7717) available under SL and the Linked Health Administrative Datasets (SN 8697) available under Secure Access.

Multi-omics Data and Risk Scores Data (GN 33592)
Proteomics analyses were run on the blood samples collected from NCDS participants in 2002-2004 and are available under SL SN 9254. Metabolomics analyses were conducted on respondents of sweep 10 and are available under SL SN 9411.

Additional Sub-Studies (GN 33562):
In addition to the main NCDS sweeps, further studies have also been conducted on a range of subjects such as parent migration, unemployment, behavioural studies and respondent essays. The full list of NCDS studies available from the UK Data Service can be found on the NCDS series access data webpage.

How to access genetic and/or bio-medical sample data from a range of longitudinal surveys:
For information on how to access biomedical data from NCDS that are not held at the UKDS, see the CLS Genetic data and biological samples webpage.

Further information about the full NCDS series can be found on the Centre for Longitudinal Studies website.

The National Child Development Study: Linked Health Administrative Datasets (Hospital Episode Statistics), England, 1997-2023: Secure Access includes data files from the NHS Digital HES database for those cohort members who provided consent to health data linkage in the Age 50 sweep. The HES database contains information about all hospital admissions in England. The following linked HES data are available:
1) Accident and Emergency (A&E)
The A&E dataset details each attendance to an Accident and Emergency care facility in England, between 01-04-2007 and 31-03-2020 (inclusive). It includes major A&E departments, single speciality A&E departments, minor injury units and walk-in centres in England.

2) Admitted Patient Care (APC)
The APC data summarises episodes of care for admitted patients, where the episode occurred between 01-04-1997 and 31-03-2023 (inclusive).

3) Critical Care (CC)
The CC dataset covers records of critical care activity between 01-04-2009 and 31-03-2023 (inclusive).

4) Out Patient (OP)
The OP dataset lists the outpatient appointments between 01-04-2003 and 31-03-2023 (inclusive).

5) Emergency Care Dataset (ECDS)
The ECDS lists the emergency care appointments between 01-04-2020 and 31-03-2023 (inclusive).

6) Consent data
The consents dataset describes consent to linkage, and is current at the time of deposit.

CLS/ NHS Digital Sub-licence agreement
NHS Digital has given CLS permission for onward sharing of the NCDS/HES dataset via the UKDS Secure Lab. In order to ensure data minimisation, NHS Digital requires that researchers only access the HES variables needed for their approved research project. Therefore, the HES linked data provided by the UKDS to approved researchers will be subject to sub-setting of variables. The researcher will need to request a specific sub-set of variables from the NCDS/HES data dictionary, which will subsequently be made available within their UKDS Secure Account. Once the researcher has finished their research, the UKDS will delete the tailored dataset for that specific project. Any party wishing to access the data deposited at the UK Data Service will be required to enter into a Licence agreement with CLS (UCL), in addition to the agreements signed with the UKDS, provided in the application pack.
CLS Hospital Episode Statistics data access update July 2025
From March 2027, HES data linked to all four CLS studies will no longer be available via the UK Data Service. For projects ending before March 2027, uses should continue to apply via UKDS. However, if access to a wider range of linked Longitudinal Population Studies data is needed, UKLLC might be more suitable. For projects ending after March 2027, users must apply via UKLLC.
Latest edition information
For the third edition (April 2025), the data have been updated to include linked data for the financial years 2017-2022. In addition, a new dataset for Emergency Care (ECDS) episodes has been added, along with a dataset detailing the consent for linkage. Furthermore, the study documentation has also been updated.
h
NIHR Midlands ARC Dataset: Outcomes from out-of-hospital cardiac arrest
healthdatagateway.org
unknown
Updated Oct 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158) (2024). NIHR Midlands ARC Dataset: Outcomes from out-of-hospital cardiac arrest [Dataset]. https://healthdatagateway.org/en/dataset/935
Explore at:
unknownAvailable download formats
Dataset updated
Oct 31, 2024
Dataset authored and provided by
This publication uses data from PIONEER, an ethically approved database and analytical environment (East Midlands Derby Research Ethics 20/EM/0158)
License
https://www.pioneerdatahub.co.uk/data/data-request-process/https://www.pioneerdatahub.co.uk/data/data-request-process/
Description
Resuscitation to Recovery is the national framework to improve care of people with Out of hospital cardiac arrests (OHCA). Despite this, survival rates continue to be around 10%. Recently an OHCA care pathway been developed by the British Cardiovascular Interventional Society, aiming to reduce unwarranted variation in interventional cardiovascular practice for OHCA. However, little research has tracked the care OHCA patients receive along the whole pathway. 

To support a better understanding of OHCA care pathways, PIONEER, working with the NIHR Midlands Applied Research Collaboration and West Midlands Ambulance Service, has curated a highly granular dataset of 1588 OHCA events. The data includes demography, comorbidities, initial presentation, serial physiology, assessments, treatment provided both before and after West Midlands Ambulance Service arrival, onward hospital investigations, management and outcomes, including future healthcare use. The current dataset includes OHCA from 2018 to 2022 but can be expanded to assess other timelines of interest.

Geography: The West Midlands (WM) has a population of 6 million & includes a diverse ethnic & socio-economic mix. UHB is one of the largest NHS Trusts in England, providing direct acute services & specialist care across four hospital sites, with 2.2 million patient episodes per year, 2750 beds & > 120 ITU bed capacity. UHB runs a fully electronic healthcare record (EHR) (PICS; Birmingham Systems), a shared primary & secondary care record (Your Care Connected) & a patient portal “My Health”.

Data set availability: Data access is available via the PIONEER Hub for projects which will benefit the public or patients. This can be by developing a new understanding of disease, by providing insights into how to improve care, or by developing new models, tools, treatments, or care processes. Data access can be provided to NHS, academic, commercial, policy and third sector organisations. Applications from SMEs are welcome. There is a single data access process, with public oversight provided by our public review committee, the Data Trust Committee. Contact pioneer@uhb.nhs.uk or visit www.pioneerdatahub.co.uk for more details.

Available supplementary data: Matched controls; ambulance and community data. Unstructured data (images). We can provide the dataset in OMOP and other common data models and can build synthetic data to meet bespoke requirements.

Available supplementary support: Analytics, model build, validation & refinement; A.I. support. Data partner support for ETL (extract, transform & load) processes. Bespoke and “off the shelf” Trusted Research Environment (TRE) build and run. Consultancy with clinical, patient & end-user and purchaser access/ support. Support for regulatory requirements. Cohort discovery. Data-driven trials and “fast screen” services to assess population size.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alex Berger; Matteo Locatelli; Ximena Arcila-Londono; Ghazala Hayat; Nicholas Olney; James Wymer; Kelly Gwathmey; Christian Lunetta; Terry Heiman-Patterson; Senda Ajroud-Driss; Eric A. Macklin; Marie-Abèle Bind; Kimberly Goslin; Tamela Stuchiner; Lauren Brown; Tracy Bazan; Tyler Regan; Ashley Adamo; Valerie Ferment; Carly Schroeder; Megan Somers; Georgios Manousakis; Kenneth Faulconer; Ervin Sinani; Julia Mirochnick; Hong Yu; Alexander V. Sherman; David Walk (2024). The natural history of ALS: Baseline characteristics from a multicenter clinical cohort [Dataset]. http://doi.org/10.6084/m9.figshare.23701648.v1

Data from: The natural history of ALS: Baseline characteristics from a multicenter clinical cohort

Explore at:

docxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.23701648.v1

Dataset updated

Feb 9, 2024

Dataset provided by

Taylor & Francis

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Amyotrophic lateral sclerosis (ALS) is a rare disease with urgent need for improved treatment. Despite the acceleration of research in recent years, there is a need to understand the full natural history of the disease. As only 40% of people living with ALS are eligible for typical clinical trials, clinical trial datasets may not generalize to the full ALS population. While biomarker and cohort studies have more generous inclusion criteria, these too may not represent the full range of phenotypes, particularly if the burden for participation is high. To permit a complete understanding of the heterogeneity of ALS, comprehensive data on the full range of people with ALS is needed. The ALS Natural History Consortium (ALS NHC) consists of nine ALS clinics and was created to build a comprehensive dataset reflective of the ALS population. At each clinic, most patients are asked to participate and about 95% do. After obtaining consent, a minimum dataset is abstracted from each participant’s electronic health record. Participant burden is therefore minimal. Data on 1925 ALS patients were submitted as of 9 December 2022. ALS NHC participants were more heterogeneous relative to anonymized clinical trial data from the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) database. The ALS NHC includes ALS patients of older age of onset and a broader distribution of El Escorial categories, than the PRO-ACT database. ALS NHC participants had a higher diversity of diagnostic and demographic data compared to ALS clinical trial participants.Key MessagesWhat is already known on this topic: Current knowledge of the natural history of ALS derives largely from regional and national registries that have broad representation of the population of people living with ALS but do not always collect covariates and clinical outcomes. Clinical studies with rich datasets of participant characteristics and validated clinical outcomes have stricter inclusion and exclusion criteria that may not be generalizable to the full ALS population.What this study adds: To bridge this gap, we collected baseline characteristics for a sample of the population of people living with ALS seen at a consortium of ALS clinics that collect extensive, pre-specified participant-level data, including validated outcome measures.How this study might affect research, practice, or policy: A clinic-based longitudinal dataset can improve our understanding of the natural history of ALS and can be used to inform the design and analysis of clinical trials and health economics studies, to help the prediction of clinical course, to find matched controls for open label extension trials and expanded access protocols, and to document real-world evidence of the impact of novel treatments and changes in care practice. What is already known on this topic: Current knowledge of the natural history of ALS derives largely from regional and national registries that have broad representation of the population of people living with ALS but do not always collect covariates and clinical outcomes. Clinical studies with rich datasets of participant characteristics and validated clinical outcomes have stricter inclusion and exclusion criteria that may not be generalizable to the full ALS population. What this study adds: To bridge this gap, we collected baseline characteristics for a sample of the population of people living with ALS seen at a consortium of ALS clinics that collect extensive, pre-specified participant-level data, including validated outcome measures. How this study might affect research, practice, or policy: A clinic-based longitudinal dataset can improve our understanding of the natural history of ALS and can be used to inform the design and analysis of clinical trials and health economics studies, to help the prediction of clinical course, to find matched controls for open label extension trials and expanded access protocols, and to document real-world evidence of the impact of novel treatments and changes in care practice.

Clear search

Close search

Google apps

Main menu

Data from: The natural history of ALS: Baseline characteristics from a...

Data cleaning using unstructured data

Randomized controlled clinical trials with tagged information regarding the...

Data (i.e., evidence) about evidence based medicine

CK4Gen, High Utility Synthetic Survival Datasets

Dataset from A Randomised, Placebo Controlled, Ascending, Repeat Dose Study...

Africa Centre for Health and Population Studies

‘District Attorney Trials’ analyzed by Analyst-2

The NIMH Healthy Research Volunteer Dataset

The National Institute of Mental Health (NIMH) Research Volunteer (RV) Data Set

Release Notes

Release v2.0.0

Participant Eligibility

Clinical Measures

Biological and physiological measures

Imaging Studies

MRI

MEG

Specific Survey and Test Data within Data Set

1. Preliminary Online Screening Questionnaires

autism prevalence studies

National Child Development Study: Social Participation and Identity,...

Public Health Portfolio dataset

Data from: Population Assessment of Tobacco and Health (PATH) Study [United...

Longitudinal ALS EEG Dataset for Motor Imagery Studies

Understanding Society: Calendar Year Dataset, 2022

Public Life Data - People Moving

Time Diary Study (CAPS-DIARY module)

Spinal Cord Images - Spine MRI Dataset

Spine MRI Dataset, Fracture Detection, Anomaly Detection & Segmentation

MRI study angles in the dataset

💴 For Commercial Usage: Full version of the dataset includes 20,000 spine studies of people with different conditions, leave a request on TrainingData to buy the dataset

Types of diseases and conditions in the full dataset:

OTHER MEDICAL SPINE MRI DATASETS:

💴 Buy the Dataset: This is just an example of the data. Leave a request on https://trainingdata.pro/datasets to discuss your requirements, learn about the price and buy the dataset

Content

The dataset includes:

Medical reports include the following data:

Medical data might be collected in accordance with your requirements.

TrainingData provides high-quality data annotation tailored to your needs

National Child Development Study: Linked Health Administrative Datasets...

NIHR Midlands ARC Dataset: Outcomes from out-of-hospital cardiac arrest

Data from: The natural history of ALS: Baseline characteristics from a multicenter clinical cohort