Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/38464/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38464/terms
This is a collection of 2,888 clinical MRIs of patients admitted at a National Stroke Center, over ten years, with clinical diagnosis of acute or early subacute stroke. The collection includes diverse MRI modalities and protocols. The infarct core was manually defined in the diffusion weighted images; the images are provided in native subject space and in standard space (MNI), in Neuroimaging Informatics Technology Initiative (NIfTI) format. The data format and organization follows Brain Imaging Data Structure (BIDS) guidelines. The collection includes diverse metadata, comprised of demographic information, basic clinical profile (NIH Stroke Scale/Score (NIHSS), hospitalization duration, blood pressure at admission, BMI, and associated health conditions), and expert description of the acute lesion. This resource provides high quality, large scale, human-supervised knowledge to feed artificial intelligence models and enable further development of tools to automate several tasks that currently rely on human labor, such as lesion segmentation, labeling, calculation of disease-relevant scores, and lesion-based studies relating function to frequency lesion maps. The dataset is divided in folders with 60-70 subjects. Each folder contains the "raw data" (multimodal MRIs, in native space), "DWI-mask" (manually-defined lesion masks, brain masks, and 3D DWI, b0, and recalculated ADC), "DWI-MNI-IntensityNormalized" (DWI and lesion masks in MNI coordinates), and "phenotype" (individual ".tsv" files with metadata of each subject). The "templates" folder contains images averages and lesion frequency maps. The "documentation" contains comprehensive data documentation, the phenotypes of the whole dataset, and the data dictionary.
Facebook
TwitterDatabase and associated software tools providing access to clinical and research data on stroke, including deidentified patient data. Data types include imaging (e.g. CT, MRI, PET), clinical demographic data, genetic data, simulation perfusion data for verifying deconvolution algorithms used in bolus-tracking perfusion-weighted imaging (PWI). Also available are programs for performing deconvolution of bolus-tracking PWI, DTI tractography and an automated program for etiologic classification of ischemic stroke -- Causative Classification System for Ischemic Stroke (CCS)
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on January 28,2025. The Stroke Patient Recovery Research Database (SPReD) initiative creates the infrastructure needed for the collection of a wide range of data related to stroke risk factors and to stroke recovery. It also promotes the analysis and management of large brain and vessel images. A major goal is to create a comprehensive electronic database Stroke Patient Recovery Research Database or SPReD and populate it with patient data, including demographic, biomarker, genetic and proteomic data and imaging data. SPReD will enable us to combine descriptions of our stroke patients from multiple projects that are geographically distributed. We will do this in a uniform fashion in order to enhance our ability to document rates of recovery; to study the effects of vascular risk factors and inflammatory biomarkers; and to use these data to improve their physical and cognitive recovery through innovative intervention programs. This comprehensive database will provide an integrated repository of data with which our researchers will investigate and test original ideas, ultimately leading to knowledge that can be applied clinically to benefit stroke survivors.
Facebook
TwitterThe International Stroke Trial (IST) dataset includes data on 19,435 patients and 112 variables. For each randomized patient, data were extracted on the variables assessed at randomization, at the early outcome point, and at 6-months. This dataset provides a source of primary data and is available for public use for the conduct of secondary analyses and in the planning of future trials particularly in older patients and in resource-poor settings given the age distribution of the dataset.
Facebook
TwitterOpen Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
The International Stroke Trial (IST) was one of the biggest randomised trials in acute stroke. Methods: Available data on variables assessed at randomisation, at the early outcome point (14-days after randomisation or prior discharge) and at 6-months were extracted and made publically available. Results and Conclusions: The IST provides an excellent source of primary data easy-to-use for sample size calculations and preliminary analysis necessary for planning a good quality trial. # Associated publications # * The erratum paper explains the difference between this version ie version 2, and the previous version: Sandercock, P.A.G., Niewada, M., Czlonkowska, A. et al. Erratum to: The International Stroke Trial database. Trials 13, 24 (2012). https://doi.org/10.1186/1745-6215-13-24 . * Main results paper for the trial: International Stroke Trial Collaborative Group 1997, 'The International Stroke Trial (IST): A randomised trial of aspirin, subcutaneous heparin, both, or neither among 19 435 patients with acute ischaemic stroke', The Lancet, vol. 349, no. 9065, pp. 1569-1581. https://doi.org/10.1016/S0140-6736(97)04011-7 .
Facebook
Twitter2019 to 2021, 3-year average. Rates are age-standardized. County rates are spatially smoothed. The data can be viewed by sex and race/ethnicity. Data source: National Vital Statistics System. Additional data, maps, and methodology can be viewed on the Interactive Atlas of Heart Disease and Stroke https://www.cdc.gov/heart-disease-stroke-atlas/about/index.html
Facebook
TwitterThe data were extracted from ED records, as well as laboratory analyses and health professional notes taken during the hospital stay of each patient starting from admission to the hospital until discharge. Additionally, clinic records were accessed to determine if patients came for a follow-up visit. Initially, we selected the charts of all patients discharged with a diagnosis of stroke (ischemic and hemorrhagic) or transient ischemic attack (TIA) from the Crouse Hospital ED from September 2015 to January 2021. This provided a starting pool of 4634 records to review. We then focused on a set of 1863 more complete records that included incidents from January 2019 to January 2021. This group was further reduced to those which were associated with home addresses in the six counties surrounding Crouse Hospital (Onondaga, Oneida, Oswego, Madison, Cayuga and Cortland). The final group of incidents used for our study numbered 1731.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data set consists of electroencephalography (EEG) data from 50 (Subject1 – Subject50) participants with acute ischemic stroke aged between 30 and 77 years. The participants included 39 male and 11 female. The time after stroke ranged from 1 days to 30 days. 22 participants had right hemisphere hemiplegia and 28 participants had left hemisphere hemiplegia. All participants were originally right-handed. Each of the participants sat in front of a computer screen with an arm resting on a pillow on their lap or on a table and they carried out the instructions given on the computer screen. At the trial start, a picture with text description which was circulated with left right hand, were presented for 2s. We asked the participants to focus their mind on the hand motor imagery which was instructed, at the same time, the video of ipsilateral hand movement is displayed on the computer screen and lasts for 4s. Next, take a 2s break.
Facebook
Twitterhttps://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
The National Institutes of Health Stroke Scale (NIHSS) is a 15-item neurologic examination stroke scale. It quantifies the physical manifestations of neurological deficits and provides crucial support for clinical decision making and early-stage emergency triage. NIHSS scores stored in the free-text of Electronic Health Records (EHRs) often lack standardization and the expression patterns are highly dependent on the habit of the clinicians. This can limit the potential for reusability of the data.
There is benefit in developing robust algorithms to extract NIHSS scores from the free-text of EHRs. We developed a dataset for NIHSS score identification, a task defined as the extraction of scale items and corresponding scores from discharge summaries. Discharge summaries of stroke cases in the Medical Information Mart for Intensive Care III (MIMIC-III) database were used to create an annotated NIHSS corpus.
Each discharge summary was manually annotated for the presence of NIHSS scores by two annotators with backgrounds in medical informatics. Annotations include all scale items (e.g. “4. Facial Palsy”), the corresponding score “measurement”, and their relation “has value”. The dataset is intended to support academic and industrial research in the field of medical natural language processing (NLP).
Facebook
TwitterBackground and Purpose Stroke, increasingly referred to as a "brain attack", is one of the leading causes of death and the leading cause of adult disability in the United States. It has recently been estimated that there were three quarters of a million strokes in the United States in 1995. The aim of this study was to replicate the 1995 estimate and examine if there was an increase from 1995 to 1996 by using a large administrative claims database representative of all 1996 US inpatient discharges. Methods We used the Nationwide Inpatient Sample of the Healthcare Cost and Utilization Project, release 5, which contains ≈ 20 percent of all 1996 US inpatient discharges. We identified stroke patients by using the International Classification of Diseases, 9th Revision, Clinical Modification (ICD-9-CM) codes from 430 to 438, and we compared the 1996 database with that of 1995. Results There were 712,000 occurrences of stroke with hospitalization (95% CI 688,000 to 737,000) and an estimated 71,000 occurrences of stroke without hospitalization. This totaled 783,000 occurrences of stroke in 1996, compared to 750,000 in 1995. The overall rate for occurrence of total stroke (first-ever and recurrent) was 269 per 100,000 population (age- and sex-adjusted to 1996 US population). Conclusions We estimate that there were 783,000 first-ever or recurrent strokes in the United States during 1996, compared to the figure of 750,000 in 1995. This study replicates and confirms the previous annual estimates of approximately three quarters of a million total strokes. This slight increase is likely due to the aging of the population and the population gain in the US from 1995 to 1996.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset was created by josué Parra Rosales
Released under Apache 2.0
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionThe Lithuanian Stroke Database (StrokeLT) aims to automate data collection and key performance indicator (KPI) monitoring across all stroke-ready hospitals, addressing the limitations of manual processes and facilitating evidence-based improvements in stroke care nationwide. This publication outlines the selection process and target values of the KPIs designed to standardise and enhance stroke care quality in Lithuania.Study populationThe database will include all adult patients diagnosed with stroke or transient ischemic attack (TIA), admitted to Lithuanian stroke-ready hospitals, encompassing approximately 9,582 annual stroke and 1,899 TIA admissions based on 2023 data. The database will ensure comprehensive national coverage by integrating data from stroke centres via a centralised electronic health record system.Main variablesA total of 53 KPIs were selected through a multi-stage Delphi process involving national experts and guided by international standards. These KPIs include 44 process metrics, such as timeliness metrics, early rehabilitation, and availability of secondary prevention, as well as 8 outcome metrics, including functional recovery, completion of a patient feedback survey and mortality. This framework enables comprehensive monitoring across all stages of patient care, as well as incorporating valuable patient feedback.ConclusionThe Lithuanian Stroke Database establishes a standardised automated framework for monitoring stroke care using 53 KPIs, selected through a multi-stage Delphi process involving all relevant stakeholders.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides a collection of patient health records, including key medical conditions, lifestyle habits, and biometric indicators related to stroke occurrence. It includes details on age, gender, BMI, average glucose levels, hypertension, heart disease, diabetes status, smoking status, and socioeconomic status (SES).
This dataset can be used for predictive modeling, healthcare analytics, and medical research to identify patterns in stroke risk factors. It is particularly useful for machine learning models focused on stroke prediction and cardiovascular health analysis.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/37122/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/37122/terms
To access this data collection, please click on the Restricted Data button above. You will need to download and complete the data use agreement and then email it to icpsr-addep@umich.edu. The instructions are in the form. This study was conducted at the Medical University of South Carolina over the span of one year to delineate the cause/effect relationship between neural output and the biomechanical functions being executed in walking in post-stroke patients. Kinematic, kinetic, and electromyography (EMG) data were collected from 27 post-stroke subjects and from 17 healthy control subjects. Each subject walked on a treadmill at their self-selected walking speed in addition to a randomized block design of four steady-state mobility capability tasks: walking at maximum speed, and walking at self-selected speed with maximum cadence, maximum step length, and maximum step height.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📌 Overview This dataset has been carefully curated to support research in stroke risk prediction, helping develop models that estimate:
It is designed for machine learning and deep learning applications in medical AI and predictive healthcare. The dataset is balanced, ensuring that 50% of the records belong to individuals at risk and 50% belong to those not at risk.
📜 Dataset Generation Process The dataset was constructed based on medical literature, expert consultations, and statistical modeling. The feature distributions and relationships were inspired by real-world clinical observations, ensuring medical validity.
📖 Medical References & Sources The dataset structure is based on established risk factors documented in leading medical textbooks, research papers, and guidelines from health organizations. Key references include: - American Stroke Association (ASA): Guidelines on stroke risk factors and early warning symptoms. - Mayo Clinic & Cleveland Clinic: Medical literature on cardiovascular diseases and stroke risk factors. - "**Harrison’s Principles of Internal Medicine**" (20th Edition): Provides in-depth insights into stroke etiology and risk factors. - "**Stroke Prevention, Treatment, and Rehabilitation**" (2021, Oxford University Press): A comprehensive guide on stroke mechanisms and preventive strategies. - "**The Stroke Book" (Cambridge Medicine, 2nd Edition)** : Clinical insights into the symptoms and early predictors of stroke. - World Health Organization (WHO) Reports on Stroke Risk and Prevention.
🔬 Features of the Dataset Each record represents an individual’s medical condition, symptoms, and risk assessment. The dataset includes the following features:
1️⃣ Symptoms (Primary Predictors) The presence of these symptoms significantly influences stroke risk. These features are binary (1 = symptom present, 0 = absent).
2️⃣ Target Variables (Predicted Outcomes) - At Risk (Binary) → 1 if the person is at risk of stroke, 0 otherwise. - Stroke Risk (%) → The estimated probability of stroke occurrence, ranging from 0 to 100.
3️⃣ Demographic Feature - Age → A key risk factor, as stroke prevalence increases with age.
⚡**Why This Dataset is Accurate and Useful?**
Diverse Risk Factors Considered:
Cardiovascular symptoms like chest pain, irregular heartbeat, high blood pressure.
Neurological symptoms such as dizziness, fatigue, and anxiety.
Sleep-related issues like snoring and sleep apnea, which are linked to increased stroke risk.
Scalability and ML Suitability:
Ideal for both classification and regression tasks.
Can be used with deep learning (TensorFlow, PyTorch), ML models (XGBoost, Random Forest, SVM), and explainable AI techniques.
📂 Dataset Usage & Applications This dataset can be used in various healthcare AI applications, including:
✅ Predictive Analytics – Early stroke detection and prevention. ✅ Healthcare Chatbots – Real-time risk assessment and patient guidance. ✅ Medical Research – Identifying key stroke indicators from patient symptoms. ✅ Explainable AI (XAI) in Medicine – Understanding how AI makes stroke predictions.
Facebook
TwitterThis dataset contains risk-adjusted 30-day mortality and 30-day readmission rates, quality ratings, and number of deaths / readmissions and cases for ischemic stroke treated in California hospitals. This dataset does not include ischemic stroke treated in outpatient settings.
Facebook
Twitter2015 to 2017, 3-year average. Rates are age-standardized. County rates are spatially smoothed. The data can be viewed by sex and race/ethnicity. Data source: National Vital Statistics System. Additional data, maps, and methodology can be viewed on the Interactive Atlas of Heart Disease and Stroke. http://www.cdc.gov/dhdsp/maps/atlas
Facebook
Twitterhttps://opensource.org/licenses/NCSAhttps://opensource.org/licenses/NCSA
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction: The objective of this study was to evaluate the complementarity of the French national health database (Système national des données de Santé, SNDS) and the Dijon Stroke Registry for the epidemiology of stroke patients with anticoagulated atrial fibrillation (AF). Methods: The SNDS collects healthcare prescriptions and procedures reimbursed by the French national health insurance for almost all of the 66 million individuals living in France. A previously published algorithm was used to identify AF newly treated with oral anticoagulants. The Dijon Stroke Registry is a population-based study covering the residents of the city of Dijon since 1985 and records all stroke cases of the area. We compared the proportions of stroke patients with anticoagulated AF in the city of Dijon identified in SNDS databases to those registered in the Dijon Stroke Registry. Results: For the period 2013–2017 in the city of Dijon, 1,146 strokes were identified in the SNDS and 1,188 in the registry. The proportion of strokes with anticoagulated AF was 13.4% in the SNDS and 20.3% in the Dijon Stroke Registry. Very similar characteristics were found between patients identified through the 2 databases. The overall prevalence of AF in stroke patients could be estimated only in the Dijon stroke registry and was 30.4% for the study period. Discussion/Conclusion: If administrative health databases can be a useful tool to study the epidemiology of anticoagulated AF in stroke patients, population-based stroke registries as the Dijon Stroke Registry remain essential to fully study the epidemiology of strokes with anticoagulated AF.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Conventional stroke registries contain alphanumeric text-based data on the clinical status of stroke patients, but this format captures imaging data in a very limited form. There is a need for a new type of stroke registry to capture both text- and image-based data. Methods and Results: We designed a next-generation stroke registry containing quantitative magnetic resonance imaging (MRI) data, ‘DUIH_SRegI’, developed a supporting software package, ‘Image_QNA’, and performed experiments to assess the feasibility and utility of the system. Image_QNA enabled the mapping of stroke-related lesions on MR onto a standard brain template and the storage of this extracted imaging data in a visual database. Interuser and intrauser variability of the lesion mapping procedure was low. We compared the results from the semi automatic lesion registration using Image_QNA with automatic lesion registration using SPM5 (Statistical Parametric Mapping version 5), a well-regarded standard neuroscience software package, in terms of lesion location, size and shape, and found Image_QNA to be superior. We assessed the clinical usefulness of an image-based registry by studying 47 consecutive patients with first-ever lacunar infarcts in the corona radiata. We used the enriched dataset comprised of both image-based and alphanumeric databases to show that diffusion MR lesions overlapped in a more posterolateral brain location for patients with high NIH Stroke Scale scores (≧4) than for patients with low scores (≤3). In April 2009, we launched the first prospective image-based acute (≤1 week) stroke registry at our institution. The registered data include high signal intensity ischemic lesions on diffusion, T2-weighted, or fluid attenuation inversion recovery MRIs, and low signal intensity hemorrhagic lesions on gradient-echo MRIs. An interim analysis at 6 months showed that the time requirement for the lesion registration (183 consecutive patients, 3,226 MR slices with visible stroke-related lesions) was acceptable at about 1 h of labor per patient by a trained assistant with physician oversight. Conclusions: We have developed a novel image-based stroke registry, with database functions that allow the formulation and testing of intuitive, image-based hypotheses in a manner not easily achievable with conventional alphanumeric stroke registries.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/38464/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/38464/terms
This is a collection of 2,888 clinical MRIs of patients admitted at a National Stroke Center, over ten years, with clinical diagnosis of acute or early subacute stroke. The collection includes diverse MRI modalities and protocols. The infarct core was manually defined in the diffusion weighted images; the images are provided in native subject space and in standard space (MNI), in Neuroimaging Informatics Technology Initiative (NIfTI) format. The data format and organization follows Brain Imaging Data Structure (BIDS) guidelines. The collection includes diverse metadata, comprised of demographic information, basic clinical profile (NIH Stroke Scale/Score (NIHSS), hospitalization duration, blood pressure at admission, BMI, and associated health conditions), and expert description of the acute lesion. This resource provides high quality, large scale, human-supervised knowledge to feed artificial intelligence models and enable further development of tools to automate several tasks that currently rely on human labor, such as lesion segmentation, labeling, calculation of disease-relevant scores, and lesion-based studies relating function to frequency lesion maps. The dataset is divided in folders with 60-70 subjects. Each folder contains the "raw data" (multimodal MRIs, in native space), "DWI-mask" (manually-defined lesion masks, brain masks, and 3D DWI, b0, and recalculated ADC), "DWI-MNI-IntensityNormalized" (DWI and lesion masks in MNI coordinates), and "phenotype" (individual ".tsv" files with metadata of each subject). The "templates" folder contains images averages and lesion frequency maps. The "documentation" contains comprehensive data documentation, the phenotypes of the whole dataset, and the data dictionary.