100+ datasets found

p
MIMIC-III Clinical Database
physionet.org
Updated Sep 4, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
Explore at:
Unique identifier
https://doi.org/10.13026/C2XW26
Dataset updated
Sep 4, 2016
Authors
Alistair Johnson; Tom Pollard; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.
p
MIMIC-IV
physionet.org
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark (2024). MIMIC-IV [Dataset]. http://doi.org/10.13026/kpb9-mt58
Explore at:
Unique identifier
https://doi.org/10.13026/kpb9-mt58
Dataset updated
Oct 11, 2024
Authors
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
P
MIMIC-IV v2.2 Dataset
paperswithcode.com
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MIMIC-IV v2.2 Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-v2-2
Explore at:
Dataset updated
Feb 24, 2025
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. The Medical Information Mart for Intensive Care (MIMIC)-III database provided critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC). Importantly, MIMIC-III was deidentified, and patient identifiers were removed according to the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-III has been integral in driving large amounts of research in clinical informatics, epidemiology, and machine learning. Here we present MIMIC-IV, an update to MIMIC-III, which incorporates contemporary data and improves on numerous aspects of MIMIC-III. MIMIC-IV adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
mimic-iii-clinical-database-demo-1.4
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iii-clinical-database-demo-1.4 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iii-clinical-database-demo-1-4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26). Though deidentified, MIMIC-III contains detailed information regarding the care of real patients, and as such requires credentialing before access. To allow researchers to ascertain whether the database is suitable for their work, we have manually curated a demo subset, which contains information for 100 patients also present in the MIMIC-III Clinical Database. Notably, the demo dataset does not include free-text notes.

Background In recent years there has been a concerted move towards the adoption of digital health record systems in hospitals. Despite this advance, interoperability of digital systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offers in terms of understanding and improving care is yet to be fully realized.

MIMIC-III integrates deidentified, comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts, and makes it widely accessible to researchers internationally under a data use agreement. The open nature of the data allows clinical studies to be reproduced and improved in ways that would not otherwise be possible.

The MIMIC-III database was populated with data that had been acquired during routine hospital care, so there was no associated burden on caregivers and no interference with their workflow. For more information on the collection of the data, see the MIMIC-III Clinical Database page.

Methods The demo dataset contains all intensive care unit (ICU) stays for 100 patients. These patients were selected randomly from the subset of patients in the dataset who eventually die. Consequently, all patients will have a date of death (DOD). However, patients do not necessarily die during an individual hospital admission or ICU stay.

This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.

Data Description MIMIC-III is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-III Clinical Database page. The demo shares an identical schema, except all rows in the NOTEEVENTS table have been removed.

The data files are distributed in comma separated value (CSV) format following the RFC 4180 standard. Notably, string fields which contain commas, newlines, and/or double quotes are encapsulated by double quotes ("). Actual double quotes in the data are escaped using an additional double quote. For example, the string she said "the patient was notified at 6pm" would be stored in the CSV as "she said ""the patient was notified at 6pm""". More detail is provided on the RFC 4180 description page: https://tools.ietf.org/html/rfc4180

Usage Notes The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset.

CSV files can be opened natively using any text editor or spreadsheet program. However, some tables are large, and it may be preferable to navigate the data stored in a relational database. One alternative is to create an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

DB Browser for SQLite is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. We have found this tool to be useful for navigating SQLite files. Information regarding installation of the software and creation of the database can be found online: https://sqlitebrowser.org/

Release Notes Release notes for the demo follow the release notes for the MIMIC-III database.

Acknowledgements This research and development was supported by grants NIH-R01-EB017205, NIH-R01-EB001659, and NIH-R01-GM104987 from the National Institutes of Health. The authors would also like to thank Philips Healthcare and staff at the Beth Israel Deaconess Medical Center, Boston, for supporting database development, and Ken Pierce for providing ongoing support for the MIMIC research community.

Conflicts of Interest The authors declare no competing financial interests.

References Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Mo...
P
MIMIC-IV-ED Dataset
paperswithcode.com
Updated Mar 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). MIMIC-IV-ED Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-ed
Explore at:
Dataset updated
Mar 26, 2022
Description
MIMIC-IV-ED is a large, freely available database of emergency department (ED) admissions at the Beth Israel Deaconess Medical Center between 2011 and 2019. As of MIMIC-ED v1.0, the database contains 448,972 ED stays. Vital signs, triage information, medication reconciliation, medication administration, and discharge diagnoses are available. All data are deidentified to comply with the Health Information Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-ED is intended to support a diverse range of education initiatives and research studies.
mimic-iv-clinical-database-demo-2.2
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iv-clinical-database-demo-2.2 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iv-clinical-database-demo-2-2/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract The Medical Information Mart for Intensive Care (MIMIC)-IV database is comprised of deidentified electronic health records for patients admitted to the Beth Israel Deaconess Medical Center. Access to MIMIC-IV is limited to credentialed users. Here, we have provided an openly-available demo of MIMIC-IV containing a subset of 100 patients. The dataset includes similar content to MIMIC-IV, but excludes free-text clinical notes. The demo may be useful for running workshops and for assessing whether the MIMIC-IV is appropriate for a study before making an access request.

Background The increasing adoption of digital electronic health records has led to the existence of large datasets that could be used to carry out important research across many areas of medicine. Research progress has been limited, however, due to limitations in the way that the datasets are curated and made available for research. The MIMIC datasets allow credentialed researchers around the world unprecedented access to real world clinical data, helping to reduce the barriers to conducting important medical research. The public availability of the data allows studies to be reproduced and collaboratively improved in ways that would not otherwise be possible.

Methods First, the set of individuals to include in the demo was chosen. Each person in MIMIC-IV is assigned a unique subject_id. As the subject_id is randomly generated, ordering by subject_id results in a random subset of individuals. We only considered individuals with an anchor_year_group value of 2011 - 2013 or 2014 - 2016 to ensure overlap with MIMIC-CXR v2.0.0. The first 100 subject_id who satisfied the anchor_year_group criteria were selected for the demo dataset.

All tables from MIMIC-IV were included in the demo dataset. Tables containing patient information, such as emar or labevents, were filtered using the list of selected subject_id. Tables which do not contain patient level information were included in their entirety (e.g. d_items or d_labitems). Note that all tables which do not contain patient level information are prefixed with the characters 'd_'.

Deidentification was performed following the same approach as the MIMIC-IV database. Protected health information (PHI) as listed in the HIPAA Safe Harbor provision was removed. Patient identifiers were replaced using a random cipher, resulting in deidentified integer identifiers for patients, hospitalizations, and ICU stays. Stringent rules were applied to structured columns based on the data type. Dates were shifted consistently using a random integer removing seasonality, day of the week, and year information. Text fields were filtered by manually curated allow and block lists, as well as context-specific regular expressions. For example, columns containing dose values were filtered to only contain numeric values. If necessary, a free-text deidentification algorithm was applied to remove PHI from free-text. Results of this algorithm were manually reviewed and verified to remove identified PHI.

Data Description MIMIC-IV is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-IV Clinical Database page [1] or the MIMIC-IV online documentation [2]. The demo shares an identical schema and structure to the equivalent version of MIMIC-IV.

Data files are distributed in comma separated value (CSV) format following the RFC 4180 standard [3]. The dataset is also made available on Google BigQuery. Instructions to accessing the dataset on BigQuery are provided on the online MIMIC-IV documentation, under the cloud page [2].

An additional file is included: demo_subject_id.csv. This is a list of the subject_id used to filter MIMIC-IV to the demo subset.

Usage Notes The MIMIC-IV demo provides researchers with the opportunity to better understand MIMIC-IV data.

CSV files can be opened natively using any text editor or spreadsheet program. However, as some tables are large it may be preferable to navigate the data via a relational database. We suggest either working with the data in Google BigQuery (see the "Files" section for access details) or creating an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

Code is made available for use with MIMIC-IV on the MIMIC-IV code repository [4]. Code provided includes derivation of clinical concepts, tutorials, and reproducible analyses.

Release Notes Release notes for the demo follow the release notes for the MIMIC-IV database.

Ethics This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the pr...
i
MIMIC dataset for Anomaly detection
ieee-dataport.org
Updated Jan 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prarthi Jain (2021). MIMIC dataset for Anomaly detection [Dataset]. https://ieee-dataport.org/documents/mimic-dataset-anomaly-detection
Explore at:
Dataset updated
Jan 31, 2021
Authors
Prarthi Jain
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset is part of the MIMIC database and specifically utilise the data corresponding to two patients with ids 221 and 230.
p
MIMIC-II Clinical Database
physionet.org
Updated Apr 24, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Saeed; Mauricio Villarroel; Andrew Reisner; Gari Clifford; Li-wei Lehman; George Moody; Thomas Heldt; Tin Kyaw; Benjamin Moody; Roger Mark (2011). MIMIC-II Clinical Database [Dataset]. http://doi.org/10.13026/fxn0-mk84
Explore at:
Unique identifier
https://doi.org/10.13026/fxn0-mk84
Dataset updated
Apr 24, 2011
Authors
Mohammed Saeed; Mauricio Villarroel; Andrew Reisner; Gari Clifford; Li-wei Lehman; George Moody; Thomas Heldt; Tin Kyaw; Benjamin Moody; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-II documents a diverse and large population of intensive care unit patient stays and contains comprehensive and detailed clinical data, including physiological waveforms and minute-by-minute trends for a subset of records. It establishes a unique public-access resource for critical care research, supporting a diverse range of analytic studies spanning epidemiology, clinical decision-rule development, and electronic tool development. The MIMIC-II Clinical Database, although de-identified, still contains detailed information regarding the clinical care of patients, and must be treated with appropriate care and respect.
Z
MIMIC PERform Datasets
data.niaid.nih.gov
explore.openaire.eu
+1more
Updated Aug 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter H Charlton (2022). MIMIC PERform Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6807402
Explore at:
Dataset updated
Aug 8, 2022
Dataset authored and provided by
Peter H Charlton
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
Overview

The MIMIC PERform datasets contain physiological signals recorded from critically-ill patients during routine clinical care. Specifically, the datasets contain the following signals:

electrocardiogram (ECG)

photoplethysmogram (PPG)

impedance pneumography (imp), also known as respiratory (resp)

The datasets were extracted from the MIMIC III Waveform Database. Further details of the datasets are provided in the documentation accompanying the ppg-beats project, which is available at: https://ppg-beats.readthedocs.io/en/latest/ .

Datasets

The following datasets are available:

MIMIC PERform AF Dataset: Recordings from 35 critically-ill adults during routine clinical care, categorised as either AF (atrial fibrillation, 19 subjects) or non-AF (16 subjects).

Matlab format (AF subjects, non-AF subjects)

WFDB format (AF subjects, non-AF subjects)

CSV format (AF subjects, non-AF subjects)

MIMIC PERform Training Dataset: Recordings from 200 patients during routine clinical care, who are categorised as either adults (100 subjects) or neonates (100 subjects).

Matlab format (all data, adults, neonates)

WFDB format (all data, adults, neonates)

CSV format (all data, adults, neonates)

MIMIC PERform Testing Dataset: Recordings from 200 patients during routine clinical care, who are categorised as either adults (100 subjects) or neonates (100 subjects).

Matlab format (all data, adults, neonates)

WFDB format (all data, adults, neonates)

CSV format (all data, adults, neonates)

Citation

When using these datasets, please cite the following publication:

Charlton PH et al. Detecting beats in the photoplethysmogram: benchmarking open-source algorithms. Physiological Measurement 2022. DOI: 10.1088/1361-6579/ac826d

Acknowledgments

Each dataset is accompanied by a licence which acknowledges the source(s) of the data - please see the individual licenses for these acknowledgements.
P
Clinical Admission Notes from MIMIC-III Dataset
paperswithcode.com
opendatalab.com
Updated Feb 7, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Betty van Aken; Jens-Michalis Papaioannou; Manuel Mayrdorfer; Klemens Budde; Felix A. Gers; Alexander Löser (2021). Clinical Admission Notes from MIMIC-III Dataset [Dataset]. https://paperswithcode.com/dataset/hospital-admission-notes-from-mimic-iii
Explore at:
Dataset updated
Feb 7, 2021
Authors
Betty van Aken; Jens-Michalis Papaioannou; Manuel Mayrdorfer; Klemens Budde; Felix A. Gers; Alexander Löser
Description
This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The clinical notes contain information about a patient at admission time to the ICU and are labelled for four outcome prediction tasks: Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay.

To obtain the data one first has to gain access to the MIMIC-III dataset and then run the scripts introduced in the linked repository.
d
Data from: Assessing the use of HL7 FHIR for implementing the FAIR guiding...
search.dataone.org
data.niaid.nih.gov
+2more
Updated Jan 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Philip van Damme; Matthias LÃ¶be; Nirupama Benis; Nicolette de Keizer; Ronald Cornet (2024). Assessing the use of HL7 FHIR for implementing the FAIR guiding principles: A case study of the MIMIC-IV emergency department module [Dataset]. http://doi.org/10.5061/dryad.1jwstqk10
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.1jwstqk10
Dataset updated
Jan 18, 2024
Dataset provided by
Dryad Digital Repository
Authors
Philip van Damme; Matthias LÃ¶be; Nirupama Benis; Nicolette de Keizer; Ronald Cornet
Time period covered
Jan 1, 2023
Description
Objective To assess the use of Health Level Seven Fast Healthcare Interoperability Resources (FHIRÂ®) for implementing the Findable, Accessible, Interoperable, and Reusable guiding principles for scientific data (FAIR). Additionally, present a list of FAIR implementation choices for supporting future FAIR implementations that use FHIR. Material and Methods A case study was conducted on the Medical Information Mart for Intensive Care-IV Emergency Department dataset (MIMIC-ED), a deidentified clinical dataset converted into FHIR. The FAIRness of this dataset was assessed using a set of common FAIR assessment indicators. Results The FHIR distribution of MIMIC-ED, comprising an implementation guide and demo data, was more FAIR compared to the non-FHIR distribution. The FAIRness score increased from 60 to 82 out of 95 points, a relative improvement of 37%. The most notable improvements were observed in interoperability, with a score increase from 5 to 19 out of 19 points, and reusability, wit..., The authors of the paper collected the dataset.Â , Microsoft Word (.docx files) or Microsoft ExcelÂ (.csv files) (Open-source alternatives: LibreOffice, OpenOffice) The data files (.csv) can also be opened using any text editor, R, etc., # FAIR Indicator Scores and Qualitative Comments

This dataset belongs as supplementary material to the paper entitled "Assessing the Use of HL7 FHIR for Implementing the FAIR Guiding Principles: A Case Study of the MIMIC-IV Emergency Department Module".

Description of the data and file structure

This dataset describes the indicator scores and qualitative comments of the FAIR data assessment of the Medical Information Mart for Intensive Care (MIMIC)-IV Emergency Department Module. Two distributions of the Emergency Department module were assessed, the PhysioNet distribution and the Fast Healthcare Interoperability Resources (FHIR) distribution. This dataset consists of two files: (1) PhysioNet.csv containing the data of the PhysioNet distribution; and (2) FHIR.csv containing the data of the FHIR distribution. Both files share the same structure and fields.

Indicator ID: an ID corresponding to the IDs listed in Table 1 of the paper, which refer to a Research Data Alliance FAIR ...
p
Data from: MIMIC-III and eICU-CRD: Feature Representation by FIDDLE...
physionet.org
Updated Apr 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shengpu Tang; Parmida Davarmanesh; Yanmeng Song; Danai Koutra; Michael Sjoding; Jenna Wiens (2021). MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing [Dataset]. http://doi.org/10.13026/2qtg-k467
Explore at:
Unique identifier
https://doi.org/10.13026/2qtg-k467
Dataset updated
Apr 28, 2021
Authors
Shengpu Tang; Parmida Davarmanesh; Yanmeng Song; Danai Koutra; Michael Sjoding; Jenna Wiens
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
This is a preprocessed dataset derived from patient records in MIMIC-III and eICU, two large-scale electronic health record (EHR) databases. It contains features and labels for 5 prediction tasks involving 3 adverse outcomes (prediction times listed in parentheses): in-hospital mortality (48h), acute respiratory failure (4h and 12h), and shock (4h and 12h). We extracted comprehensive, high-dimensional feature representations (up to ~8,000 features) using FIDDLE (FlexIble Data-Driven pipeLinE), an open-source preprocessing pipeline for structured clinical data. These 5 prediction tasks were designed in consultation with a critical care physician for their clinical importance, and were used as part of the proof-of-concept experiments in the original paper to demonstrate FIDDLE's utility in aiding the feature engineering step of machine learning model development. The intent of this release is to share preprocessed MIMIC-III and eICU datasets used in the experiments to support and enable reproducible machine learning research on EHR data.
MRI Tissue Mimics Data
catalog.data.gov
data.nist.gov
Updated May 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institute of Standards and Technology (2024). MRI Tissue Mimics Data [Dataset]. https://catalog.data.gov/dataset/mri-tissue-mimics-data-19457
Explore at:
Dataset updated
May 15, 2024
Dataset provided by
National Institute of Standards and Technologyhttp://www.nist.gov/
Description
Database of MRI quantitative measurements gathered from literature and experimental studies, for tissues and synthetic materials.Additionally, a code base is provided to aid in finding MRI tissue relaxation times for a target field strength, and to provide functionality to solve for tissue mimic composition given target tissue relaxation times.
d
Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II)
catalog.data.gov
healthdata.gov
+3more
Updated Jul 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
National Institutes of Health (NIH) (2023). Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) [Dataset]. https://catalog.data.gov/dataset/multiparameter-intelligent-monitoring-in-intensive-care-ii-mimic-ii
Explore at:
Dataset updated
Jul 26, 2023
Dataset provided by
National Institutes of Health (NIH)
Description
The objective of this Bioengineering Research Partnership is to focus the resources of a powerful interdisciplinary team from academia (MIT), industry (Philips Medical Systems) and clinical medicine (Beth Israel Deaconess Medical Center) to develop and evaluate advanced ICU patient monitoring systems that will substantially improve the efficiency, accuracy and timeliness of clinical decision making in intensive care.
Z
Mimicking Clinical Trials with Synthetic Acute Myeloid Leukemia Patients...
data.niaid.nih.gov
Updated Mar 25, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Baldus, Claudia D. (2024). Mimicking Clinical Trials with Synthetic Acute Myeloid Leukemia Patients Using Generative Artificial Intelligence [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8334264
Explore at:
Dataset updated
Mar 25, 2024
Dataset provided by
Schetelig, Johannes
Röllig, Christoph
Hanoun, Maher
Kaufmann, Martin
Eckardt, Jan-Niklas
Wolfien, Markus
Burchert, Andreas
Sedlmayr, Martin
Platzbecker, Uwe
Hahn, Waldemar
Serve, Hubert
Schäfer-Eckart, Kerstin
Baldus, Claudia D.
Stasik, Sebastian
Müller-Tidow, Carsten
Thiede, Christian
Schliemann, Christoph
Middeke, Jan Moritz
Bornhäuser, Martin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We used two different methodologies of generative artificial intelligence, CTAB-GAN+ and normalizing flows (NFlow), to synthesize patient data based on 1606 patients with acute myeloid leukemia that were treated within four multicenter clinical trials. The resulting data set consists of 1606 synthetic patients for each of the models.

This dataset is associated with our publication "Mimicking clinical trials with synthetic acute myeloid leukemia patients using generative artificial intelligence" by Eckardt et al., npj Digital Medicine, 2024 (https://doi.org/10.1038/s41746-024-01076-x). If you use this dataset, please cite our paper.

Data Dictionary

NAME LABEL TYPE CODELIST

AGE age num in years

AMLSTAT AML status char de novo, sAML, tAML

ASXL1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

ATRX mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

BCOR mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

BCORL1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

BRAF mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CALR mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CBL mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CBLB mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CDKN2A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CEBPA CEBPA mutation char 0 = 'no mutation', 1 = 'mutation'

CGCX complex cytogenetic karyotype char 0 'No', 1 'Yes'

CGNK cytogenetic normal karyotype char 0 'No', 1 'Yes'

CR1 first complete remission char 0 = 'not achieved', 1 = 'achieved'

CSF3R mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

CUX1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

DNMT3A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

EFSSTAT status variable for EFSTM num 0 'censored' 1 'event'

EFSTM event free survival time num in months

ETV6 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

EXAML extramedullary AML char 0 'No', 1 'Yes'

EZH2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

FBXW7 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

FLT3I FLT3-ITD mutation status char 0 = 'no mutation', 1 = 'mutation'

FLT3T FLT3-TKD mutation status char 0 = 'no mutation', 1 = 'mutation'

GATA2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

GNAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

HB hemoglobin num in mmol/l

HRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

IDH1 IDH1 mutation status char 0 = 'no mutation', 1 = 'mutation'

IDH2 IDH2 mutation status char 0 = 'no mutation', 1 = 'mutation'

IKZF1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

JAK2 Jak2 Mutation char 0 = 'no mutation', 1 = 'mutation'

KDM6A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

KIT mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

KRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

MPL mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

MYD88 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

NOTCH1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

NPM1 NPM1 mutation status char 0 = 'no mutation', 1 = 'mutation'

NRAS mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

OSSTAT status variable for OSTM num 0 'censored' 1 'event'

OSTM overall survival time num in months

PDGFRA mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

PHF6 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

PLT platelet count num in 10⁶/l

PTEN mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

PTPN11 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

RAD21 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

RUNX1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SETBP1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SEX sex char f 'female', m 'male'

SF3B1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SMC1A mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SMC3 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SRSF2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

STAG2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

SUBJID subject identifier char

TET2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

TP53 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

U2AF1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

WBC white blood count num in 10⁶/l

WT1 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

ZRSR2 mutation indicator, NGS num 0 = 'no mutation', 1 = 'mutation'

inv16_t16.16 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t8.21 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.6.9..p23.q34. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

inv.3..q21.q26.2. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

minus.5 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

del.5q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.9.22..q34.q11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

minus.7 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

minus.17 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.v.11..v.q23. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

abn.17p. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.9.11..p21.23.q23. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.3.5. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.6.11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.10.11. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

t.11.19..q23.p13. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

del.7q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

del.9q. mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

trisomy 8 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

trisomy 21 mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

minus.Y mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'

minus.X mutation indicator, cytogenetics num 0 = 'no mutation', 1 = 'mutation'
MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal...
zenodo.org
Updated Mar 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ibrahim Baroud; Ibrahim Baroud; Lisa Raithel; Lisa Raithel; Sebastian Möller; Sebastian Möller; Roland Roller; Roland Roller (2025). MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal Identifiers Annotations [Dataset]. http://doi.org/10.5281/zenodo.15044596
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15044596
Dataset updated
Mar 19, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ibrahim Baroud; Ibrahim Baroud; Lisa Raithel; Lisa Raithel; Sebastian Möller; Sebastian Möller; Roland Roller; Roland Roller
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
MIMIC_III_IPI - Discharge Summaries from Medical Information Mart for Intensive Care-III with Indirect Personal Identifiers Annotations

The discharge summaries we use for demonstrating our Indirect Personal Identifiers (IPI) schema are randomly sampled from the Medical Information Mart for Intensive Care (MIMIC-III) dataset. MIMIC-III comprises health-related data from over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. Among other types of data, such as patient demographics, the database also includes various types of textual data, such as diagnostic reports and discharge summaries. We chose discharge summaries for our study, since these are richer in information than other notes in MIMIC-III. Details:

Johnson, A., Pollard, T., & Mark, R. (2016). MIMIC-III Clinical Database (version 1.4). PhysioNet. https://doi.org/10.13026/C2XW26.

Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L. W., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific data, 3, 160035. https://doi.org/10.1038/sdata.2016.35

This is the Discharge Summaries from MIMIC-III with Indirect Personal Identifiers Annotations as an external source of the paper accepted at the PrivateNLP workshop at NAACL 2025, a preprint can be found in:

Baroud, I., Raithel, L., Möller, S., & Roller, R. (2025). Beyond De-Identification: A Structured Approach for Defining and Detecting Indirect Identifiers in Medical Texts. arXiv preprint arXiv:2502.13342.

This repository contains the annotations in a CSV file and the annotation guidelines document. Inspecting the exact annotation texts requires access to the MIMIC-III Clinical Database, see https://physionet.org/content/mimiciii/1.4/. Each row in the CSV file has an ID together with a list of the IPI annotated spans, each in the format {"start": ,"end": ,"label": }. The ID in the ipi_annotations.csv table corresponds to the same ROW_ID in the MIMIC-III NOTEEVENTS.csv table and can be used for merging the tables to inspect the original documents and reconstruct the annotations using the offsets.

Please note that only authenticated users can request access to review and download the annotations and guidelines. If you encounter any issues, feel free to reach out to the contact person.
f
SQL code.
plos.figshare.com
7z
Updated Jun 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dengao Li; Jian Fu; Jumin Zhao; Junnan Qin; Lihui Zhang (2023). SQL code. [Dataset]. http://doi.org/10.1371/journal.pone.0276835.s001
Explore at:
7zAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0276835.s001
Dataset updated
Jun 21, 2023
Dataset provided by
PLOS ONE
Authors
Dengao Li; Jian Fu; Jumin Zhao; Junnan Qin; Lihui Zhang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The code is about how to extract data from the MIMIC-III. (7Z)
MIMIC-IV Clinical Text Data
kaggle.com
Updated Mar 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Antu Saha (2025). MIMIC-IV Clinical Text Data [Dataset]. https://www.kaggle.com/datasets/antusaha182352543/mimic-iv-clinical-text-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 13, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Antu Saha
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by Antu Saha

Released under CC0: Public Domain

Contents
o
Data from: MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset
registry.opendata.aws
physionet.org
Updated Dec 19, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PhysioNet (2024). MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset [Dataset]. https://registry.opendata.aws/mimic-iv-ecg/
Explore at:
Dataset updated
Dec 19, 2024
Dataset provided by
<a href="https://physionet.org/">PhysioNet</a>
Description
The MIMIC-IV-ECG module contains approximately 800,000 diagnostic electrocardiograms across nearly 160,000 unique patients. These diagnostic ECGs use 12 leads and are 10 seconds in length. They are sampled at 500 Hz. This subset contains all of the ECGs for patients who appear in the MIMIC-IV Clinical Database. When a cardiologist report is available for a given ECG, we provide the needed information to link the waveform to the report. The patients in MIMIC-IV-ECG have been matched against the MIMIC-IV Clinical Database, making it possible to link to information across the MIMIC-IV modules.
Data from: PDD Graph: Bridging Electronic Medical Records and Biomedical...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meng Wang; Jiaheng Zhang; Jun Liu; Wei Hu; Sen Wang; Xue Li; Wenqiang Liu (2023). PDD Graph: Bridging Electronic Medical Records and Biomedical Knowledge Graphs via Entity Linking [Dataset]. http://doi.org/10.6084/m9.figshare.5242138
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5242138
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Meng Wang; Jiaheng Zhang; Jun Liu; Wei Hu; Sen Wang; Xue Li; Wenqiang Liu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Patient-drug-disease (PDD) Graph dataset, utilising Electronic medical records (EMRS) and biomedical Knowledge graphs. The novel framework to construct the PDD graph is described in the associated publication.PDD is an RDF graph consisting of PDD facts, where a PDD fact is represented by an RDF triple to indicate that a patient takes a drug or a patient is diagnosed with a disease. For instance, (pdd:274671, pdd:diagnosed, sepsis)Data files are in .nt N-Triple format, a line-based syntax for an RDF graph. These can be accessed via openly-available text edit software.diagnose_icd_information.nt - contains RDF triples mapping patients to diagnoses. For example:(pdd:18740, pdd:diagnosed, icd99592),where pdd:18740 is a patient entity, and icd99592 is the ICD-9 code of sepsis.drug_patients.nt- contains RDF triples mapping patients to drugs. For example:(pdd:18740, pdd:prescribed, aspirin),where pdd:18740 is a patient entity, and aspirin is the drug's name.Background:Electronic medical records contain multi-format electronic medical data that consist of an abundance of medical knowledge. Faced with patients' symptoms, experienced caregivers make the right medical decisions based on their professional knowledge, which accurately grasps relationships between symptoms, diagnoses and corresponding treatments. In the associated paper, we aim to capture these relationships by constructing a large and high-quality heterogenous graph linking patients, diseases, and drugs (PDD) in EMRs. Specifically, we propose a novel framework to extract important medical entities from MIMIC-III (Medical Information Mart for Intensive Care III) and automatically link them with the existing biomedical knowledge graphs, including ICD-9 ontology and DrugBank. The PDD graph presented in this paper is accessible on the Web via the SPARQL endpoint as well as in .nt format in this repository, and provides a pathway for medical discovery and applications, such as effective treatment recommendations.De-identificationIt is necessary to mention that MIMIC-III contains clinical information of patients. Although the protected health information was de-identifed, researchers who seek to use more clinical data should complete an on-line training course and then apply for the permission to download the complete MIMIC-III dataset: https://mimic.physionet.org/

Facebook

Twitter

Click to copy link

Link copied

Cite

Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26

MIMIC-III Clinical Database

Explore at:

Unique identifier

https://doi.org/10.13026/C2XW26

Dataset updated

Sep 4, 2016

Authors

Alistair Johnson; Tom Pollard; Roger Mark

License

https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Description

MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.

Clear search

Close search

Google apps

Main menu

MIMIC-III Clinical Database

MIMIC-IV

MIMIC-IV v2.2 Dataset

mimic-iii-clinical-database-demo-1.4

MIMIC-IV-ED Dataset

mimic-iv-clinical-database-demo-2.2

MIMIC dataset for Anomaly detection

MIMIC-II Clinical Database

MIMIC PERform Datasets

Clinical Admission Notes from MIMIC-III Dataset

Data from: Assessing the use of HL7 FHIR for implementing the FAIR guiding...

Description of the data and file structure

Data from: MIMIC-III and eICU-CRD: Feature Representation by FIDDLE...

MRI Tissue Mimics Data

Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II)

Mimicking Clinical Trials with Synthetic Acute Myeloid Leukemia Patients...

MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal...

SQL code.

MIMIC-IV Clinical Text Data

Dataset

Contents

Data from: MIMIC-IV-ECG: Diagnostic Electrocardiogram Matched Subset

Data from: PDD Graph: Bridging Electronic Medical Records and Biomedical...

MIMIC-III Clinical DatabaseSee More Versions

MIMIC-III Clinical Database