100+ datasets found

p
MIMIC-III Clinical Database
physionet.org
Updated Sep 4, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
Explore at:
Unique identifier
https://doi.org/10.13026/C2XW26
Dataset updated
Sep 4, 2016
Authors
Alistair Johnson; Tom Pollard; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.
P
MIMIC-III Dataset
paperswithcode.com
opendatalab.com
Updated Apr 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair E.W. Johnson; Tom J. Pollard; Lu Shen; Li-wei H. Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G. Mark (2022). MIMIC-III Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iii
Explore at:
Dataset updated
Apr 20, 2022
Authors
Alistair E.W. Johnson; Tom J. Pollard; Lu Shen; Li-wei H. Lehman; Mengling Feng; Mohammad Ghassemi; Benjamin Moody; Peter Szolovits; Leo Anthony Celi; Roger G. Mark
Description
The Medical Information Mart for Intensive Care III (MIMIC-III) dataset is a large, de-identified and publicly-available collection of medical records. Each record in the dataset includes ICD-9 codes, which identify diagnoses and procedures performed. Each code is partitioned into sub-codes, which often include specific circumstantial details. The dataset consists of 112,000 clinical reports records (average length 709.3 tokens) and 1,159 top-level ICD-9 codes. Each report is assigned to 7.6 codes, on average. Data includes vital signs, medications, laboratory measurements, observations and notes charted by care providers, fluid balance, procedure codes, diagnostic codes, imaging reports, hospital length of stay, survival data, and more.

The database supports applications including academic and industrial research, quality improvement initiatives, and higher education coursework.
mimic-iii-clinical-database-demo-1.4
kaggle.com
Updated Apr 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Montassar bellah (2025). mimic-iii-clinical-database-demo-1.4 [Dataset]. https://www.kaggle.com/datasets/montassarba/mimic-iii-clinical-database-demo-1-4
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 1, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Montassar bellah
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Abstract MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012 [1]. The MIMIC-III Clinical Database is available on PhysioNet (doi: 10.13026/C2XW26). Though deidentified, MIMIC-III contains detailed information regarding the care of real patients, and as such requires credentialing before access. To allow researchers to ascertain whether the database is suitable for their work, we have manually curated a demo subset, which contains information for 100 patients also present in the MIMIC-III Clinical Database. Notably, the demo dataset does not include free-text notes.

Background In recent years there has been a concerted move towards the adoption of digital health record systems in hospitals. Despite this advance, interoperability of digital systems remains an open issue, leading to challenges in data integration. As a result, the potential that hospital data offers in terms of understanding and improving care is yet to be fully realized.

MIMIC-III integrates deidentified, comprehensive clinical data of patients admitted to the Beth Israel Deaconess Medical Center in Boston, Massachusetts, and makes it widely accessible to researchers internationally under a data use agreement. The open nature of the data allows clinical studies to be reproduced and improved in ways that would not otherwise be possible.

The MIMIC-III database was populated with data that had been acquired during routine hospital care, so there was no associated burden on caregivers and no interference with their workflow. For more information on the collection of the data, see the MIMIC-III Clinical Database page.

Methods The demo dataset contains all intensive care unit (ICU) stays for 100 patients. These patients were selected randomly from the subset of patients in the dataset who eventually die. Consequently, all patients will have a date of death (DOD). However, patients do not necessarily die during an individual hospital admission or ICU stay.

This project was approved by the Institutional Review Boards of Beth Israel Deaconess Medical Center (Boston, MA) and the Massachusetts Institute of Technology (Cambridge, MA). Requirement for individual patient consent was waived because the project did not impact clinical care and all protected health information was deidentified.

Data Description MIMIC-III is a relational database consisting of 26 tables. For a detailed description of the database structure, see the MIMIC-III Clinical Database page. The demo shares an identical schema, except all rows in the NOTEEVENTS table have been removed.

The data files are distributed in comma separated value (CSV) format following the RFC 4180 standard. Notably, string fields which contain commas, newlines, and/or double quotes are encapsulated by double quotes ("). Actual double quotes in the data are escaped using an additional double quote. For example, the string she said "the patient was notified at 6pm" would be stored in the CSV as "she said ""the patient was notified at 6pm""". More detail is provided on the RFC 4180 description page: https://tools.ietf.org/html/rfc4180

Usage Notes The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset.

CSV files can be opened natively using any text editor or spreadsheet program. However, some tables are large, and it may be preferable to navigate the data stored in a relational database. One alternative is to create an SQLite database using the CSV files. SQLite is a lightweight database format which stores all constituent tables in a single file, and SQLite databases interoperate well with a number software tools.

DB Browser for SQLite is a high quality, visual, open source tool to create, design, and edit database files compatible with SQLite. We have found this tool to be useful for navigating SQLite files. Information regarding installation of the software and creation of the database can be found online: https://sqlitebrowser.org/

Release Notes Release notes for the demo follow the release notes for the MIMIC-III database.

Acknowledgements This research and development was supported by grants NIH-R01-EB017205, NIH-R01-EB001659, and NIH-R01-GM104987 from the National Institutes of Health. The authors would also like to thank Philips Healthcare and staff at the Beth Israel Deaconess Medical Center, Boston, for supporting database development, and Ken Pierce for providing ongoing support for the MIMIC research community.

Conflicts of Interest The authors declare no competing financial interests.

References Johnson, A. E. W., Pollard, T. J., Shen, L., Lehman, L. H., Feng, M., Ghassemi, M., Mo...
O
Clinical Admission Notes from MIMIC-III
opendatalab.com
paperswithcode.com
zip
Updated Sep 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Beuth University of Applied Sciences Berlin (2022). Clinical Admission Notes from MIMIC-III [Dataset]. https://opendatalab.com/OpenDataLab/Clinical_Admission_Notes_from_etc
Explore at:
zip(282276 bytes)Available download formats
Dataset updated
Sep 21, 2022
Dataset provided by
Beuth University of Applied Sciences Berlin
Charité – Berlin University of Medicine
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
This dataset is created from MIMIC-III (Medical Information Mart for Intensive Care III) and contains simulated patient admission notes. The clinical notes contain information about a patient at admission time to the ICU and are labelled for four outcome prediction tasks: Diagnoses at discharge, procedures performed, in-hospital mortality and length-of-stay. To obtain the data one first has to gain access to the MIMIC-III dataset and then run the scripts introduced in the linked repository.
P
MIMIC-IV v2.2 Dataset
paperswithcode.com
Updated Feb 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). MIMIC-IV v2.2 Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-v2-2
Explore at:
Dataset updated
Feb 24, 2025
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. The Medical Information Mart for Intensive Care (MIMIC)-III database provided critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC). Importantly, MIMIC-III was deidentified, and patient identifiers were removed according to the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-III has been integral in driving large amounts of research in clinical informatics, epidemiology, and machine learning. Here we present MIMIC-IV, an update to MIMIC-III, which incorporates contemporary data and improves on numerous aspects of MIMIC-III. MIMIC-IV adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
p
MIMIC-IV
physionet.org
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark (2024). MIMIC-IV [Dataset]. http://doi.org/10.13026/kpb9-mt58
Explore at:
Unique identifier
https://doi.org/10.13026/kpb9-mt58
Dataset updated
Oct 11, 2024
Authors
Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Brian Gow; Benjamin Moody; Steven Horng; Leo Anthony Celi; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
h
MIMIC-III-split
huggingface.co
Updated Mar 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Corentin Royer (2024). MIMIC-III-split [Dataset]. https://huggingface.co/datasets/croyer/MIMIC-III-split
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 19, 2024
Authors
Corentin Royer
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
croyer/MIMIC-III-split dataset hosted on Hugging Face and contributed by the HF Datasets community
p
MIMIC-II Clinical Database
physionet.org
Updated Apr 24, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammed Saeed; Mauricio Villarroel; Andrew Reisner; Gari Clifford; Li-wei Lehman; George Moody; Thomas Heldt; Tin Kyaw; Benjamin Moody; Roger Mark (2011). MIMIC-II Clinical Database [Dataset]. http://doi.org/10.13026/fxn0-mk84
Explore at:
Unique identifier
https://doi.org/10.13026/fxn0-mk84
Dataset updated
Apr 24, 2011
Authors
Mohammed Saeed; Mauricio Villarroel; Andrew Reisner; Gari Clifford; Li-wei Lehman; George Moody; Thomas Heldt; Tin Kyaw; Benjamin Moody; Roger Mark
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
MIMIC-II documents a diverse and large population of intensive care unit patient stays and contains comprehensive and detailed clinical data, including physiological waveforms and minute-by-minute trends for a subset of records. It establishes a unique public-access resource for critical care research, supporting a diverse range of analytic studies spanning epidemiology, clinical decision-rule development, and electronic tool development. The MIMIC-II Clinical Database, although de-identified, still contains detailed information regarding the clinical care of patients, and must be treated with appropriate care and respect.
S
EHR data from MIMIC-III
scidb.cn
Updated Aug 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tingyi Wanyan; Hossein Honarvar; Ariful Azad; Ying Ding; Benjamin S. Glicksberg (2021). EHR data from MIMIC-III [Dataset]. http://doi.org/10.11922/sciencedb.j00104.00094
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.11922/sciencedb.j00104.00094
Dataset updated
Aug 24, 2021
Dataset provided by
Science Data Bank
Authors
Tingyi Wanyan; Hossein Honarvar; Ariful Azad; Ying Ding; Benjamin S. Glicksberg
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We conducted our experiments on de-identified EHR data from MIMIC-III. This data set contains various clinical data relating to patient admission to ICU, such as disease diagnoses in the form of International Classification of Diseases (ICD)-9 codes, and lab test results as detailed in Supplementary Materials. We collected data for 5,956 patients, extracting lab tests every hour from admission. There are a total of 409 unique lab tests and 3,387 unique disease diagnoses observed. The diagnoses were obtained as ICD-9 codes and they were represented using one-hot encoding where one represents patients with disease and zero indicates those without. We binned the lab test events into 6, 12, 24, and 48 hours prior to patient death or discharge from ICU. From these data, we performed mortality predictions that are 10-fold, cross validated.
p
Data from: MIMIC-III and eICU-CRD: Feature Representation by FIDDLE...
physionet.org
Updated Apr 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shengpu Tang; Parmida Davarmanesh; Yanmeng Song; Danai Koutra; Michael Sjoding; Jenna Wiens (2021). MIMIC-III and eICU-CRD: Feature Representation by FIDDLE Preprocessing [Dataset]. http://doi.org/10.13026/2qtg-k467
Explore at:
Unique identifier
https://doi.org/10.13026/2qtg-k467
Dataset updated
Apr 28, 2021
Authors
Shengpu Tang; Parmida Davarmanesh; Yanmeng Song; Danai Koutra; Michael Sjoding; Jenna Wiens
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
This is a preprocessed dataset derived from patient records in MIMIC-III and eICU, two large-scale electronic health record (EHR) databases. It contains features and labels for 5 prediction tasks involving 3 adverse outcomes (prediction times listed in parentheses): in-hospital mortality (48h), acute respiratory failure (4h and 12h), and shock (4h and 12h). We extracted comprehensive, high-dimensional feature representations (up to ~8,000 features) using FIDDLE (FlexIble Data-Driven pipeLinE), an open-source preprocessing pipeline for structured clinical data. These 5 prediction tasks were designed in consultation with a critical care physician for their clinical importance, and were used as part of the proof-of-concept experiments in the original paper to demonstrate FIDDLE's utility in aiding the feature engineering step of machine learning model development. The intent of this release is to share preprocessed MIMIC-III and eICU datasets used in the experiments to support and enable reproducible machine learning research on EHR data.
MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal...
zenodo.org
Updated Mar 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ibrahim Baroud; Ibrahim Baroud; Lisa Raithel; Lisa Raithel; Sebastian Möller; Sebastian Möller; Roland Roller; Roland Roller (2025). MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal Identifiers Annotations [Dataset]. http://doi.org/10.5281/zenodo.15044596
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.15044596
Dataset updated
Mar 19, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ibrahim Baroud; Ibrahim Baroud; Lisa Raithel; Lisa Raithel; Sebastian Möller; Sebastian Möller; Roland Roller; Roland Roller
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
MIMIC_III_IPI - Discharge Summaries from Medical Information Mart for Intensive Care-III with Indirect Personal Identifiers Annotations

The discharge summaries we use for demonstrating our Indirect Personal Identifiers (IPI) schema are randomly sampled from the Medical Information Mart for Intensive Care (MIMIC-III) dataset. MIMIC-III comprises health-related data from over 40,000 patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. Among other types of data, such as patient demographics, the database also includes various types of textual data, such as diagnostic reports and discharge summaries. We chose discharge summaries for our study, since these are richer in information than other notes in MIMIC-III. Details:

Johnson, A., Pollard, T., & Mark, R. (2016). MIMIC-III Clinical Database (version 1.4). PhysioNet. https://doi.org/10.13026/C2XW26.

Johnson, A. E., Pollard, T. J., Shen, L., Lehman, L. W., Feng, M., Ghassemi, M., Moody, B., Szolovits, P., Celi, L. A., & Mark, R. G. (2016). MIMIC-III, a freely accessible critical care database. Scientific data, 3, 160035. https://doi.org/10.1038/sdata.2016.35

This is the Discharge Summaries from MIMIC-III with Indirect Personal Identifiers Annotations as an external source of the paper accepted at the PrivateNLP workshop at NAACL 2025, a preprint can be found in:

Baroud, I., Raithel, L., Möller, S., & Roller, R. (2025). Beyond De-Identification: A Structured Approach for Defining and Detecting Indirect Identifiers in Medical Texts. arXiv preprint arXiv:2502.13342.

This repository contains the annotations in a CSV file and the annotation guidelines document. Inspecting the exact annotation texts requires access to the MIMIC-III Clinical Database, see https://physionet.org/content/mimiciii/1.4/. Each row in the CSV file has an ID together with a list of the IPI annotated spans, each in the format {"start": ,"end": ,"label": }. The ID in the ipi_annotations.csv table corresponds to the same ROW_ID in the MIMIC-III NOTEEVENTS.csv table and can be used for merging the tables to inspect the original documents and reconstruct the annotations using the offsets.

Please note that only authenticated users can request access to review and download the annotations and guidelines. If you encounter any issues, feel free to reach out to the contact person.
h
mimic-iii
huggingface.co
Updated Apr 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Hejazizo (2025). mimic-iii [Dataset]. https://huggingface.co/datasets/hejazizo/mimic-iii
Explore at:
Dataset updated
Apr 10, 2025
Authors
Ali Hejazizo
Description
hejazizo/mimic-iii dataset hosted on Hugging Face and contributed by the HF Datasets community
p
MIMIC-III - SequenceExamples for TensorFlow modeling
physionet.org
Updated Sep 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonas Kemp; Kun Zhang; Andrew Dai (2020). MIMIC-III - SequenceExamples for TensorFlow modeling [Dataset]. http://doi.org/10.13026/n2v5-5b32
Explore at:
Unique identifier
https://doi.org/10.13026/n2v5-5b32
Dataset updated
Sep 29, 2020
Authors
Jonas Kemp; Kun Zhang; Andrew Dai
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
This dataset contains TensorFlow SequenceExamples derived from patient records in MIMIC-III, a freely available set of deidentified medical records from critical care patients at Beth Israel Deaconess Medical Center. Each SequenceExample converts data from an individual patient encounter and any previous encounters into a set of timestamped “feature lists” describing the patient history up to a certain time, beyond which predictions can be made. These data are suitable for direct input into TensorFlow modeling pipelines, and include labels for inpatient mortality and discharge diagnosis codes for each encounter. The intent of this release is to provide a preprocessed, ready-to-use version of MIMIC-III to support and enable reproducible machine learning research for electronic health records.
i
Refined MIMIC-III 30-day mortality prediction of sepsis-3 patients
ieee-dataport.org
Updated Aug 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
JaeSung Yoo (2022). Refined MIMIC-III 30-day mortality prediction of sepsis-3 patients [Dataset]. https://ieee-dataport.org/documents/refined-mimic-iii-30-day-mortality-prediction-sepsis-3-patients
Explore at:
Dataset updated
Aug 15, 2022
Authors
JaeSung Yoo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
and the data type of each column was adjusted.
f
Additional file 1 of A novel nomogram to predict mortality in patients with...
springernature.figshare.com
figshare.com
txt
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiao-Dan Li; Min-Min Li (2023). Additional file 1 of A novel nomogram to predict mortality in patients with stroke: a survival analysis based on the MIMIC-III clinical database [Dataset]. http://doi.org/10.6084/m9.figshare.19533957.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.19533957.v1
Dataset updated
Jun 5, 2023
Dataset provided by
figshare
Authors
Xiao-Dan Li; Min-Min Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 1: Raw data of relevant clinical data of stroke patients.
f
Atrial Fibrillation annotations of electrocardiogram from MIMIC III matched...
figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Syed Khairul Bashar (2023). Atrial Fibrillation annotations of electrocardiogram from MIMIC III matched subset [Dataset]. http://doi.org/10.6084/m9.figshare.12149091.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12149091.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Syed Khairul Bashar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We provide some annotations of the Medical Information Mart for Intensive Care (MIMIC) III waveform database matched Subset. The annotations are for the electrocardiogram recordings and denote atrial fibrillation status.More annotations will be added in future.Details about MIMIC III matched subset can be found at Physionet.https://archive.physionet.org/physiobank/database/mimic3wdb/matched/If you use the annotations, please cite the following paper:Bashar, S.K., Ding, E., Walkey, A.J., McManus, D.D. and Chon, K.H., 2019. Noise Detection in Electrocardiogram Signals for Intensive Care Unit Patients. IEEE Access, 7, pp.88357-88368
o
Data from: Assessment of Non-Invasive Blood Pressure Prediction from PPG and...
explore.openaire.eu
zenodo.org
Updated Oct 7, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Fabian Schrumpf; Patrick Frenzel; Christoph Aust; Georg Osterhoff; Mirco Fuch (2021). Assessment of Non-Invasive Blood Pressure Prediction from PPG and rPPG Signals Using Deep Learning [Dataset]. http://doi.org/10.5281/zenodo.5590603
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.5590603
Dataset updated
Oct 7, 2021
Authors
Fabian Schrumpf; Patrick Frenzel; Christoph Aust; Georg Osterhoff; Mirco Fuch
Description
This dataset is a subset of the MIMIC-III dataset used for non-invasive blood pressure prediction. PPG and ABP data were divided into windows of 7s length (875 data points). Systolic and diastolic blood pressure values were derived from the ABP windows. Each sample of the dataset consists of a PPG signal and blood pressure values as well as a unique subject identifier. The file consists of three datasets: PPG: PPG data of size 905,400 x 875 label: BP data of size 905,400 x 2 subject_idx: subject affiliation of each sample (size 905,400 x 1) Furthermore, this submission contains the following models: AlexNet ResNet50 LSTM Architecture published by Slapnicar et al. 2019 The architectures were trained using a non-mixed dataset derived from the MIMIC-III waveform database. Samples were divided between training, validation and test set based on their subject affiliation preventing contamination of validation and test sets with samples from subjects used for training.
Z
Structure Annotations of Assessment and Plan Sections from MIMIC-III
data.niaid.nih.gov
Updated Apr 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lee, I-Ching (2022). Structure Annotations of Assessment and Plan Sections from MIMIC-III [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6413404
Explore at:
Dataset updated
Apr 17, 2022
Dataset provided by
Matias, Yossi
Hassidim, Avinatan
Barequet, Ronnie
Ofek, Eran
Stupp, Doron
Rajkomar, Alvin
Oren, Eyal
Lee, I-Ching
Benjamini, Ayelet
Feder, Amir
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Physicians record their detailed thought-processes about diagnoses and treatments as unstructured text in a section of a clinical note called the "assessment and plan". This information is more clinically rich than structured billing codes assigned for an encounter but harder to reliably extract given the complexity of clinical language and documentation habits. To structure these sections we collected a dataset of annotations over assessment and plan sections from the publicly available and de-identified MIMIC-III dataset, and developed deep-learning based models to perform this task, described in the associated paper available as a pre-print at: https://www.medrxiv.org/content/10.1101/2022.04.13.22273438v1

When using this data please cite our paper:

@article {Stupp2022.04.13.22273438, author = {Stupp, Doron and Barequet, Ronnie and Lee, I-Ching and Oren, Eyal and Feder, Amir and Benjamini, Ayelet and Hassidim, Avinatan and Matias, Yossi and Ofek, Eran and Rajkomar, Alvin}, title = {Structured Understanding of Assessment and Plans in Clinical Documentation}, year = {2022}, doi = {10.1101/2022.04.13.22273438}, publisher = {Cold Spring Harbor Laboratory Press}, URL = {https://www.medrxiv.org/content/early/2022/04/17/2022.04.13.22273438}, journal = {medRxiv} }

The dataset, presented here, contains annotations of assessment and plan sections of notes from the publicly available and de-identified MIMIC-III dataset, marking the active problems, their assessment description, and plan action items. Action items are additionally marked as one of 8 categories (listed below). The dataset contains over 30,000 annotations of 579 notes from distinct patients, annotated by 6 medical residents and students.

The dataset is divided into 4 partitions - a training set (481 notes), validation set (50 notes), test set (48 notes) and an inter-rater set. The inter-rater set contains the annotations of each of the raters over the test set. Rater 1 in the inter-rater set should be regarded as an intra-rater comparison (details in the paper). The labels underwent automatic normalization to capture entire word boundaries and remove flanking non-alphanumeric characters.

Code for transforming labels into TensorFlow examples and training models as described in the paper will be made available at GitHub: https://github.com/google-research/google-research/tree/master/assessment_plan_modeling

In order to use these annotations, the user additionally needs to obtain the text of the notes which is found in the NOTE_EVENTS table from MIMIC-III, access to which is to be acquired independently (https://mimic.mit.edu/)

Annotations are given as character spans in a CSV file with the following schema:

Field Type Semantics partition categorical (one of [train, val, test, interrater] The set of ratings the span belongs to. rater_id int Unique id for each the raters note_id int The note’s unique note_id, links to the MIMIC-III notes table (as ROW-ID). span_type categorical (one of [PROBLEM_TITLE, PROBLEM_DESCRIPTION, ACTION_ITEM] Type of the span as annotated by raters. char_start int Character offsets from note start char_end int action_item_type categorical (one of [MEDICATIONS, IMAGING, OBSERVATIONS_LABS, CONSULTS, NUTRITION, THERAPEUTIC_PROCEDURES, OTHER_DIAGNOSTIC_PROCEDURES, OTHER]) Type of action item if the span is an action item (empty otherwise) as annotated by raters.
p
Annotated Question-Answer Pairs for Clinical Notes in the MIMIC-III Database...
physionet.org
Updated Jan 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xiang Yue; Xinliang Frederick Zhang; Huan Sun (2021). Annotated Question-Answer Pairs for Clinical Notes in the MIMIC-III Database [Dataset]. http://doi.org/10.13026/j0y6-bw05
Explore at:
Unique identifier
https://doi.org/10.13026/j0y6-bw05
Dataset updated
Jan 15, 2021
Authors
Xiang Yue; Xinliang Frederick Zhang; Huan Sun
License
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Description
Clinical question answering (QA) (or reading comprehension) aims to automatically answer questions from medical professionals based on clinical texts. We release this dataset, which contains 1287 annotated QA pairs on 36 sampled discharge summaries from MIMIC-III Clinical Notes, to facilitate the clinical question answering task. Questions in our dataset are either verified or directly generated by clinical experts.

Note that the primary purpose of this dataset is to test the generalizability of a QA model, i.e., whether a QA model that is trained on other datasets can answer questions on this dataset (which may have a different distribution compared with the training data), rather than to train a QA model. Hence the scale of our annotations is relatively small compared to some existing QA datasets.
P
MIMIC-IV ICD-10 Dataset
paperswithcode.com
Updated Apr 20, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joakim Edin; Alexander Junge; Jakob D. Havtorn; Lasse Borgholt; Maria Maistro; Tuukka Ruotsalo; Lars Maaløe (2023). MIMIC-IV ICD-10 Dataset [Dataset]. https://paperswithcode.com/dataset/mimic-iv-icd-10
Explore at:
Dataset updated
Apr 20, 2023
Authors
Joakim Edin; Alexander Junge; Jakob D. Havtorn; Lasse Borgholt; Maria Maistro; Tuukka Ruotsalo; Lars Maaløe
Description
MIMIC-IV ICD-10 contains 122,279 discharge summaries—free-text medical documents—annotated with ICD-10 diagnosis and procedure codes. It contains data for patients admitted to the Beth Israel Deaconess Medical Center emergency department or ICU between 2008-2019. All codes with fewer than ten examples have been removed, and the train-val-test split was created using multi-label stratified sampling. The dataset is described further in Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study, and the code to use the dataset is found here.

The dataset is intended for medical code prediction and was created using MIMIC-IV v2.2 and MIMIC-IV-NOTE v2.2. Using the two datasets requires a license obtained in Physionet; this can take a couple of days.

Facebook

Twitter

Click to copy link

Link copied

Cite

Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26

MIMIC-III Clinical Database

Explore at:

Unique identifier

https://doi.org/10.13026/C2XW26

Dataset updated

Sep 4, 2016

Authors

Alistair Johnson; Tom Pollard; Roger Mark

License

https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Description

MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.

Clear search

Close search

Google apps

Main menu

MIMIC-III Clinical Database

MIMIC-III Dataset

mimic-iii-clinical-database-demo-1.4

Clinical Admission Notes from MIMIC-III

MIMIC-IV v2.2 Dataset

MIMIC-IV

MIMIC-III-split

MIMIC-II Clinical Database

EHR data from MIMIC-III

Data from: MIMIC-III and eICU-CRD: Feature Representation by FIDDLE...

MIMIC_III_IPI - Discharge Summaries from MIMIC-III with Indirect Personal...

mimic-iii

MIMIC-III - SequenceExamples for TensorFlow modeling

Refined MIMIC-III 30-day mortality prediction of sepsis-3 patients

Additional file 1 of A novel nomogram to predict mortality in patients with...

Atrial Fibrillation annotations of electrocardiogram from MIMIC III matched...

Data from: Assessment of Non-Invasive Blood Pressure Prediction from PPG and...

Structure Annotations of Assessment and Plan Sections from MIMIC-III

Annotated Question-Answer Pairs for Clinical Notes in the MIMIC-III Database...

MIMIC-IV ICD-10 Dataset

MIMIC-III Clinical Database