100+ datasets found
  1. p

    MIMIC-III Clinical Database

    • physionet.org
    Updated Sep 4, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26
    Explore at:
    Dataset updated
    Sep 4, 2016
    Authors
    Alistair Johnson; Tom Pollard; Roger Mark
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.

  2. MIMIC-III Clinical Database(Open Access)

    • kaggle.com
    Updated Jun 2, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ihssane Ned (2025). MIMIC-III Clinical Database(Open Access) [Dataset]. https://www.kaggle.com/datasets/ihssanened/mimic-iii-clinical-databaseopen-access
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 2, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ihssane Ned
    Description

    Dataset Source

    This dataset is a portion of MIMIC-III Clinical Database, a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The MIMIC-III demo provides researchers with an opportunity to review the structure and content of MIMIC-III before deciding whether or not to carry out an analysis on the full dataset. The full dataset is available on PhysioNet this** link**

    Dataset Description:

    This dataset contains solely 4 tables (extracted from the original dataset), more informations about each table can be found in its corresponding link - admissions.csv
    - d_labitems.csv - labevents.csv - patient.csv a nice visualization of this dataset can be found here

    Future Perspectives:

    This portion of the dataset will be combined to build a comprehensive dataset of simulated medical reports.

  3. p

    MIMIC-III Clinical Database CareVue subset

    • physionet.org
    Updated Sep 21, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Roger Mark (2022). MIMIC-III Clinical Database CareVue subset [Dataset]. http://doi.org/10.13026/8a4q-w170
    Explore at:
    Dataset updated
    Sep 21, 2022
    Authors
    Alistair Johnson; Tom Pollard; Roger Mark
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    MIMIC-III is a database of critically ill patients admitted to an intensive care unit (ICU) at the Beth Israel Deaconess Medical Center (BIDMC) in Boston, MA. MIMIC-III has seen broad use, and was updated with the release of MIMIC-IV. MIMIC-IV contains more contemporaneous stays, higher granularity data, and expanded domains of information. To maximize the sample size of MIMIC-IV, the database overlaps with MIMIC-III, and specifically both databases contain the same admissions which occurred between 2008 - 2012. This overlap complicates analyses of the two databases simultaneously. Here we provide a subset of MIMIC-III containing patients who are not in MIMIC-IV. The goal of this project is to simplify the combination of MIMIC-III with MIMIC-IV.

  4. h

    MIMIC-III-split

    • huggingface.co
    Updated Mar 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corentin Royer (2024). MIMIC-III-split [Dataset]. https://huggingface.co/datasets/croyer/MIMIC-III-split
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 19, 2024
    Authors
    Corentin Royer
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    croyer/MIMIC-III-split dataset hosted on Hugging Face and contributed by the HF Datasets community

  5. p

    MIMIC-IV

    • physionet.org
    Updated Aug 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark (2020). MIMIC-IV [Dataset]. http://doi.org/10.13026/a2mm-bn44
    Explore at:
    Dataset updated
    Aug 13, 2020
    Authors
    Alistair Johnson; Lucas Bulgarelli; Tom Pollard; Steven Horng; Leo Anthony Celi; Roger Mark
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. The Medical Information Mart for Intensive Care (MIMIC)-III database provided critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC). Importantly, MIMIC-III was deidentified, and patient identifiers were removed according to the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-III has been integral in driving large amounts of research in clinical informatics, epidemiology, and machine learning. Here we present MIMIC-IV, an update to MIMIC-III, which incorporates contemporary data and improves on numerous aspects of MIMIC-III. MIMIC-IV adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.

  6. S

    EHR data from MIMIC-III

    • scidb.cn
    Updated Aug 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tingyi Wanyan; Hossein Honarvar; Ariful Azad; Ying Ding; Benjamin S. Glicksberg (2021). EHR data from MIMIC-III [Dataset]. http://doi.org/10.11922/sciencedb.j00104.00094
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 24, 2021
    Dataset provided by
    Science Data Bank
    Authors
    Tingyi Wanyan; Hossein Honarvar; Ariful Azad; Ying Ding; Benjamin S. Glicksberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We conducted our experiments on de-identified EHR data from MIMIC-III. This data set contains various clinical data relating to patient admission to ICU, such as disease diagnoses in the form of International Classification of Diseases (ICD)-9 codes, and lab test results as detailed in Supplementary Materials. We collected data for 5,956 patients, extracting lab tests every hour from admission. There are a total of 409 unique lab tests and 3,387 unique disease diagnoses observed. The diagnoses were obtained as ICD-9 codes and they were represented using one-hot encoding where one represents patients with disease and zero indicates those without. We binned the lab test events into 6, 12, 24, and 48 hours prior to patient death or discharge from ICU. From these data, we performed mortality predictions that are 10-fold, cross validated.

  7. S

    Mortality Prediction MIMIC-III

    • scidb.cn
    Updated May 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanrong Cai (2021). Mortality Prediction MIMIC-III [Dataset]. http://doi.org/10.11922/sciencedb.00787
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 6, 2021
    Dataset provided by
    Science Data Bank
    Authors
    Yanrong Cai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This new dataset was established according to the MIMIC III dataset, an openly available database developed by The Laboratory of Computational Physiology at Massachusetts Institute of Technology (MIT), which consists of data from more than 25,000 patients who were admitted to the Beth Israel Deaconess Medical Center (BIDMC) since 2003 and who have been de-identified for information safety. Here, we identified patients who were diagnosed as pelvic, acetabular, or combined pelvic and acetabular fractures according to ICD-9 code and who survived at least 72 hours after the ICU admission. All the data within the first 72 hours following ICU admission were collected and extracted from the MIMIC-III clinical database version 1.4.

  8. ECG_sepsis.xlsx

    • figshare.com
    xlsx
    Updated Nov 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MERVE APALAK (2023). ECG_sepsis.xlsx [Dataset]. http://doi.org/10.6084/m9.figshare.24265717.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 13, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    MERVE APALAK
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This database is created to enable community-based sepsis detection research. It is a subset of MIMIC-III Waveform Database Matched Subset. Sepsis onset is calculated based on Sepsis-3 criteria. Total of 447 patients are included. Further details can be found in our research paper or description file.If you use the annotations, please cite the following paper:..Details about MIMIC III matched subset can be found at Physionet.https://physionet.org/content/mimic3wdb-matched/1.0/

  9. h

    mimiciii-hospitalcourse-meta

    • huggingface.co
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dalton Macres (2023). mimiciii-hospitalcourse-meta [Dataset]. https://huggingface.co/datasets/dmacres/mimiciii-hospitalcourse-meta
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 20, 2023
    Authors
    Dalton Macres
    Description

    Dataset Card for "mimiciii-hospitalcourse-meta"

    More Information needed

  10. MIMIC PERform Datasets

    • zenodo.org
    • data.niaid.nih.gov
    bin, zip
    Updated Aug 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter H Charlton; Peter H Charlton (2022). MIMIC PERform Datasets [Dataset]. http://doi.org/10.5281/zenodo.6807403
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Aug 8, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Peter H Charlton; Peter H Charlton
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    The MIMIC PERform datasets are a series of datasets extracted from the MIMIC III Waveform Database. Each dataset contains recordings of physiological signals from critically-ill patients during routine clinical care. Specifically, the datasets contain the following signals:

    • electrocardiogram (ECG)
    • photoplethysmogram (PPG)
    • impedance pneumography (imp), also known as respiratory (resp)

    Further details of the datasets are provided in the documentation accompanying the ppg-beats project, which is available at: https://ppg-beats.readthedocs.io/en/latest/ . In particular, documentation is provided on the following datasets:

    • MIMIC PERform AF Dataset: Recordings from 35 critically-ill adults during routine clinical care, categorised as either AF (atrial fibrillation, 19 subjects) or non-AF (16 subjects).
      • Matlab format (all data, adults, neonates)
      • WFDB format (all data, adults, neonates)
      • CSV format (all data, adults, neonates)
    • MIMIC PERform Training Dataset: Recordings from 200 patients during routine clinical care, who are categorised as either adults (100 subjects) or neonates (100 subjects).
      • Matlab format (all data, adults, neonates)
      • WFDB format (all data, adults, neonates)
      • CSV format (all data, adults, neonates)
    • MIMIC PERform Testing Dataset: Recordings from 200 patients during routine clinical care, who are categorised as either adults (100 subjects) or neonates (100 subjects).

    Each dataset is accompanied by a licence which acknowledges the source(s) of the data - please see the individual licenses for these acknowledgements.

  11. f

    Data_Sheet_1_Machine Learning Prediction Models for Mechanically Ventilated...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jul 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zheng, Hua; Guo, Junyang; Zhu, Yibing; Chen, Yan; Chen, Ge; Xi, Xiuming; Li, Wei; Li, Yang; Jin, Xin; Wang, Guowei; Ren, Chao; Guo, Qianqian; Liu, Shi; Du, Bin; Huang, Huibin; Yu, Qian; Zhang, Jin; Li, Lin; Yao, Renqi (2021). Data_Sheet_1_Machine Learning Prediction Models for Mechanically Ventilated Patients: Analyses of the MIMIC-III Database.pdf [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000884385
    Explore at:
    Dataset updated
    Jul 1, 2021
    Authors
    Zheng, Hua; Guo, Junyang; Zhu, Yibing; Chen, Yan; Chen, Ge; Xi, Xiuming; Li, Wei; Li, Yang; Jin, Xin; Wang, Guowei; Ren, Chao; Guo, Qianqian; Liu, Shi; Du, Bin; Huang, Huibin; Yu, Qian; Zhang, Jin; Li, Lin; Yao, Renqi
    Description

    Background: Mechanically ventilated patients in the intensive care unit (ICU) have high mortality rates. There are multiple prediction scores, such as the Simplified Acute Physiology Score II (SAPS II), Oxford Acute Severity of Illness Score (OASIS), and Sequential Organ Failure Assessment (SOFA), widely used in the general ICU population. We aimed to establish prediction scores on mechanically ventilated patients with the combination of these disease severity scores and other features available on the first day of admission.Methods: A retrospective administrative database study from the Medical Information Mart for Intensive Care (MIMIC-III) database was conducted. The exposures of interest consisted of the demographics, pre-ICU comorbidity, ICU diagnosis, disease severity scores, vital signs, and laboratory test results on the first day of ICU admission. Hospital mortality was used as the outcome. We used the machine learning methods of k-nearest neighbors (KNN), logistic regression, bagging, decision tree, random forest, Extreme Gradient Boosting (XGBoost), and neural network for model establishment. A sample of 70% of the cohort was used for the training set; the remaining 30% was applied for testing. Areas under the receiver operating characteristic curves (AUCs) and calibration plots would be constructed for the evaluation and comparison of the models' performance. The significance of the risk factors was identified through models and the top factors were reported.Results: A total of 28,530 subjects were enrolled through the screening of the MIMIC-III database. After data preprocessing, 25,659 adult patients with 66 predictors were included in the model analyses. With the training set, the models of KNN, logistic regression, decision tree, random forest, neural network, bagging, and XGBoost were established and the testing set obtained AUCs of 0.806, 0.818, 0.743, 0.819, 0.780, 0.803, and 0.821, respectively. The calibration curves of all the models, except for the neural network, performed well. The XGBoost model performed best among the seven models. The top five predictors were age, respiratory dysfunction, SAPS II score, maximum hemoglobin, and minimum lactate.Conclusion: The current study indicates that models with the risk of factors on the first day could be successfully established for predicting mortality in ventilated patients. The XGBoost model performs best among the seven machine learning models.

  12. p

    MIMIC-III Waveform Database

    • physionet.org
    Updated Apr 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Moody; George Moody; Mauricio Villarroel; Gari D. Clifford; Ikaro Silva (2020). MIMIC-III Waveform Database [Dataset]. http://doi.org/10.13026/c2607m
    Explore at:
    Dataset updated
    Apr 7, 2020
    Authors
    Benjamin Moody; George Moody; Mauricio Villarroel; Gari D. Clifford; Ikaro Silva
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    The MIMIC-III Waveform Database contains 67,830 record sets for approximately 30,000 ICU patients. Almost all record sets include a waveform record containing digitized signals (typically including ECG, ABP, respiration, and PPG, and frequently other signals) and a “numerics” record containing time series of periodic measurements, each presenting a quasi-continuous recording of vital signs of a single patient throughout an ICU stay (typically a few days, but many are several weeks in duration). A subset of this database contains waveform and numerics records that have been matched and time-aligned with MIMIC-III Clinical Database records.

  13. mimic-iii

    • kaggle.com
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chan hainguyen (2025). mimic-iii [Dataset]. https://www.kaggle.com/datasets/chanhainguyen/mimic-iii
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    chan hainguyen
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by chan hainguyen

    Released under MIT

    Contents

  14. i

    Datasets of Acute Pancreatitis Patients from MIMIC - IV

    • ieee-dataport.org
    Updated May 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingchang Chen (2025). Datasets of Acute Pancreatitis Patients from MIMIC - IV [Dataset]. https://ieee-dataport.org/documents/datasets-acute-pancreatitis-patients-mimic-iv-mimic-iii-subsets-and-eicu
    Explore at:
    Dataset updated
    May 1, 2025
    Authors
    Mingchang Chen
    Description

    and the eICU

  15. h

    MIMIC-III-Clinical-Database

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Truong-Phuc Nguyen, MIMIC-III-Clinical-Database [Dataset]. https://huggingface.co/datasets/ntphuc149/MIMIC-III-Clinical-Database
    Explore at:
    Authors
    Truong-Phuc Nguyen
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ntphuc149/MIMIC-III-Clinical-Database dataset hosted on Hugging Face and contributed by the HF Datasets community

  16. f

    Atrial Fibrillation annotations of electrocardiogram from MIMIC III matched...

    • figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Syed Khairul Bashar (2023). Atrial Fibrillation annotations of electrocardiogram from MIMIC III matched subset [Dataset]. http://doi.org/10.6084/m9.figshare.12149091.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Authors
    Syed Khairul Bashar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We provide some annotations of the Medical Information Mart for Intensive Care (MIMIC) III waveform database matched Subset. The annotations are for the electrocardiogram recordings and denote atrial fibrillation status.More annotations will be added in future.Details about MIMIC III matched subset can be found at Physionet.https://archive.physionet.org/physiobank/database/mimic3wdb/matched/If you use the annotations, please cite the following paper:Bashar, S.K., Ding, E., Walkey, A.J., McManus, D.D. and Chon, K.H., 2019. Noise Detection in Electrocardiogram Signals for Intensive Care Unit Patients. IEEE Access, 7, pp.88357-88368

  17. r

    Medical Information Mart for Intensive Care-III

    • rrid.site
    • scicrunch.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Medical Information Mart for Intensive Care-III [Dataset]. http://identifiers.org/RRID:SCR_017384/resolver
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Collection of comprising deidentified health related data associated with patients who stayed in critical care units of Beth Israel Deaconess Medical Center between 2001 and 2012. Database includes information such as demographics, vital sign measurements made at bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (both in and out of hospital).

  18. t

    MIMIC-III-full and MIMIC-III-top 50

    • service.tib.eu
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). MIMIC-III-full and MIMIC-III-top 50 [Dataset]. https://service.tib.eu/ldmservice/dataset/mimic-iii-full-and-mimic-iii-top-50
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The MIMIC-III-full and MIMIC-III-top 50 datasets are used for training and testing the proposed model. The MIMIC-III-full dataset contains all the records, while the MIMIC-III-top 50 dataset contains only the top 50 most frequent ICD codes.

  19. Z

    Structure Annotations of Assessment and Plan Sections from MIMIC-III

    • data.niaid.nih.gov
    Updated Apr 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajkomar, Alvin (2022). Structure Annotations of Assessment and Plan Sections from MIMIC-III [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6413404
    Explore at:
    Dataset updated
    Apr 17, 2022
    Dataset provided by
    Feder, Amir
    Ofek, Eran
    Hassidim, Avinatan
    Stupp, Doron
    Oren, Eyal
    Rajkomar, Alvin
    Lee, I-Ching
    Benjamini, Ayelet
    Barequet, Ronnie
    Matias, Yossi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Physicians record their detailed thought-processes about diagnoses and treatments as unstructured text in a section of a clinical note called the "assessment and plan". This information is more clinically rich than structured billing codes assigned for an encounter but harder to reliably extract given the complexity of clinical language and documentation habits. To structure these sections we collected a dataset of annotations over assessment and plan sections from the publicly available and de-identified MIMIC-III dataset, and developed deep-learning based models to perform this task, described in the associated paper available as a pre-print at: https://www.medrxiv.org/content/10.1101/2022.04.13.22273438v1

    When using this data please cite our paper:

    @article {Stupp2022.04.13.22273438, author = {Stupp, Doron and Barequet, Ronnie and Lee, I-Ching and Oren, Eyal and Feder, Amir and Benjamini, Ayelet and Hassidim, Avinatan and Matias, Yossi and Ofek, Eran and Rajkomar, Alvin}, title = {Structured Understanding of Assessment and Plans in Clinical Documentation}, year = {2022}, doi = {10.1101/2022.04.13.22273438}, publisher = {Cold Spring Harbor Laboratory Press}, URL = {https://www.medrxiv.org/content/early/2022/04/17/2022.04.13.22273438}, journal = {medRxiv} }

    The dataset, presented here, contains annotations of assessment and plan sections of notes from the publicly available and de-identified MIMIC-III dataset, marking the active problems, their assessment description, and plan action items. Action items are additionally marked as one of 8 categories (listed below). The dataset contains over 30,000 annotations of 579 notes from distinct patients, annotated by 6 medical residents and students.

    The dataset is divided into 4 partitions - a training set (481 notes), validation set (50 notes), test set (48 notes) and an inter-rater set. The inter-rater set contains the annotations of each of the raters over the test set. Rater 1 in the inter-rater set should be regarded as an intra-rater comparison (details in the paper). The labels underwent automatic normalization to capture entire word boundaries and remove flanking non-alphanumeric characters.

    Code for transforming labels into TensorFlow examples and training models as described in the paper will be made available at GitHub: https://github.com/google-research/google-research/tree/master/assessment_plan_modeling

    In order to use these annotations, the user additionally needs to obtain the text of the notes which is found in the NOTE_EVENTS table from MIMIC-III, access to which is to be acquired independently (https://mimic.mit.edu/)

    Annotations are given as character spans in a CSV file with the following schema:

        Field
        Type
        Semantics
    
    
        partition
        categorical (one of [train, val, test, interrater]
        The set of ratings the span belongs to.
    
    
        rater_id
        int
        Unique id for each the raters
    
    
        note_id
        int
        The note’s unique note_id, links to the MIMIC-III notes table (as ROW-ID).
    
    
        span_type
        categorical (one of [PROBLEM_TITLE,
        PROBLEM_DESCRIPTION, ACTION_ITEM]
        Type of the span as annotated by raters.
    
    
        char_start
        int
        Character offsets from note start
    
    
        char_end
        int
    
    
        action_item_type
        categorical (one of [MEDICATIONS, IMAGING, OBSERVATIONS_LABS, CONSULTS, NUTRITION, THERAPEUTIC_PROCEDURES, OTHER_DIAGNOSTIC_PROCEDURES, OTHER])
        Type of action item if the span is an action item (empty otherwise) as annotated by raters.
    
  20. Additional file 1 of Predicting 30-days mortality for MIMIC-III patients...

    • springernature.figshare.com
    txt
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nianzong Hou; Mingzhe Li; Lu He; Bing Xie; Lin Wang; Rumin Zhang; Yong Yu; Xiaodong Sun; Zhengsheng Pan; Kai Wang (2024). Additional file 1 of Predicting 30-days mortality for MIMIC-III patients with sepsis-3: a machine learning approach using XGboost [Dataset]. http://doi.org/10.6084/m9.figshare.13346712.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Feb 14, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Nianzong Hou; Mingzhe Li; Lu He; Bing Xie; Lin Wang; Rumin Zhang; Yong Yu; Xiaodong Sun; Zhengsheng Pan; Kai Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1. Extracted raw data from the MIMIC-III.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alistair Johnson; Tom Pollard; Roger Mark (2016). MIMIC-III Clinical Database [Dataset]. http://doi.org/10.13026/C2XW26

MIMIC-III Clinical Database

Explore at:
Dataset updated
Sep 4, 2016
Authors
Alistair Johnson; Tom Pollard; Roger Mark
License

https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Description

MIMIC-III is a large, freely-available database comprising deidentified health-related data associated with over forty thousand patients who stayed in critical care units of the Beth Israel Deaconess Medical Center between 2001 and 2012. The database includes information such as demographics, vital sign measurements made at the bedside (~1 data point per hour), laboratory test results, procedures, medications, caregiver notes, imaging reports, and mortality (including post-hospital discharge).MIMIC supports a diverse range of analytic studies spanning epidemiology, clinical decision-rule improvement, and electronic tool development. It is notable for three factors: it is freely available to researchers worldwide; it encompasses a diverse and very large population of ICU patients; and it contains highly granular data, including vital signs, laboratory results, and medications.

Search
Clear search
Close search
Google apps
Main menu