https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. The Medical Information Mart for Intensive Care (MIMIC)-III database provided critical care data for over 40,000 patients admitted to intensive care units at the Beth Israel Deaconess Medical Center (BIDMC). Importantly, MIMIC-III was deidentified, and patient identifiers were removed according to the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-III has been integral in driving large amounts of research in clinical informatics, epidemiology, and machine learning. Here we present MIMIC-IV, an update to MIMIC-III, which incorporates contemporary data and improves on numerous aspects of MIMIC-III. MIMIC-IV adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.
The Medical Information Mart for Intensive Care (MIMIC)-IV database is comprised of deidentified electronic health records for patients admitted to the Beth Israel Deaconess Medical Center. Access to MIMIC-IV is limited to credentialed users. Here, we have provided an openly-available demo of MIMIC-IV containing a subset of 100 patients. The dataset includes similar content to MIMIC-IV, but excludes free-text clinical notes. The demo may be useful for running workshops and for assessing whether the MIMIC-IV is appropriate for a study before making an access request.
MIMIC-IV-ED is a large, freely available database of emergency department (ED) admissions at the Beth Israel Deaconess Medical Center between 2011 and 2019. As of MIMIC-ED v1.0, the database contains 448,972 ED stays. Vital signs, triage information, medication reconciliation, medication administration, and discharge diagnoses are available. All data are deidentified to comply with the Health Information Portability and Accountability Act (HIPAA) Safe Harbor provision. MIMIC-ED is intended to support a diverse range of education initiatives and research studies.
MIMIC-IV ICD-10 contains 122,279 discharge summaries—free-text medical documents—annotated with ICD-10 diagnosis and procedure codes. It contains data for patients admitted to the Beth Israel Deaconess Medical Center emergency department or ICU between 2008-2019. All codes with fewer than ten examples have been removed, and the train-val-test split was created using multi-label stratified sampling. The dataset is described further in Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study, and the code to use the dataset is found here.
The dataset is intended for medical code prediction and was created using MIMIC-IV v2.2 and MIMIC-IV-NOTE v2.2. Using the two datasets requires a license obtained in Physionet; this can take a couple of days.
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Fast Healthcare Interoperability Resources (FHIR) has emerged as a robust standard for healthcare data exchange. To explore the use of FHIR for the process of data harmonization, we converted the Medical Information Mart for Intensive Care IV (MIMIC-IV) and MIMIC-IV Emergency Department (MIMIC-IV-ED) databases into FHIR. We extended base FHIR to encode information in MIMIC-IV and aimed to retain the data in FHIR with minimal additional processing, aligning to US Core v4.0.0 where possible. A total of 24 profiles were created for MIMIC-IV data, and an additional 6 profiles were created for MIMIC-IV-ED data. Code systems and value sets were created from MIMIC terminology. We hope MIMIC-IV in FHIR provides a useful restructuring of the data to support applications around data harmonization, interoperability, and other areas of research.
The MIMIC-IV-ECG module contains approximately 800,000 diagnostic electrocardiograms across nearly 160,000 unique patients. These diagnostic ECGs use 12 leads and are 10 seconds in length. They are sampled at 500 Hz. This subset contains all of the ECGs for patients who appear in the MIMIC-IV Clinical Database. When a cardiologist report is available for a given ECG, we provide the needed information to link the waveform to the report. The patients in MIMIC-IV-ECG have been matched against the MIMIC-IV Clinical Database, making it possible to link to information across the MIMIC-IV modules.
The advent of large, open access text databases has driven advances in state-of-the-art model performance in natural language processing (NLP). The relatively limited amount of clinical data available for NLP has been cited as a significant barrier to the field's progress. Here we describe MIMIC-IV-Note: a collection of deidentified free-text clinical notes for patients included in the MIMIC-IV clinical database. MIMIC-IV-Note contains 331,794 deidentified discharge summaries from 145,915 patients admitted to the hospital and emergency department at the Beth Israel Deaconess Medical Center in Boston, MA, USA. The database also contains 2,321,355 deidentified radiology reports for 237,427 patients. All notes have had protected health information removed in accordance with the Health Insurance Portability and Accountability Act (HIPAA) Safe Harbor provision. All notes are linkable to MIMIC-IV providing important context to the clinical data therein. The database is intended to stimulate research in clinical natural language processing and associated areas.
Dataset for mimic4 data, by default for the Mortality task. Available tasks are: Mortality, Length of Stay, Readmission, Phenotype. The data is extracted from the mimic4 database using this pipeline: 'https://github.com/healthylaife/MIMIC-IV-Data-Pipeline/tree/main' mimic path should have this form : "path/to/mimic4data/from/username/mimiciv/2.2" If you choose a Custom task provide a configuration file for the Time series. Currently working with Mimic-IV ICU Data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundExisting research suggests that using statins may reduce the incidence of enteritis caused by C. difficile and improve the prognosis of patients. This study aimed to explore the relation between Clostridium difficile-induced enteritis (CDE) and statin use.MethodsData were collected from the Medical Information Mart for Intensive Care-IV (MIMIC-IV) database. Multivariate logistic regression analysis was employed to assess the impact of statin use on CDE incidence in patients in intensive care units (ICUs) and its effect on in-hospital mortality among them. The research findings were validated by performing propensity score matching (PSM), inverse probability of treatment weighting (IPTW), and subgroup analyses.ResultsThe study enrolled the data of 51,978 individuals to assess the effect of statin usage on the occurrence of CDE in patients admitted to the ICU. The results indicate that statins can decrease the prevalence of CDE in patients in ICU (odds ratio (OR): 0.758, 95% confidence interval (CI): 0.666–0.873, P < 0.05), which was further confirmed through PSM (OR: 0.760, 95% CI: 0.661–0.873, P < 0.05) and IPTW (OR: 0.818, 95% CI: 0.754–0.888, P < 0.05) analyses. For most subgroups, statins’ favorable effect in reducing CDE remained constant. A total of 1,208 patients were included in the study to evaluate whether statins could lower the risk of death in patients in ICU with enteritis caused by C. difficile. Statins did not reduce in-hospital mortality of patients in ICU with CDE (OR: 0.911, 95% CI: 0.667–1.235, P = 0.553). The results were validated following PSM (OR: 0.877, 95% CI: 0.599–1.282, P = 0.499) and IPTW (OR: 0.781, 95% CI: 0.632–1.062, P = 0.071) analyses, and all subgroups demonstrated consistent results.ConclusionStatin administration can reduce the incidence of CDE in patients in the ICU; however, it does not decrease the in-hospital mortality rate for individuals with CDE.
Objective To assess the use of Health Level Seven Fast Healthcare Interoperability Resources (FHIR®) for implementing the Findable, Accessible, Interoperable, and Reusable guiding principles for scientific data (FAIR). Additionally, present a list of FAIR implementation choices for supporting future FAIR implementations that use FHIR. Material and Methods A case study was conducted on the Medical Information Mart for Intensive Care-IV Emergency Department dataset (MIMIC-ED), a deidentified clinical dataset converted into FHIR. The FAIRness of this dataset was assessed using a set of common FAIR assessment indicators. Results The FHIR distribution of MIMIC-ED, comprising an implementation guide and demo data, was more FAIR compared to the non-FHIR distribution. The FAIRness score increased from 60 to 82 out of 95 points, a relative improvement of 37%. The most notable improvements were observed in interoperability, with a score increase from 5 to 19 out of 19 points, and reusability, wit..., The authors of the paper collected the dataset. , Microsoft Word (.docx files) or Microsoft Excel (.csv files) (Open-source alternatives: LibreOffice, OpenOffice) The data files (.csv) can also be opened using any text editor, R, etc., # FAIR Indicator Scores and Qualitative Comments
This dataset belongs as supplementary material to the paper entitled "Assessing the Use of HL7 FHIR for Implementing the FAIR Guiding Principles: A Case Study of the MIMIC-IV Emergency Department Module".
This dataset describes the indicator scores and qualitative comments of the FAIR data assessment of the Medical Information Mart for Intensive Care (MIMIC)-IV Emergency Department Module. Two distributions of the Emergency Department module were assessed, the PhysioNet distribution and the Fast Healthcare Interoperability Resources (FHIR) distribution. This dataset consists of two files: (1) PhysioNet.csv containing the data of the PhysioNet distribution; and (2) FHIR.csv containing the data of the FHIR distribution. Both files share the same structure and fields.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
AimTo compare the effects of midazolam, propofol, and dexmedetomidine monotherapy and combination therapy on the prognosis of intensive care unit (ICU) patients receiving continuous mechanical ventilation (MV).Methods11,491 participants from the Medical Information Mart for Intensive Care (MIMIC)-IV database 2008–2019 was included in this retrospective cohort study. The primary outcome was defined as incidence of ventilator-associated pneumonia (VAP), in-hospital mortality, and duration of MV. Univariate and multivariate logistic regression analyses were utilized to evaluate the association between sedation and the incidence of VAP. Univariate and multivariate Cox analyses were performed to investigate the correlation between sedative therapy and in-hospital mortality. Additionally, univariate and multivariate linear analyses were conducted to explore the relationship between sedation and duration of MV.ResultsCompared to patients not receiving these medications, propofol alone, dexmedetomidine alone, combination of midazolam and dexmedetomidine, combination of propofol and dexmedetomidine, combination of midazolam, propofol and dexmedetomidine were all association with an increased risk of VAP; dexmedetomidine alone, combination of midazolam and dexmedetomidine, combination of propofol and dexmedetomidine, combination of midazolam, propofol and dexmedetomidine may be protective factor for in-hospital mortality, while propofol alone was risk factor. There was a positive correlation between all types of tranquilizers and the duration of MV. Taking dexmedetomidine alone as the reference, all other drug groups were found to be associated with an increased risk of in-hospital mortality. The administration of propofol alone, in combination with midazolam and dexmedetomidine, in combination with propofol and dexmedetomidine, in combination with midazolam, propofol and dexmedetomidine were associated with an increased risk of VAP compared to the use of dexmedetomidine alone.ConclusionDexmedetomidine alone may present as a favorable prognostic option for ICU patients with mechanical ventilation MV.
F219091/mimic-iv-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
The MIMIC-IV-ECG module contains approximately 800,000 diagnostic electrocardiograms across nearly 160,000 unique patients. These diagnostic ECGs use 12 leads and are 10 seconds in length. They are sampled at 500 Hz. This subset contains all of the ECGs for patients who appear in the MIMIC-IV Clinical Database. When a cardiologist report is available for a given ECG, we provide the needed information to link the waveform to the report. The patients in MIMIC-IV-ECG have been matched against the MIMIC-IV Clinical Database, making it possible to link to information across the MIMIC-IV modules.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset is part of the MIMIC database and specifically utilise the data corresponding to two patients with ids 221 and 230.
and the eICU
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: Mechanically ventilated patients are susceptible to nosocomial infections such as ventilator-associated pneumonia. To treat ventilated patients with suspected infection, clinicians select appropriate antibiotics. However, decision-making regarding the use of antibiotics for methicillin-resistant Staphylococcus aureus (MRSA) is challenging, because of the lack of evidence-supported criteria. This study aims to derive a machine learning model to predict MRSA as a possible pathogen responsible for infection in mechanically ventilated patients.Methods: Data were collected from the Medical Information Mart for Intensive Care (MIMIC)-IV database (an openly available database of patients treated at the Beth Israel Deaconess Medical Center in the period 2008–2019). Of 26,409 mechanically ventilated patients, 809 were screened for MRSA during the mechanical ventilation period and included in the study. The outcome was positivity to MRSA on screening, which was highly imbalanced in the dataset, with 93.9% positive outcomes. Therefore, after dividing the dataset into a training set (n = 566) and a test set (n = 243) for validation by stratified random sampling with a 7:3 allocation ratio, synthetic datasets with 50% positive outcomes were created by synthetic minority over-sampling for both sets individually (synthetic training set: n = 1,064; synthetic test set: n = 456). Using these synthetic datasets, we trained and validated an XGBoost machine learning model using 28 predictor variables for outcome prediction. Model performance was evaluated by area under the receiver operating characteristic (AUROC), sensitivity, specificity, and other statistical measurements. Feature importance was computed by the Gini method.Results: In validation, the XGBoost model demonstrated reliable outcome prediction with an AUROC value of 0.89 [95% confidence interval (CI): 0.83–0.95]. The model showed a high sensitivity of 0.98 [CI: 0.95–0.99], but a low specificity of 0.47 [CI: 0.41–0.54] and a positive predictive value of 0.65 [CI: 0.62–0.68]. Important predictor variables included admission from the emergency department, insertion of arterial lines, prior quinolone use, hemodialysis, and admission to a surgical intensive care unit.Conclusions: We were able to develop an effective machine learning model to predict positive MRSA screening during mechanical ventilation using synthetic datasets, thus encouraging further research to develop a clinically relevant machine learning model for antibiotics stewardship.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Acute ischemic stroke (AIS) patients in the intensive care unit (ICU) face high mortality. This study examined the association between systolic blood pressure variability (SBPV), specifically average real variability (SBP-ARV), and short-term mortality in critically ill AIS patients. We conducted a retrospective cohort study using the MIMIC-IV database. The primary outcomes were 28-day and 90-day all-cause mortality. Cox regression, Kaplan-Meier curves, restricted cubic spline (RCS) models, and subgroup analyses were used to assess associations. A total of 861 AIS patients were included. The 28-day and 90-day mortality rates were 20.9% and 23.3%, respectively. Higher SBP-ARV was independently associated with increased mortality. Compared with the lowest tertile, the highest tertile of SBP-ARV had significantly increased 28-day mortality (HR: 1.53; 95% CI: 1.03–2.27; P = 0.035). SBP-ARV as a continuous variable was also significantly associated with 28-day and 90-day mortality. RCS analysis showed that mortality risk increased when SBP-ARV exceeded 11.63. Our findings suggest that elevated systolic blood pressure variability, particularly higher SBP-ARV within the first 24 hours of ICU admission, is significantly associated with increased 28-day and 90-day mortality in AIS patients. SBP-ARV may serve as a valuable prognostic marker for risk stratification and early clinical intervention in critically ill stroke patients. Stroke is a leading cause of death and disability worldwide. Some patients with severe strokes need to be treated in the intensive care unit (ICU), where close monitoring and advanced medical support are provided. However, many of these patients still face a high risk of death within a short period. Doctors are looking for reliable signs that can help predict which patients are at greater risk so that they can receive timely and targeted care. This study looked at changes in blood pressure over time—known as blood pressure variability (BPV)—in patients who had a severe ischemic stroke and were admitted to the ICU. Using data from a large hospital database, we found that patients with greater fluctuations in their systolic blood pressure (the top number in a blood pressure reading) during the first 24 hours in the ICU were more likely to die within 28 and 90 days. We also identified a specific threshold: once the fluctuation level exceeded a certain point, the risk of death increased steadily. This finding is important because blood pressure is a vital sign that is already measured routinely. Recognizing dangerous levels of blood pressure variability early could help doctors identify high-risk stroke patients sooner. By doing so, healthcare teams could adjust treatments—such as blood pressure management—more effectively and potentially improve survival outcomes.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The original data
https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts
Retrospectively collected medical data has the opportunity to improve patient care through knowledge discovery and algorithm development. Broad reuse of medical data is desirable for the greatest public good, but data sharing must be done in a manner which protects patient privacy. Here we present Medical Information Mart for Intensive Care (MIMIC)-IV, a large deidentified dataset of patients admitted to the emergency department or an intensive care unit at the Beth Israel Deaconess Medical Center in Boston, MA. MIMIC-IV contains data for over 65,000 patients admitted to an ICU and over 200,000 patients admitted to the emergency department. MIMIC-IV incorporates contemporary data and adopts a modular approach to data organization, highlighting data provenance and facilitating both individual and combined use of disparate data sources. MIMIC-IV is intended to carry on the success of MIMIC-III and support a broad set of applications within healthcare.