11 datasets found
  1. p

    Data from: MIMIC-CXR-JPG - chest radiographs with structured labels

    • physionet.org
    Updated Mar 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng (2024). MIMIC-CXR-JPG - chest radiographs with structured labels [Dataset]. http://doi.org/10.13026/jsn5-t979
    Explore at:
    Dataset updated
    Mar 12, 2024
    Authors
    Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The MIMIC Chest X-ray JPG (MIMIC-CXR-JPG) Database v2.0.0 is a large publicly available dataset of chest radiographs in JPG format with structured labels derived from free-text radiology reports. The MIMIC-CXR-JPG dataset is wholly derived from MIMIC-CXR, providing JPG format files derived from the DICOM images and structured labels derived from the free-text reports. The aim of MIMIC-CXR-JPG is to provide a convenient processed version of MIMIC-CXR, as well as to provide a standard reference for data splits and image labels. The dataset contains 377,110 JPG format images and structured labels derived from the 227,827 free-text radiology reports associated with these images. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.

  2. p

    Data from: MIMIC-CXR Database

    • physionet.org
    Updated Jul 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alistair Johnson; Tom Pollard; Roger Mark; Seth Berkowitz; Steven Horng (2024). MIMIC-CXR Database [Dataset]. http://doi.org/10.13026/4jqj-jw95
    Explore at:
    Dataset updated
    Jul 23, 2024
    Authors
    Alistair Johnson; Tom Pollard; Roger Mark; Seth Berkowitz; Steven Horng
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    The MIMIC Chest X-ray (MIMIC-CXR) Database v2.0.0 is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. The dataset contains 377,110 images corresponding to 227,835 radiographic studies performed at the Beth Israel Deaconess Medical Center in Boston, MA. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.

  3. t

    MNIST and MIMIC-CXR-JPG datasets

    • service.tib.eu
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). MNIST and MIMIC-CXR-JPG datasets [Dataset]. https://service.tib.eu/ldmservice/dataset/mnist-and-mimic-cxr-jpg-datasets
    Explore at:
    Dataset updated
    Jan 3, 2025
    Description

    The MNIST dataset is a large dataset of handwritten digits, and the MIMIC-CXR-JPG dataset is a large dataset of chest x-ray images.

  4. p

    Visual Question Answering evaluation dataset for MIMIC CXR

    • physionet.org
    Updated Jan 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timo Kohlberger; Charles Lau; Tom Pollard; Andrew Sellergren; Atilla Kiraly; Fayaz Jamil (2025). Visual Question Answering evaluation dataset for MIMIC CXR [Dataset]. http://doi.org/10.13026/cvsk-ny21
    Explore at:
    Dataset updated
    Jan 28, 2025
    Authors
    Timo Kohlberger; Charles Lau; Tom Pollard; Andrew Sellergren; Atilla Kiraly; Fayaz Jamil
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    MIMIC CXR [1] is a large publicly available dataset of chest radiographs in DICOM format with free-text radiology reports. In addition, labels for the presence of 12 different chest-related pathologies, as well as of any support devices, and overall normal/abnormal status were made available via the MIMIC Chest X-ray JPG (MIMIC-CXR-JPG) [2] labels, which were generated using the CheXpert and NegBio algorithms.

    Based on these labels, we created a visual question answering dataset comprising 224 questions for 48 cases from the official test set, and 111 questions for 23 validation cases. A majority (68%) of the questions are close-ended (answerable with yes or no), and focus on the presence of one out of 15 chest pathologies, or any support device, or generically on any abnormality, whereas the remaining open-ended questions inquire about the location, size, severity or type of a pathology/device, if present in the specific case, indicated by the MIMIC-CXR-JPG labels.

    For each question and case we also provide a reference answer, which was authored by a board-certified radiologist (with 17 years of post-residency experience) based on the chest X-ray and original radiology report

  5. Curated CXR report generation dataset

    • kaggle.com
    Updated Feb 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FinanceKim (2023). Curated CXR report generation dataset [Dataset]. https://www.kaggle.com/datasets/financekim/curated-cxr-report-generation-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FinanceKim
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description
  6. p

    Data from: Image-derived cardiomegaly biomarker values for 96K chest X-rays...

    • physionet.org
    Updated Aug 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Duvieusart; Felix Krones; Guy Parsons; Lionel Tarassenko; Bartlomiej W Papiez; Adam Mahdi (2024). Image-derived cardiomegaly biomarker values for 96K chest X-rays in MIMIC-CXR/MIMIC-CXR-JPG [Dataset]. http://doi.org/10.13026/kfpv-zm25
    Explore at:
    Dataset updated
    Aug 23, 2024
    Authors
    Benjamin Duvieusart; Felix Krones; Guy Parsons; Lionel Tarassenko; Bartlomiej W Papiez; Adam Mahdi
    License

    Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    Cardiomegaly is a condition characterized by an abnormal enlargement of the heart, its identification is of paramount importance as it associate with a wide range of cardiac conditions. It is primary identified via the cardiothoracic ratio (CTR), however this metric can be inaccurate as it is affect by external factors such as breathing and body position. Multimodal approaches could mitigate these limitations by integrating non-imaging data, however reliable and explainable integration of imaging and non-imaging data remains a significant challenge. While this database does not directly use multimodal data, it hopes to tackle this challenge by extracting cardiomegaly biomarkers (CTR and cardiopulmonary area ratio) from chest X-rays. Thus encapsulating the relevant imaging information into individual datapoints, allowing easy integration of ‘imaging’ data with non-imaging data for more reliable diagnostic tools. The values were extracted from over 93,000 posterior-anterior MIMIC-CXR scans using detection and segmentation neural networks, tuned for cardiac and pulmonary identification.

  7. p

    Code for generating the HAIM multimodal dataset of MIMIC-IV clinical data...

    • physionet.org
    Updated Aug 23, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luis R Soenksen; Yu Ma; Cynthia Zeng; Leonard David Jean Boussioux; Kimberly Villalobos Carballo; Liangyuan Na; Holly Wiberg; Michael Li; Ignacio Fuentes; Dimitris Bertsimas (2022). Code for generating the HAIM multimodal dataset of MIMIC-IV clinical data and x-rays [Dataset]. http://doi.org/10.13026/3f8d-qe93
    Explore at:
    Dataset updated
    Aug 23, 2022
    Authors
    Luis R Soenksen; Yu Ma; Cynthia Zeng; Leonard David Jean Boussioux; Kimberly Villalobos Carballo; Liangyuan Na; Holly Wiberg; Michael Li; Ignacio Fuentes; Dimitris Bertsimas
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    A multimodal combination of the MIMIC-IV v1.0.0 and MIMIC Chest X-ray (MIMIC-CXR-JPG) v2.0.0 databases filtered to only include patients that have at least one chest X-ray performed with the goal of validating multi-modal predictive analytics in healthcare operations can be generated with the present resource. This multimodal dataset generated through this code contains 34,540 individual patient files in the form of "pickle" Python object structures, which covers a total of 7,279 hospitalization stays involving 6,485 unique patients. Additionally, code to extract feature embeddings as well as the list of pre-processed features are included in this repository.

  8. h

    GEMeX-CoT

    • huggingface.co
    Updated Jun 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelvin Liu (2025). GEMeX-CoT [Dataset]. https://huggingface.co/datasets/BoKelvin/GEMeX-CoT
    Explore at:
    Dataset updated
    Jun 1, 2025
    Authors
    Kelvin Liu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    For images, please refer to MIMIC-CXR-JPG(https://physionet.org/content/mimic-cxr-jpg/2.1.0/). After downloading, pad the shorter side with zeros and then resize the image to 336 × 336.

  9. p

    Data from: CheXmask Database: a large-scale dataset of anatomical...

    • physionet.org
    Updated Jan 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicolas Gaggion; Candelaria Mosquera; Martina Aineseder; Lucas Mansilla; Diego Milone; Enzo Ferrante (2025). CheXmask Database: a large-scale dataset of anatomical segmentation masks for chest x-ray images [Dataset]. http://doi.org/10.13026/3705-zg36
    Explore at:
    Dataset updated
    Jan 22, 2025
    Authors
    Nicolas Gaggion; Candelaria Mosquera; Martina Aineseder; Lucas Mansilla; Diego Milone; Enzo Ferrante
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The CheXmask Database presents a comprehensive, uniformly annotated collection of chest radiographs, constructed from five public databases: ChestX-ray8, Chexpert, MIMIC-CXR-JPG, Padchest and VinDr-CXR. The database aggregates 657,566 anatomical segmentation masks derived from images which have been processed using the HybridGNet model to ensure consistent, high-quality segmentation. To confirm the quality of the segmentations, we include in this database individual Reverse Classification Accuracy (RCA) scores for each of the segmentation masks. This dataset is intended to catalyze further innovation and refinement in the field of semantic chest X-ray analysis, offering a significant resource for researchers in the medical imaging domain.

  10. h

    GEMeX-ThinkVG

    • huggingface.co
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kelvin Liu (2025). GEMeX-ThinkVG [Dataset]. https://huggingface.co/datasets/BoKelvin/GEMeX-ThinkVG
    Explore at:
    Dataset updated
    Jun 26, 2025
    Authors
    Kelvin Liu
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    For images, please refer to MIMIC-CXR-JPG(https://physionet.org/content/mimic-cxr-jpg/2.1.0/). After downloading, pad the shorter side with zeros and then resize the image to 336 × 336. (Full data will be released soon)

      Reference
    

    If you find ThinkVG useful in your research, please consider citing the following paper: @misc{liu2025gemexthinkvg, title={GEMeX-ThinkVG: Towards Thinking with Visual Grounding in Medical VQA via Reinforcement Learning}, author={Bo Liu and Xiangyu… See the full description on the dataset page: https://huggingface.co/datasets/BoKelvin/GEMeX-ThinkVG.

  11. p

    Data from: RadGraph2: Tracking Findings Over Time in Radiology Reports

    • physionet.org
    Updated Aug 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adam Dejl; Sameer Khanna; Patricia Therese Pile; Kibo Yoon; Steven QH Truong; Hanh Duong; Agustina Saenz; Pranav Rajpurkar (2024). RadGraph2: Tracking Findings Over Time in Radiology Reports [Dataset]. http://doi.org/10.13026/q65y-9688
    Explore at:
    Dataset updated
    Aug 8, 2024
    Authors
    Adam Dejl; Sameer Khanna; Patricia Therese Pile; Kibo Yoon; Steven QH Truong; Hanh Duong; Agustina Saenz; Pranav Rajpurkar
    License

    https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

    Description

    RadGraph2 is a dataset of 800 chest radiology reports annotated using a fine-grained entity-relationship schema, which is an expanded version of the previously introduced RadGraph dataset. In contrast with the previous approaches and the original RadGraph, the new version of the used information extraction schema is designed to capture not only the key findings and their context but also the mentions of changes that occurred between the prior radiology examinations and the more recent study. These changes may include the appearance of new conditions affecting the patient, their progression, or the differences in the setup of the observed supporting devices. The information extracted from each report is represented in the form of a knowledge graph composed of clinically relevant entities and relations, which makes it easily amenable to automated processing. In addition to the dataset of manually labeled reports, we release more than 220,000 reports automatically annotated by our benchmark model. This model achieved an F1 micro performance of 0.88 and 0.74 on two differently sourced withheld test sets (from MIMIC-CXR-JPG and CheXpert, respectively). We believe that RadGraph2 could facilitate the development of clinically useful systems for the automated processing of radiology reports, particularly those reasoning about the evolution of a patient’s state over time.

  12. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng (2024). MIMIC-CXR-JPG - chest radiographs with structured labels [Dataset]. http://doi.org/10.13026/jsn5-t979

Data from: MIMIC-CXR-JPG - chest radiographs with structured labels

Related Article
Explore at:
108 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Mar 12, 2024
Authors
Alistair Johnson; Matthew Lungren; Yifan Peng; Zhiyong Lu; Roger Mark; Seth Berkowitz; Steven Horng
License

https://github.com/MIT-LCP/license-and-dua/tree/master/draftshttps://github.com/MIT-LCP/license-and-dua/tree/master/drafts

Description

The MIMIC Chest X-ray JPG (MIMIC-CXR-JPG) Database v2.0.0 is a large publicly available dataset of chest radiographs in JPG format with structured labels derived from free-text radiology reports. The MIMIC-CXR-JPG dataset is wholly derived from MIMIC-CXR, providing JPG format files derived from the DICOM images and structured labels derived from the free-text reports. The aim of MIMIC-CXR-JPG is to provide a convenient processed version of MIMIC-CXR, as well as to provide a standard reference for data splits and image labels. The dataset contains 377,110 JPG format images and structured labels derived from the 227,827 free-text radiology reports associated with these images. The dataset is de-identified to satisfy the US Health Insurance Portability and Accountability Act of 1996 (HIPAA) Safe Harbor requirements. Protected health information (PHI) has been removed. The dataset is intended to support a wide body of research in medicine including image understanding, natural language processing, and decision support.

Search
Clear search
Close search
Google apps
Main menu