7 datasets found
  1. d

    MedPix

    • catalog.data.gov
    • data.virginia.gov
    • +3more
    Updated Jun 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Library of Medicine (2025). MedPix [Dataset]. https://catalog.data.gov/dataset/medpix
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    National Library of Medicine
    Description

    MedPix is a database of patient cases integrating images and textual information. The content material is organized by disease location (organ system), pathology category, patient profiles, and by image classification and caption. Additional information at https://medpix.nlm.nih.gov/home

  2. h

    MedPix-VQA

    • huggingface.co
    Updated Apr 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moukouba Moutoumounkata (2025). MedPix-VQA [Dataset]. https://huggingface.co/datasets/mmoukouba/MedPix-VQA
    Explore at:
    Dataset updated
    Apr 26, 2025
    Authors
    Moukouba Moutoumounkata
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    MedPix-VQA Dataset

    The MedPix-VQA dataset is a version of the data found at MEDPIX-ClinQA, specifically modified to address an image overlap issue that would result from directl splitting the original dataset. This overlap can lead to a model potentially seeing the same image during both training and validation, potentially leading to bias or data leakage.

      Key Modifications:
    

    We have modified the dataset to ensure no image overlap between the training and validation… See the full description on the dataset page: https://huggingface.co/datasets/mmoukouba/MedPix-VQA.

  3. h

    MedPix-Grouped-QA

    • huggingface.co
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Moukouba Moutoumounkata (2025). MedPix-Grouped-QA [Dataset]. https://huggingface.co/datasets/mmoukouba/MedPix-Grouped-QA
    Explore at:
    Dataset updated
    Apr 16, 2025
    Authors
    Moukouba Moutoumounkata
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    MedPix Grouped QA Dataset

    This dataset is a grouped version of MedPix 2.0, a multimodal biomedical dataset, intended for use in training and fine-tuning Visual Question Answering (VQA) and Multimodal LLMs.

      🧩 Structure
    

    Each entry contains:

    image: A unique medical image questions: A list of 10 diagnosis-related questions per image answers: A list of corresponding answers

    This version reduces the original 20,500 entries (10 Q&A per image) down to 2,050 unique images, with… See the full description on the dataset page: https://huggingface.co/datasets/mmoukouba/MedPix-Grouped-QA.

  4. MedPix-2.0

    • zenodo.org
    json, zip
    Updated Jan 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Salvatore Contino; Salvatore Contino (2025). MedPix-2.0 [Dataset]. http://doi.org/10.5281/zenodo.12624810
    Explore at:
    zip, jsonAvailable download formats
    Dataset updated
    Jan 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Salvatore Contino; Salvatore Contino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 30, 2024
    Description

    MedPix 2.0: A Comprehensive Multimodal Biomedical Dataset for Advanced AI Applications.

    Please cite our work as follows if you use MedPix 2.0
    ```
    @misc{siragusa2025medpix20comprehensivemultimodal,
    title={MedPix 2.0: A Comprehensive Multimodal Biomedical Data set for Advanced AI Applications with Retrieval Augmented Generation and Knowledge Graphs},
    author={Irene Siragusa and Salvatore Contino and Massimo La Ciura and Rosario Alicata and Roberto Pirrone},
    year={2025},
    eprint={2407.02994},
    archivePrefix={arXiv},
    primaryClass={cs.DB},
    url={https://arxiv.org/abs/2407.02994},
    }
    ```

    Below a description of Case_topic.json and Descriptions.json is provided. images folder contains all the images of the dataset, while in splitted_dataset folder, a split of the dataset is provided, please refer to /splitted_dataset/README.md for further informations.

    Case_topic.json

    Contains a list of JSON, each of these provide the information of a single clinical case. The structure of each element is reported below:

    • U_id the UID code idenifies a clinical case

    • TAC list of names of the .png files containing the CT scans (if present). Images are under the image folder.

    • MRI list of names of the .png files containing the MR scans (if present). Images are under the image folder.

    • Case dictionary with the information of the clinical case. It contains the following information:

      • Title the diagnosis
      • History patient's history
      • Exam
      • Findings
      • Differential Diagnosis
      • Case Diagnosis
      • Diagnosis By
    • Topic Dictionary with the general information about the disease. It contains the following information:

      • Title the diagnosis
      • Disease Discussion
      • ACR Code
      • Category

      Descriptions.json

    Contains a list of JSON, each of these provide the textual information about a single image, stored in the image folder. The structure of each element is reported below:

    • Type Can be CT or MR, identifies teh scanning modality of the image.
    • U_id The UID code of the clinical case the image belongs to.
    • image name of the image file
    • location fine-grained information about the body part location of the given image
    • location category macro-location of the body-part showen in the given image
    • Description Dictionary with the decriptive information of the image. It contains the following information:
      • ACR codes
      • Age age of the patient
      • Sex sex of the patient
      • Caption refers to the specific caption of the image
      • Figure part
      • Modality scanning modality of the image
      • Plane

  5. MedPix - kuw7-t8hs - Archive Repository

    • healthdata.gov
    application/rdfxml +5
    Updated Jun 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). MedPix - kuw7-t8hs - Archive Repository [Dataset]. https://healthdata.gov/w/agpt-kk3v/default?cur=5_Dm7y9HbBA
    Explore at:
    csv, application/rssxml, json, xml, application/rdfxml, tsvAvailable download formats
    Dataset updated
    Jun 28, 2025
    Description

    This dataset tracks the updates made on the dataset "MedPix" as a repository for previous versions of the data and metadata.

  6. h

    medpix-clinqa-split

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dharanidhar Reddy Yerram, medpix-clinqa-split [Dataset]. https://huggingface.co/datasets/dreddyyerram/medpix-clinqa-split
    Explore at:
    Authors
    Dharanidhar Reddy Yerram
    Description

    dreddyyerram/medpix-clinqa-split dataset hosted on Hugging Face and contributed by the HF Datasets community

  7. h

    vqa-rad

    • huggingface.co
    • opendatalab.com
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Flavia Giammarino (2023). vqa-rad [Dataset]. https://huggingface.co/datasets/flaviagiammarino/vqa-rad
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 3, 2023
    Authors
    Flavia Giammarino
    License

    https://choosealicense.com/licenses/cc0-1.0/https://choosealicense.com/licenses/cc0-1.0/

    Description

    Dataset Card for VQA-RAD

      Dataset Description
    

    VQA-RAD is a dataset of question-answer pairs on radiology images. The dataset is intended to be used for training and testing Medical Visual Question Answering (VQA) systems. The dataset includes both open-ended questions and binary "yes/no" questions. The dataset is built from MedPix, which is a free open-access online database of medical images. The question-answer pairs were manually generated by a team of clinicians.… See the full description on the dataset page: https://huggingface.co/datasets/flaviagiammarino/vqa-rad.

  8. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Library of Medicine (2025). MedPix [Dataset]. https://catalog.data.gov/dataset/medpix

MedPix

Explore at:
Dataset updated
Jun 19, 2025
Dataset provided by
National Library of Medicine
Description

MedPix is a database of patient cases integrating images and textual information. The content material is organized by disease location (organ system), pathology category, patient profiles, and by image classification and caption. Additional information at https://medpix.nlm.nih.gov/home

Search
Clear search
Close search
Google apps
Main menu