27 datasets found
  1. Lung Disease Classification Dataset (100+ images)

    • kaggle.com
    Updated Sep 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AyushTankha (2023). Lung Disease Classification Dataset (100+ images) [Dataset]. https://www.kaggle.com/datasets/ayushtankha/lung-disease-classification-dataset-100-images
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 1, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    AyushTankha
    Description

    This file does not have a description yet.

    Covid-19_and_Pneumonia_X-Ray_Detector Aim of this project is to detect Covid-19 from X-ray and also able to differentitate Covid-19 from viral pneumonia and bacterial pneumonia. I have created a custom dataset that contains covid-19 x-ray images, viral pneumonia x-ray images, bacterial pneumonia x-ray iamges and normal person x-ray images.Each class contains 133 images.

    Dataset I have used data from https://github.com/ieee8023/covid-chestxray-dataset and https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.

    0 - Covid-19

    1 - Normal X-ray

    2 - Viral Pneumonia X-ray

    3 - Bacterial Pneumonia X-ray

  2. R

    Chest X Rays Dataset

    • universe.roboflow.com
    zip
    Updated Nov 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Traore (2022). Chest X Rays Dataset [Dataset]. https://universe.roboflow.com/mohamed-traore-2ekkp/chest-x-rays-qjmia/model/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 4, 2022
    Dataset authored and provided by
    Mohamed Traore
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Pneumonia
    Description

    This classification dataset is from Kaggle and was uploaded to Kaggle by Paul Mooney.

    It contains over 5,000 images of chest x-rays in two categories: "PNEUMONIA" and "NORMAL."

    • Version 1 contains the raw images, and only has the pre-processing feature of "Auto-Orient" applied to strip out EXIF data, and ensure all images are "right side up."
    • Version 2 contains the raw images with pre-processing features of "Auto-Orient" and Resize of 640 by 640 applied
    • Version 3 was trained with Roboflow's model architecture for classification datasets and contains the raw images with pre-processing features of "Auto-Orient" and Resize of 640 by 640 applied + augmentations:
      • Outputs per training example: 3
      • Shear: ±3° Horizontal, ±2° Vertical
      • Saturation: Between -5% and +5%
      • Brightness: Between -5% and +5%
      • Exposure: Between -5% and +5%

    Below you will find the description provided on Kaggle:

    Context

    http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5 https://i.imgur.com/jZqpV51.png" alt="Figure S6"> Figure S6. Illustrative Examples of Chest X-Rays in Patients with Pneumonia, Related to Figure 6 The normal chest X-ray (left panel) depicts clear lungs without any areas of abnormal opacification in the image. Bacterial pneumonia (middle) typically exhibits a focal lobar consolidation, in this case in the right upper lobe (white arrows), whereas viral pneumonia (right) manifests with a more diffuse ‘‘interstitial’’ pattern in both lungs. http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5

    Content

    The dataset is organized into 3 folders (train, test, val) and contains subfolders for each image category (Pneumonia/Normal). There are 5,863 X-Ray images (JPEG) and 2 categories (Pneumonia/Normal).

    Chest X-ray images (anterior-posterior) were selected from retrospective cohorts of pediatric patients of one to five years old from Guangzhou Women and Children’s Medical Center, Guangzhou. All chest X-ray imaging was performed as part of patients’ routine clinical care.

    For the analysis of chest x-ray images, all chest radiographs were initially screened for quality control by removing all low quality or unreadable scans. The diagnoses for the images were then graded by two expert physicians before being cleared for training the AI system. In order to account for any grading errors, the evaluation set was also checked by a third expert.

    Acknowledgements

    Data: https://data.mendeley.com/datasets/rscbjbr9sj/2

    License: CC BY 4.0

    Citation: http://www.cell.com/cell/fulltext/S0092-8674(18)30154-5 https://i.imgur.com/8AUJkin.png" alt="citation - latest version (Kaggle)">

    Inspiration

    Automated methods to detect and classify human diseases from medical images.

  3. NIH Chest X-rays Bbox version

    • kaggle.com
    Updated Jun 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huthayfa Hodeb (2024). NIH Chest X-rays Bbox version [Dataset]. https://www.kaggle.com/datasets/huthayfahodeb/nih-chest-x-rays-bbox-version
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 25, 2024
    Dataset provided by
    Kaggle
    Authors
    Huthayfa Hodeb
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    NIH Chest X-ray Dataset

    National Institutes of Health Chest X-Ray Dataset

    Chest X-ray exams are one of the most frequent and cost-effective medical imaging examinations available. However, clinical diagnosis of a chest X-ray can be challenging and sometimes more difficult than diagnosis via chest CT imaging. The lack of large publicly available datasets with annotations means it is still very difficult, if not impossible, to achieve clinically relevant computer-aided detection and diagnosis (CAD) in real world medical sites with chest X-rays. One major hurdle in creating large X-ray image datasets is the lack resources for labeling so many images. Prior to the release of this dataset, Openi was the largest publicly available source of chest X-ray images with 4,143 images available.

    This NIH Chest X-ray Dataset is comprised of 112,120 X-ray images with disease labels from 30,805 unique patients. To create these labels, the authors used Natural Language Processing to text-mine disease classifications from the associated radiological reports. The labels are expected to be >90% accurate and suitable for weakly-supervised learning. The original radiology reports are not publicly available but you can find more details on the labeling process in this Open Access paper: "ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases." (Wang et al.)

    Link to paper

    Data limitations

    • The image labels are NLP extracted so there could be some erroneous labels but the NLP labeling accuracy is estimated to be >90%.
    • Very limited numbers of disease region bounding boxes (See BBox_list_2017.csv)

    File contents

    • Image format: 880 total images with size 1024 x 1024
    • bbox_img: Contains 880 bbox images
    • README_ChestXray.pdf: Original README file
    • BBox_list_2017.csv: Bounding box coordinates. Note: Start at x,y, extend horizontally w pixels, and vertically h pixels
      • Image Index: File name
      • Finding Label: Disease type (Class label)
      • Bbox x
      • Bbox y
      • Bbox w
      • Bbox h
    • Data_entry_2017.csv: Class labels and patient data for the entire dataset
      • Image Index: File name
      • Finding Labels: Disease type (Class label)
      • Follow-up #
      • Patient ID
      • Patient Age
      • Patient Gender
      • View Position: X-ray orientation
      • OriginalImageWidth
      • OriginalImageHeight
      • OriginalImagePixelSpacing_x
      • OriginalImagePixelSpacing_y
    • label.csv: Class labels
    • tesnorlfow.csv: tensorflow version of the dataset

    Class descriptions

    There are 8 classes . Images can be classified as one or more disease classes: - Infiltrate - Atelectasis - Pneumonia - Cardiomegaly - Effusion - Pneumothorax - Mass - Nodule

    Citations

    Acknowledgements

    This work was supported by the Intramural Research Program of the NClinical Center (clinicalcenter.nih.gov) and National Library of Medicine (www.nlm.nih.gov).

  4. Covid-19 X-Ray Classification Dataset

    • kaggle.com
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanidhya Goel (2024). Covid-19 X-Ray Classification Dataset [Dataset]. https://www.kaggle.com/datasets/sanidhyagoel/covid-19-x-ray-classification-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sanidhya Goel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains 284 images of human chest X-ray belonging to 2 classes (Covid-19 Positive and Negative). The dataset has been divided into train and validation splits with 112 and 30 images respectively. The dataset shall be used to train deep learning models such as CNN.

    Dataset Hierarchy

    Dataset.zip ├── train │ ├── normal │ └── infected └── val ├── normal └── infected

    Citations : - Covid-19 Positive Patient Chest X-ray images (Source : https://github.com/ieee8023/covid-chestxray-dataset/tree/master) - Kaggle Human Lung X-ray Image Dataset (Extracted only "Normal") (Source : https://www.kaggle.com/datasets/paultimothymooney/chest-xray-pneumonia)

  5. P

    HDSNE Chest X-ray Dataset Dataset

    • paperswithcode.com
    Updated Feb 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HDSNE Chest X-ray Dataset Dataset [Dataset]. https://paperswithcode.com/dataset/hdsne-chest-x-ray-dataset
    Explore at:
    Dataset updated
    Feb 25, 2025
    Description

    Description:

    👉 Download the dataset here

    The continuous release of medical image databases, often featuring overlapping or identical categories, poses a significant challenge for the development of autonomous Computer-Aided Diagnostics (CAD) systems. These systems are essential for creating truly comprehensive medical diagnostics. However, one of the main obstacles lies in the frequent bulk release of datasets, which commonly suffer from two critical issues: image duplication and data corruption.

    The Problem of Dataset Redundancy

    Repeated releases of the same categories often fail to integrate or deduplicate similar images across databases, which can severely impact the effectiveness of machine learning models. Data duplication not only reduces the efficiency of learning models but also leads to overfitting, wastes computational resources, and increases the carbon footprint due to the energy required for training complex models.

    Download Dataset

    Proposed Solution: Global Data Aggregation Model

    In response to these challenges, we introduce a global data aggregation model that intelligently combines data from six distinct and reputable medical imaging databases. Each database was carefully curated to ensure the elimination of redundancies while preserving data diversity. Two robust algorithms were employed:

    Hash MD5 Algorithm: This algorithm generates unique hash values for each image, helping in the effective detection and elimination of duplicate images.

    t-SNE Algorithm: This technique is used for dimensionality reduction, with a tunable perplexity parameter to ensure accurate representation of high-dimensional data.

    Dataset Categories

    The final dataset includes an equal number of samples from three key categories of chest X-ray images:

    Normal Pneumonia COVID-19

    This uniform distribution ensures that the dataset is balanced, avoiding class imbalance—a common issue that can skew results in medical image analysis.

    Dataset Application & Model Evaluation

    The dataset was applied to the Inception V3 pre-trained model, a leading convolutional neural network (CNN) architecture known for its excellence in image classification tasks. The evaluation was conduct using the following performance metrics:

    Accuracy: An exceptional accuracy rate of 98.48% was achieve.

    Precision, Recall, and F1-score: The dataset showed strong performance across these metrics, reducing both false positives and false negatives.

    Statistical Validation: A t-test was conduct to validate the results, and the t-values and p-values confirm the statistical significance of the model’s performance.

    Conclusion

    The HDSNE Chest X-ray Dataset offers a novel and effective approach to data aggregation, tackling the issues of redundancy and data duplication that have long plagued the field of medical imaging. By maintaining a balance class distribution and eliminating unnecessary data, this dataset provides a cleaner and more efficient resource for training machine learning models.

    This dataset is sourced from Kaggle.

  6. n

    NIH Chest X-ray Dataset - Dataset - 國網中心Dataset平台

    • scidm.nchc.org.tw
    Updated Oct 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). NIH Chest X-ray Dataset - Dataset - 國網中心Dataset平台 [Dataset]. https://scidm.nchc.org.tw/dataset/nih-chest-x-ray-dataset
    Explore at:
    Dataset updated
    Oct 10, 2020
    Description

    https://www.kaggle.com/nih-chest-xrays Chest X-ray exams are one of the most frequent and cost-effective medical imaging examinations available. However, clinical diagnosis of a chest X-ray can be challenging and sometimes more difficult than diagnosis via chest CT imaging. The lack of large publicly available datasets with annotations means it is still very difficult, if not impossible, to achieve clinically relevant computer-aided detection and diagnosis (CAD) in real world medical sites with chest X-rays. One major hurdle in creating large X-ray image datasets is the lack resources for labeling so many images. Prior to the release of this dataset, Openi was the largest publicly available source of chest X-ray images with 4,143 images available. This NIH Chest X-ray Dataset is comprised of 112,120 X-ray images with disease labels from 30,805 unique patients. To create these labels, the authors used Natural Language Processing to text-mine disease classifications from the associated radiological reports. The labels are expected to be >90% accurate and suitable for weakly-supervised learning. The original radiology reports are not publicly available but you can find more details on the labeling process in this Open Access paper: "ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases." (Wang et al.)

  7. Lung Area Specific COVID-19 Xray Dataset

    • kaggle.com
    Updated Mar 26, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    foram sanghavi (2021). Lung Area Specific COVID-19 Xray Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/2060331
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 26, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    foram sanghavi
    Description

    In this dataset, the full radiographs are hand-cropped to obtain the lung area-specific radiographs. On using the lung area-specific dataset please cite the following paper: "Automated Detection of COVID-19 cases on Radiographs using Shape-dependent Fibonacci-p Patterns"

    The full radiographs were collected from the Kaggle dataset (M. E. Chowdhury et al., "Can AI help in screening viral and COVID-19 pneumonia?," arXiv preprint arXiv:2003.13145, 2020), and from the COVIDGR dataset (S. Tabik et al., "COVIDGR dataset and COVID-SDNet methodology for predicting COVID-19 based on Chest X-Ray images," IEEE journal of biomedical and health informatics, vol. 24, no. 12, pp. 3595-3605, 2020.)

  8. P

    ChestX-ray14 Dataset

    • paperswithcode.com
    Updated Feb 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaosong Wang; Yifan Peng; Le Lu; Zhiyong Lu; Mohammadhadi Bagheri; Ronald M. Summers (2021). ChestX-ray14 Dataset [Dataset]. https://paperswithcode.com/dataset/chestx-ray14
    Explore at:
    Dataset updated
    Feb 19, 2021
    Authors
    Xiaosong Wang; Yifan Peng; Le Lu; Zhiyong Lu; Mohammadhadi Bagheri; Ronald M. Summers
    Description

    ChestX-ray14 is a medical imaging dataset which comprises 112,120 frontal-view X-ray images of 30,805 (collected from the year of 1992 to 2015) unique patients with the text-mined fourteen common disease labels, mined from the text radiological reports via NLP techniques. It expands on ChestX-ray8 by adding six additional thorax diseases: Edema, Emphysema, Fibrosis, Pleural Thickening and Hernia.

  9. P

    ChestX-ray8 Dataset

    • paperswithcode.com
    • opendatalab.com
    Updated Feb 9, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaosong Wang; Yifan Peng; Le Lu; Zhiyong Lu; Mohammadhadi Bagheri; Ronald M. Summers (2021). ChestX-ray8 Dataset [Dataset]. https://paperswithcode.com/dataset/chestx-ray8
    Explore at:
    Dataset updated
    Feb 9, 2021
    Authors
    Xiaosong Wang; Yifan Peng; Le Lu; Zhiyong Lu; Mohammadhadi Bagheri; Ronald M. Summers
    Description

    ChestX-ray8 is a medical imaging dataset which comprises 108,948 frontal-view X-ray images of 32,717 (collected from the year of 1992 to 2015) unique patients with the text-mined eight common disease labels, mined from the text radiological reports via NLP techniques.

  10. Z

    DECIMER Image classifier dataset

    • data.niaid.nih.gov
    Updated Jul 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M. Isabel agea (2022). DECIMER Image classifier dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6670745
    Explore at:
    Dataset updated
    Jul 9, 2022
    Dataset authored and provided by
    M. Isabel agea
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Images dataset divided into train (10905114 images), validation (2115528 images) and test (544946 images) folders containing a balanced number of images for two classes (chemical structures and non-chemical structures).

    The chemical structures were generated using RanDepict to random picked compounds from the ChEMBL30 database and the COCONUT database.

    The non-chemical structures were generated using Python or they were retrieved from several public datasets:

    COCO dataset, MIT Places-205 dataset, Visual Genome dataset, Google Open labeled Images, MMU-OCR-21 (kaggle), HandWritten_Character (kaggle), CoronaHack -Chest X-Ray-dataset (kaggle), PANDAS Augmented Images (kaggle), Bacterial_Colony (kaggle), Ceylon Epigraphy Periods (kaggle), Chinese Calligraphy Styles by Calligraphers (kaggle), Graphs Dataset (kaggle), Function_Graphs Polynomial (kaggle), sketches (kaggle), Person Face Sketches (kaggle), Art Pictograms (kaggle), Russian handwritten letters (kaggle), Handwritten Russian Letters (kaggle), Covid-19 Misinformation Tweets Labeled Dataset (kaggle) and grapheme-imgs-224x224 (kaggle).

    This data was used to build a CNN classification model using as a base model EfficienNetB0 and fine tuning it. The model is available on Github.

  11. Pneumonia_chest_xray

    • kaggle.com
    Updated Nov 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adnan Alaref (2024). Pneumonia_chest_xray [Dataset]. https://www.kaggle.com/datasets/adnanalaref/pneumonia-chest-xray/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 6, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Adnan Alaref
    Description

    Dataset

    This dataset was created by Adnan Alaref

    Released under Other (specified in description)

    Contents

  12. Chest X-Ray Worldwide Datasets

    • kaggle.com
    Updated Dec 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Homayoon khadivi (2020). Chest X-Ray Worldwide Datasets [Dataset]. https://www.kaggle.com/homayoonkhadivi/chest-xray-worldwide-datasets/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 9, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Homayoon khadivi
    Description

    The ChestX-ray8 dataset which contains 108,948 frontal-view X-ray images of 32,717 unique patients.

    Each image in the data set contains multiple text-mined labels identifying 14 different pathological conditions. These in turn can be used by physicians to diagnose 8 different diseases. We will use this data to develop a single model that will provide binary classification predictions for each of the 14 labeled pathologies. In other words it will predict 'positive' or 'negative' for each of the pathologies. You can download the entire dataset for free here. (https://nihcc.app.box.com/v/ChestXray-NIHCC)

    I have provided a ~1000 image subset of the images here The dataset includes a CSV file that provides the labels for each X-ray.

    To make your job a bit easier, I have processed the labels for our small sample and generated three new files to get you started. These three files are:

    train-small-new.csv: 875 images from our dataset to be used for training. valid-small-new.csv: 109 images from our dataset to be used for validation. test-small-new.csv: 420 images from our dataset to be used for testing. This dataset has been annotated by consensus among four different radiologists for 5 of our 14 pathologies:

    Consolidation Edema Effusion Cardiomegaly Atelectasis

  13. Mini NIH XRay Dataset for Binary Classification

    • kaggle.com
    Updated Jan 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abby Morgan (2023). Mini NIH XRay Dataset for Binary Classification [Dataset]. https://www.kaggle.com/datasets/abbymorgan/create-mini-xray-dataset-binary-classification-100
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 4, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Abby Morgan
    Description

    The original full dataset contained 112,120 X-ray images with disease labels from 30,805 unique patients.

    This notebook is modified from K Scott Mader's notebook here to create a mini chest x-ray dataset that is split 50:50 between normal and diseased images.

    In my notebook I will use this dataset to test a pretrained model on a binary classification task (diseased vs. healthy xray), and then visualize which specific labels the model has the most trouble with.

    Also, because disease classification is such an important task to get right, it's likely that any AI/ML medical classification task will include a human-in-the-loop. In this way, this process more closely resembles how this sort of ML would be used in the real world.

    Note that the original notebook on which this one was based had two versions: Standard and Equalized. In this notebook we will be using the equalized version in order to save ourselves the extra step of performing CLAHE during the tensor transformations.

    The goal of this notebook, as originally stated by Mader, is "to make a much easier to use mini-dataset out of the Chest X-Ray collection. The idea is to have something akin to MNIST or Fashion MNIST for medical images." In order to do this, we will preprocess, normalize, and scale down the images, and then save them into an HDF5 file with the corresponding tabular data.

    Data limitations: The image labels are NLP extracted so there could be some erroneous labels but the NLP labeling accuracy is estimated to be >90%. Very limited numbers of disease region bounding boxes (See BBoxlist2017.csv) Chest x-ray radiology reports are not anticipated to be publicly shared. Parties who use this public dataset are encouraged to share their “updated” image labels and/or new bounding boxes in their own studied later, maybe through manual annotation

    File Contents File is an HDF5 file of shape 200, 28. Main file contains nested HDF5 file of xray images with key images. Main HDF5 file keys are: - Image Index
    - Finding Labels: list of disease labels
    - Follow-up #
    - Patient ID
    - Patient Age
    - Patient Gender: 'F'/'M'
    - View Position: 'PA', 'AP' - OriginalImageWidth
    - OriginalImageHeight
    - OriginalImagePixelSpacing_x
    - Normal: Binary; if Xray finding is 'Normal' - Atelectasis: Binary; if Xray finding includes 'Atelectasis' - Cardiomegaly: Binary; if Xray finding includes 'Cardiomegaly' - Consolidation: Binary; if Xray finding includes 'Consolidation' - Edema: Binary; if Xray finding includes 'Edema' - Effusion: Binary; if Xray finding includes 'Effusion' - Emphysema: Binary; if Xray finding includes 'Emphysema' - Fibrosis: Binary; if Xray finding includes 'Fibrosis' - Hernia: Binary; if Xray finding includes 'Hernia' - Infiltration: Binary; if Xray finding includes 'Infiltration' - Mass: Binary; if Xray finding includes 'Mass' - Nodule: Binary; if Xray finding includes 'Nodule' - Pleural_Thickening: Binary; if Xray finding includes 'Pleural_Thickening' - Pneumonia: Binary; if Xray finding includes'Pneumonia'
    - Pneumothorax: Binary; if Xray finding includes 'Pneumothorax'

  14. NIH Chest X ray 14 (224x224 resized)

    • kaggle.com
    zip
    Updated Jul 8, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Khan Fashee Monowar (Sawrup) (2020). NIH Chest X ray 14 (224x224 resized) [Dataset]. https://www.kaggle.com/khanfashee/nih-chest-x-ray-14-224x224-resized
    Explore at:
    zip(2468882507 bytes)Available download formats
    Dataset updated
    Jul 8, 2020
    Authors
    Khan Fashee Monowar (Sawrup)
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    National Institutes of Health Chest X-Ray Dataset

    Chest X-ray exams are one of the most frequent and cost-effective medical imaging examinations available. However, clinical diagnosis of a chest X-ray can be challenging and sometimes more difficult than diagnosis via chest CT imaging. The lack of large publicly available datasets with annotations means it is still very difficult, if not impossible, to achieve clinically relevant computer-aided detection and diagnosis (CAD) in real world medical sites with chest X-rays. One major hurdle in creating large X-ray image datasets is the lack resources for labeling so many images. Prior to the release of this dataset, Openi was the largest publicly available source of chest X-ray images with 4,143 images available.

    This NIH Chest X-ray Dataset is comprised of 112,120 X-ray images with disease labels from 30,805 unique patients. To create these labels, the authors used Natural Language Processing to text-mine disease classifications from the associated radiological reports. The labels are expected to be >90% accurate and suitable for weakly-supervised learning. The original radiology reports are not publicly available but you can find more details on the labeling process in this Open Access paper: "ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases." (Wang et al.)

    Data limitations:

    The image labels are NLP extracted so there could be some erroneous labels but the NLP labeling accuracy is estimated to be >90%.
    Very limited numbers of disease region bounding boxes (See BBoxlist2017.csv)
    Chest x-ray radiology reports are not anticipated to be publicly shared. Parties who use this public dataset are encouraged to share their “updated” image labels and/or new bounding boxes in their own studied later, maybe through manual annotation
    

    File contents

    Image format: 112,120 total images with size 1024 x 1024
    
    images_001.zip: Contains 4999 images
    
    images_002.zip: Contains 10,000 images
    
    images_003.zip: Contains 10,000 images
    
    images_004.zip: Contains 10,000 images
    
    images_005.zip: Contains 10,000 images
    
    images_006.zip: Contains 10,000 images
    
    images_007.zip: Contains 10,000 images
    
    images_008.zip: Contains 10,000 images
    
    images_009.zip: Contains 10,000 images
    
    images_010.zip: Contains 10,000 images
    
    images_011.zip: Contains 10,000 images
    
    images_012.zip: Contains 7,121 images
    
    README_ChestXray.pdf: Original README file
    
    BBoxlist2017.csv: Bounding box coordinates. Note: Start at x,y, extend horizontally w pixels, and vertically h pixels
      Image Index: File name
      Finding Label: Disease type (Class label)
      Bbox x
      Bbox y
      Bbox w
      Bbox h
    
    Dataentry2017.csv: Class labels and patient data for the entire dataset
      Image Index: File name
      Finding Labels: Disease type (Class label)
      Follow-up #
      Patient ID
      Patient Age
      Patient Gender
      View Position: X-ray orientation
      OriginalImageWidth
      OriginalImageHeight
      OriginalImagePixelSpacing_x
      OriginalImagePixelSpacing_y
    

    Class descriptions

    There are 15 classes (14 diseases, and one for "No findings"). Images can be classified as "No findings" or one or more disease classes:

    Atelectasis
    Consolidation
    Infiltration
    Pneumothorax
    Edema
    Emphysema
    Fibrosis
    Effusion
    Pneumonia
    Pleural_thickening
    Cardiomegaly
    Nodule Mass
    Hernia
    

    Full Dataset Content

    There are 12 zip files in total and range from ~2 gb to 4 gb in size. Additionally, we randomly sampled 5% of these images and created a smaller dataset for use in Kernels. The random sample contains 5606 X-ray images and class labels.

    Sample: sample.zip
    

    Modifications to original data

    Original TAR archives were converted to ZIP archives to be compatible with the Kaggle platform
    
    CSV headers slightly modified to be more explicit in comma separation and also to allow fields to be self-explanatory
    

    Citations

    Wang X, Peng Y, Lu L, Lu Z, Bagheri M, Summers RM. ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases. IEEE CVPR 2017, ChestX-ray8Hospital-ScaleChestCVPR2017_paper.pdf
    
    NIH News release: NIH Clinical Center provides one of the largest publicly available chest x-ray datasets to scientific community
    
    Original source files and documents: https://nihcc.app.box.com/v/ChestXray-NIHCC/folder/36938765345
    
  15. f

    Classification accuracy comparison.

    • figshare.com
    xls
    Updated Sep 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weiguang Liu; Rafael Delalibera Rodrigues; Jianglong Yan; Yu-tao Zhu; Everson José de Freitas Pereira; Gen Li; Qiusheng Zheng; Liang Zhao (2023). Classification accuracy comparison. [Dataset]. http://doi.org/10.1371/journal.pone.0290968.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Sep 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Weiguang Liu; Rafael Delalibera Rodrigues; Jianglong Yan; Yu-tao Zhu; Everson José de Freitas Pereira; Gen Li; Qiusheng Zheng; Liang Zhao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In this work, we present a network-based technique for chest X-ray image classification to help the diagnosis and prognosis of patients with COVID-19. From visual inspection, we perceive that healthy and COVID-19 chest radiographic images present different levels of geometric complexity. Therefore, we apply fractal dimension and quadtree as feature extractors to characterize such differences. Moreover, real-world datasets often present complex patterns, which are hardly handled by only the physical features of the data (such as similarity, distance, or distribution). This issue is addressed by complex networks, which are suitable tools for characterizing data patterns and capturing spatial, topological, and functional relationships in data. Specifically, we propose a new approach combining complexity measures and complex networks to provide a modified high-level classification technique to be applied to COVID-19 chest radiographic image classification. The computational results on the Kaggle COVID-19 Radiography Database show that the proposed method can obtain high classification accuracy on X-ray images, being competitive with state-of-the-art classification techniques. Lastly, a set of network measures is evaluated according to their potential in distinguishing the network classes, which resulted in the choice of communicability measure. We expect that the present work will make significant contributions to machine learning at the semantic level and to combat COVID-19.

  16. 5k trachea bifurcation on chest xray

    • kaggle.com
    Updated Feb 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dr. Konya (2021). 5k trachea bifurcation on chest xray [Dataset]. https://www.kaggle.com/sandorkonya/5k-trachea-bifurcation-on-chest-xray
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 20, 2021
    Dataset provided by
    Kaggle
    Authors
    dr. Konya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I made these annotations for the RANZCR CLiP - Catheter and Line Position Challenge.

    Content

    The dataset contains: - 5281 ROIs for trachea bifurcation in VGG's json and COCO-Style json format.

    A faster RCNN with a 200 px bounding box around the point trained performs pretty good, the average distance to GT is below 50 px , see histogram (X distance in pixel from GT):

    https://i.postimg.cc/4xZZQJYS/trachea-bifurcation.jpg" alt="predicted trachea distance on image from GT">

    Inspiration

    I hope this helps you to determine the abnormal positin of ET tubes on x-rays!

    If you use this dataset, please cite as: Trachea bifurcation dataset by Kónya et al., 2021 , https://www.kaggle.com/sandorkonya/5k-trachea-bifurcation-on-chest-xray https://orcid.org/0000-0001-7356-0541

    Thank you!

  17. Data from: Covid19 Detection

    • kaggle.com
    Updated May 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    donjon00 (2021). Covid19 Detection [Dataset]. https://www.kaggle.com/donjon00/covid19-detection/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 27, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    donjon00
    Description

    Datasets Used

    This dataset is made from multiple publicly available datasets, which are listed below- 1. NIH Chest X-ray Dataset of 14 Common Thorax Disease 2. Tuberculosis (TB) Chest X-ray Database 3. COVID-19 CHEST X-RAY DATABASE 4. "https://data.mendeley.com/datasets/rscbjbr9sj/2">Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification

    Acknowledgements

    NIH Chest X-ray dataset: Xiaosong Wang, Yifan Peng, Le Lu, Zhiyong Lu, Mohammadhadi Bagheri, Ronald Summers, ChestX-ray8: Hospital-scale Chest X-ray Database and Benchmarks on Weakly-Supervised Classification and Localization of Common Thorax Diseases, IEEE CVPR, pp. 3462-3471, 2017.

    TB dataset: Tawsifur Rahman, Amith Khandakar, Muhammad A. Kadir, Khandaker R. Islam, Khandaker F. Islam, Zaid B. Mahbub, Mohamed Arselene Ayari, Muhammad E. H. Chowdhury. (2020) "Reliable Tuberculosis Detection using Chest X-ray with Deep Learning, Segmentation and Visualization". IEEE Access, Vol. 8, pp 191586 - 191601. DOI. 10.1109/ACCESS.2020.3031384.

    COVID dataset: -M.E.H. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M.A. Kadir, Z.B. Mahbub, K.R. Islam, M.S. Khan, A. Iqbal, N. Al-Emadi, M.B.I. Reaz, M. T. Islam, “Can AI help in screening Viral and COVID-19 pneumonia?” IEEE Access, Vol. 8, 2020, pp. 132665 - 132676. -Rahman, T., Khandakar, A., Qiblawey, Y., Tahir, A., Kiranyaz, S., Kashem, S.B.A., Islam, M.T., Maadeed, S.A., Zughaier, S.M., Khan, M.S. and Chowdhury, M.E., 2020. Exploring the Effect of Image Enhancement Techniques on COVID-19 Detection using Chest X-ray Images. arXiv preprint arXiv:2012.02238.

    Pneumonia dataset: Kermany, Daniel; Zhang, Kang; Goldbaum, Michael (2018), “Labeled Optical Coherence Tomography (OCT) and Chest X-Ray Images for Classification”, Mendeley Data, V2, doi: 10.17632/rscbjbr9sj.2

    Inspiration

    Automating the detection and classification of pulmonary diseases using CXR images.

  18. Dataset (Covid-Bacterial-Viral-Normal-Emphysema)

    • kaggle.com
    Updated Jun 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nhật Nguyễn Minh (2024). Dataset (Covid-Bacterial-Viral-Normal-Emphysema) [Dataset]. https://www.kaggle.com/datasets/minhnhat232/dataset-covid-bacterial-viral-normal-emphysema/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 13, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nhật Nguyễn Minh
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The dataset contain lung x-ray image including:

    1. Normal - 3,270 images
    2. Covid-19 - 3,017 images
    3. Viral-pneumonia - 3,013 images
    4. Bacterial-pneumonia - 3,000 images
    5. Emphysema - 2,550 images

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F15315323%2F8041ddd2485bfe9cdf2ba1f9d96bd7e5%2F6_Class_Img.jpg?generation=1741951756137022&alt=media" alt="">

    The dataset we use is compiled from many reputable sources including: Dataset 1 [1]: This dataset includes four classes of diseases: COVID-19, viral pneumonia, bacterial pneumonia, and normal. It has multiple versions, and we are currently using the latest version (version 4). Previous studies, such as those by Hariri et al. [18] and Ahmad et al. [20], have also utilized earlier versions of this dataset. Dataset 2 [2]: This dataset is from the National Institutes of Health (NIH) Chest X-Ray Dataset, which contains over 100,000 chest X-ray images from over 30,000 patients. It includes 14 disease classes, including conditions like atelectasis, consolidation, and infiltration. For this study, we have selected 2,550 chest X-ray images specifically from the Emphysema class. Dataset 3 [3]: This is the COVQU dataset, which we have extended to include two additional classes: COVID-19 and viral pneumonia. This dataset has been widely used in previous studies by M.E.H. Chowdhury et al. [4] and Rahman T et al. [5], establishing its reputation as a reliable resource.

    In addition, we also publish a modified dataset that aims to remove image regions that do not contain lungs (abdomen, arms, etc.).

    References: [1] U. Sait, K. G. Lal, S. P. Prajapati, R. Bhaumik, T. Kumar, S. Shivakumar, K. Bhalla, Curated dataset for covid-19 posterior-anterior chest radiography images (x-rays)., Mendeley Data V4 (2022). doi:10.17632/9xkhgts2s6.4. [2] X. Wang, Y. Peng, L. Lu, Z. Lu, M. Bagheri, R. M. Summers, Chestx-ray8: Hospital-scale chest x-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases (2017) 3462–3471. doi:10.1109/CVPR.2017.369. [3] A. M. Tahir, M. E. Chowdhury, A. Khandakar, T. Rahman, Y. Qiblawey, U. Khurshid, S. Kiranyaz, N. Ibtehaz, M. S. Rahman, S. Al-Maadeed,S. Mahmud, M. Ezeddin, K. Hameed, T. Hamid, Covid-19 infection localization and severity grading from chest x-ray images, Computers in Biology and Medicine 139 (2021) 105002. URL: https://www.sciencedirect.com/science/article/pii/S0010482521007964. doi:https://doi.org/10.1016/j.compbiomed.2021.105002. [4] M. E. Chowdhury, T. Rahman, A. Khandakar, R. Mazhar, M. A. Kadir, Z. B. Mahbub, K. R. Islam, M. S. Khan, A. Iqbal, N. A. Emadi, M. B. I. Reaz, M. T. Islam, Can ai help in screening viral and covid-19 pneumonia?, IEEE Access 8 (2020) 132665–132676. doi:10.1109/ACCESS.2020.3010287. [5] T. Rahman, A. Khandakar, Y. Qiblawey, A. Tahir, S. Kiranyaz, S. B. A. Kashem, M. T. Islam, S. A. Maadeed, S. M. Zughaier, M. S. Khan, M. E. Chowdhury, Exploring the effect of image enhancement techniques on covid-19 detection using chest x-ray images, Computers in Biology and Medicine 132 (2021). doi:10.1016/j.compbiomed.2021.104319.

  19. UNET Lung Segmentation Weights for Chest X Rays

    • kaggle.com
    Updated Dec 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farhan Hai Khan (2023). UNET Lung Segmentation Weights for Chest X Rays [Dataset]. http://doi.org/10.34740/kaggle/dsv/7312855
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 31, 2023
    Dataset provided by
    Kaggle
    Authors
    Farhan Hai Khan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Often CXRs contain a lot of noise around them, for cardiovascular disease identification, the Lung is an essential part of the CXR and mostly the only object of interest. To eliminate learning from noise, it is often advisable to preprocess datasets first using UNET lung Segmentation and then apply Object Detection/Classification Algorithms. hence this model is being uploaded.

    Starter Code

    I strongly recommend this notebook for training. Model Architecture : ```python

    def unet(input_size=(256,256,1)): inputs = Input(input_size)

    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)
    
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool2)
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv3)
    pool3 = MaxPooling2D(pool_size=(2, 2))(conv3)
    
    conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(pool3)
    conv4 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv4)
    pool4 = MaxPooling2D(pool_size=(2, 2))(conv4)
    
    conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(pool4)
    conv5 = Conv2D(512, (3, 3), activation='relu', padding='same')(conv5)
    
    up6 = concatenate([Conv2DTranspose(256, (2, 2), strides=(2, 2), padding='same')(conv5), conv4], axis=3)
    conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(up6)
    conv6 = Conv2D(256, (3, 3), activation='relu', padding='same')(conv6)
    
    up7 = concatenate([Conv2DTranspose(128, (2, 2), strides=(2, 2), padding='same')(conv6), conv3], axis=3)
    conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(up7)
    conv7 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv7)
    
    up8 = concatenate([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv7), conv2], axis=3)
    conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(up8)
    conv8 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv8)
    
    up9 = concatenate([Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(conv8), conv1], axis=3)
    conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(up9)
    conv9 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv9)
    
    conv10 = Conv2D(1, (1, 1), activation='sigmoid')(conv9)
    
    return Model(inputs=[inputs], outputs=[conv10])
    
    
    ### Acknowledgements
    
    This model would not be possible without [Nikhil Pandey](https://www.kaggle.com/nikhilpandey360).
    Here is the [Source Notebook](https://www.kaggle.com/nikhilpandey360/lung-segmentation-from-chest-x-ray-dataset/output).
    Also the dataset over which it is trained : [Chest Xray Masks and Labels](https://www.kaggle.com/nikhilpandey360/chest-xray-masks-and-labels)
    
    ### Inspiration
    
    Go forth and apply your own amazing DEEP NEURAL NETWORKS!
    
  20. 2.3k tracheostomy tube annotated on chest x-ray

    • kaggle.com
    Updated Feb 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    dr. Konya (2021). 2.3k tracheostomy tube annotated on chest x-ray [Dataset]. https://www.kaggle.com/sandorkonya/23k-tracheostomy-tube-annotated-on-chest-xray/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 19, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    dr. Konya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    I made these segmentations for the RANZCR CLiP - Catheter and Line Position Challenge.

    Content

    The dataset contains: - 2231 bounding boxes for tracheostomy tubes in VGG's json and COCO-Style json format.

    Inspiration

    I hope this helps you to segment tracheostomy tubes on x-rays for others!

    If you use this dataset, please cite as: Tracheostomy tube segmentation dataset by Kónya et al., 2021 , https://www.kaggle.com/sandorkonya/23k-tracheostomy-tube-annotated-on-chest-xray

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
AyushTankha (2023). Lung Disease Classification Dataset (100+ images) [Dataset]. https://www.kaggle.com/datasets/ayushtankha/lung-disease-classification-dataset-100-images
Organization logo

Lung Disease Classification Dataset (100+ images)

Computer Vision Training Data

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
AyushTankha
Description

This file does not have a description yet.

Covid-19_and_Pneumonia_X-Ray_Detector Aim of this project is to detect Covid-19 from X-ray and also able to differentitate Covid-19 from viral pneumonia and bacterial pneumonia. I have created a custom dataset that contains covid-19 x-ray images, viral pneumonia x-ray images, bacterial pneumonia x-ray iamges and normal person x-ray images.Each class contains 133 images.

Dataset I have used data from https://github.com/ieee8023/covid-chestxray-dataset and https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia.

0 - Covid-19

1 - Normal X-ray

2 - Viral Pneumonia X-ray

3 - Bacterial Pneumonia X-ray

Search
Clear search
Close search
Google apps
Main menu