41 datasets found
  1. c

    Curated Breast Imaging Subset of Digital Database for Screening Mammography

    • cancerimagingarchive.net
    csv, dicom, n/a
    Updated Sep 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2017). Curated Breast Imaging Subset of Digital Database for Screening Mammography [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
    Explore at:
    csv, dicom, n/aAvailable download formats
    Dataset updated
    Sep 14, 2017
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 14, 2017
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

    Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.

    For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing an curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

    Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

    For scientific and other inquiries about this dataset, please contact TCIA's Helpdesk.

  2. CBIS DDSM Dataset

    • kaggle.com
    Updated Apr 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Orvile (2025). CBIS DDSM Dataset [Dataset]. https://www.kaggle.com/datasets/orvile/cbis-ddsm-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 17, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Orvile
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    The CBIS-DDSM: Curated Breast Imaging Subset of Digital Database for Screening Mammography includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data set size capable of analyzing decision support systems in mammography.

    Authors mention that published research results are difficult to replicate due to the lack of a standard evaluation data set in the area of decision support systems in mammography; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. This causes an inability to directly compare the performance of methods or to replicate prior results. Authors seek to resolve this substantial challenge by releasing an updated and standardized version of the Digital Database for Screening Mammography (DDSM) for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

    The DDSM is a collection of mammograms from the following sources: Massachusetts General Hospital, Wake Forest University School of Medicine, Sacred Heart Hospital, and Washington University of St Louis School of Medicine. The DDSM was developed through a grant from the DOD Breast Cancer Research Program, US Army Research and Material Command, and the necessary patient consents were obtained by the original developers of the DDSM. The cases are annotated with ROIs for calcifications and masses, as well as the following information that may be useful for CADe and CADx algorithms: Breast Imaging Reporting and Data System (BI-RADS) descriptors for mass shape, mass margin, calcification type, calcification distribution, and breast density; overall BI-RADS assessment from 0 to 5; rating of the subtlety of the abnormality from 1 to 5; and patient age.

    Mass segmentation

    Mass margin and shape have long been proven substantial indicators for diagnosis in mammography. Because of this, many methods are based on developing mathematical descriptions of the tumour outline. Due to the dependence of these methods on accurate ROI segmentation and the imprecise nature of many of the DDSM-provided annotations, as seen in Fig. 1, we applied a lesion segmentation algorithm (described below) that is initialized by the general original DDSM contours but is able to supply much more accurate ROIs. Figure 1 contains example ROIs from the DDSM, our mammographer, and the automated segmentation algorithm. As shown, the DDSM outlines provide only a general location and not a precise mass boundary. The segmentation algorithm was designed to provide an exact delineation of the mass from the surrounding tissue. This segmentation was done only for masses and not calcifications.

    Standardized train/test splits

    Separate sets of cases for training and testing algorithms are important for ensuring that all researchers are using the same cases for these tasks. Specifically, the test set should contain cases of varying difficulty in order to ensure that the method is tested thoroughly. The data were split into a training set and a testing set based on the BI-RADS category. This allows for an appropriate stratification for researchers working on CADe as well as CADx. Note that many of the BI-RADS assessments likely were updated after additional information was gathered by the physician, as it is unconventional to subscribe BI-RADS 4 and 5 to screening images. The split was obtained using 20% of the cases for testing and the rest for training. The data were split for all mass cases and all calcification cases separately. Here ‘case’ is used to indicate a particular abnormality, seen on the craniocaudal (CC) and/or mediolateral oblique (MLO) views, which are the standard views for screening mammography. Figure 2 displays the histograms of BI-RADS assessment and pathology for the training and test sets for calcification cases and mass cases. As shown, the data split was performed in such a way as to provide an equal level of difficulty in the training and test sets.

    Data Records

    The original images are distributed at the full mammography and abnormality level as DICOM files. Full mammography images include both MLO and CC views of the mammograms.

    Metadata for each abnormality was transferred from the original csv files to tag format. For example:

    Patient ID: the first 7 characters of images in the case file
    
    Density category
    
    Breast: Left or Right
    
    View: CC or MLO
    
    Mass shape (when applicable)
    
    Mass margin (when applicable)
    
    Calcification type (when applicable)
    
    Calcification d...
    
  3. g

    DDSM Dataset

    • gts.ai
    json
    Updated Jul 5, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GTS (2024). DDSM Dataset [Dataset]. https://gts.ai/dataset-download/ddsm-dataset/
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jul 5, 2024
    Dataset provided by
    GLOBOSE TECHNOLOGY SOLUTIONS PRIVATE LIMITED
    Authors
    GTS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Explore the comprehensive DDSM and CBIS-DDSM mammogram image dataset, featuring 55,890 pre-processed images resized to 299x299 pixels.

  4. i

    Re-curated Breast Imaging Subset DDSM Dataset (RBIS-DDSM)

    • ieee-dataport.org
    Updated Mar 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RAKSHITH SATHISH (2022). Re-curated Breast Imaging Subset DDSM Dataset (RBIS-DDSM) [Dataset]. https://ieee-dataport.org/documents/re-curated-breast-imaging-subset-ddsm-dataset-rbis-ddsm
    Explore at:
    Dataset updated
    Mar 3, 2022
    Authors
    RAKSHITH SATHISH
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Re-curated Breast Imaging Subset DDSM Dataset (RBIS-DDSM) is a curated version of 849 images from the CBIS-DDSM dataset available online with a permissive copyright license (CC-BY-SA 3.0). The CBIS-DDSM dataset is an improved version of the DDSM dataset. The authors of the CBIS-DDSM dataset attempted to improve the ground truth by applying simple image processing based methods to enhance the edges without any manual intervention from medical experts in order to segment and annotate masses.

  5. h

    DDSM-mammography-dataset

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata Medical (2025). DDSM-mammography-dataset [Dataset]. https://huggingface.co/datasets/ud-medical/DDSM-mammography-dataset
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Unidata Medical
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Mammogram Photos of breast cancer - 600,000+ Studies

    Dataset comprises 100,000+ studies with protocol and 500,000+ studies without protocol, totaling over 600,000 digital mammography exams curated for cancer detection and diagnosis research.It is designed for advancing breast cancer research, providing a comprehensive resource for studying screening mammography, malignant and benign cases, and computer-aided detection systems. - Get the data

      Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-medical/DDSM-mammography-dataset.
    
  6. f

    Data distribution of CBIS-DDSM dataset.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman (2024). Data distribution of CBIS-DDSM dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0304757.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent advancements in AI, driven by big data technologies, have reshaped various industries, with a strong focus on data-driven approaches. This has resulted in remarkable progress in fields like computer vision, e-commerce, cybersecurity, and healthcare, primarily fueled by the integration of machine learning and deep learning models. Notably, the intersection of oncology and computer science has given rise to Computer-Aided Diagnosis (CAD) systems, offering vital tools to aid medical professionals in tumor detection, classification, recurrence tracking, and prognosis prediction. Breast cancer, a significant global health concern, is particularly prevalent in Asia due to diverse factors like lifestyle, genetics, environmental exposures, and healthcare accessibility. Early detection through mammography screening is critical, but the accuracy of mammograms can vary due to factors like breast composition and tumor characteristics, leading to potential misdiagnoses. To address this, an innovative CAD system leveraging deep learning and computer vision techniques was introduced. This system enhances breast cancer diagnosis by independently identifying and categorizing breast lesions, segmenting mass lesions, and classifying them based on pathology. Thorough validation using the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) demonstrated the CAD system’s exceptional performance, with a 99% success rate in detecting and classifying breast masses. While the accuracy of detection is 98.5%, when segmenting breast masses into separate groups for examination, the method’s performance was approximately 95.39%. Upon completing all the analysis, the system’s classification phase yielded an overall accuracy of 99.16% for classification. The potential for this integrated framework to outperform current deep learning techniques is proposed, despite potential challenges related to the high number of trainable parameters. Ultimately, this recommended framework offers valuable support to researchers and physicians in breast cancer diagnosis by harnessing cutting-edge AI and image processing technologies, extending recent advances in deep learning to the medical domain.

  7. R

    Ddsm Detector Dataset

    • universe.roboflow.com
    zip
    Updated Aug 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bluespace (2023). Ddsm Detector Dataset [Dataset]. https://universe.roboflow.com/bluespace/ddsm-detector
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 4, 2023
    Dataset authored and provided by
    Bluespace
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Patch Bounding Boxes
    Description

    DDSM Detector

    ## Overview
    
    DDSM Detector is a dataset for object detection tasks - it contains Patch annotations for 3,032 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  8. f

    Approaches comparison on CBIS DDSM dataset.

    • plos.figshare.com
    xls
    Updated Oct 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mudassar Ali; Tong Wu; Haoji Hu; Tariq Mahmood (2024). Approaches comparison on CBIS DDSM dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0309421.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 2, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Mudassar Ali; Tong Wu; Haoji Hu; Tariq Mahmood
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PurposeUsing computer-aided design (CAD) systems, this research endeavors to enhance breast cancer segmentation by addressing data insufficiency and data complexity during model training. As perceived by computer vision models, the inherent symmetry and complexity of mammography images make segmentation difficult. The objective is to optimize the precision and effectiveness of medical imaging.MethodsThe study introduces a hybrid strategy combining shape-guided segmentation (SGS) and M3D-neural cellular automata (M3D-NCA), resulting in improved computational efficiency and performance. The implementation of Shape-guided segmentation (SGS) during the initialization phase, coupled with the elimination of convolutional layers, enables the model to effectively reduce computation time. The research proposes a novel loss function that combines segmentation losses from both components for effective training.ResultsThe robust technique provided aims to improve the accuracy and consistency of breast tumor segmentation, leading to significant improvements in medical imaging and breast cancer detection and treatment.ConclusionThis study enhances breast cancer segmentation in medical imaging using CAD systems. Combining shape-guided segmentation (SGS) and M3D-neural cellular automata (M3D-NCA) is a hybrid approach that improves performance and computational efficiency by dealing with complex data and not having enough training data. The approach also reduces computing time and improves training efficiency. The study aims to improve breast cancer detection and treatment methods in medical imaging technology.

  9. h

    CBIS-DDSM_1024

    • huggingface.co
    Updated May 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean Baek (2025). CBIS-DDSM_1024 [Dataset]. https://huggingface.co/datasets/dbaek111/CBIS-DDSM_1024
    Explore at:
    Dataset updated
    May 26, 2025
    Authors
    Sean Baek
    Description

    The CBIS-DDSM dataset consists of mammograms for 1,566 patients provided in DICOM format with metadata in CSV files. Among its contents, the full mammogram images, which originally numbered 3,120, had 34 excluded, resulting in 3,086 images. These were then converted to 8-bit PNG files and organized into 'cancer' and 'not_cancer' folders based on their pathology for both training and testing purposes.The CBIS-DDSM dataset consists of mammograms for 1,566 patients provided in DICOM format with… See the full description on the dataset page: https://huggingface.co/datasets/dbaek111/CBIS-DDSM_1024.

  10. R

    Ddsm Dataset

    • universe.roboflow.com
    zip
    Updated Apr 27, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MassDetectClass (2022). Ddsm Dataset [Dataset]. https://universe.roboflow.com/massdetectclass/ddsm
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 27, 2022
    Dataset authored and provided by
    MassDetectClass
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Mass Bounding Boxes
    Description

    DDSM

    ## Overview
    
    DDSM is a dataset for object detection tasks - it contains Mass annotations for 1,591 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  11. t

    Digital Database for Screening Mammography (DDSM) dataset - Dataset - LDM

    • service.tib.eu
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Digital Database for Screening Mammography (DDSM) dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/digital-database-for-screening-mammography--ddsm--dataset
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The DDSM dataset is a public mammogram dataset used for training and testing the proposed method.

  12. CBIS-DDSM One View Mammograms TFRecords

    • kaggle.com
    Updated Dec 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Fuentes (2022). CBIS-DDSM One View Mammograms TFRecords [Dataset]. http://doi.org/10.34740/kaggle/dsv/4429171
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 29, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sergio Fuentes
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    CBIS-DDSM images in TFRecords format. Each example has an imagem or view of a mammogram, with the corresponding label of the image. It can be positive or negative for breast cancer . CBIS-DDSM images were taken from this Kaggle dataset: https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset

  13. R

    Cbis Ddsm Masses Dataset

    • universe.roboflow.com
    zip
    Updated May 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Breast (2023). Cbis Ddsm Masses Dataset [Dataset]. https://universe.roboflow.com/breast-fp534/cbis-ddsm-masses/dataset/2
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 9, 2023
    Dataset authored and provided by
    Breast
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Benign Malignant Bounding Boxes
    Description

    CBIS DDSM Masses

    ## Overview
    
    CBIS DDSM Masses is a dataset for object detection tasks - it contains Benign Malignant annotations for 1,514 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  14. CBIS-DDSM patches 448x448

    • kaggle.com
    Updated Dec 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergio Fuentes (2022). CBIS-DDSM patches 448x448 [Dataset]. http://doi.org/10.34740/kaggle/dsv/4465927
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 29, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sergio Fuentes
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    Images obtained from Kaggle dataset created by AWSEF, in https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset. Patches 448x448 created from CBIS-DDSM and their corresponding labels in TFRecords format.

  15. i

    Updated Description Files for the CBIS-DDSM Dataset

    • ieee-dataport.org
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sora Park (2025). Updated Description Files for the CBIS-DDSM Dataset [Dataset]. https://ieee-dataport.org/documents/updated-description-files-cbis-ddsm-dataset
    Explore at:
    Dataset updated
    Aug 12, 2025
    Authors
    Sora Park
    Description

    cropped images

  16. R

    Cbis Ddsm Dataset

    • universe.roboflow.com
    zip
    Updated May 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    frani1999 (2022). Cbis Ddsm Dataset [Dataset]. https://universe.roboflow.com/frani1999-do9am/cbis-ddsm-oekpf/model/1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 29, 2022
    Dataset authored and provided by
    frani1999
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Variables measured
    Breat Cancer Bounding Boxes
    Description

    CBIS DDSM

    ## Overview
    
    CBIS DDSM is a dataset for object detection tasks - it contains Breat Cancer annotations for 352 images.
    
    ## Getting Started
    
    You can download this dataset for use within your own projects, or fork it into a workspace on Roboflow to create your own model.
    
      ## License
    
      This dataset is available under the [CC BY 4.0 license](https://creativecommons.org/licenses/CC BY 4.0).
    
  17. h

    cbis-ddsm-breast-cancerz

    • huggingface.co
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zi Deng (2025). cbis-ddsm-breast-cancerz [Dataset]. https://huggingface.co/datasets/zkdeng/cbis-ddsm-breast-cancerz
    Explore at:
    Dataset updated
    Aug 31, 2025
    Authors
    Zi Deng
    Description

    zkdeng/cbis-ddsm-breast-cancerz dataset hosted on Hugging Face and contributed by the HF Datasets community

  18. Mammography Dataset from INbreast, MIAS, and DDSM

    • kaggle.com
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emilio A. Venegas Hernández (2024). Mammography Dataset from INbreast, MIAS, and DDSM [Dataset]. https://www.kaggle.com/datasets/emiliovenegas1/mammography-dataset-from-inbreast-mias-and-ddsm/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 31, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Emilio A. Venegas Hernández
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Malign and benign mammograms

    Malignant and benign mammograms from INbreast, MIAS, and DDSM datasets, were downloaded directly from Lin, Ting-Yu, and Huang, Mei-Ling. Dataset of Breast mammography images with Masses https://doi.org/10.17632/ywsbh3ndr8.2

    Normal mammograms

    Normal mammograms were sourced from the DDSM webpage: http://www.eng.usf.edu/cvprg/Mammography/Database.html. However, the FTP service is currently not operational. Consequently, using BeautifulSoup (bs4) and PIL, thumbnails of all the normal datasets were extracted, resulting in a total of 2026 files. These files were then augmented and enhanced using CLAHE (Contrast Limited Adaptive Histogram Equalization).

    Consult Jupyter Notebook for more information on the methods used for extraction and enhancing from webpage of DDSM

  19. m

    Breast Mammography Image Dataset with Masses

    • data.mendeley.com
    Updated Jan 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Faramonna (2023). Breast Mammography Image Dataset with Masses [Dataset]. http://doi.org/10.17632/8fztxggjnc.1
    Explore at:
    Dataset updated
    Jan 27, 2023
    Authors
    David Faramonna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The mammography dataset includes both benign and malignant tumors. In order to create the pictures for this dataset, 106 masses from the INbreast dataset, 53 masses from the MIAS dataset, and 2188 masses from the DDSM dataset were initially extracted. Then, we preprocess our photos using contrast-limited adaptive histogram equalization and data augmentation. Inbreast dataset has 7632 photos, MIAS dataset has 3816 images, and DDSM dataset includes 13128 images after data augmentation. Additionally, we combine DDSM, MIAS, and INbreast. The size of each image was changed to 227*227 pixels.

  20. h

    CBIS-DDSM

    • huggingface.co
    Updated Jun 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FLARE25-Agent-Xray (2025). CBIS-DDSM [Dataset]. https://huggingface.co/datasets/FLARE25-Agent-Xray/CBIS-DDSM
    Explore at:
    Dataset updated
    Jun 30, 2025
    Dataset authored and provided by
    FLARE25-Agent-Xray
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    FLARE25-Agent-Xray/CBIS-DDSM dataset hosted on Hugging Face and contributed by the HF Datasets community

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2017). Curated Breast Imaging Subset of Digital Database for Screening Mammography [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.7O02S9CY

Curated Breast Imaging Subset of Digital Database for Screening Mammography

CBIS-DDSM

Explore at:
86 scholarly articles cite this dataset (View in Google Scholar)
csv, dicom, n/aAvailable download formats
Dataset updated
Sep 14, 2017
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
Sep 14, 2017
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.

For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing an curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

For scientific and other inquiries about this dataset, please contact TCIA's Helpdesk.

Search
Clear search
Close search
Google apps
Main menu