13 datasets found
  1. c

    Curated Breast Imaging Subset of Digital Database for Screening Mammography

    • cancerimagingarchive.net
    csv, dicom, n/a
    Updated Sep 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2017). Curated Breast Imaging Subset of Digital Database for Screening Mammography [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
    Explore at:
    csv, dicom, n/aAvailable download formats
    Dataset updated
    Sep 14, 2017
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Sep 14, 2017
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

    Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.

    For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing an curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

    Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

    For scientific and other inquiries about this dataset, please contact TCIA's Helpdesk.

  2. h

    DDSM-mammography-dataset

    • huggingface.co
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unidata Medical (2025). DDSM-mammography-dataset [Dataset]. https://huggingface.co/datasets/ud-medical/DDSM-mammography-dataset
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Unidata Medical
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Mammogram Photos of breast cancer - 600,000+ Studies

    Dataset comprises 100,000+ studies with protocol and 500,000+ studies without protocol, totaling over 600,000 digital mammography exams curated for cancer detection and diagnosis research.It is designed for advancing breast cancer research, providing a comprehensive resource for studying screening mammography, malignant and benign cases, and computer-aided detection systems. - Get the data

      Dataset characteristics:… See the full description on the dataset page: https://huggingface.co/datasets/ud-medical/DDSM-mammography-dataset.
    
  3. D

    CBIS-DDSM Dataset

    • datasetninja.com
    Updated Sep 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Sawyer Lee; Francisco Gimenez; Assaf Hoogi (2017). CBIS-DDSM Dataset [Dataset]. https://datasetninja.com/cbis-ddsm
    Explore at:
    Dataset updated
    Sep 14, 2017
    Dataset provided by
    Dataset Ninja
    Authors
    Rebecca Sawyer Lee; Francisco Gimenez; Assaf Hoogi
    License

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    The CBIS-DDSM: Curated Breast Imaging Subset of Digital Database for Screening Mammography includes decompressed images, data selection and curation by trained mammographers, updated mass segmentation and bounding boxes, and pathologic diagnosis for training data, formatted similarly to modern computer vision data sets. The data set contains 753 calcification cases and 891 mass cases, providing a data set size capable of analyzing decision support systems in mammography.

  4. CBIS-DDSM: Mass Case Mammograms PNG Dataset

    • kaggle.com
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duru Alaylı (2023). CBIS-DDSM: Mass Case Mammograms PNG Dataset [Dataset]. https://www.kaggle.com/datasets/durualayl/cbis-ddsm-mass-case-mammograms-png-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 5, 2023
    Dataset provided by
    Kaggle
    Authors
    Duru Alaylı
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    CBIS-DDSM is a publicly available dataset provided by The Cancer Imaging Archive. It is an updated and standardized version of the DDSM dataset that is provided by the University of South Florida. It consists of 2,620 mammograms that are benign with no callback, benign or malignant cases with mass and/or calcification abnormalities. The cases also have both the MLO and CC views or only one of them. Each case has a full image, a cropped image, and a region of interest (ROI) mask image. The dataset is also split into train and test sets for both the mass and calcification cases.

    Description

    For simpler usage CBIS-DDSM: Mass Case Mammograms PNG Dataset is only a subset of the mass cases from the dataset and it only contains the full image views of the mammograms. The images are in .png format and they are grouped in the following directories.

    MLO_full: MLO full views of mammograms for training (661 images) - MLO-ben-full-images: MLO benign full views of mammograms for training (291 images) - MLO-ben-wout-full-images: MLO benign with no callback full views of mammograms for training (52 images) - MLO-mal-full-images: MLO malignant full views of mammograms for training (318 images)

    MLO_full_test: MLO full views of mammograms for test (192 images) - MLO-ben-full-images: MLO benign full views of mammograms for test (94 images) - MLO-ben-wout-full-images: MLO benign with no callback full views of mammograms for test (19 images) - MLO-mal-full-images: MLO malignant full views of mammograms for test (79 images)

    CC_full: CC full views of mammograms for training (576 images) - CC-ben-full-images: CC benign full views of mammograms for training (259 images) - CC-ben-wout-full-images: CC benign with no callback full views of mammograms for training (33 images) - CC-mal-full-images: CC malignant full views of mammograms for training (284 images)

    CC_full_test: CC full views of mammograms for test (171 images) - CC-ben-full-images: CC benign full views of mammograms for test (90 images) - CC-ben-wout-full-images: CC benign with no callback full views of mammograms for test (15 images) - CC-mal-full-images: CC malignant full views of mammograms for test (66 images)

    Citation for Databases

    TCIAl. "Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM)." The Cancer Imaging Archive, (2023). https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=22516629#22516629accaef0469834754b89af9e007760b10.

    University of South Florida. DDSM: Digital Database for Screening Mammography." University of South Florida Digital Mammography, http://www.eng.usf.edu/cvprg/mammography/database.html.

  5. CBIS-DDSM: Breast Cancer Image Dataset

    • kaggle.com
    Updated Feb 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Awsaf (2021). CBIS-DDSM: Breast Cancer Image Dataset [Dataset]. https://www.kaggle.com/awsaf49/cbis-ddsm-breast-cancer-image-dataset/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 7, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Awsaf
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    https://www.researchgate.net/publication/338558131/figure/fig3/AS:962412517793792@1606468433025/CBIS-DDSM-example-images-used-for-detection.jpg" alt="">

    Descripton

    This dataset is jpeg format of the original dataset(163GB). The resolution was kept to the original dataset.

    • Number of Studies: 6775
    • Number of Series: 6775
    • Number of Participants: 1,566(NB)
    • Number of Images: 10239
    • Modalities: MG
    • Image Size (GB): 6(.jpg)

    NB: The image data for this collection is structured such that each participant has multiple patient IDs. For example, pat_id 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1) This makes it appear as though there are 6,671 participants according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

    Summary

    This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

    Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.

    For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing a curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

    Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

    For scientific inquiries about this dataset, please contact Dr. Daniel Rubin, Department of Biomedical Data Science, Radiology, and Medicine, Stanford University School of Medicine (dlrubin@stanford.edu).

    Citations & Data Usage Policy

    Users of this data must abide by the TCIA Data Usage Policy and the Creative Commons Attribution 3.0 Unported License under which it has been published. Attribution should include references to the following citations:

    CBIS-DDSM Citation

     Rebecca Sawyer Lee, Francisco Gimenez, Assaf Hoogi , Daniel Rubin (2016). **Curated Breast Imaging Subset of DDSM [Dataset]**. The Cancer Imaging Archive. **DOI:** https://doi.org/10.7937/K9/TCIA.2016.7O02S9CY
    

    Publication Citation

    Rebecca Sawyer Lee, Francisco Gimenez, Assaf Hoogi, Kanae Kawai Miyake, Mia Gorovoy & Danie...
    
  6. f

    Data distribution of CBIS-DDSM dataset.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman (2024). Data distribution of CBIS-DDSM dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0304757.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent advancements in AI, driven by big data technologies, have reshaped various industries, with a strong focus on data-driven approaches. This has resulted in remarkable progress in fields like computer vision, e-commerce, cybersecurity, and healthcare, primarily fueled by the integration of machine learning and deep learning models. Notably, the intersection of oncology and computer science has given rise to Computer-Aided Diagnosis (CAD) systems, offering vital tools to aid medical professionals in tumor detection, classification, recurrence tracking, and prognosis prediction. Breast cancer, a significant global health concern, is particularly prevalent in Asia due to diverse factors like lifestyle, genetics, environmental exposures, and healthcare accessibility. Early detection through mammography screening is critical, but the accuracy of mammograms can vary due to factors like breast composition and tumor characteristics, leading to potential misdiagnoses. To address this, an innovative CAD system leveraging deep learning and computer vision techniques was introduced. This system enhances breast cancer diagnosis by independently identifying and categorizing breast lesions, segmenting mass lesions, and classifying them based on pathology. Thorough validation using the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) demonstrated the CAD system’s exceptional performance, with a 99% success rate in detecting and classifying breast masses. While the accuracy of detection is 98.5%, when segmenting breast masses into separate groups for examination, the method’s performance was approximately 95.39%. Upon completing all the analysis, the system’s classification phase yielded an overall accuracy of 99.16% for classification. The potential for this integrated framework to outperform current deep learning techniques is proposed, despite potential challenges related to the high number of trainable parameters. Ultimately, this recommended framework offers valuable support to researchers and physicians in breast cancer diagnosis by harnessing cutting-edge AI and image processing technologies, extending recent advances in deep learning to the medical domain.

  7. m

    Breast Mammography Image Dataset with Masses

    • data.mendeley.com
    Updated Jan 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Faramonna (2023). Breast Mammography Image Dataset with Masses [Dataset]. http://doi.org/10.17632/8fztxggjnc.1
    Explore at:
    Dataset updated
    Jan 27, 2023
    Authors
    David Faramonna
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The mammography dataset includes both benign and malignant tumors. In order to create the pictures for this dataset, 106 masses from the INbreast dataset, 53 masses from the MIAS dataset, and 2188 masses from the DDSM dataset were initially extracted. Then, we preprocess our photos using contrast-limited adaptive histogram equalization and data augmentation. Inbreast dataset has 7632 photos, MIAS dataset has 3816 images, and DDSM dataset includes 13128 images after data augmentation. Additionally, we combine DDSM, MIAS, and INbreast. The size of each image was changed to 227*227 pixels.

  8. t

    Essam Rashed, M. Samir Abou El Seoud (2024). Dataset: Curated Breast Imaging...

    • service.tib.eu
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Essam Rashed, M. Samir Abou El Seoud (2024). Dataset: Curated Breast Imaging Subset of Digital Database of Screening Mammography (CBIS-DDSM). https://doi.org/10.57702/sjkug8pe [Dataset]. https://service.tib.eu/ldmservice/dataset/curated-breast-imaging-subset-of-digital-database-of-screening-mammography--cbis-ddsm-
    Explore at:
    Dataset updated
    Dec 2, 2024
    Description

    The Curated Breast Imaging Subset of Digital Database of Screening Mammography (CBIS-DDSM) dataset is used for both training and testing of the developed deep learning approach.

  9. f

    Approaches comparison on CBIS DDSM dataset.

    • plos.figshare.com
    xls
    Updated Oct 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mudassar Ali; Tong Wu; Haoji Hu; Tariq Mahmood (2024). Approaches comparison on CBIS DDSM dataset. [Dataset]. http://doi.org/10.1371/journal.pone.0309421.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Oct 2, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Mudassar Ali; Tong Wu; Haoji Hu; Tariq Mahmood
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PurposeUsing computer-aided design (CAD) systems, this research endeavors to enhance breast cancer segmentation by addressing data insufficiency and data complexity during model training. As perceived by computer vision models, the inherent symmetry and complexity of mammography images make segmentation difficult. The objective is to optimize the precision and effectiveness of medical imaging.MethodsThe study introduces a hybrid strategy combining shape-guided segmentation (SGS) and M3D-neural cellular automata (M3D-NCA), resulting in improved computational efficiency and performance. The implementation of Shape-guided segmentation (SGS) during the initialization phase, coupled with the elimination of convolutional layers, enables the model to effectively reduce computation time. The research proposes a novel loss function that combines segmentation losses from both components for effective training.ResultsThe robust technique provided aims to improve the accuracy and consistency of breast tumor segmentation, leading to significant improvements in medical imaging and breast cancer detection and treatment.ConclusionThis study enhances breast cancer segmentation in medical imaging using CAD systems. Combining shape-guided segmentation (SGS) and M3D-neural cellular automata (M3D-NCA) is a hybrid approach that improves performance and computational efficiency by dealing with complex data and not having enough training data. The approach also reduces computing time and improves training efficiency. The study aims to improve breast cancer detection and treatment methods in medical imaging technology.

  10. i

    Data from: Mammo-Bench: A Large Scale Benchmark Dataset of Mammography...

    • india-data.org
    images
    Updated Sep 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IIIT Hyderabad, IHUB (2025). Mammo-Bench: A Large Scale Benchmark Dataset of Mammography Images [Dataset]. https://india-data.org/googleSEO-list-dataset-search
    Explore at:
    imagesAvailable download formats
    Dataset updated
    Sep 7, 2025
    Dataset authored and provided by
    IIIT Hyderabad, IHUB
    License

    https://india-data.org/terms-conditionshttps://india-data.org/terms-conditions

    Area covered
    India
    Description

    Breast cancer remains a significant global health concern, and machine learning algorithms and computer-aided detection systems have shown great promise in enhancing the accuracy and efficiency of mammography image analysis. However, there is a critical need for large, benchmark datasets for training deep learning models for breast cancer detection. In this work we developed Mammo-Bench, a large-scale benchmark dataset of mammography images, by collating data from six well-curated resources, viz., DDSM, INbreast, KAU-BCMD, CMMD, CDD-CESM and DMID. To ensure consistency across images from diverse sources while preserving clinically relevant features, a preprocessing pipeline that includes breast segmentation, pectoral muscle removal, and intelligent cropping is proposed. The dataset consists of 19,731 high-quality mammographic images from 6,500 patients across 6 countries and is one of the largest open-source mammography databases to the best of our knowledge. To show the efficacy of training on the large dataset, performance of ResNet101 architecture was evaluated on Mammo-Bench and the results compared by training independently on a few member datasets and an external dataset, VinDr-Mammo. An accuracy of 78.8% (with data augmentation of the minority classes) and 77.8% (without data augmentation) was achieved on the proposed benchmark dataset, compared to the other datasets for which accuracy varied from 25 – 69%. Noticeably, improved prediction of the minority classes is observed with the Mammo-Bench dataset. These results establish baseline performance and demonstrate Mammo-Bench's utility as a comprehensive resource for developing and evaluating mammography analysis systems.

  11. f

    Performance of fused model approach.

    • plos.figshare.com
    xls
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman (2024). Performance of fused model approach. [Dataset]. http://doi.org/10.1371/journal.pone.0304757.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent advancements in AI, driven by big data technologies, have reshaped various industries, with a strong focus on data-driven approaches. This has resulted in remarkable progress in fields like computer vision, e-commerce, cybersecurity, and healthcare, primarily fueled by the integration of machine learning and deep learning models. Notably, the intersection of oncology and computer science has given rise to Computer-Aided Diagnosis (CAD) systems, offering vital tools to aid medical professionals in tumor detection, classification, recurrence tracking, and prognosis prediction. Breast cancer, a significant global health concern, is particularly prevalent in Asia due to diverse factors like lifestyle, genetics, environmental exposures, and healthcare accessibility. Early detection through mammography screening is critical, but the accuracy of mammograms can vary due to factors like breast composition and tumor characteristics, leading to potential misdiagnoses. To address this, an innovative CAD system leveraging deep learning and computer vision techniques was introduced. This system enhances breast cancer diagnosis by independently identifying and categorizing breast lesions, segmenting mass lesions, and classifying them based on pathology. Thorough validation using the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) demonstrated the CAD system’s exceptional performance, with a 99% success rate in detecting and classifying breast masses. While the accuracy of detection is 98.5%, when segmenting breast masses into separate groups for examination, the method’s performance was approximately 95.39%. Upon completing all the analysis, the system’s classification phase yielded an overall accuracy of 99.16% for classification. The potential for this integrated framework to outperform current deep learning techniques is proposed, despite potential challenges related to the high number of trainable parameters. Ultimately, this recommended framework offers valuable support to researchers and physicians in breast cancer diagnosis by harnessing cutting-edge AI and image processing technologies, extending recent advances in deep learning to the medical domain.

  12. f

    Evaluating the identification of mass lesions.

    • plos.figshare.com
    xls
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman (2024). Evaluating the identification of mass lesions. [Dataset]. http://doi.org/10.1371/journal.pone.0304757.t010
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Jawad Ahmad; Sheeraz Akram; Arfan Jaffar; Zulfiqar Ali; Sohail Masood Bhatti; Awais Ahmad; Shafiq Ur Rehman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent advancements in AI, driven by big data technologies, have reshaped various industries, with a strong focus on data-driven approaches. This has resulted in remarkable progress in fields like computer vision, e-commerce, cybersecurity, and healthcare, primarily fueled by the integration of machine learning and deep learning models. Notably, the intersection of oncology and computer science has given rise to Computer-Aided Diagnosis (CAD) systems, offering vital tools to aid medical professionals in tumor detection, classification, recurrence tracking, and prognosis prediction. Breast cancer, a significant global health concern, is particularly prevalent in Asia due to diverse factors like lifestyle, genetics, environmental exposures, and healthcare accessibility. Early detection through mammography screening is critical, but the accuracy of mammograms can vary due to factors like breast composition and tumor characteristics, leading to potential misdiagnoses. To address this, an innovative CAD system leveraging deep learning and computer vision techniques was introduced. This system enhances breast cancer diagnosis by independently identifying and categorizing breast lesions, segmenting mass lesions, and classifying them based on pathology. Thorough validation using the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) demonstrated the CAD system’s exceptional performance, with a 99% success rate in detecting and classifying breast masses. While the accuracy of detection is 98.5%, when segmenting breast masses into separate groups for examination, the method’s performance was approximately 95.39%. Upon completing all the analysis, the system’s classification phase yielded an overall accuracy of 99.16% for classification. The potential for this integrated framework to outperform current deep learning techniques is proposed, despite potential challenges related to the high number of trainable parameters. Ultimately, this recommended framework offers valuable support to researchers and physicians in breast cancer diagnosis by harnessing cutting-edge AI and image processing technologies, extending recent advances in deep learning to the medical domain.

  13. The detection effects of the two modules based on baselines Faster RCNN.

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    xls
    Updated Jun 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingzhen He; Jing Wang; Zeyu Han; Baojun Li; Mei Lv; Yunfeng Shi (2023). The detection effects of the two modules based on baselines Faster RCNN. [Dataset]. http://doi.org/10.1371/journal.pone.0275194.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 21, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Jingzhen He; Jing Wang; Zeyu Han; Baojun Li; Mei Lv; Yunfeng Shi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The detection effects of the two modules based on baselines Faster RCNN.

  14. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2017). Curated Breast Imaging Subset of Digital Database for Screening Mammography [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.7O02S9CY

Curated Breast Imaging Subset of Digital Database for Screening Mammography

CBIS-DDSM

Explore at:
94 scholarly articles cite this dataset (View in Google Scholar)
csv, dicom, n/aAvailable download formats
Dataset updated
Sep 14, 2017
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
Sep 14, 2017
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

This CBIS-DDSM (Curated Breast Imaging Subset of DDSM) is an updated and standardized version of the Digital Database for Screening Mammography (DDSM). The DDSM is a database of 2,620 scanned film mammography studies. It contains normal, benign, and malignant cases with verified pathology information. The scale of the database along with ground truth validation makes the DDSM a useful tool in the development and testing of decision support systems. The CBIS-DDSM collection includes a subset of the DDSM data selected and curated by a trained mammographer. The images have been decompressed and converted to DICOM format. Updated ROI segmentation and bounding boxes, and pathologic diagnosis for training data are also included. A manuscript describing how to use this dataset in detail is available at https://www.nature.com/articles/sdata2017177.

Published research results from work in developing decision support systems in mammography are difficult to replicate due to the lack of a standard evaluation data set; most computer-aided diagnosis (CADx) and detection (CADe) algorithms for breast cancer in mammography are evaluated on private data sets or on unspecified subsets of public databases. Few well-curated public datasets have been provided for the mammography community. These include the DDSM, the Mammographic Imaging Analysis Society (MIAS) database, and the Image Retrieval in Medical Applications (IRMA) project. Although these public data sets are useful, they are limited in terms of data set size and accessibility.

For example, most researchers using the DDSM do not leverage all its images for a variety of historical reasons. When the database was released in 1997, computational resources to process hundreds or thousands of images were not widely available. Additionally, the DDSM images are saved in non-standard compression files that require the use of decompression code that has not been updated or maintained for modern computers. Finally, the ROI annotations for the abnormalities in the DDSM were provided to indicate a general position of lesions, but not a precise segmentation for them. Therefore, many researchers must implement segmentation algorithms for accurate feature extraction. This causes an inability to directly compare the performance of methods or to replicate prior results. The CBIS-DDSM collection addresses that challenge by publicly releasing an curated and standardized version of the DDSM for evaluation of future CADx and CADe systems (sometimes referred to generally as CAD) research in mammography.

Please note that the image data for this collection is structured such that each participant has multiple patient IDs. For example, participant 00038 has 10 separate patient IDs which provide information about the scans within the IDs (e.g. Calc-Test_P_00038_LEFT_CC, Calc-Test_P_00038_RIGHT_CC_1). This makes it appear as though there are 6,671 patients according to the DICOM metadata, but there are only 1,566 actual participants in the cohort.

For scientific and other inquiries about this dataset, please contact TCIA's Helpdesk.

Search
Clear search
Close search
Google apps
Main menu