Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset of breast cancer patients was obtained from the 2017 November update of the SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer (SEER primary cites recode NOS histology codes 8522/3) diagnosed in 2006-2010. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included.
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.
2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset used in this study consists of 7,632 mammogram images categorized into two classes: 2,520 benign and 5,112 malignant images from Huang and Lin (2020). The mammography images in the INbreast database were originally collected from the Centro Hospitalar de S. Joao (CHSJ) Breast Center in Porto. The database contains data collected from August 2008 to July 2010 and includes 115 cases with a total of 410 images (Moreira et al., 2012). Of these, 90 cases concern women with abnormalities in both breasts. Four different types of breast disease are recorded in the database: Mass, calcification, asymmetries and distortions. The mammograms are recorded from two standard perspectives: Craniocaudal (CC) and Mediolateral Oblique (MLO). In addition, breast density is classified into four categories based on the BI-RADS standards: Fully Fat (Density 1), Scattered Fibrous-Landular Density (Density 2), Heterogeneously Dense (Density 3) and Extremely Dense (Density 4). The images are stored in two resolutions: 3328 x 4084 pixels or 2560 x 3328 pixels, in DICOM format. 106 mammograms depicting breast masses were selected from the INbreast database. To enhance the dataset for model training, data augmentation techniques were applied, increasing the total number of breast mammography images to 7,632.
Dataset Card for "breast-cancer"
Dataset was taken from the MedSAM project and used in this notebook which fine-tunes Meta's SAM model on the dataset. More Information needed
This dataset was created by maede norouzi
This dataset was created to be used in machine learning tasks for both binary and multiclass classification problems. It consists of 7,909 microscopic images of breast tumor tissue, captured using different magnification factors (40X, 100X, 200X, and 400X). The images in this dataset were resized to 224x224 pixels and organized according to binary and multiclass classification tasks.
THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOVE
Interdisciplinary s... Identifier DOI https://doi.org/10.17632/jxwvdwhpc2.1 PID https://nbn-resolving.org/urn:nbn:nl:ui:13-i2-oi2g Metadata Access https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:301675 Provenance Creator Pereira, M Publisher Data Archiving and Networked Services (DANS) Contributor Mayke Pereira Publication Year 2023 Rights info:eu-repo/semantics/openAccess; License: http://creativecommons.org/licenses/by/4.0; http://creativecommons.org/licenses/by/4.0 OpenAccess true Representation Resource Type Dataset Discipline Other
Pereira, Mayke (2023), “BreakHis - Breast Cancer Histopathological Database”, Mendeley Data, V1, doi: 10.17632/jxwvdwhpc2.1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The AISSLab Breast Cancer Dataset is a collection of mammogram images by experts from the Ma'amon's Diagnostic Centre Mammogram Images for Breast Cancer (MDCMI-BC) in Yemen. It is designed to support advancements in breast cancer research and computer-aided diagnosis (CAD) systems. To facilitate research in breast cancer detection, focusing on harmonizing AI with diverse imaging data. This dataset emphasizes improving diagnostic accuracy and is available for academic and clinical research applications.
If you are using this dataset for research purpose kindly cite the following papers:
[1] A. M. Al-Hejri, R. M. Al-Tam, M. Fazea, A. H. Sable, S. Lee, and M. A. Al-antari, “ETECADx: Ensemble Self-Attention Transformer Encoder for Breast Cancer Diagnosis Using Full-Field Digital X-ray Breast Images,” Diagnostics, vol. 13, no. 1, p. 89, Dec. 2022, doi: 10.3390/diagnostics13010089.
[2] R. M. Al-Tam, A. M. Al-Hejri, S. S. Alshamrani, M. A. Al-antari, and S. M. Narangale, “Multimodal breast cancer hybrid explainable computer-aided diagnosis using medical mammograms and ultrasound Images,” Biocybern. Biomed. Eng., vol. 44, no. 3, pp. 731–758, Jul. 2024, doi: 10.1016/j.bbe.2024.08.007.
Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer. Wisconsin Diagnostic Breast Cancer (WDBC) dataset obtained by the university of Wisconsin Hospital is used to classify tumors as benign or malignant.
Attribute Information:
1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)
Ten real-valued features are computed for each cell nucleus:
a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)
The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.
All feature values are recoded with four significant digits.
Missing attribute values: none
Class distribution: 357 benign, 212 malignant
Creator: Dr. WIlliam H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA
This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/
Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)
This dataset consists of 361 whole slide images (WSI) - 296 malignant from women with invasive breast cancer (HER2 neg) and 65 benign. The tumours have been classified with four SNOMED-CT categories based on morphology: invasive duct carcinoma, invasive lobular carcinoma, in situ carcinoma, and others. 4144 separate annotations have been made to segment different tissue structures connected to ontologies.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Ductal carcinoma in situ with microinvasion (DCISM) is a challenging subtype of breast cancer with controversial invasiveness and prognosis. Accurate diagnosis of DCISM from ductal carcinoma in situ (DCIS) is crucial for optimal treatment and improved clinical outcomes. This dataset provides histopathology images and paired CK5/6 immunohistochemical staining images from patients with DCISM, as well as multiphoton microscopy images of suspicious regions. It offers multi-modal imaging data from various perspectives for analysis and diagnosis of microinvasive breast cancer by other researchers in the field.
The dataset contains data from 12 breast cancer patients, including 10 cases of ductal carcinoma in situ with microinvasion (DCISM), 1 case of ductal carcinoma in situ (DCIS), and 1 case of invasive breast cancer.
The magnification of the glass slide images is 40x. The pathology slide scanner used was created by the Sunny Optical Technology (group) Co., Ltd., and the pixel aspect ratio of the images is 1. The dataset also includes multiphoton microscopy imaging of suspicious microinvasion areas. The multiphoton imaging system was manufactured by Zeiss, and it also has a pixel aspect ratio of 1.
Our database was specifically collected for the use of imaging methods in diagnosing DICSM. The suffixes in each case number indicate the patient's condition - "DCISM" for ductal carcinoma in situ with microinvasion, "DCIS" for ductal carcinoma in situ, and "IDC" for invasive ductal carcinoma. Apart from these labels, we have not collected any additional clinical information for these cases.
According to the WHO, breast cancer is the most commonly occurring cancer worldwide. In 2020 alone, there were 2.3 million new breast cancer diagnoses and 685,000 deaths. Yet breast cancer mortality in high-income countries has dropped by 40% since the 1980s when health authorities implemented regular mammography screening in age groups considered at risk. Early detection and treatment are critical to reducing cancer fatalities, and your machine learning skills could help streamline the process radiologists use to evaluate screening mammograms. Currently, early detection of breast cancer requires the expertise of highly-trained human observers, making screening mammography programs expensive to conduct. RSNA collected screening mammograms and supporting information from two sites, totaling just under 20,000 imaging studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
categorizing
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Breast Cancer Dataset
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Data Source
https://www.kaggle.com/datasets/andrewmvd/breast-cancer-cell-segmentation
Dataset Card Authors
Mahadi Hassan
Dataset Card Contact
mahadise01@gmail.com
Linkdin: https://www.linkedin.com/in/mahadise01
Github: https://github.com/Mahadih534
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Breast Cancer Wisconsin Dataset: African Physiognomy Adjusted
Dataset Description
This dataset addresses representation bias in medical AI by providing an African physiognomy-adjusted version of the classic Wisconsin Breast Cancer Dataset. The adjustment methodology systematically modifies cellular morphology features to better reflect documented physiological differences in African populations.
Dataset Summary
Original Dataset: Wisconsin Breast Cancer Dataset… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Data Integration & Imaging Informatics (DI-Cubed) project explored the issue of lack of standardized data capture at the point of data creation, as reflected in the non-image data accompanying various TCIA breast cancer collections. The work addressed the desire for semantic interoperability between various NCI initiatives by aligning on common clinical metadata elements and supporting use cases that connect clinical, imaging, and genomics data. Accordingly, clinical and measurement data was imported into I2B2 and cross-mapped to industry standard concepts for names and values including those derived from BRIDG, CDISC SDTM, DICOM Structured Reporting models and using NCI Thesaurus, SNOMED CT and LOINC controlled terminology. A subset of the standardized data was then exported from I2B2 to CSV and thence converted to DICOM SR according to the the DICOM Breast Imaging Report template [1] , which supports description of patient characteristics, histopathology, receptor status and clinical findings including measurements. The purpose was not to advocate DICOM SR as an appropriate format for interchange or storage of such information for query purposes, but rather to demonstrate that use of standard concepts harmonized across multiple collections could be transformed into an existing standard report representation. The DICOM SR can be stored and used together with the images in repositories such as TCIA and in image viewers that support rendering of DICOM SR content. During the project, various deficiencies in the DICOM Breast Imaging Report template were identified with respect to describing breast MR studies, laterality of findings versus procedures, more recently developed receptor types, and patient characteristics and status. These were addressed via DICOM CP 1838, finalized in Jan 2019, and this subset reflects those changes. DICOM Breast Imaging Report Templates available from: http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_BreastImagingReportTemplates.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset of breast cancer patients was obtained from the 2017 November update of the SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer (SEER primary cites recode NOS histology codes 8522/3) diagnosed in 2006-2010. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included.