Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.
2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset of breast cancer patients was obtained from the 2017 November update of the SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer (SEER primary cites recode NOS histology codes 8522/3) diagnosed in 2006-2010. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset used in this study consists of 7,632 mammogram images categorized into two classes: 2,520 benign and 5,112 malignant images from Huang and Lin (2020). The mammography images in the INbreast database were originally collected from the Centro Hospitalar de S. Joao (CHSJ) Breast Center in Porto. The database contains data collected from August 2008 to July 2010 and includes 115 cases with a total of 410 images (Moreira et al., 2012). Of these, 90 cases concern women with abnormalities in both breasts. Four different types of breast disease are recorded in the database: Mass, calcification, asymmetries and distortions. The mammograms are recorded from two standard perspectives: Craniocaudal (CC) and Mediolateral Oblique (MLO). In addition, breast density is classified into four categories based on the BI-RADS standards: Fully Fat (Density 1), Scattered Fibrous-Landular Density (Density 2), Heterogeneously Dense (Density 3) and Extremely Dense (Density 4). The images are stored in two resolutions: 3328 x 4084 pixels or 2560 x 3328 pixels, in DICOM format. 106 mammograms depicting breast masses were selected from the INbreast database. To enhance the dataset for model training, data augmentation techniques were applied, increasing the total number of breast mammography images to 7,632.
Facebook
TwitterDataset Card for "breast-cancer"
Dataset was taken from the MedSAM project and used in this notebook which fine-tunes Meta's SAM model on the dataset. More Information needed
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIT/breast-cancer-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Ductal carcinoma in situ with microinvasion (DCISM) is a challenging subtype of breast cancer with controversial invasiveness and prognosis. Accurate diagnosis of DCISM from ductal carcinoma in situ (DCIS) is crucial for optimal treatment and improved clinical outcomes. This dataset provides histopathology images and paired CK5/6 immunohistochemical staining images from patients with DCISM, as well as multiphoton microscopy images of suspicious regions. It offers multi-modal imaging data from various perspectives for analysis and diagnosis of microinvasive breast cancer by other researchers in the field.
The dataset contains data from 12 breast cancer patients, including 10 cases of ductal carcinoma in situ with microinvasion (DCISM), 1 case of ductal carcinoma in situ (DCIS), and 1 case of invasive breast cancer.
The magnification of the glass slide images is 40x. The pathology slide scanner used was created by the Sunny Optical Technology (group) Co., Ltd., and the pixel aspect ratio of the images is 1. The dataset also includes multiphoton microscopy imaging of suspicious microinvasion areas. The multiphoton imaging system was manufactured by Zeiss, and it also has a pixel aspect ratio of 1.
Our database was specifically collected for the use of imaging methods in diagnosing DICSM. The suffixes in each case number indicate the patient's condition - "DCISM" for ductal carcinoma in situ with microinvasion, "DCIS" for ductal carcinoma in situ, and "IDC" for invasive ductal carcinoma. Apart from these labels, we have not collected any additional clinical information for these cases.
Facebook
TwitterThis dataset was created to be used in machine learning tasks for both binary and multiclass classification problems. It consists of 7,909 microscopic images of breast tumor tissue, captured using different magnification factors (40X, 100X, 200X, and 400X). The images in this dataset were resized to 224x224 pixels and organized according to binary and multiclass classification tasks.
THIS DATASET IS ARCHIVED AT DANS/EASY, BUT NOT ACCESSIBLE HERE. TO VIEW A LIST OF FILES AND ACCESS THE FILES IN THIS DATASET CLICK ON THE DOI-LINK ABOVE
Interdisciplinary s... Identifier DOI https://doi.org/10.17632/jxwvdwhpc2.1 PID https://nbn-resolving.org/urn:nbn:nl:ui:13-i2-oi2g Metadata Access https://easy.dans.knaw.nl/oai?verb=GetRecord&metadataPrefix=oai_datacite&identifier=oai:easy.dans.knaw.nl:easy-dataset:301675 Provenance Creator Pereira, M Publisher Data Archiving and Networked Services (DANS) Contributor Mayke Pereira Publication Year 2023 Rights info:eu-repo/semantics/openAccess; License: http://creativecommons.org/licenses/by/4.0; http://creativecommons.org/licenses/by/4.0 OpenAccess true Representation Resource Type Dataset Discipline Other
Pereira, Mayke (2023), “BreakHis - Breast Cancer Histopathological Database”, Mendeley Data, V1, doi: 10.17632/jxwvdwhpc2.1
Facebook
Twitterhttps://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/
This dataset, derived from the Wisconsin Breast Cancer (Diagnostic), is a comprehensive resource for developing and evaluating machine learning models focused on the binary classification of breast tumors as either benign (B) or malignant (M). The data consists of features computed from digitized images of fine needle aspirates (FNA) of breast masses, offering a rich set of quantitative metrics for computational pathology and diagnostic research. The dataset is a critical tool for healthcare… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic.
Facebook
TwitterThis dataset was created by maede norouzi
Facebook
TwitterThis dataset consists of 361 whole slide images (WSI) - 296 malignant from women with invasive breast cancer (HER2 neg) and 65 benign. The tumours have been classified with four SNOMED-CT categories based on morphology: invasive duct carcinoma, invasive lobular carcinoma, in situ carcinoma, and others. 4144 separate annotations have been made to segment different tissue structures connected to ontologies.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Breast Cancer Dataset
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
categorizing
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Breast cancer is the leading cause of cancer death among women worldwide. The vast majority of breast cancers are carcinomas that originate from cells lining the milk-forming ducts of the mammary gland. The molecular subtypes of breast cancer, which are based on the presence or absence of hormone receptors (estrogen and progesterone subtypes) and human epidermal growth factor receptor-2 (HER2), include: * Luminal A subtype: Hormone receptor positive (progesterone and estrogen) and HER2 (ERBB2) negative * Luminal B subtype: Hormone receptor positive (progesterone and estrogen) and HER2 (ERBB2) positive * HER2 positive: Hormone receptor negative (progesterone and estrogen) and HER2 (ERBB2) positive * Basal-like or triple-negative (TNBCs): Hormone receptor negative (progesterone and estrogen) and HER2 (ERBB2) negative Hormone receptor positive breast cancers are largely driven by the estrogen/ER pathway. In HER2 positive breast tumors, HER2 activates the PI3K/AKT and the RAS/RAF/MAPK pathways, and stimulate cell growth, survival and differentiation. In patients suffering from TNBC, the deregulation of various signaling pathways (Notch and Wnt/beta-catenin), EGFR protein have been confirmed. In the case of breast cancer only 8% of all cancers are hereditary, a phenomenon linked to genetic changes in BRCA1 or BRCA2. Somatic mutations in only three genes (TP53, PIK3CA and GATA3) occurred at >10% incidence across all breast cancers. Phosphorylation sites were added based on information from PhosphoSitePlus (R), www.phosphosite.org.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Breast Cancer Wisconsin Dataset: African Physiognomy Adjusted
Dataset Description
This dataset addresses representation bias in medical AI by providing an African physiognomy-adjusted version of the classic Wisconsin Breast Cancer Dataset. The adjustment methodology systematically modifies cellular morphology features to better reflect documented physiological differences in African populations.
Dataset Summary
Original Dataset: Wisconsin Breast Cancer Dataset… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset.
Facebook
TwitterSource:
Copied from the original dataset
Creators:
Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu
W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619
Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi '@' cs.wisc.edu… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.
Facebook
Twitterhttps://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Data Source
https://www.kaggle.com/datasets/andrewmvd/breast-cancer-cell-segmentation
Dataset Card Authors
Mahadi Hassan
Dataset Card Contact
mahadise01@gmail.com
Linkdin: https://www.linkedin.com/in/mahadise01
Github: https://github.com/Mahadih534
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.
2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.