Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Breast Cancer Wisconsin Diagnostic Dataset
Following description was retrieved from breast cancer dataset on UCI machine learning repository. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at here. Separating plane described above was obtained using Multisurface Method-Tree (MSM-T), a classification method which uses linear… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset of breast cancer patients was obtained from the 2017 November update of the SEER Program of the NCI, which provides information on population-based cancer statistics. The dataset involved female patients with infiltrating duct and lobular carcinoma breast cancer (SEER primary cites recode NOS histology codes 8522/3) diagnosed in 2006-2010. Patients with unknown tumour size, examined regional LNs, positive regional LNs, and patients whose survival months were less than 1 month were excluded; thus, 4024 patients were ultimately included.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by Figo
Released under MIT
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.
2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains images of mammograms and can be used for research and education purposes only. The dataset contains DCM images, TIFF images, a Radiology report, a Segmented mask, and pixel level annotation on abnormal regions and csv file that contains other metadata.
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
BIT/breast-cancer-dataset dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterDataset Card for "breast-cancer"
Dataset was taken from the MedSAM project and used in this notebook which fine-tunes Meta's SAM model on the dataset. More Information needed
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is taken from the UCI Machine Learning Repository (Link: https://data.world/health/breast-cancer-wisconsin) by the Donor: Nick Street
The main idea and inspiration behind the upload was to provide datasets for Machine Learning as practice and reference for my peers at college. The main purpose is to analyze data and experiment with different machine learning ideas and techniques for this binary classification task. As such, this dataset is a very useful resource to practice on.
Breast cancer is when breast cells mutate and become cancerous cells that multiply and form tumors. It accounts for 25% of all cancer cases and affected over 2.1 Million people in 2015 alone. Breast cancer typically affects women and people assigned female at birth (AFAB) age 50 and older, but it can also affect men and people assigned male at birth (AMAB), as well as younger women. Healthcare providers may treat breast cancer with surgery to remove tumors or treatment to kill cancerous cells.
Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at http://www.cs.wisc.edu/~street/images/
The task: To classify whether the tumor is benign (B) or malignant (M).
Relevant information
Features are computed from a digitized image of a fine needle
aspirate (FNA) of a breast mass. They describe
characteristics of the cell nuclei present in the image.
A few of the images can be found at
http://www.cs.wisc.edu/~street/images/
Separating plane described above was obtained using
Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree
Construction Via Linear Programming." Proceedings of the 4th
Midwest Artificial Intelligence and Cognitive Science Society,
pp. 97-101, 1992], a classification method which uses linear
programming to construct a decision tree. Relevant features
were selected using an exhaustive search in the space of 1-4
features and 1-3 separating planes.
The actual linear program used to obtain the separating plane
in the 3-dimensional space is that described in:
[K. P. Bennett and O. L. Mangasarian: "Robust Linear
Programming Discrimination of Two Linearly Inseparable Sets",
Optimization Methods and Software 1, 1992, 23-34].
This database is also available through the UW CS ftp server:
ftp ftp.cs.wisc.edu
cd math-prog/cpo-dataset/machine-learn/WDBC/
Number of instances: 569
Number of attributes: 32 (ID, diagnosis, 30 real-valued input features)
Original Creators:
Dr. William H. Wolberg, General Surgery Dept., University of
Wisconsin, Clinical Sciences Center, Madison, WI 53792
wolberg@eagle.surgery.wisc.edu
W. Nick Street, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
street@cs.wisc.edu 608-262-6619
Olvi L. Mangasarian, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
olvi@cs.wisc.edu
Donor: Nick Street
Date: November 1995
Past Usage:
first usage:
W.N. Street, W.H. Wolberg and O.L. Mangasarian
Nuclear feature extraction for breast tumor diagnosis.
IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science
and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.
OR literature:
O.L. Mangasarian, W.N. Street and W.H. Wolberg.
Breast cancer diagnosis and prognosis via linear programming.
Operations Research, 43(4), pages 570-577, July-August 1995.
Medical literature:
W.H. Wolberg, W.N. Street, and O.L. Mangasarian.
Machine learning techniques to diagnose breast cancer from
fine-needle aspirates.
Cancer Letters 77 (1994) 163-171.
W.H. Wolberg, W.N. Street, and O.L. Mangasarian.
Image analysis and machine learning applied to breast cancer
diagnosis and prognosis.
Analytical and Quantitative Cytology and Histology, Vol. 17
No. 2, pages 77-87, April 1995.
W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian.
Computerized breast cancer diagnosis and prognosis from fine
needle aspirates.
Archives of Surgery 1995;130:511-516.
W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian.
Computer-derived nuclear features distinguish malignant from
benign breast cytology.
Human Pathology, 26:792--796, 1995.
See also: http://www.cs.wisc.edu/~olvi/uwmp/mpml.html http://www.cs.wisc.edu/~olvi/uwmp/cancer.html
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The AISSLab Breast Cancer Dataset is a collection of mammogram images by experts from the Ma'amon's Diagnostic Centre Mammogram Images for Breast Cancer (MDCMI-BC) in Yemen. It is designed to support advancements in breast cancer research and computer-aided diagnosis (CAD) systems. To facilitate research in breast cancer detection, focusing on harmonizing AI with diverse imaging data. This dataset emphasizes improving diagnostic accuracy and is available for academic and clinical research applications.
If you are using this dataset for research purpose kindly cite the following papers:
[1] A. M. Al-Hejri, R. M. Al-Tam, M. Fazea, A. H. Sable, S. Lee, and M. A. Al-antari, “ETECADx: Ensemble Self-Attention Transformer Encoder for Breast Cancer Diagnosis Using Full-Field Digital X-ray Breast Images,” Diagnostics, vol. 13, no. 1, p. 89, Dec. 2022, doi: 10.3390/diagnostics13010089.
[2] R. M. Al-Tam, A. M. Al-Hejri, S. S. Alshamrani, M. A. Al-antari, and S. M. Narangale, “Multimodal breast cancer hybrid explainable computer-aided diagnosis using medical mammograms and ultrasound Images,” Biocybern. Biomed. Eng., vol. 44, no. 3, pp. 731–758, Jul. 2024, doi: 10.1016/j.bbe.2024.08.007.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Breast Cancer Dataset
Facebook
TwitterPICO breast cancer dataset
This dataset has been extracted from PICO-Corpus. The corpus consists of 1,011 abstracts of breast cancer randomized controlled trials extracted from PubMed. The PICO breast cancer dataset contains a total of 26 entities, compared to the usual 4 found in PICO corpora. Specifically, the following image extracted by the dataset's authors shows the hierarchy of the entities. The preprocessed dataset, ready to serve as inputs for MLMs such as BERT-like… See the full description on the dataset page: https://huggingface.co/datasets/cuevascarlos/PICO-breast-cancer.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Breast cancer dataset cited in 'Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters'
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consists of 1 .xlsx file, 2 .png files, 1 .json file and 1 .zip file:annotation_details.xlsx: The distribution of annotations in the previously mentioned six classes (mitosis, apoptosis, tumor nuclei, non-tumor nuclei, tubule, and non-tubule) is presented in a Excel spreadsheet.original.png: The input image.annotated.png: An example from the dataset. In the annotated image, blue circles indicate the tumor nuclei, pink circles show non-tumor nuclei such as blood cells, stroma nuclei, and lymphocytes; orange and green circles are mitosis and apoptosis, respectively; light blue circles are true lumen for tubules, and yellow circles represent white regions (non-lumen) such as fat, blood vessel, and broken tissues.data.json: The annotations for the BreCaHAD dataset are provided in JSON (JavaScript Object Notation) format. In the given example, the JSON file (ground truth) contains two mitosis and only one tumor nuclei annotations. Here, x and y are the coordinates of the centroid of the annotated object, and the values are between 0, 1.BreCaHAD.zip: An archive file containing dataset. Three folders are included: images (original images), groundTruth (json files), and groundTruth_display (groundTruth applied on original images)
Facebook
TwitterSource:
Copied from the original dataset
Creators:
Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu
W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619
Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi '@' cs.wisc.edu… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
categorizing
Facebook
TwitterThe resources for this dataset can be found at https://www.openml.org/d/13 and https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.
Number of Instances: 286
Number of Attributes: 9 + the class attribute
Attribute Information:
Class: no-recurrence-events, recurrence-events age: 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99. menopause: lt40, ge40, premeno. tumor-size: 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59. inv-nodes: 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39. node-caps: yes, no. deg-malig: 1, 2, 3. breast: left, right. breast-quad: left-up, left-low, right-up, right-low, central. irradiat: yes, no.
Missing Attribute Values: (denoted by “?”) Attribute #: Number of instances with missing values: 6. 8 9. 1.
Class Distribution:
no-recurrence-events: 201 instances recurrence-events: 85 instances
Original data https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
With the attributes described above, can you predict if a patient has recurrence event ?
Facebook
TwitterThe dataset is used for biomedical data and contains 30 input features computed from digitalized images of needle aspirates, with binary labels malignant/benign.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Ductal carcinoma in situ with microinvasion (DCISM) is a challenging subtype of breast cancer with controversial invasiveness and prognosis. Accurate diagnosis of DCISM from ductal carcinoma in situ (DCIS) is crucial for optimal treatment and improved clinical outcomes. This dataset provides histopathology images and paired CK5/6 immunohistochemical staining images from patients with DCISM, as well as multiphoton microscopy images of suspicious regions. It offers multi-modal imaging data from various perspectives for analysis and diagnosis of microinvasive breast cancer by other researchers in the field.
The dataset contains data from 12 breast cancer patients, including 10 cases of ductal carcinoma in situ with microinvasion (DCISM), 1 case of ductal carcinoma in situ (DCIS), and 1 case of invasive breast cancer.
The magnification of the glass slide images is 40x. The pathology slide scanner used was created by the Sunny Optical Technology (group) Co., Ltd., and the pixel aspect ratio of the images is 1. The dataset also includes multiphoton microscopy imaging of suspicious microinvasion areas. The multiphoton imaging system was manufactured by Zeiss, and it also has a pixel aspect ratio of 1.
Our database was specifically collected for the use of imaging methods in diagnosing DICSM. The suffixes in each case number indicate the patient's condition - "DCISM" for ductal carcinoma in situ with microinvasion, "DCIS" for ductal carcinoma in situ, and "IDC" for invasive ductal carcinoma. Apart from these labels, we have not collected any additional clinical information for these cases.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Breast Cancer Wisconsin Diagnostic Dataset
Following description was retrieved from breast cancer dataset on UCI machine learning repository. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at here. Separating plane described above was obtained using Multisurface Method-Tree (MSM-T), a classification method which uses linear… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin.