81 datasets found

h
Data from: breast-cancer-wisconsin
huggingface.co
Updated May 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
scikit-learn (2025). breast-cancer-wisconsin [Dataset]. https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 26, 2025
Dataset authored and provided by
scikit-learn
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Breast Cancer Wisconsin Diagnostic Dataset

Following description was retrieved from breast cancer dataset on UCI machine learning repository. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at here. Separating plane described above was obtained using Multisurface Method-Tree (MSM-T), a classification method which uses linear… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin.

Breast Cancer Dataset [Wisconsin Diagnostic UCI]

kaggle.com

zip

Updated Jan 22, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Abhinav Mangalore (2024). Breast Cancer Dataset [Wisconsin Diagnostic UCI] [Dataset]. https://www.kaggle.com/datasets/abhinavmangalore/breast-cancer-dataset-wisconsin-diagnostic-uci

Explore at:

zip(49831 bytes)Available download formats

Dataset updated

Jan 22, 2024

Authors

Abhinav Mangalore

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Wisconsin

Description

This dataset is taken from the UCI Machine Learning Repository (Link: https://data.world/health/breast-cancer-wisconsin) by the Donor: Nick Street

The main idea and inspiration behind the upload was to provide datasets for Machine Learning as practice and reference for my peers at college. The main purpose is to analyze data and experiment with different machine learning ideas and techniques for this binary classification task. As such, this dataset is a very useful resource to practice on.

Breast cancer is when breast cells mutate and become cancerous cells that multiply and form tumors. It accounts for 25% of all cancer cases and affected over 2.1 Million people in 2015 alone. Breast cancer typically affects women and people assigned female at birth (AFAB) age 50 and older, but it can also affect men and people assigned male at birth (AMAB), as well as younger women. Healthcare providers may treat breast cancer with surgery to remove tumors or treatment to kill cancerous cells.

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at http://www.cs.wisc.edu/~street/images/

The task: To classify whether the tumor is benign (B) or malignant (M).

Relevant information

Features are computed from a digitized image of a fine needle
aspirate (FNA) of a breast mass. They describe
characteristics of the cell nuclei present in the image.
A few of the images can be found at
http://www.cs.wisc.edu/~street/images/

Separating plane described above was obtained using
Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree
Construction Via Linear Programming." Proceedings of the 4th
Midwest Artificial Intelligence and Cognitive Science Society,
pp. 97-101, 1992], a classification method which uses linear
programming to construct a decision tree. Relevant features
were selected using an exhaustive search in the space of 1-4
features and 1-3 separating planes.

The actual linear program used to obtain the separating plane
in the 3-dimensional space is that described in:
[K. P. Bennett and O. L. Mangasarian: "Robust Linear
Programming Discrimination of Two Linearly Inseparable Sets",
Optimization Methods and Software 1, 1992, 23-34].


This database is also available through the UW CS ftp server:

ftp ftp.cs.wisc.edu
cd math-prog/cpo-dataset/machine-learn/WDBC/

Number of instances: 569

Number of attributes: 32 (ID, diagnosis, 30 real-valued input features)

Original Creators:

Dr. William H. Wolberg, General Surgery Dept., University of
Wisconsin, Clinical Sciences Center, Madison, WI 53792
wolberg@eagle.surgery.wisc.edu

W. Nick Street, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
street@cs.wisc.edu 608-262-6619

Olvi L. Mangasarian, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
olvi@cs.wisc.edu

Donor: Nick Street

Date: November 1995

Past Usage:

first usage:

W.N. Street, W.H. Wolberg and O.L. Mangasarian 
Nuclear feature extraction for breast tumor diagnosis.
IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science
and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.

OR literature:

O.L. Mangasarian, W.N. Street and W.H. Wolberg. 
Breast cancer diagnosis and prognosis via linear programming. 
Operations Research, 43(4), pages 570-577, July-August 1995.

Medical literature:

W.H. Wolberg, W.N. Street, and O.L. Mangasarian. 
Machine learning techniques to diagnose breast cancer from
fine-needle aspirates. 
Cancer Letters 77 (1994) 163-171.

W.H. Wolberg, W.N. Street, and O.L. Mangasarian. 
Image analysis and machine learning applied to breast cancer
diagnosis and prognosis. 
Analytical and Quantitative Cytology and Histology, Vol. 17
No. 2, pages 77-87, April 1995. 

W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. 
Computerized breast cancer diagnosis and prognosis from fine
needle aspirates. 
Archives of Surgery 1995;130:511-516.

W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. 
Computer-derived nuclear features distinguish malignant from
benign breast cytology. 
Human Pathology, 26:792--796, 1995.

Breast Cancer Diagnosis Dataset - Wisconsin State

kaggle.com

zip

Updated Mar 31, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Saurabh Badole (2024). Breast Cancer Diagnosis Dataset - Wisconsin State [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/breast-cancer-wisconsin-state

Explore at:

zip(5844 bytes)Available download formats

Dataset updated

Mar 31, 2024

Authors

Saurabh Badole

Area covered

Wisconsin

Description

Description:

Explore the field of breast cancer diagnosis with the insightful Wisconsin Breast Cancer dataset (Original). This dataset provides detailed attributes representing tumor characteristics observed in breast tissue samples. By analyzing these attributes, researchers and medical professionals can gain insights into tumor behavior and develop predictive models for cancer detection and prognosis.

Features
1. Sample code number: Unique identifier for each tissue sample.
2. Clump Thickness: Assessment of the thickness of tumor cell clusters (1 - 10).
3. Uniformity of Cell Size: Uniformity in the size of tumor cells (1 - 10).
4. Uniformity of Cell Shape: Uniformity in the shape of tumor cells (1 - 10).
5. Marginal Adhesion: Degree of adhesion of tumor cells to surrounding tissue (1 - 10).
6. Single Epithelial Cell Size: Size of individual tumor cells (1 - 10).
7. Bare Nuclei: Presence of nuclei without surrounding cytoplasm (1 - 10).
8. Bland Chromatin: Assessment of chromatin structure in tumor cells (1 - 10).
9. Normal Nucleoli: Presence of normal-looking nucleoli in tumor cells (1 - 10).
10. Mitoses: Frequency of mitotic cell divisions (1 - 10).
11. Class: Classification of tumor type (2 for benign, 4 for malignant).

Usage:

Cancer diagnosis: Develop machine learning models to classify tumors as benign or malignant based on their characteristics, aiding in early detection and treatment planning.
Feature importance analysis: Identify key attributes contributing to tumor malignancy and understand their biological significance.
Clinical decision support: Assist healthcare professionals in interpreting biopsy results and making informed decisions about patient care.

Acknowledgements:

The Breast Cancer Wisconsin dataset is sourced from tissue samples collected for diagnostic purposes, with attributes derived from microscopic examination. The dataset is anonymized and made available for research purposes, contributing to advancements in cancer diagnosis and treatment.

t
Breast Cancer Wisconsin dataset - Dataset - LDM
service.tib.eu
Updated Dec 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Wisconsin dataset - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/breast-cancer-wisconsin-dataset
Explore at:
Dataset updated
Dec 2, 2024
Description
The Breast Cancer Wisconsin dataset is a multiclass classification dataset. It contains 699 samples, each described by 9 features, and is used for cancer diagnosis.
Data from: BREAST CANCER WISCONSIN DATA SET
kaggle.com
zip
Updated Aug 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roopa Calistus (2022). BREAST CANCER WISCONSIN DATA SET [Dataset]. https://www.kaggle.com/datasets/roopacalistus/breast-cancer-prediction
Explore at:
zip(49796 bytes)Available download formats
Dataset updated
Aug 19, 2022
Authors
Roopa Calistus
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
BREAST CANCER WISCONSIN (DIAGNOSTIC) DATA SET Predict whether the cancer is benign or malignant. It consists of features that are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)
Breast Cancer Prognostics
kaggle.com
zip
Updated Dec 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Breast Cancer Prognostics [Dataset]. https://www.kaggle.com/datasets/thedevastator/improve-breast-cancer-prognostics-using-machine
Explore at:
zip(78356 bytes)Available download formats
Dataset updated
Dec 4, 2022
Authors
The Devastator
Description
Breast Cancer Prognostics

Study the Wisconsin Dataset

By UCI [source]

About this dataset

The Breast Cancer Wisconsin (Prognostic) dataset brings together data collected from hundreds of breast cancer cases, making it valuable for predictive prognosis. It includes 30 features such as radius, texture, area, compactness and concavity that were generated from the a digitized fine needle aspirate (FNA) of the mass to generate characteristics of the cell nuclei present in each case. It also includes outcomes such as recurrence and nonrecurrence and also time-to-recurrence information for those cases that relapse.

This breaking dataset was created by some leading minds in medical science; Dr William H. Wolberg at the University Of Wisconsin Clinical Sciences Center alongside W. Nick Street at the university's Computer Sciences Dept., and Olvi L Mangasarian also based there - all credited with creating various decision tree construction systems using linear programming models to accurately predict disease recurrences within an incredibly short time frame.

The data is freely available through UW CS ftp server or on Kaggle's website making use easier than ever before - giving all researchers access up-to-date information regarding breast cancer prognosis and diagnosis via images taken from FNA tests conducted on masses in diagnosed patients' bodies - allowing each participant instantaneous access to a powerful set of features versus outcomes within both recurrent and nonrecurrent situations.. Moreover papers such as 'An inductive learning approach to prognostic prediction.' by WN street et al have utilized this database extensively mapping out how Artificial Neural Networks can be used for predictive tasks with noteworthy success! Armed with these tested ideas consequently anyone has access level ground in understanding how decisions are made as it relates to predicting breast cancer outcome effectively utilizing this dataset helping us better understand how a predictive model can significantly improve patient care processes!

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset is designed to improve the prognostics of breast cancer using machine learning algorithms. The data consists of a time series of patient symptoms and various medical parameters, such as tumor size and malignancy, that can be used by programmatic algorithms to predict diagnosis and prognosis outcomes. Here are some steps on how to use this dataset:

Pre-process and clean the data: Since the dataset contains incomplete or missing values across various parameters, it is important to clean and pre-process the data before attempting any machine learning algorithm (MLA). This includes sorting out what type of values need imputation, standardizing features for better performance, encoding categorical variables for MLAs, and normalizing numerical values for accuracy.

Choose an appropriate MLA: Depending on your exact goal with this data set - for example if you wanted reliable classification results or weighted predictions based on factors - there are a variety of MLAs from which you may select; examples include logistic regression classifiers, least squares support vector machines (SVM), neural networks, nonsmooth optimization algorithms like A-Optimality or global optimization methods such as Extract M-of-N rule sets from trained neural nets.. It would be wise to read up on each algorithm in order to determine which one most appropriately meets your needs before starting experimentation with the dataset itself.

Train the model using your selected MLA: Once you have identified an MLA that fits your desired result outcome best – or if you decide on experimenting with multiple approaches –it’s time turn back towards the data itself in order run experiments actually examine outcomes based upon training models built upon it through cross validation methods such as k-fold splitting.. Then test these trained models against validation datasets taken from specified subsets within the original larger data set structure held by Kaggle in order get general outputs results determining performance rates over various conditions presented by parameter combinations relevant when predicting breast cancer diagnostic &/or prognostic outcomes .. Establishing any trends revealed during these experiments will help inform future model selections during training process associated implementing an effective predictive solution fitting specific user requirements especially where particular MLA are not tailored handle purpose generally falling outside scope designing said model so guaranteeing ac...
t
Breast Cancer Wisconsin - Dataset - LDM
service.tib.eu
Updated Dec 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Wisconsin - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/breast-cancer-wisconsin
Explore at:
Dataset updated
Dec 3, 2024
Description
The dataset used in this study for exploring white-box attacks and defenses on quantum neural networks under depolarization noise.
h
wisconsin-breast-cancer-diagnostic
huggingface.co
Updated Oct 26, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mnemora (2025). wisconsin-breast-cancer-diagnostic [Dataset]. https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic
Explore at:
Dataset updated
Oct 26, 2025
Dataset authored and provided by
Mnemora
License
https://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/
Description
This dataset, derived from the Wisconsin Breast Cancer (Diagnostic), is a comprehensive resource for developing and evaluating machine learning models focused on the binary classification of breast tumors as either benign (B) or malignant (M). The data consists of features computed from digitized images of fine needle aspirates (FNA) of breast masses, offering a rich set of quantitative metrics for computational pathology and diagnostic research. The dataset is a critical tool for healthcare… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic.
h
wisconsin-breast-cancer
huggingface.co
Updated Feb 1, 2001
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Witold Wydmański (2001). wisconsin-breast-cancer [Dataset]. https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 1, 2001
Authors
Witold Wydmański
Area covered
Wisconsin
Description
Source:

Copied from the original dataset

Creators:

Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu

W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619

Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi '@' cs.wisc.edu… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer.
c
Breast Cancer Dataset
cubig.ai
zip
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Breast Cancer Dataset [Dataset]. https://cubig.ai/store/products/178/breast-cancer-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 2, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.

2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.
A
‘Breast Cancer Wisconsin (Diagnostic) ’ analyzed by Analyst-2
analyst-2.ai
Updated Sep 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Breast Cancer Wisconsin (Diagnostic) ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-4be8/0af307d3/?iid=022-284&v=presentation
Explore at:
Dataset updated
Sep 30, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Wisconsin (Diagnostic) ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/faroukbenarous/breast-cancer-wisconsin-diagnostic on 30 September 2021.

--- No further description of dataset provided by original source ---

--- Original source retains full ownership of the source dataset ---
Data from: Breast cancer Wisconsin
kaggle.com
zip
Updated Jan 16, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PAVAN KUMAR D (2021). Breast cancer Wisconsin [Dataset]. https://www.kaggle.com/datasets/mragpavank/breast-cancer
Explore at:
zip(49796 bytes)Available download formats
Dataset updated
Jan 16, 2021
Authors
PAVAN KUMAR D
Description
Dataset

This dataset was created by PAVAN KUMAR D

Contents
Data from: Breast Cancer Wisconsin (Diagnostic)
kaggle.com
zip
Updated Aug 6, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SARANGI Venkat (2025). Breast Cancer Wisconsin (Diagnostic) [Dataset]. https://www.kaggle.com/datasets/sarangivenkat/breast-cancer-wisconsin-diagnostic
Explore at:
zip(37566 bytes)Available download formats
Dataset updated
Aug 6, 2025
Authors
SARANGI Venkat
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by SARANGI Venkat

Released under CC0: Public Domain

Contents
p
Breast Cancer Dataset - Dataset - CKAN
data.poltekkes-smg.ac.id
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-dataset
Explore at:
Dataset updated
Oct 7, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description: Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area. The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We ask you to complete the analysis of classifying these tumors using machine learning (with SVMs) and the Breast Cancer Wisconsin (Diagnostic) Dataset. Acknowledgements: This dataset has been referred from Kaggle. Objective: Understand the Dataset & cleanup (if required). Build classification models to predict whether the cancer type is Malignant or Benign. Also fine-tune the hyperparameters & compare the evaluation metrics of various classification algorithms.
H
Replication Data for: Wisconsin Breast Cancer Diagnostic
dataverse.harvard.edu
Updated Apr 6, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Bartley (2016). Replication Data for: Wisconsin Breast Cancer Diagnostic [Dataset]. http://doi.org/10.7910/DVN/SP6VXJ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SP6VXJ
Dataset updated
Apr 6, 2016
Dataset provided by
Harvard Dataverse
Authors
Christopher Bartley
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Wisconsin
Description
Original data from: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic). Changes made: - 16 rows with '?' for Bare Nuclei removed, leaving 683 records # Attribute Domain -- ----------------------------------------- 0. Class: (-1 for benign, +1 for malignant) 1. Clump Thickness 1 - 10 2. Uniformity of Cell Size 1 - 10 3. Uniformity of Cell Shape 1 - 10 4. Marginal Adhesion 1 - 10 5. Single Epithelial Cell Size 1 - 10 6. Bare Nuclei 1 - 10 7. Bland Chromatin 1 - 10 8. Normal Nucleoli 1 - 10 9. Mitoses 1 - 10
t
Yu Wang, Hussein Sibai, Sayan Mitra, Geir E. Dullerud (2025). Dataset:...
service.tib.eu
Updated Jan 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Yu Wang, Hussein Sibai, Sayan Mitra, Geir E. Dullerud (2025). Dataset: Breast Cancer Wisconsin (Diagnostic) and Landsat Satellite Data Sets. https://doi.org/10.57702/q6tqwgtn [Dataset]. https://service.tib.eu/ldmservice/dataset/breast-cancer-wisconsin--diagnostic--and-landsat-satellite-data-sets
Explore at:
Dataset updated
Jan 2, 2025
Description
The dataset used in this paper is the Breast Cancer Wisconsin (Diagnostic) and the Landsat Satellite Data Sets from the UCI repository.
c
Data from: Cancer classification Dataset
cubig.ai
zip
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Cancer classification Dataset [Dataset]. https://cubig.ai/store/products/166/cancer-classification-dataset
Explore at:
zipAvailable download formats
Dataset updated
May 2, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Cancer Classification dataset is derived from the UCI ML Breast Cancer Wisconsin (Diagnostic) datasets, containing 569 instances with 30 numerical attributes. The features are computed from digitized images of fine needle aspirates (FNA) of breast masses, aimed at distinguishing between malignant and benign tumors.

2) Data Utilization (1) Cancer Classification data has characteristics that: • It includes detailed measurements of cell nuclei characteristics such as radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, and fractal dimension. These attributes are essential for accurate classification of breast cancer tumors. (2) Cancer Classification data can be used to: • Medical Diagnosis: Assists in developing predictive models to classify breast cancer tumors as malignant or benign, aiding in early detection and treatment planning. • Research and Development: Supports academic research and development of machine learning models in the medical field, providing a comprehensive dataset for testing various algorithms.
Data from: Cancer classification
kaggle.com
zip
Updated Apr 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sahil Bajaj (2024). Cancer classification [Dataset]. https://www.kaggle.com/datasets/sahilnbajaj/cancer-classification
Explore at:
zip(53037 bytes)Available download formats
Dataset updated
Apr 11, 2024
Authors
Sahil Bajaj
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Diagnostic Wisconsin Breast Cancer Database.Dataset Characteristics Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at http://www.cs.wisc.edu/~street/images/

Separating plane described above was obtained using Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree Construction Via Linear Programming." Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97-101, 1992], a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1-4 features and 1-3 separating planes.

The actual linear program used to obtain the separating plane in the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/
f
Table 1_A robust stacked neural network approach for early and accurate...
frontiersin.figshare.com
docx
Updated Oct 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Xinkang Li; Menglong Gao; Chengyang Zhang; Guikai Ma; Qingyun Zhang; Wenjuan Meng; Tianbai Yuan; Yang Wang; Zhenhua Li (2025). Table 1_A robust stacked neural network approach for early and accurate breast cancer diagnosis.docx [Dataset]. http://doi.org/10.3389/fmed.2025.1644857.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fmed.2025.1644857.s001
Dataset updated
Oct 16, 2025
Dataset provided by
Frontiers
Authors
Xinkang Li; Menglong Gao; Chengyang Zhang; Guikai Ma; Qingyun Zhang; Wenjuan Meng; Tianbai Yuan; Yang Wang; Zhenhua Li
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Timely and accurate diagnosis of breast cancer remains a critical clinical challenge. In this study, we propose Stacked Artificial Neural Network (StackANN), a robust stacking ensemble framework that integrates six classical machine learning classifiers with an Artificial Neural Network (ANN) meta-learner to enhance diagnostic precision and generalization. By incorporating the Synthetic Minority Over-Sampling Technique (SMOTE) to address class imbalance and employing SHapley Additive exPlanations (SHAP) for model interpretability. StackANN was comprehensively evaluated on Wisconsin Diagnostic Breast Cancer (WDBC) datasets, Ljubljana Breast Cancer (LBC) datasets and Wisconsin Breast Cancer Dataset (WBCD), as well as the METABRIC2 dataset for multi-subtype classification. Experimental results demonstrate that StackANN consistently outperforms individual classifiers and existing hybrid models, achieving near-perfect Recall and Area Under the Curve (AUC) values while maintaining balanced overall performance. Importantly, feature attribution analysis confirmed strong alignment with clinical diagnostic criteria, emphasizing tumor malignancy, size, and morphology as key determinants. These findings highlight StackANN as a reliable, interpretable, and clinically relevant tool with significant potential for early screening, subtype classification, and personalized treatment planning in breast cancer care.
Breast Cancer Wisconsin Diagnosis
kaggle.com
zip
Updated Aug 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya_Sahu500096455 (2025). Breast Cancer Wisconsin Diagnosis [Dataset]. https://www.kaggle.com/datasets/adityasahu500096455/breast-cancer-wisconsin-diagnosis
Explore at:
zip(49796 bytes)Available download formats
Dataset updated
Aug 21, 2025
Authors
Aditya_Sahu500096455
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Aditya_Sahu500096455

Released under MIT

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

scikit-learn (2025). breast-cancer-wisconsin [Dataset]. https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin

Data from: breast-cancer-wisconsin

scikit-learn/breast-cancer-wisconsin

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 26, 2025

Dataset authored and provided by

scikit-learn

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

Breast Cancer Wisconsin Diagnostic Dataset

Following description was retrieved from breast cancer dataset on UCI machine learning repository. Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at here. Separating plane described above was obtained using Multisurface Method-Tree (MSM-T), a classification method which uses linear… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/breast-cancer-wisconsin.

Clear search

Close search

Google apps

Main menu

Data from: breast-cancer-wisconsin

Breast Cancer Dataset [Wisconsin Diagnostic UCI]

Breast Cancer Diagnosis Dataset - Wisconsin State

Description:

Usage:

Acknowledgements:

Breast Cancer Wisconsin dataset - Dataset - LDM

Data from: BREAST CANCER WISCONSIN DATA SET

Breast Cancer Prognostics

Breast Cancer Prognostics

Study the Wisconsin Dataset

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Breast Cancer Wisconsin - Dataset - LDM

wisconsin-breast-cancer-diagnostic

wisconsin-breast-cancer

Breast Cancer Dataset

‘Breast Cancer Wisconsin (Diagnostic) ’ analyzed by Analyst-2

Data from: Breast cancer Wisconsin

Dataset

Contents

Data from: Breast Cancer Wisconsin (Diagnostic)

Dataset

Contents

Breast Cancer Dataset - Dataset - CKAN

Replication Data for: Wisconsin Breast Cancer Diagnostic

Yu Wang, Hussein Sibai, Sayan Mitra, Geir E. Dullerud (2025). Dataset:...

Data from: Cancer classification Dataset

Data from: Cancer classification

Table 1_A robust stacked neural network approach for early and accurate...

Breast Cancer Wisconsin Diagnosis

Dataset

Contents

Data from: breast-cancer-wisconsin

scikit-learn/breast-cancer-wisconsin