45 datasets found

A
‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2
analyst-2.ai
Updated Nov 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-data-set-4f29/6238ad2a/?iid=010-987&v=presentation
Explore at:
Dataset updated
Nov 20, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/breast-cancer-wisconsin-data on 20 November 2021.

--- Dataset description provided by original source is as follows ---

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

--- Original source retains full ownership of the source dataset ---
h
wisconsin-breast-cancer
huggingface.co
Updated Feb 1, 2001
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Witold Wydmański (2001). wisconsin-breast-cancer [Dataset]. https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 1, 2001
Authors
Witold Wydmański
Area covered
Wisconsin
Description
Source:

Copied from the original dataset

Creators:

Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu

W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619

Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi '@' cs.wisc.edu… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer.
Data from: BREAST CANCER WISCONSIN DATA SET
kaggle.com
Updated Aug 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roopa Calistus (2022). BREAST CANCER WISCONSIN DATA SET [Dataset]. http://doi.org/10.34740/kaggle/dsv/4092342
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/4092342
Dataset updated
Aug 19, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Roopa Calistus
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
BREAST CANCER WISCONSIN (DIAGNOSTIC) DATA SET Predict whether the cancer is benign or malignant. It consists of features that are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)
c
Breast Cancer Dataset
cubig.ai
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Breast Cancer Dataset [Dataset]. https://cubig.ai/store/products/178/breast-cancer-dataset
Explore at:
Dataset updated
May 2, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.

2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.
A
‘Wisconsin Diagnostic Breast Cancer (WDBC)’ analyzed by Analyst-2
analyst-2.ai
Updated Sep 30, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Wisconsin Diagnostic Breast Cancer (WDBC)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-wisconsin-diagnostic-breast-cancer-wdbc-b8cd/5b08ae03/?iid=009-999&v=presentation
Explore at:
Dataset updated
Sep 30, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Wisconsin Diagnostic Breast Cancer (WDBC)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mohaiminul101/wisconsin-diagnostic-breast-cancer-wdbc on 30 September 2021.

--- Dataset description provided by original source is as follows ---

Context

Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer. Wisconsin Diagnostic Breast Cancer (WDBC) dataset obtained by the university of Wisconsin Hospital is used to classify tumors as benign or malignant.

Content

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

Acknowledgements

Creator: Dr. WIlliam H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

--- Original source retains full ownership of the source dataset ---
t
Breast Cancer Wisconsin (Original) - Dataset - LDM
service.tib.eu
Updated Dec 3, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Wisconsin (Original) - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/breast-cancer-wisconsin--original-
Explore at:
Dataset updated
Dec 3, 2024
Description
Breast Cancer Wisconsin (Original) dataset consists of 699 observations and 11 features
Wisconsin Diagnostic Breast Cancer (WDBC)
kaggle.com
Updated Oct 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohaiminul Islam (2020). Wisconsin Diagnostic Breast Cancer (WDBC) [Dataset]. https://www.kaggle.com/mohaiminul101/wisconsin-diagnostic-breast-cancer-wdbc/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 19, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Mohaiminul Islam
Area covered
Wisconsin
Description
Context

Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer. Wisconsin Diagnostic Breast Cancer (WDBC) dataset obtained by the university of Wisconsin Hospital is used to classify tumors as benign or malignant.

Content

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

Acknowledgements

Creator: Dr. WIlliam H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29
h
wisconsin-breast-cancer-diagnostic
huggingface.co
Updated Sep 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mnemora (2025). wisconsin-breast-cancer-diagnostic [Dataset]. https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic
Explore at:
Dataset updated
Sep 24, 2025
Dataset authored and provided by
Mnemora
License
https://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/
Description
This dataset, derived from the Wisconsin Breast Cancer (Diagnostic), is a comprehensive resource for developing and evaluating machine learning models focused on the binary classification of breast tumors as either benign (B) or malignant (M). The data consists of features computed from digitized images of fine needle aspirates (FNA) of breast masses, offering a rich set of quantitative metrics for computational pathology and diagnostic research. The dataset is a critical tool for healthcare… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic.
h
breast-cancer-africa-adjusted-dataset
huggingface.co
Updated Sep 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Electric Sheep (2025). breast-cancer-africa-adjusted-dataset [Dataset]. https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset
Explore at:
Dataset updated
Sep 9, 2025
Dataset authored and provided by
Electric Sheep
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Breast Cancer Wisconsin Dataset: African Physiognomy Adjusted

Dataset Description

This dataset addresses representation bias in medical AI by providing an African physiognomy-adjusted version of the classic Wisconsin Breast Cancer Dataset. The adjustment methodology systematically modifies cellular morphology features to better reflect documented physiological differences in African populations.

Dataset Summary

Original Dataset: Wisconsin Breast Cancer Dataset… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset.
Breast Cancer Dataset UCI ML
kaggle.com
Updated Apr 19, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jean de Dieu Nyandwi (2020). Breast Cancer Dataset UCI ML [Dataset]. https://www.kaggle.com/datasets/jeandedieunyandwi/breast-cancer-dataset-uci-ml/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 19, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Jean de Dieu Nyandwi
Description
Context

Breast Cancer Wisconsin (Diagnostic) Data Set

Content

Data Set Characteristics:

:Number of Instances: 569 :Number of Attributes: 30 numeric, predictive attributes and the class :Attribute Information: - radius (mean of distances from center to points on the perimeter) - texture (standard deviation of gray-scale values) - perimeter - area - smoothness (local variation in radius lengths) - compactness (perimeter^2 / area - 1.0) - concavity (severity of concave portions of the contour) - concave points (number of concave portions of the contour) - symmetry - fractal dimension ("coastline approximation" - 1) The mean, standard error, and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius. - class: - WDBC-Malignant - WDBC-Benign

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

Acknowledgements

Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. This is a copy of UCI ML Breast Cancer Wisconsin (Diagnostic) datasets. https://goo.gl/U2Uwz2
Breast Cancer Diagnostic Data Set
kaggle.com
Updated May 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ishan Dutta (2020). Breast Cancer Diagnostic Data Set [Dataset]. https://www.kaggle.com/ishandutta/breast-cancer-diagnostic-data-set/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 26, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Ishan Dutta
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at [Web Link]

Separating plane described above was obtained using Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree Construction Via Linear Programming." Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97-101, 1992], a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1-4 features and 1-3 separating planes.

The actual linear program used to obtain the separating plane in the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/
Data from: Breast Cancer Wisconsin
kaggle.com
Updated Jan 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joshua Marsh (2023). Breast Cancer Wisconsin [Dataset]. https://www.kaggle.com/datasets/joshuamarsh/breast-cancer-wisconsin
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 21, 2023
Dataset provided by
Kaggle
Authors
Joshua Marsh
Description
Dataset

This dataset was created by Joshua Marsh

Contents
p
Breast Cancer Dataset - Dataset - CKAN
data.poltekkes-smg.ac.id
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-dataset
Explore at:
Dataset updated
Oct 7, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description: Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area. The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We ask you to complete the analysis of classifying these tumors using machine learning (with SVMs) and the Breast Cancer Wisconsin (Diagnostic) Dataset. Acknowledgements: This dataset has been referred from Kaggle. Objective: Understand the Dataset & cleanup (if required). Build classification models to predict whether the cancer type is Malignant or Benign. Also fine-tune the hyperparameters & compare the evaluation metrics of various classification algorithms.
H
Replication Data for: Wisconsin Breast Cancer Diagnostic
dataverse.harvard.edu
Updated Apr 6, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Bartley (2016). Replication Data for: Wisconsin Breast Cancer Diagnostic [Dataset]. http://doi.org/10.7910/DVN/SP6VXJ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SP6VXJ
Dataset updated
Apr 6, 2016
Dataset provided by
Harvard Dataverse
Authors
Christopher Bartley
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Area covered
Wisconsin
Description
Original data from: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic). Changes made: - 16 rows with '?' for Bare Nuclei removed, leaving 683 records # Attribute Domain -- ----------------------------------------- 0. Class: (-1 for benign, +1 for malignant) 1. Clump Thickness 1 - 10 2. Uniformity of Cell Size 1 - 10 3. Uniformity of Cell Shape 1 - 10 4. Marginal Adhesion 1 - 10 5. Single Epithelial Cell Size 1 - 10 6. Bare Nuclei 1 - 10 7. Bland Chromatin 1 - 10 8. Normal Nucleoli 1 - 10 9. Mitoses 1 - 10
Breast Cancer Wisconsin Data
kaggle.com
Updated Feb 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CDBezz (2021). Breast Cancer Wisconsin Data [Dataset]. https://www.kaggle.com/cdbezz/breast-cancer-wisconsin-data/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 5, 2021
Dataset provided by
Kaggle
Authors
CDBezz
Description
Dataset

This dataset was created by CDBezz

Contents
O
Data from: Breast Cancer Wisconsin (Diagnostic)
opendatalab.com
zip
Updated Apr 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Wisconsin (2023). Breast Cancer Wisconsin (Diagnostic) [Dataset]. https://opendatalab.com/OpenDataLab/Breast_Cancer_Wisconsin_Diagnostic
Explore at:
zipAvailable download formats
Dataset updated
Apr 21, 2023
Dataset provided by
University of Wisconsin
Description
UCI Breast Cancer Raw Dataset is a breast cancer dataset that contains three sets of breast cancer cytopathology image data. Features are calculated from digitized images of fine needle aspiration (FNA) of breast masses. They describe the image The characteristics of the nuclei appearing in . The original UCI Breast Cancer dataset was published in 1995 by Dr. William H. Wolberg, General Surgery Dept. W. Nick Street, Computer Sciences Dept. Olvi L. Mangasarian, Computer Sciences Dept. Related papers are Breast cancer diagnosis and prognosis via linear programming etc.
p
Breast Cancer Prediction Dataset - Dataset - CKAN
data.poltekkes-smg.ac.id
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Prediction Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-prediction-dataset
Explore at:
Dataset updated
Oct 7, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
c
Data from: Cancer classification Dataset
cubig.ai
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). Cancer classification Dataset [Dataset]. https://cubig.ai/store/products/166/cancer-classification-dataset
Explore at:
Dataset updated
May 2, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The Cancer Classification dataset is derived from the UCI ML Breast Cancer Wisconsin (Diagnostic) datasets, containing 569 instances with 30 numerical attributes. The features are computed from digitized images of fine needle aspirates (FNA) of breast masses, aimed at distinguishing between malignant and benign tumors.

2) Data Utilization (1) Cancer Classification data has characteristics that: • It includes detailed measurements of cell nuclei characteristics such as radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, and fractal dimension. These attributes are essential for accurate classification of breast cancer tumors. (2) Cancer Classification data can be used to: • Medical Diagnosis: Assists in developing predictive models to classify breast cancer tumors as malignant or benign, aiding in early detection and treatment planning. • Research and Development: Supports academic research and development of machine learning models in the medical field, providing a comprehensive dataset for testing various algorithms.
Cancer Data
kaggle.com
Updated Mar 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erdem Taha (2023). Cancer Data [Dataset]. https://www.kaggle.com/datasets/erdemtaha/cancer-data/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 22, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Erdem Taha
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
🦠 Breast Cancer Data Set

This dataset contains the characteristics of patients diagnosed with cancer. The dataset contains a unique ID for each patient, the type of cancer (diagnosis), the visual characteristics of the cancer and the average values of these characteristics.

📚 The main features of the dataset are as follows:

id: Represents a unique ID of each patient.

diagnosis: Indicates the type of cancer. This property can take the values "M" (Malignant - Benign) or "B" (Benign - Malignant).

radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean: Represents the mean values of the cancer's visual characteristics.

There are also several categorical features where patients in the dataset are labeled with numerical values. You can examine them in the Chart area.

Other features contain specific ranges of average values of the features of the cancer image:

radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean

Each of these features is mapped to a table containing the number of values in a given range. You can examine the Chart Tables

Each sample contains the patient's unique ID, the cancer diagnosis and the average values of the cancer's visual characteristics.

Such a dataset can be used to train or test models and algorithms used to make cancer diagnoses. Understanding and analyzing the dataset can contribute to the improvement of cancer-related visual features and diagnosis.

✨ Examples of Projects that can be done with the Data Set

Logistic Regression: This algorithm can be used effectively for binary classification problems. In this dataset, logistic regression may be an appropriate choice since there are "Malignant" (benign) and "Benign" (malignant) classes. It can be used to predict cancer type with the visual features in the dataset.

K-Nearest Neighbors (KNN): KNN classifies an example by looking at the k closest examples around it. This algorithm assumes that patients with similar characteristics tend to have similar types of cancer. KNN can be used for cancer diagnosis by taking into account neighborhood relationships in the data set.

Support Vector Machines (SVM): SVM is effective for classification tasks, especially for two-class problems. Focusing on the clear separation of classes in the dataset, SVM is a powerful algorithm that can be used for cancer diagnosis.

Data Set Related Training Notebooks 😊 ("I Recommend You Review")

K-NN Project: https://www.kaggle.com/code/erdemtaha/prediction-cancer-data-with-k-nn-95

Logistic Regressüon: https://www.kaggle.com/code/erdemtaha/cancer-prediction-96-5-with-logistic-regression

💖 Acknowledgements and Information

This is a copy of content that has been elaborated for educational purposes and published to reach more people, you can access the original source from the link below, please do not forget to support that data

🔗 https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data

This database can also be accessed via the UW CS ftp server: 🔗 ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

It can also be found at the UCI Machine Learning Repository: 🔗 https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

📩 Personal Information:

If you have some questions or curiosities about the data or studies, you can contact me as you wish from the links below 😊

LinkedIn: https://www.linkedin.com/in/erdem-taha-sokullu/

Mail: erdemtahasokullu@gmail.com

Github: https://github.com/Prometheussx

Kaggle: https://www.kaggle.com/erdemtaha

📜 License:

This Data has a CC BY-NC-SA 4.0 License You can review the license rules from the link below

License Link: https://creativecommons.org/licenses/by-nc-sa/4.0/
f
DATA SHEET.csv
figshare.com
csv
Updated Jan 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dola Saha (2025). DATA SHEET.csv [Dataset]. http://doi.org/10.6084/m9.figshare.28203392.v1
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28203392.v1
Dataset updated
Jan 14, 2025
Dataset provided by
figshare
Authors
Dola Saha
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Breast cancer is one of the most prevalent cancers among women worldwide, and early detection is crucial for reducing mortality rates and improving treatment outcomes. Mammography has been the gold standard for breast cancer screening, offering non-invasive imaging to identify suspicious abnormalities. However, mammography has limitations, such as variability in interpretation, false positives, false negatives, and challenges in distinguishing between benign and malignant lesions.Machine learning has the potential to revolutionize breast cancer detection by enhancing the capabilities of mammography. Its ability to improve accuracy, efficiency, and consistency in diagnosis makes it an indispensable tool for early detection efforts.This study focuses on developing a machine learning-based predictive model for the early detection and classification of breast cancer, utilizing the Wisconsin Breast Cancer Diagnostic dataset. Special emphasis is placed on the potential of ML algorithms, particularly the Support Vector Classifier with a Radial Basis Function (SVC-RBF), to enhance diagnostic accuracy and efficiency.Machine learning has the potential to revolutionize breast cancer detection by enhancing the capabilities of mammography. Its ability to improve accuracy, efficiency, and consistency in diagnosis makes it an indispensable tool for early detection efforts.

Facebook

Twitter

Click to copy link

Link copied

Cite

Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-data-set-4f29/6238ad2a/?iid=010-987&v=presentation

‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2

Explore at:

Dataset updated

Nov 20, 2021

Dataset authored and provided by

Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Analysis of ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/breast-cancer-wisconsin-data on 20 November 2021.

--- Dataset description provided by original source is as follows ---

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

--- Original source retains full ownership of the source dataset ---

Clear search

Close search

Google apps

Main menu

‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2

wisconsin-breast-cancer

Data from: BREAST CANCER WISCONSIN DATA SET

Breast Cancer Dataset

‘Wisconsin Diagnostic Breast Cancer (WDBC)’ analyzed by Analyst-2

Context

Content

Acknowledgements

Breast Cancer Wisconsin (Original) - Dataset - LDM

Wisconsin Diagnostic Breast Cancer (WDBC)

Context

Content

Acknowledgements

wisconsin-breast-cancer-diagnostic

breast-cancer-africa-adjusted-dataset

Breast Cancer Dataset UCI ML

Context

Content

Acknowledgements

Breast Cancer Diagnostic Data Set

Data from: Breast Cancer Wisconsin

Dataset

Contents

Breast Cancer Dataset - Dataset - CKAN

Replication Data for: Wisconsin Breast Cancer Diagnostic

Breast Cancer Wisconsin Data

Dataset

Contents

Data from: Breast Cancer Wisconsin (Diagnostic)

Breast Cancer Prediction Dataset - Dataset - CKAN

Data from: Cancer classification Dataset

Cancer Data

🦠 Breast Cancer Data Set

📚 The main features of the dataset are as follows:

✨ Examples of Projects that can be done with the Data Set

Data Set Related Training Notebooks 😊 ("I Recommend You Review")

💖 Acknowledgements and Information

📩 Personal Information:

📜 License:

DATA SHEET.csv

‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2