By UCI [source]
This dataset contains data on breast cancer diagnosis, a devastating medical condition that affects thousands of people around the world each year. The data is comprised of patient ID, diagnosis (Malignant or Benign), and 30 computed features extracted from a digitized image of a fine needle aspirate (FNA) of a breast mass. Features include radius, texture, perimeter, area, smoothness, compactness concavity and concave points as well as symmetry and fractal dimension.
Created by renowned researchers in the fields of General Surgery and Computer Science at the University of Wisconsin-Madison led by Dr. William H Wolberg with contributions from Professor W Nick Street and Olvi L Mangasarian this dataset was used in some groundbreaking research to predict breast cancer prognosis using linear programming methods. More recently statistical methods such as support vector machines have been employed to classify tumour types from this dataset as well other tasks such as identify hidden patterns through pattern recognition techniques like Artificial Neural Networks (ANN).
It has also been used for studies exploring unsupervised classification tools like Ant Colony Optimization for discovering meaningful relationships among different variables which can help physicians better understand the progression of certain types of tumors over time. For example types cardinality analysis allowed researchers to determine tumor’s heterogeneity before deciding on appropriate treatments potentially leading to improved prognosis success rates overall. This Wisconsin Breast Cancer Diagnostic dataset provides an invaluable resource to scientists working on preventing or curing this dreaded disease - a goal we all eagerly hope to achieve someday soon!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Developing a classifier that can accurately predict breast cancer diagnoses based on the provided features.
- Clustering patient data with similar diagnosis to discover trends or connections between certain symptoms and diagnoses.
- Optimizing feature selection algorithms to identify the most relevant predictors of breast cancer diagnosis from a set of given cell nuclei features
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: unformatted-data.csv
File: wpbc.data.csv | Column name | Description | |:--------------|:--------------------------------| | 119513 | ID number (Integer) | | N | Diagnosis (Binary) | | 31 | Radius (Real-valued) | | 18.02 | Texture (Real-valued) | | 27.6 | Perimeter (Real-valued) | | 117.5 | Area (Real-valued) | | 1013 | Smoothness (Real-valued) | | 0.09489 | Compactness (Real-valued) | | 0.1036 | Concavity (Real-valued) | | 0.1086 | Symmetry (Real-valued) | | 0.07055 | Fractal Dimension (Real-valued) | | 0.1865 | Mean Intensity (Real-valued) | | 0.06333 | Standard Error (Real-valued) | | 0.6249 | Worst Radius (Real-valued) | | 1.89 | Worst Texture (Real-valued) | | 3.972 | Worst Perimeter (Real-valued) | | 71.55 | Worst Area (Real-valued) | | 0.004433 | Worst Smoothness (Real-valued) | | 0.01421 | Worst Compactness (Real-valued) | | 0.03233 | Worst Concavity (Real-valued) |
File: breast-cancer-wisconsin.data.csv | Column name | Description | |:--------------|:--------------------------------------| | 119513 | ID number (Integer) | | 1000025 | ID number (Integer) | | 1.1 | Uniformity of Cell Size (Integer) | | 1.2 | Uniformity of Cell Shape (Integer) | | 1.3 | Single Epithelial Cell Size (Integer) | | 1.4 | Bland Chromatin (Integer) | | 1.5 | Normal Nucleoli (Integer) | | 2.1 | Mitoses (Integer) |
File: wdbc.data.csv | Column name | Description | |:--------------|:----------------------------------------| | 842302 | Patient ID number (Integer Type) | | M | Diagnosis (Binary Type) | | **...
Explore the field of breast cancer diagnosis with the insightful Wisconsin Breast Cancer dataset (Original). This dataset provides detailed attributes representing tumor characteristics observed in breast tissue samples. By analyzing these attributes, researchers and medical professionals can gain insights into tumor behavior and develop predictive models for cancer detection and prognosis.
Features | |
---|---|
1. Sample code number: Unique identifier for each tissue sample. | |
2. Clump Thickness: Assessment of the thickness of tumor cell clusters (1 - 10). | |
3. Uniformity of Cell Size: Uniformity in the size of tumor cells (1 - 10). | |
4. Uniformity of Cell Shape: Uniformity in the shape of tumor cells (1 - 10). | |
5. Marginal Adhesion: Degree of adhesion of tumor cells to surrounding tissue (1 - 10). | |
6. Single Epithelial Cell Size: Size of individual tumor cells (1 - 10). | |
7. Bare Nuclei: Presence of nuclei without surrounding cytoplasm (1 - 10). | |
8. Bland Chromatin: Assessment of chromatin structure in tumor cells (1 - 10). | |
9. Normal Nucleoli: Presence of normal-looking nucleoli in tumor cells (1 - 10). | |
10. Mitoses: Frequency of mitotic cell divisions (1 - 10). | |
11. Class: Classification of tumor type (2 for benign, 4 for malignant). |
The Breast Cancer Wisconsin dataset is sourced from tissue samples collected for diagnostic purposes, with attributes derived from microscopic examination. The dataset is anonymized and made available for research purposes, contributing to advancements in cancer diagnosis and treatment.
UCI Breast Cancer Raw Dataset is a breast cancer dataset that contains three sets of breast cancer cytopathology image data. Features are calculated from digitized images of fine needle aspiration (FNA) of breast masses. They describe the image The characteristics of the nuclei appearing in . The original UCI Breast Cancer dataset was published in 1995 by Dr. William H. Wolberg, General Surgery Dept. W. Nick Street, Computer Sciences Dept. Olvi L. Mangasarian, Computer Sciences Dept. Related papers are Breast cancer diagnosis and prognosis via linear programming etc.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
By UCI [source]
This dataset contains data on breast cancer diagnosis, a devastating medical condition that affects thousands of people around the world each year. The data is comprised of patient ID, diagnosis (Malignant or Benign), and 30 computed features extracted from a digitized image of a fine needle aspirate (FNA) of a breast mass. Features include radius, texture, perimeter, area, smoothness, compactness concavity and concave points as well as symmetry and fractal dimension.
Created by renowned researchers in the fields of General Surgery and Computer Science at the University of Wisconsin-Madison led by Dr. William H Wolberg with contributions from Professor W Nick Street and Olvi L Mangasarian this dataset was used in some groundbreaking research to predict breast cancer prognosis using linear programming methods. More recently statistical methods such as support vector machines have been employed to classify tumour types from this dataset as well other tasks such as identify hidden patterns through pattern recognition techniques like Artificial Neural Networks (ANN).
It has also been used for studies exploring unsupervised classification tools like Ant Colony Optimization for discovering meaningful relationships among different variables which can help physicians better understand the progression of certain types of tumors over time. For example types cardinality analysis allowed researchers to determine tumor’s heterogeneity before deciding on appropriate treatments potentially leading to improved prognosis success rates overall. This Wisconsin Breast Cancer Diagnostic dataset provides an invaluable resource to scientists working on preventing or curing this dreaded disease - a goal we all eagerly hope to achieve someday soon!
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
- Developing a classifier that can accurately predict breast cancer diagnoses based on the provided features.
- Clustering patient data with similar diagnosis to discover trends or connections between certain symptoms and diagnoses.
- Optimizing feature selection algorithms to identify the most relevant predictors of breast cancer diagnosis from a set of given cell nuclei features
If you use this dataset in your research, please credit the original authors. Data Source
See the dataset description for more information.
File: unformatted-data.csv
File: wpbc.data.csv | Column name | Description | |:--------------|:--------------------------------| | 119513 | ID number (Integer) | | N | Diagnosis (Binary) | | 31 | Radius (Real-valued) | | 18.02 | Texture (Real-valued) | | 27.6 | Perimeter (Real-valued) | | 117.5 | Area (Real-valued) | | 1013 | Smoothness (Real-valued) | | 0.09489 | Compactness (Real-valued) | | 0.1036 | Concavity (Real-valued) | | 0.1086 | Symmetry (Real-valued) | | 0.07055 | Fractal Dimension (Real-valued) | | 0.1865 | Mean Intensity (Real-valued) | | 0.06333 | Standard Error (Real-valued) | | 0.6249 | Worst Radius (Real-valued) | | 1.89 | Worst Texture (Real-valued) | | 3.972 | Worst Perimeter (Real-valued) | | 71.55 | Worst Area (Real-valued) | | 0.004433 | Worst Smoothness (Real-valued) | | 0.01421 | Worst Compactness (Real-valued) | | 0.03233 | Worst Concavity (Real-valued) |
File: breast-cancer-wisconsin.data.csv | Column name | Description | |:--------------|:--------------------------------------| | 119513 | ID number (Integer) | | 1000025 | ID number (Integer) | | 1.1 | Uniformity of Cell Size (Integer) | | 1.2 | Uniformity of Cell Shape (Integer) | | 1.3 | Single Epithelial Cell Size (Integer) | | 1.4 | Bland Chromatin (Integer) | | 1.5 | Normal Nucleoli (Integer) | | 2.1 | Mitoses (Integer) |
File: wdbc.data.csv | Column name | Description | |:--------------|:----------------------------------------| | 842302 | Patient ID number (Integer Type) | | M | Diagnosis (Binary Type) | | **...