3 datasets found
  1. Breast Cancer Diagnosis

    • kaggle.com
    Updated Dec 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Breast Cancer Diagnosis [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-breast-cancer-diagnosis-with-wisconsi/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 4, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Breast Cancer Diagnosis

    Study breast cancer diagnosis

    By UCI [source]

    About this dataset

    This dataset contains data on breast cancer diagnosis, a devastating medical condition that affects thousands of people around the world each year. The data is comprised of patient ID, diagnosis (Malignant or Benign), and 30 computed features extracted from a digitized image of a fine needle aspirate (FNA) of a breast mass. Features include radius, texture, perimeter, area, smoothness, compactness concavity and concave points as well as symmetry and fractal dimension.

    Created by renowned researchers in the fields of General Surgery and Computer Science at the University of Wisconsin-Madison led by Dr. William H Wolberg with contributions from Professor W Nick Street and Olvi L Mangasarian this dataset was used in some groundbreaking research to predict breast cancer prognosis using linear programming methods. More recently statistical methods such as support vector machines have been employed to classify tumour types from this dataset as well other tasks such as identify hidden patterns through pattern recognition techniques like Artificial Neural Networks (ANN).

    It has also been used for studies exploring unsupervised classification tools like Ant Colony Optimization for discovering meaningful relationships among different variables which can help physicians better understand the progression of certain types of tumors over time. For example types cardinality analysis allowed researchers to determine tumor’s heterogeneity before deciding on appropriate treatments potentially leading to improved prognosis success rates overall. This Wisconsin Breast Cancer Diagnostic dataset provides an invaluable resource to scientists working on preventing or curing this dreaded disease - a goal we all eagerly hope to achieve someday soon!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    Research Ideas

    • Developing a classifier that can accurately predict breast cancer diagnoses based on the provided features.
    • Clustering patient data with similar diagnosis to discover trends or connections between certain symptoms and diagnoses.
    • Optimizing feature selection algorithms to identify the most relevant predictors of breast cancer diagnosis from a set of given cell nuclei features

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: unformatted-data.csv

    File: wpbc.data.csv | Column name | Description | |:--------------|:--------------------------------| | 119513 | ID number (Integer) | | N | Diagnosis (Binary) | | 31 | Radius (Real-valued) | | 18.02 | Texture (Real-valued) | | 27.6 | Perimeter (Real-valued) | | 117.5 | Area (Real-valued) | | 1013 | Smoothness (Real-valued) | | 0.09489 | Compactness (Real-valued) | | 0.1036 | Concavity (Real-valued) | | 0.1086 | Symmetry (Real-valued) | | 0.07055 | Fractal Dimension (Real-valued) | | 0.1865 | Mean Intensity (Real-valued) | | 0.06333 | Standard Error (Real-valued) | | 0.6249 | Worst Radius (Real-valued) | | 1.89 | Worst Texture (Real-valued) | | 3.972 | Worst Perimeter (Real-valued) | | 71.55 | Worst Area (Real-valued) | | 0.004433 | Worst Smoothness (Real-valued) | | 0.01421 | Worst Compactness (Real-valued) | | 0.03233 | Worst Concavity (Real-valued) |

    File: breast-cancer-wisconsin.data.csv | Column name | Description | |:--------------|:--------------------------------------| | 119513 | ID number (Integer) | | 1000025 | ID number (Integer) | | 1.1 | Uniformity of Cell Size (Integer) | | 1.2 | Uniformity of Cell Shape (Integer) | | 1.3 | Single Epithelial Cell Size (Integer) | | 1.4 | Bland Chromatin (Integer) | | 1.5 | Normal Nucleoli (Integer) | | 2.1 | Mitoses (Integer) |

    File: wdbc.data.csv | Column name | Description | |:--------------|:----------------------------------------| | 842302 | Patient ID number (Integer Type) | | M | Diagnosis (Binary Type) | | **...

  2. Breast Cancer Diagnosis Dataset - Wisconsin State

    • kaggle.com
    Updated Mar 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saurabh Badole (2024). Breast Cancer Diagnosis Dataset - Wisconsin State [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/breast-cancer-wisconsin-state
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 31, 2024
    Dataset provided by
    Kaggle
    Authors
    Saurabh Badole
    Area covered
    Wisconsin
    Description

    Description:

    Explore the field of breast cancer diagnosis with the insightful Wisconsin Breast Cancer dataset (Original). This dataset provides detailed attributes representing tumor characteristics observed in breast tissue samples. By analyzing these attributes, researchers and medical professionals can gain insights into tumor behavior and develop predictive models for cancer detection and prognosis.

    Features
    1. Sample code number: Unique identifier for each tissue sample.
    2. Clump Thickness: Assessment of the thickness of tumor cell clusters (1 - 10).
    3. Uniformity of Cell Size: Uniformity in the size of tumor cells (1 - 10).
    4. Uniformity of Cell Shape: Uniformity in the shape of tumor cells (1 - 10).
    5. Marginal Adhesion: Degree of adhesion of tumor cells to surrounding tissue (1 - 10).
    6. Single Epithelial Cell Size: Size of individual tumor cells (1 - 10).
    7. Bare Nuclei: Presence of nuclei without surrounding cytoplasm (1 - 10).
    8. Bland Chromatin: Assessment of chromatin structure in tumor cells (1 - 10).
    9. Normal Nucleoli: Presence of normal-looking nucleoli in tumor cells (1 - 10).
    10. Mitoses: Frequency of mitotic cell divisions (1 - 10).
    11. Class: Classification of tumor type (2 for benign, 4 for malignant).

    Usage:

    • Cancer diagnosis: Develop machine learning models to classify tumors as benign or malignant based on their characteristics, aiding in early detection and treatment planning.
    • Feature importance analysis: Identify key attributes contributing to tumor malignancy and understand their biological significance.
    • Clinical decision support: Assist healthcare professionals in interpreting biopsy results and making informed decisions about patient care.

    Acknowledgements:

    The Breast Cancer Wisconsin dataset is sourced from tissue samples collected for diagnostic purposes, with attributes derived from microscopic examination. The dataset is anonymized and made available for research purposes, contributing to advancements in cancer diagnosis and treatment.

  3. O

    Data from: Breast Cancer Wisconsin (Diagnostic)

    • opendatalab.com
    zip
    Updated Apr 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Wisconsin (2023). Breast Cancer Wisconsin (Diagnostic) [Dataset]. https://opendatalab.com/OpenDataLab/Breast_Cancer_Wisconsin_Diagnostic
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2023
    Dataset provided by
    University of Wisconsin
    Description

    UCI Breast Cancer Raw Dataset is a breast cancer dataset that contains three sets of breast cancer cytopathology image data. Features are calculated from digitized images of fine needle aspiration (FNA) of breast masses. They describe the image The characteristics of the nuclei appearing in . The original UCI Breast Cancer dataset was published in 1995 by Dr. William H. Wolberg, General Surgery Dept. W. Nick Street, Computer Sciences Dept. Olvi L. Mangasarian, Computer Sciences Dept. Related papers are Breast cancer diagnosis and prognosis via linear programming etc.

  4. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2022). Breast Cancer Diagnosis [Dataset]. https://www.kaggle.com/datasets/thedevastator/uncovering-breast-cancer-diagnosis-with-wisconsi/discussion
Organization logo

Breast Cancer Diagnosis

Study breast cancer diagnosis

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 4, 2022
Dataset provided by
Kaggle
Authors
The Devastator
Description

Breast Cancer Diagnosis

Study breast cancer diagnosis

By UCI [source]

About this dataset

This dataset contains data on breast cancer diagnosis, a devastating medical condition that affects thousands of people around the world each year. The data is comprised of patient ID, diagnosis (Malignant or Benign), and 30 computed features extracted from a digitized image of a fine needle aspirate (FNA) of a breast mass. Features include radius, texture, perimeter, area, smoothness, compactness concavity and concave points as well as symmetry and fractal dimension.

Created by renowned researchers in the fields of General Surgery and Computer Science at the University of Wisconsin-Madison led by Dr. William H Wolberg with contributions from Professor W Nick Street and Olvi L Mangasarian this dataset was used in some groundbreaking research to predict breast cancer prognosis using linear programming methods. More recently statistical methods such as support vector machines have been employed to classify tumour types from this dataset as well other tasks such as identify hidden patterns through pattern recognition techniques like Artificial Neural Networks (ANN).

It has also been used for studies exploring unsupervised classification tools like Ant Colony Optimization for discovering meaningful relationships among different variables which can help physicians better understand the progression of certain types of tumors over time. For example types cardinality analysis allowed researchers to determine tumor’s heterogeneity before deciding on appropriate treatments potentially leading to improved prognosis success rates overall. This Wisconsin Breast Cancer Diagnostic dataset provides an invaluable resource to scientists working on preventing or curing this dreaded disease - a goal we all eagerly hope to achieve someday soon!

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

Research Ideas

  • Developing a classifier that can accurately predict breast cancer diagnoses based on the provided features.
  • Clustering patient data with similar diagnosis to discover trends or connections between certain symptoms and diagnoses.
  • Optimizing feature selection algorithms to identify the most relevant predictors of breast cancer diagnosis from a set of given cell nuclei features

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: unformatted-data.csv

File: wpbc.data.csv | Column name | Description | |:--------------|:--------------------------------| | 119513 | ID number (Integer) | | N | Diagnosis (Binary) | | 31 | Radius (Real-valued) | | 18.02 | Texture (Real-valued) | | 27.6 | Perimeter (Real-valued) | | 117.5 | Area (Real-valued) | | 1013 | Smoothness (Real-valued) | | 0.09489 | Compactness (Real-valued) | | 0.1036 | Concavity (Real-valued) | | 0.1086 | Symmetry (Real-valued) | | 0.07055 | Fractal Dimension (Real-valued) | | 0.1865 | Mean Intensity (Real-valued) | | 0.06333 | Standard Error (Real-valued) | | 0.6249 | Worst Radius (Real-valued) | | 1.89 | Worst Texture (Real-valued) | | 3.972 | Worst Perimeter (Real-valued) | | 71.55 | Worst Area (Real-valued) | | 0.004433 | Worst Smoothness (Real-valued) | | 0.01421 | Worst Compactness (Real-valued) | | 0.03233 | Worst Concavity (Real-valued) |

File: breast-cancer-wisconsin.data.csv | Column name | Description | |:--------------|:--------------------------------------| | 119513 | ID number (Integer) | | 1000025 | ID number (Integer) | | 1.1 | Uniformity of Cell Size (Integer) | | 1.2 | Uniformity of Cell Shape (Integer) | | 1.3 | Single Epithelial Cell Size (Integer) | | 1.4 | Bland Chromatin (Integer) | | 1.5 | Normal Nucleoli (Integer) | | 2.1 | Mitoses (Integer) |

File: wdbc.data.csv | Column name | Description | |:--------------|:----------------------------------------| | 842302 | Patient ID number (Integer Type) | | M | Diagnosis (Binary Type) | | **...

Search
Clear search
Close search
Google apps
Main menu