45 datasets found
  1. A

    ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-data-set-4f29/6238ad2a/?iid=010-987&v=presentation
    Explore at:
    Dataset updated
    Nov 20, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/breast-cancer-wisconsin-data on 20 November 2021.

    --- Dataset description provided by original source is as follows ---

    Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

    This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

    Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

    Attribute Information:

    1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

    Ten real-valued features are computed for each cell nucleus:

    a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

    The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

    All feature values are recoded with four significant digits.

    Missing attribute values: none

    Class distribution: 357 benign, 212 malignant

    --- Original source retains full ownership of the source dataset ---

  2. h

    wisconsin-breast-cancer

    • huggingface.co
    Updated Feb 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Witold Wydmański (2001). wisconsin-breast-cancer [Dataset]. https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 1, 2001
    Authors
    Witold Wydmański
    Area covered
    Wisconsin
    Description

    Source:

    Copied from the original dataset

      Creators:
    

    Dr. William H. Wolberg, General Surgery Dept. University of Wisconsin, Clinical Sciences Center Madison, WI 53792 wolberg '@' eagle.surgery.wisc.edu

    W. Nick Street, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 street '@' cs.wisc.edu 608-262-6619

    Olvi L. Mangasarian, Computer Sciences Dept. University of Wisconsin, 1210 West Dayton St., Madison, WI 53706 olvi '@' cs.wisc.edu… See the full description on the dataset page: https://huggingface.co/datasets/wwydmanski/wisconsin-breast-cancer.

  3. Data from: BREAST CANCER WISCONSIN DATA SET

    • kaggle.com
    Updated Aug 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roopa Calistus (2022). BREAST CANCER WISCONSIN DATA SET [Dataset]. http://doi.org/10.34740/kaggle/dsv/4092342
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 19, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Roopa Calistus
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    BREAST CANCER WISCONSIN (DIAGNOSTIC) DATA SET Predict whether the cancer is benign or malignant. It consists of features that are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

    Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

  4. c

    Breast Cancer Dataset

    • cubig.ai
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Breast Cancer Dataset [Dataset]. https://cubig.ai/store/products/178/breast-cancer-dataset
    Explore at:
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
    Description

    1) Data Introduction • The Breast Cancer Wisconsin (Diagnostic) data focuses on distinguishing between malignant (cancerous) and benign (non-cancerous) breast tumors. This dataset is crucial for developing machine learning models to aid in the early detection and classification of breast cancer, thereby potentially saving lives through timely intervention.

    2) Data Utilization (1) Breast cancer data has characteristics that: • The dataset contains various features extracted from digitized images of fine needle aspirate (FNA) of breast masses, allowing for detailed analysis and classification of tumors. (2) Breast cancer data can be used to: • Healthcare and Medical Research: Useful for developing diagnostic tools and models to accurately classify breast tumors, aiding healthcare providers in making informed decisions. • Machine Learning and AI Development: Assists in creating and fine-tuning machine learning algorithms to improve predictive accuracy in medical diagnostics.

  5. A

    ‘Wisconsin Diagnostic Breast Cancer (WDBC)’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Sep 30, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Wisconsin Diagnostic Breast Cancer (WDBC)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-wisconsin-diagnostic-breast-cancer-wdbc-b8cd/5b08ae03/?iid=009-999&v=presentation
    Explore at:
    Dataset updated
    Sep 30, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Wisconsin Diagnostic Breast Cancer (WDBC)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mohaiminul101/wisconsin-diagnostic-breast-cancer-wdbc on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer. Wisconsin Diagnostic Breast Cancer (WDBC) dataset obtained by the university of Wisconsin Hospital is used to classify tumors as benign or malignant.

    Content

    Attribute Information:

    1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

    Ten real-valued features are computed for each cell nucleus:

    a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

    The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

    All feature values are recoded with four significant digits.

    Missing attribute values: none

    Class distribution: 357 benign, 212 malignant

    Acknowledgements

    Creator: Dr. WIlliam H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA

    This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

    Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

    --- Original source retains full ownership of the source dataset ---

  6. t

    Breast Cancer Wisconsin (Original) - Dataset - LDM

    • service.tib.eu
    Updated Dec 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Breast Cancer Wisconsin (Original) - Dataset - LDM [Dataset]. https://service.tib.eu/ldmservice/dataset/breast-cancer-wisconsin--original-
    Explore at:
    Dataset updated
    Dec 3, 2024
    Description

    Breast Cancer Wisconsin (Original) dataset consists of 699 observations and 11 features

  7. Wisconsin Diagnostic Breast Cancer (WDBC)

    • kaggle.com
    Updated Oct 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohaiminul Islam (2020). Wisconsin Diagnostic Breast Cancer (WDBC) [Dataset]. https://www.kaggle.com/mohaiminul101/wisconsin-diagnostic-breast-cancer-wdbc/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 19, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mohaiminul Islam
    Area covered
    Wisconsin
    Description

    Context

    Breast cancer is a disease in which cells in the breast grow out of control. There are different kinds of breast cancer. The kind of breast cancer depends on which cells in the breast turn into cancer. Wisconsin Diagnostic Breast Cancer (WDBC) dataset obtained by the university of Wisconsin Hospital is used to classify tumors as benign or malignant.

    Content

    Attribute Information:

    1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

    Ten real-valued features are computed for each cell nucleus:

    a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

    The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

    All feature values are recoded with four significant digits.

    Missing attribute values: none

    Class distribution: 357 benign, 212 malignant

    Acknowledgements

    Creator: Dr. WIlliam H. Wolberg (physician) University of Wisconsin Hospitals Madison, Wisconsin, USA

    This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

    Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

  8. h

    wisconsin-breast-cancer-diagnostic

    • huggingface.co
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mnemora (2025). wisconsin-breast-cancer-diagnostic [Dataset]. https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic
    Explore at:
    Dataset updated
    Sep 24, 2025
    Dataset authored and provided by
    Mnemora
    License

    https://choosealicense.com/licenses/ecl-2.0/https://choosealicense.com/licenses/ecl-2.0/

    Description

    This dataset, derived from the Wisconsin Breast Cancer (Diagnostic), is a comprehensive resource for developing and evaluating machine learning models focused on the binary classification of breast tumors as either benign (B) or malignant (M). The data consists of features computed from digitized images of fine needle aspirates (FNA) of breast masses, offering a rich set of quantitative metrics for computational pathology and diagnostic research. The dataset is a critical tool for healthcare… See the full description on the dataset page: https://huggingface.co/datasets/mnemoraorg/wisconsin-breast-cancer-diagnostic.

  9. h

    breast-cancer-africa-adjusted-dataset

    • huggingface.co
    Updated Sep 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Electric Sheep (2025). breast-cancer-africa-adjusted-dataset [Dataset]. https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset
    Explore at:
    Dataset updated
    Sep 9, 2025
    Dataset authored and provided by
    Electric Sheep
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Breast Cancer Wisconsin Dataset: African Physiognomy Adjusted

      Dataset Description
    

    This dataset addresses representation bias in medical AI by providing an African physiognomy-adjusted version of the classic Wisconsin Breast Cancer Dataset. The adjustment methodology systematically modifies cellular morphology features to better reflect documented physiological differences in African populations.

      Dataset Summary
    

    Original Dataset: Wisconsin Breast Cancer Dataset… See the full description on the dataset page: https://huggingface.co/datasets/electricsheepafrica/breast-cancer-africa-adjusted-dataset.

  10. Breast Cancer Dataset UCI ML

    • kaggle.com
    Updated Apr 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jean de Dieu Nyandwi (2020). Breast Cancer Dataset UCI ML [Dataset]. https://www.kaggle.com/datasets/jeandedieunyandwi/breast-cancer-dataset-uci-ml/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 19, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Jean de Dieu Nyandwi
    Description

    Context

    Breast Cancer Wisconsin (Diagnostic) Data Set

    Content

    Data Set Characteristics:

    :Number of Instances: 569
    
    :Number of Attributes: 30 numeric, predictive attributes and the class
    
    :Attribute Information:
      - radius (mean of distances from center to points on the perimeter)
      - texture (standard deviation of gray-scale values)
      - perimeter
      - area
      - smoothness (local variation in radius lengths)
      - compactness (perimeter^2 / area - 1.0)
      - concavity (severity of concave portions of the contour)
      - concave points (number of concave portions of the contour)
      - symmetry 
      - fractal dimension ("coastline approximation" - 1)
    
      The mean, standard error, and "worst" or largest (mean of the three
      largest values) of these features were computed for each image,
      resulting in 30 features. For instance, field 3 is Mean Radius, field
      13 is Radius SE, field 23 is Worst Radius.
    
      - class:
          - WDBC-Malignant
          - WDBC-Benign
    

    Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image.

    Acknowledgements

    Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science. This is a copy of UCI ML Breast Cancer Wisconsin (Diagnostic) datasets. https://goo.gl/U2Uwz2

  11. Breast Cancer Diagnostic Data Set

    • kaggle.com
    Updated May 26, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ishan Dutta (2020). Breast Cancer Diagnostic Data Set [Dataset]. https://www.kaggle.com/ishandutta/breast-cancer-diagnostic-data-set/activity
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 26, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ishan Dutta
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at [Web Link]

    Separating plane described above was obtained using Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree Construction Via Linear Programming." Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97-101, 1992], a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1-4 features and 1-3 separating planes.

    The actual linear program used to obtain the separating plane in the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

    This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

  12. Data from: Breast Cancer Wisconsin

    • kaggle.com
    Updated Jan 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joshua Marsh (2023). Breast Cancer Wisconsin [Dataset]. https://www.kaggle.com/datasets/joshuamarsh/breast-cancer-wisconsin
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 21, 2023
    Dataset provided by
    Kaggle
    Authors
    Joshua Marsh
    Description

    Dataset

    This dataset was created by Joshua Marsh

    Contents

  13. p

    Breast Cancer Dataset - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Breast Cancer Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-dataset
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description: Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area. The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We ask you to complete the analysis of classifying these tumors using machine learning (with SVMs) and the Breast Cancer Wisconsin (Diagnostic) Dataset. Acknowledgements: This dataset has been referred from Kaggle. Objective: Understand the Dataset & cleanup (if required). Build classification models to predict whether the cancer type is Malignant or Benign. Also fine-tune the hyperparameters & compare the evaluation metrics of various classification algorithms.

  14. H

    Replication Data for: Wisconsin Breast Cancer Diagnostic

    • dataverse.harvard.edu
    Updated Apr 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bartley (2016). Replication Data for: Wisconsin Breast Cancer Diagnostic [Dataset]. http://doi.org/10.7910/DVN/SP6VXJ
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Christopher Bartley
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Wisconsin
    Description

    Original data from: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic). Changes made: - 16 rows with '?' for Bare Nuclei removed, leaving 683 records # Attribute Domain -- ----------------------------------------- 0. Class: (-1 for benign, +1 for malignant) 1. Clump Thickness 1 - 10 2. Uniformity of Cell Size 1 - 10 3. Uniformity of Cell Shape 1 - 10 4. Marginal Adhesion 1 - 10 5. Single Epithelial Cell Size 1 - 10 6. Bare Nuclei 1 - 10 7. Bland Chromatin 1 - 10 8. Normal Nucleoli 1 - 10 9. Mitoses 1 - 10

  15. Breast Cancer Wisconsin Data

    • kaggle.com
    Updated Feb 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CDBezz (2021). Breast Cancer Wisconsin Data [Dataset]. https://www.kaggle.com/cdbezz/breast-cancer-wisconsin-data/tasks
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 5, 2021
    Dataset provided by
    Kaggle
    Authors
    CDBezz
    Description

    Dataset

    This dataset was created by CDBezz

    Contents

  16. O

    Data from: Breast Cancer Wisconsin (Diagnostic)

    • opendatalab.com
    zip
    Updated Apr 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    University of Wisconsin (2023). Breast Cancer Wisconsin (Diagnostic) [Dataset]. https://opendatalab.com/OpenDataLab/Breast_Cancer_Wisconsin_Diagnostic
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 21, 2023
    Dataset provided by
    University of Wisconsin
    Description

    UCI Breast Cancer Raw Dataset is a breast cancer dataset that contains three sets of breast cancer cytopathology image data. Features are calculated from digitized images of fine needle aspiration (FNA) of breast masses. They describe the image The characteristics of the nuclei appearing in . The original UCI Breast Cancer dataset was published in 1995 by Dr. William H. Wolberg, General Surgery Dept. W. Nick Street, Computer Sciences Dept. Olvi L. Mangasarian, Computer Sciences Dept. Related papers are Breast cancer diagnosis and prognosis via linear programming etc.

  17. p

    Breast Cancer Prediction Dataset - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Breast Cancer Prediction Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-prediction-dataset
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body. This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.

  18. c

    Data from: Cancer classification Dataset

    • cubig.ai
    Updated May 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CUBIG (2025). Cancer classification Dataset [Dataset]. https://cubig.ai/store/products/166/cancer-classification-dataset
    Explore at:
    Dataset updated
    May 2, 2025
    Dataset authored and provided by
    CUBIG
    License

    https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service

    Measurement technique
    Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
    Description

    1) Data Introduction • The Cancer Classification dataset is derived from the UCI ML Breast Cancer Wisconsin (Diagnostic) datasets, containing 569 instances with 30 numerical attributes. The features are computed from digitized images of fine needle aspirates (FNA) of breast masses, aimed at distinguishing between malignant and benign tumors.

    2) Data Utilization (1) Cancer Classification data has characteristics that: • It includes detailed measurements of cell nuclei characteristics such as radius, texture, perimeter, area, smoothness, compactness, concavity, symmetry, and fractal dimension. These attributes are essential for accurate classification of breast cancer tumors. (2) Cancer Classification data can be used to: • Medical Diagnosis: Assists in developing predictive models to classify breast cancer tumors as malignant or benign, aiding in early detection and treatment planning. • Research and Development: Supports academic research and development of machine learning models in the medical field, providing a comprehensive dataset for testing various algorithms.

  19. Cancer Data

    • kaggle.com
    Updated Mar 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erdem Taha (2023). Cancer Data [Dataset]. https://www.kaggle.com/datasets/erdemtaha/cancer-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 22, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Erdem Taha
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    🦠 Breast Cancer Data Set

    This dataset contains the characteristics of patients diagnosed with cancer. The dataset contains a unique ID for each patient, the type of cancer (diagnosis), the visual characteristics of the cancer and the average values of these characteristics.

    📚 The main features of the dataset are as follows:

    1. id: Represents a unique ID of each patient.
    2. diagnosis: Indicates the type of cancer. This property can take the values "M" (Malignant - Benign) or "B" (Benign - Malignant).
    3. radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean: Represents the mean values of the cancer's visual characteristics.

    There are also several categorical features where patients in the dataset are labeled with numerical values. You can examine them in the Chart area.

    Other features contain specific ranges of average values of the features of the cancer image:

    • radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean

    Each of these features is mapped to a table containing the number of values in a given range. You can examine the Chart Tables

    Each sample contains the patient's unique ID, the cancer diagnosis and the average values of the cancer's visual characteristics.

    Such a dataset can be used to train or test models and algorithms used to make cancer diagnoses. Understanding and analyzing the dataset can contribute to the improvement of cancer-related visual features and diagnosis.

    ✨ Examples of Projects that can be done with the Data Set

    Logistic Regression: This algorithm can be used effectively for binary classification problems. In this dataset, logistic regression may be an appropriate choice since there are "Malignant" (benign) and "Benign" (malignant) classes. It can be used to predict cancer type with the visual features in the dataset.

    K-Nearest Neighbors (KNN): KNN classifies an example by looking at the k closest examples around it. This algorithm assumes that patients with similar characteristics tend to have similar types of cancer. KNN can be used for cancer diagnosis by taking into account neighborhood relationships in the data set.

    Support Vector Machines (SVM): SVM is effective for classification tasks, especially for two-class problems. Focusing on the clear separation of classes in the dataset, SVM is a powerful algorithm that can be used for cancer diagnosis.

    Data Set Related Training Notebooks 😊 ("I Recommend You Review")

    K-NN Project: https://www.kaggle.com/code/erdemtaha/prediction-cancer-data-with-k-nn-95

    Logistic Regressüon: https://www.kaggle.com/code/erdemtaha/cancer-prediction-96-5-with-logistic-regression

    💖 Acknowledgements and Information

    This is a copy of content that has been elaborated for educational purposes and published to reach more people, you can access the original source from the link below, please do not forget to support that data

    🔗 https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data

    This database can also be accessed via the UW CS ftp server: 🔗 ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

    It can also be found at the UCI Machine Learning Repository: 🔗 https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

    📩 Personal Information:

    If you have some questions or curiosities about the data or studies, you can contact me as you wish from the links below 😊

    LinkedIn: https://www.linkedin.com/in/erdem-taha-sokullu/

    Mail: erdemtahasokullu@gmail.com

    Github: https://github.com/Prometheussx

    Kaggle: https://www.kaggle.com/erdemtaha

    📜 License:

    This Data has a CC BY-NC-SA 4.0 License You can review the license rules from the link below

    License Link: https://creativecommons.org/licenses/by-nc-sa/4.0/

  20. f

    DATA SHEET.csv

    • figshare.com
    csv
    Updated Jan 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dola Saha (2025). DATA SHEET.csv [Dataset]. http://doi.org/10.6084/m9.figshare.28203392.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jan 14, 2025
    Dataset provided by
    figshare
    Authors
    Dola Saha
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Breast cancer is one of the most prevalent cancers among women worldwide, and early detection is crucial for reducing mortality rates and improving treatment outcomes. Mammography has been the gold standard for breast cancer screening, offering non-invasive imaging to identify suspicious abnormalities. However, mammography has limitations, such as variability in interpretation, false positives, false negatives, and challenges in distinguishing between benign and malignant lesions.Machine learning has the potential to revolutionize breast cancer detection by enhancing the capabilities of mammography. Its ability to improve accuracy, efficiency, and consistency in diagnosis makes it an indispensable tool for early detection efforts.This study focuses on developing a machine learning-based predictive model for the early detection and classification of breast cancer, utilizing the Wisconsin Breast Cancer Diagnostic dataset. Special emphasis is placed on the potential of ML algorithms, particularly the Support Vector Classifier with a Radial Basis Function (SVC-RBF), to enhance diagnostic accuracy and efficiency.Machine learning has the potential to revolutionize breast cancer detection by enhancing the capabilities of mammography. Its ability to improve accuracy, efficiency, and consistency in diagnosis makes it an indispensable tool for early detection efforts.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-data-set-4f29/6238ad2a/?iid=010-987&v=presentation

‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2

Explore at:
Dataset updated
Nov 20, 2021
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Analysis of ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/breast-cancer-wisconsin-data on 20 November 2021.

--- Dataset description provided by original source is as follows ---

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

--- Original source retains full ownership of the source dataset ---

Search
Clear search
Close search
Google apps
Main menu