Last updated
Download format
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Cost to access
Described as free to access or have a license that allows redistribution.
12 datasets found
  1. Breast Cancer Wisconsin (Prognostic) Data Set

    Updated Mar 31, 2017
  2. d

    Breast Cancer Wisconsin (Prognostic)

    zip, csv
    Updated Apr 10, 2020
  3. Breast Cancer Wisconsin (Diagnostic) Data Set

    Updated Sep 25, 2016
  4. d

    Breast Cancer WI (Diagnostic)

    zip, csv
    Updated Apr 20, 2020
  5. d

    Breast Cancer Wisconsin

    zip, csv
    Updated May 11, 2020
  6. d

    Breast Cancer

    zip, csv
    Updated May 4, 2020
  7. Breast Cancer Wisconsin - Data Set

    Updated Jan 8, 2018
  8. o

    Human triple negative breast cancer tissues

  9. Breast Cancer Prediction Dataset

    Updated Sep 26, 2018
  10. o

    Data from: Gene expression variation to predict 10-year survival in...

  11. d

    Ctr9, a key subunit of PAFc, affects global estrogen signaling and drives...

    Updated Sep 25, 2015
  12. Data from: Decision Tree-Based Learning Using Multi-Attributed Lens

    Updated 2013
  13. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Click to copy link
Link copied

Breast Cancer Wisconsin (Diagnostic) Data Set

Predict whether the cancer is benign or malignant

18 scholarly articles cite this dataset (View in Google Scholar)
zip (125204 bytes)Available download formats
Dataset updated Sep 25, 2016
Dataset provided by
UCI Machine Learning

Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)
License information was derived automatically


Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository:

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

Clear search
Close search
Google apps
Main menu