100+ datasets found

Breast Cancer Prediction Dataset
kaggle.com
Updated Sep 26, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Merishna Singh Suwal (2018). Breast Cancer Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/merishnasuwal/breast-cancer-prediction-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 26, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Merishna Singh Suwal
Description
Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body.

This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.
Breast Ultrasound Images Dataset(BUSI)
kaggle.com
Updated Mar 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saba Hesaraki (2023). Breast Ultrasound Images Dataset(BUSI) [Dataset]. https://www.kaggle.com/datasets/sabahesaraki/breast-ultrasound-images-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 1, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Saba Hesaraki
Description
Abstract: The data collected at baseline include breast ultrasound images among women between 25 and 75 years old. This data was organized in 2018. The number of patients is 600 females, patients. The dataset consists of 780 images with an average image size of 500*500 pixels. The images are in PNG format. The ground truth images are presented with original images. The images are categorized into three classes, which are standard, benign, and malignant.
Breast Cancer Coimbra
kaggle.com
Updated Jan 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vivek Agrawal (2024). Breast Cancer Coimbra [Dataset]. https://www.kaggle.com/datasets/atom1991/breast-cancer-coimbra
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 7, 2024
Dataset provided by
Kaggle
Authors
Vivek Agrawal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset originates from a deep learning model trained on the "Coimbra Breast Cancer" dataset, with feature distributions closely resembling the original. The original data includes clinical observations from 64 patients with breast cancer and 52 healthy controls, encompassing 10 quantitative predictors and a binary dependent variable indicating the presence or absence of breast cancer.

Quantitative Attributes:

Age (years): Represents the age of individuals in the dataset.

BMI (kg/m²): Body Mass Index, a measure of body fat based on weight and height.

Glucose (mg/dL): Reflects blood glucose levels, a vital metabolic indicator.

Insulin (µU/mL): Indicates insulin levels, a hormone associated with glucose regulation.

HOMA: Homeostatic Model Assessment, a method assessing insulin resistance and beta-cell function.

Leptin (ng/mL): Represents leptin levels, a hormone involved in appetite and energy balance regulation.

Adiponectin (µg/mL): Reflects adiponectin levels, a protein associated with metabolic regulation.

Resistin (ng/mL): Indicates resistin levels, a protein implicated in insulin resistance.

MCP-1 (pg/dL): Reflects Monocyte Chemoattractant Protein-1 levels, a cytokine involved in inflammation.

Labels:

1: Healthy controls

2: Patients with breast cancer

These quantitative attributes, including anthropometric data and parameters gathered from routine blood analysis, serve as the foundation for potential biomarkers of breast cancer. The dataset presents an opportunity for developing accurate prediction models, aiding in the identification and understanding of factors associated with breast cancer.
i
SEER Breast Cancer Data
ieee-dataport.org
data.niaid.nih.gov
+2more
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
jing teng (2025). SEER Breast Cancer Data [Dataset]. https://ieee-dataport.org/open-access/seer-breast-cancer-data
Explore at:
Dataset updated
Jul 29, 2025
Authors
jing teng
Description
examined regional LNs
h
Breast-Cancer-Cell-Dataset
huggingface.co
Updated Jun 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahadi Hassan (2024). Breast-Cancer-Cell-Dataset [Dataset]. https://huggingface.co/datasets/Mahadih534/Breast-Cancer-Cell-Dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 7, 2024
Authors
Mahadi Hassan
License
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Description
Data Source

https://www.kaggle.com/datasets/andrewmvd/breast-cancer-cell-segmentation

Dataset Card Authors

Mahadi Hassan

Dataset Card Contact mahadise01@gmail.com Linkdin: https://www.linkedin.com/in/mahadise01 Github: https://github.com/Mahadih534
A
‘Breast Cancer Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Breast Cancer Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-dataset-ba67/2037810e/?iid=003-192&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yasserh/breast-cancer-dataset on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Description:

Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area.

The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We ask you to complete the analysis of classifying these tumors using machine learning (with SVMs) and the Breast Cancer Wisconsin (Diagnostic) Dataset.

Acknowledgements:

This dataset has been referred from Kaggle.

Objective:

Understand the Dataset & cleanup (if required).

Build classification models to predict whether the cancer type is Malignant or Benign.

Also fine-tune the hyperparameters & compare the evaluation metrics of various classification algorithms.

--- Original source retains full ownership of the source dataset ---
Breast Cancer MSI Multimodal Image Dataset
kaggle.com
Updated May 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Developer (2025). Breast Cancer MSI Multimodal Image Dataset [Dataset]. https://www.kaggle.com/datasets/zoya77/breast-cancer-msi-multimodal-image-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 31, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Developer
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The multispectral breast cancer image datasets span three complementary imaging modalities: Ultrasound, Histopathological, and Chest X-ray. Each dataset includes balanced classes of benign and malignant cases, and the images are enhanced through spectral conversion (RGB, HSV, Jet) to support robust multispectral analysis for classification and fusion tasks.

MSI Ultrasound Breast Images for Breast Cancer This dataset contains ultrasound images of breast tissue labeled as either benign or malignant.

Total Images: 806

Benign: 406 images

Malignant: 400 images

Augmentation: Data augmentation techniques such as rotation and sharpening were applied to enhance the diversity and volume of the dataset, enabling robust training of machine learning models.

MSI BreastHis – Breast Cancer Histopathological Images This dataset comprises high-resolution microscopic images of breast tumor tissue collected for histopathological analysis. These images provide cellular-level detail and are essential for determining cancer grade and type.

Total Images Used: 1,246 (Subset of the full BreakHis dataset)

Benign: 623 images

Malignant: 623 images

MSI Chest X-Ray for Breast Cancer This dataset consists of colorized chest X-ray images used for identifying breast cancer-related anomalies. While traditionally not the primary modality for breast cancer detection, chest X-rays can provide useful structural insights when used in conjunction with other imaging types.

Total Images: 1,000

Benign: 500 images

Malignant: 500 images
A
‘Breast Cancer Dataset’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 10, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Breast Cancer Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-dataset-8088/8c45569a/?iid=003-199&v=presentation
Explore at:
Dataset updated
Feb 10, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/jainilcoder/breast-cancer-dataset on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Content

The Dataset contains 32 Columns and 570 rows consisting all the parameters used for detecting a Breast Cancer

Inspiration

The task for you will be predicting that wheter the cancer is Benign or Malignant. You can also perform Exploratory Data Analysis and Visualize it for practicing

--- Original source retains full ownership of the source dataset ---
A
‘Breast Cancer Diagnostic Dataset (BCD)’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Breast Cancer Diagnostic Dataset (BCD)’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-diagnostic-dataset-bcd-63e2/50e77951/?iid=012-854&v=presentation
Explore at:
Dataset updated
Feb 14, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Diagnostic Dataset (BCD)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/devraikwar/breast-cancer-diagnostic on 14 February 2022.

--- Dataset description provided by original source is as follows ---

Context

The resources for this dataset can be found at https://www.openml.org/d/13 and https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Content

This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.

Number of Instances: 286

Number of Attributes: 9 + the class attribute

Attribute Information:

Class: no-recurrence-events, recurrence-events age: 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99. menopause: lt40, ge40, premeno. tumor-size: 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59. inv-nodes: 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39. node-caps: yes, no. deg-malig: 1, 2, 3. breast: left, right. breast-quad: left-up, left-low, right-up, right-low, central. irradiat: yes, no.

Missing Attribute Values: (denoted by “?”) Attribute #: Number of instances with missing values: 6. 8 9. 1.

Class Distribution:

no-recurrence-events: 201 instances recurrence-events: 85 instances

Acknowledgements

Original data https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Inspiration

With the attributes described above, can you predict if a patient has recurrence event ?

--- Original source retains full ownership of the source dataset ---
A
‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 1, 2001
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2001). ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-wisconsin-diagnostic-data-set-2558/4a42d794/?iid=003-113&v=presentation
Explore at:
Dataset updated
Feb 1, 2001
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Wisconsin (Diagnostic) Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/uciml/breast-cancer-wisconsin-data on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. n the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: "Robust Linear Programming Discrimination of Two Linearly Inseparable Sets", Optimization Methods and Software 1, 1992, 23-34].

This database is also available through the UW CS ftp server: ftp ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

Also can be found on UCI Machine Learning Repository: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

Attribute Information:

1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32)

Ten real-valued features are computed for each cell nucleus:

a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1)

The mean, standard error and "worst" or largest (mean of the three largest values) of these features were computed for each image, resulting in 30 features. For instance, field 3 is Mean Radius, field 13 is Radius SE, field 23 is Worst Radius.

All feature values are recoded with four significant digits.

Missing attribute values: none

Class distribution: 357 benign, 212 malignant

--- Original source retains full ownership of the source dataset ---
f
BreastSwinFedNetX
figshare.com
zip
Updated Mar 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rezaul Haque (2025). BreastSwinFedNetX [Dataset]. http://doi.org/10.6084/m9.figshare.28548758.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.28548758.v1
Dataset updated
Mar 6, 2025
Dataset provided by
figshare
Authors
Rezaul Haque
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The datasets used in this study were collected from the Kaggle platform. Below are their available links:1. BreakHis: https://www.kaggle.com/datasets/ambarish/breakhis2. Breast Ultrasound Images Dataset (BUSI): https://www.kaggle.com/datasets/sabahesaraki/breast-ultrasound-images-dataset3. CBIS-DDSM: https://www.kaggle.com/datasets/seanbaek19/cbis-ddsm-40964. INbreast: https://www.kaggle.com/datasets/eoussama/breast-cancer-mammograms/data5. Combined Dataset: https://www.kaggle.com/datasets/rezaullhaque/combined-dataset/dataThe total data size is 26 GB.
S
machine learning models on the WDBC dataset
scidb.cn
Updated Apr 15, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mahdi Aghaziarati (2025). machine learning models on the WDBC dataset [Dataset]. http://doi.org/10.57760/sciencedb.23537
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.23537
Dataset updated
Apr 15, 2025
Dataset provided by
Science Data Bank
Authors
Mahdi Aghaziarati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset used in this study is the Wisconsin Diagnostic Breast Cancer (WDBC) dataset, originally provided by the University of Wisconsin and obtained via Kaggle. It consists of 569 observations, each corresponding to a digitized image of a fine needle aspirate (FNA) of a breast mass. The dataset contains 32 attributes: one identifier column (discarded during preprocessing), one diagnosis label (malignant or benign), and 30 continuous real-valued features that describe the morphology of cell nuclei. These features are grouped into three statistical descriptors—mean, standard error (SE), and worst (mean of the three largest values)—for ten morphological properties including radius, perimeter, area, concavity, and fractal dimension. All feature values were normalized using z-score standardization to ensure uniform scale across models sensitive to input ranges. No missing values were present in the original dataset. Label encoding was applied to the diagnosis column, assigning 1 to malignant and 0 to benign cases. The dataset was split into training (80%) and testing (20%) sets while preserving class balance via stratified sampling. The accompanying Python source code (breast_cancer_classification_models.py) performs data loading, preprocessing, model training, evaluation, and result visualization. Four lightweight classifiers—Decision Tree, Naïve Bayes, Perceptron, and K-Nearest Neighbors (KNN)—were implemented using the scikit-learn library (version 1.2 or later). Performance metrics including Accuracy, Precision, Recall, F1-score, and ROC-AUC were calculated for each model. Confusion matrices and ROC curves were generated and saved as PNG files for interpretability. All results are saved in a structured CSV file (classification_results.csv) that contains the performance metrics for each model. Supplementary visualizations include all_feature_histograms.png (distribution plots for all standardized features), model_comparison.png (metric-wise bar plot), and feature_correlation_heatmap.png (Pearson correlation matrix of all 30 features). The data files are in standard CSV and PNG formats and can be opened using any spreadsheet or image viewer, respectively. No rare file types are used, and all scripts are compatible with any Python 3.x environment. This data package enables reproducibility and offers a transparent overview of how baseline machine learning models perform in the domain of breast cancer diagnosis using a clinically-relevant dataset.
A
‘Breast Cancer Prediction’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Breast Cancer Prediction’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-breast-cancer-prediction-452a/fc63a1e1/?iid=002-655&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Breast Cancer Prediction’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shubamsumbria/breast-cancer-prediction on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Data Set Information: Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe the characteristics of the cell nuclei. The separating plane described above was obtained using Multi-surface Method-Tree (MSM-T) [K. P. Bennett, “Decision Tree Construction Via Linear Programming.” Proceedings of the 4th Midwest Artificial Intelligence and Cognitive Science Society, pp. 97–101, 1992], a classification method which uses linear programming to construct a decision tree. Relevant features were selected using an exhaustive search in the space of 1–4 features and 1–3 separating planes. The actual linear program used to get the separating plane in the 3-dimensional space is that described in: [K. P. Bennett and O. L. Mangasarian: “Robust Linear Programming Discrimination of Two Linearly Inseparable Sets”, Optimization Methods and Software 1, 1992, 23–34].

Attribute Information: ID number Diagnosis (M = malignant, B = benign) Ten real-valued features are computed for each cell nucleus (3–32): a) radius (mean of distances from the center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter² / area — 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension (“coastline approximation” — 1)

Cite at: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.

--- Original source retains full ownership of the source dataset ---
Breast Cancer Dataset V2
kaggle.com
Updated Apr 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cady Wang (2022). Breast Cancer Dataset V2 [Dataset]. https://www.kaggle.com/datasets/cadywang/breast-cancer-dataset-v2/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 7, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Cady Wang
Description
Dataset

This dataset was created by Cady Wang

Contents
Breast_Cancer_Dataset
kaggle.com
Updated Mar 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shivam_ay (2025). Breast_Cancer_Dataset [Dataset]. https://www.kaggle.com/datasets/shivamay/breast-cancer-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 21, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shivam_ay
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset

This dataset was created by Shivam_ay

Released under Apache 2.0

Contents
f
Digital mammography Dataset for Breast Cancer Diagnosis Research (DMID)
figshare.com
zip
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Parita Oza; Rajiv Oza; Urvi Oza; Paawan Sharma; Samir Patel; Pankaj Kumar; Bakul Gohel (2023). Digital mammography Dataset for Breast Cancer Diagnosis Research (DMID) [Dataset]. http://doi.org/10.6084/m9.figshare.24522883.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.24522883.v2
Dataset updated
Nov 8, 2023
Dataset provided by
figshare
Authors
Parita Oza; Rajiv Oza; Urvi Oza; Paawan Sharma; Samir Patel; Pankaj Kumar; Bakul Gohel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains images of mammograms and can be used for research and education purposes only. The dataset contains DCM images, TIFF images, a Radiology report, a Segmented mask, and pixel level annotation on abnormal regions and csv file that contains other metadata.
Breast_Cancer_dataset.
kaggle.com
Updated Jun 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nitesh Sahu☑️ (2023). Breast_Cancer_dataset. [Dataset]. https://www.kaggle.com/datasets/niteshsahu99/breast-cancer-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 15, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Nitesh Sahu☑️
Description
Dataset

This dataset was created by Nitesh Sahu☑️

Contents
breast cancer dataset
kaggle.com
Updated Aug 23, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
song (2019). breast cancer dataset [Dataset]. https://www.kaggle.com/datasets/youzipi/breast-cancer-dataset/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 23, 2019
Dataset provided by
Kagglehttp://kaggle.com/
Authors
song
Description
Dataset

This dataset was created by song

Contents
A
‘Anticancer peptides Data Set’ analyzed by Analyst-2
analyst-2.ai
Updated Apr 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Anticancer peptides Data Set’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-anticancer-peptides-data-set-5a0f/dce97c92/?iid=000-327&v=presentation
Explore at:
Dataset updated
Apr 6, 2019
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Anticancer peptides Data Set’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/anuragupadhyaya/anticancer-peptides-data-set on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Membranolytic anticancer peptides (ACPs) are drawing increasing attention as potential future therapeutics against cancer, due to their ability to hinder the development of cellular resistance and their potential to overcome common hurdles of chemotherapy, e.g., side effects and cytotoxicity. This dataset contains information on peptides (annotated for their one-letter amino acid code) and their anticancer activity on breast and lung cancer cell lines.

Two peptide datasets targeting breast and lung cancer cells were assembled and curated manually from CancerPPD. EC50, IC50, LD50 and LC50 annotations on breast and lung cancer cells were retained (breast cell lines: MCF7â€‰=â€‰57%, MDA-MB-361â€‰=â€‰11%, MT-1â€‰=â€‰9%; lung cell lines: H-1299â€‰=â€‰45%, A-549â€‰=â€‰17.7%); mg mlâˆ’1 values were converted to Î¼M units. Linear and l-chiral peptides were retained, while cyclic, mixed or d-chiral peptides were discarded. In the presence of both amidated and non-amidated data for the same sequence, only the value referred to the amidated peptide was retained.

Peptides were split into three classes for model training: (1) very active (EC/IC/LD/LC50â€‰â‰¤â€‰5 Î¼M) (2) moderately active (EC/IC/LD/LC50 values up to 50 Î¼M) (3) inactive (EC/IC/LD/LC50â€‰>â€‰50 Î¼M) peptides

Duplicates with conflicting class annotations were compared manually to the original sources, and, if necessary, corrected. If multiple class annotations were present for the same sequence, the most frequently represented class was chosen; in case of ties, the less active class was chosen. Since the CancerPPD is biased towards the annotation of active peptides, we built a set of presumably inactive peptides by randomly extracting 750 alpha-helical sequences from crystal structures deposited in the Protein Data Bank (7â€“30 amino acids). The final training sets contained 949 peptides for Breast cancer and 901 peptides for Lung cancer.

--- Original source retains full ownership of the source dataset ---
Breast Cancer Dataset
kaggle.com
Updated Aug 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Himanshu Madan (2020). Breast Cancer Dataset [Dataset]. https://www.kaggle.com/tug004/breast-cancer-dataset/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 20, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Himanshu Madan
Description
Dataset

This dataset was created by Himanshu Madan

Contents

Facebook

Twitter

Click to copy link

Link copied

Cite

Merishna Singh Suwal (2018). Breast Cancer Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/merishnasuwal/breast-cancer-prediction-dataset

Breast Cancer Prediction Dataset

Dataset created for "AI for Social Good: Women Coders' Bootcamp"

Explore at:

6 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Sep 26, 2018

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Merishna Singh Suwal

Description

Worldwide, breast cancer is the most common type of cancer in women and the second highest in terms of mortality rates.Diagnosis of breast cancer is performed when an abnormal lump is found (from self-examination or x-ray) or a tiny speck of calcium is seen (on an x-ray). After a suspicious lump is found, the doctor will conduct a diagnosis to determine whether it is cancerous and, if so, whether it has spread to other parts of the body.

This breast cancer dataset was obtained from the University of Wisconsin Hospitals, Madison from Dr. William H. Wolberg.

Clear search

Close search

Google apps

Main menu

Breast Cancer Prediction Dataset

Breast Ultrasound Images Dataset(BUSI)

Breast Cancer Coimbra

SEER Breast Cancer Data

Breast-Cancer-Cell-Dataset

‘Breast Cancer Dataset’ analyzed by Analyst-2

Description:

Acknowledgements:

Objective:

Breast Cancer MSI Multimodal Image Dataset

‘Breast Cancer Dataset’ analyzed by Analyst-2

Content

Inspiration

‘Breast Cancer Diagnostic Dataset (BCD)’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

‘Breast Cancer Wisconsin (Diagnostic) Data Set’ analyzed by Analyst-2

BreastSwinFedNetX

machine learning models on the WDBC dataset

‘Breast Cancer Prediction’ analyzed by Analyst-2

Breast Cancer Dataset V2

Dataset

Contents

Breast_Cancer_Dataset

Dataset

Contents

Digital mammography Dataset for Breast Cancer Diagnosis Research (DMID)

Breast_Cancer_dataset.

Dataset

Contents

breast cancer dataset

Dataset

Contents

‘Anticancer peptides Data Set’ analyzed by Analyst-2

Breast Cancer Dataset

Dataset

Contents

Breast Cancer Prediction DatasetSee More Versions

Dataset created for "AI for Social Good: Women Coders' Bootcamp"

Breast Cancer Prediction Dataset