100+ datasets found
  1. Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER)...

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Jul 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (NCI), National Institutes of Health (NIH) (2025). Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER) Registries Limited-Use [Dataset]. https://catalog.data.gov/dataset/cancer-incidence-surveillance-epidemiology-and-end-results-seer-registries-limited-use
    Explore at:
    Dataset updated
    Jul 16, 2025
    Dataset provided by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    SEER Limited-Use cancer incidence data with associated population data. Geographic areas available are county and SEER registry. The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute collects and distributes high quality, comprehensive cancer data from a number of population-based cancer registries. Data include patient demographics, primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status. The SEER Program is the only comprehensive source of population-based information in the United States that includes stage of cancer at the time of diagnosis and survival rates within each stage.

  2. s

    COSMIC

    • cancer.sanger.ac.uk
    • grch37-cancer.sanger.ac.uk
    Updated Nov 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wellcome Sanger Institute (2025). COSMIC [Dataset]. http://doi.org/10.1093/nar/gkw1121
    Explore at:
    Dataset updated
    Nov 18, 2025
    Dataset provided by
    Wellcome Sanger Institute
    Description

    COSMIC, the Catalogue Of Somatic Mutations In Cancer, is the world's largest and most comprehensive resource for exploring the impact of somatic mutations in human cancer

  3. H

    SEER Cancer Statistics Database

    • dataverse.harvard.edu
    • data.niaid.nih.gov
    Updated Jul 11, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harvard Dataverse (2011). SEER Cancer Statistics Database [Dataset]. http://doi.org/10.7910/DVN/C9KBBC
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 11, 2011
    Dataset provided by
    Harvard Dataverse
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can access data about cancer statistics in the United States including but not limited to searches by type of cancer and race, sex, ethnicity, age at diagnosis, and age at death. Background Surveillance Epidemiology and End Results (SEER) database’s mission is to provide information on cancer statistics to help reduce the burden of disease in the U.S. population. The SEER database is a project to the National Cancer Institute. The SEER database collects information on incidence, prevalence, and survival from specific geographic areas representing 28 percent of the United States population. User functionality Users can access a variety of reso urces. Cancer Stat Fact Sheets allow users to look at summaries of statistics by major cancer type. Cancer Statistic Reviews are available from 1975-2008 in table format. Users are also able to build their own tables and graphs using Fast Stats. The Cancer Query system provides more flexibility and a larger set of cancer statistics than F ast Stats but requires more input from the user. State Cancer Profiles include dynamic maps and graphs enabling the investigation of cancer trends at the county, state, and national levels. SEER research data files and SEER*Stat software are available to download through your Internet connection (SEER*Stat’s client-server mode) or via discs shipped directly to you. A signed data agreement form is required to access the SEER data Data Notes Data is available in different formats depending on which type of data is accessed. Some data is available in table, PDF, and html formats. Detailed information about the data is available under “Data Documentation and Variable Recodes”.

  4. T

    Veterans Affairs Central Cancer Registry (VACCR)

    • data.va.gov
    • datahub.va.gov
    • +3more
    csv, xlsx, xml
    Updated Sep 12, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2019). Veterans Affairs Central Cancer Registry (VACCR) [Dataset]. https://www.data.va.gov/dataset/Veterans-Affairs-Central-Cancer-Registry-VACCR-/jvmd-8fgj
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Sep 12, 2019
    Description

    The Veterans Affairs Central Cancer Registry (VACCR) receives and stores information on cancer diagnosis and treatment constraints compiled and sent in by the local cancer registry staff at each of the 132 Veterans Affairs Medical Centers that diagnose and/or treat Veterans with cancer. The information sent is encoded to meet the site-specific requirements for registry inclusion as established by several oversight bodies, including the North American Association of Central Cancer Registries, the American College of Surgeons' Commission on Cancer, and the American Joint Commission on Cancer, among others. The information is obtained from a wide variety of medical record documents at the local medical center pertaining to each Veterans Health Administration (VHA) cancer patient. The information is then transmitted to the VACCR. Details collected include extensive demographics, cancer identification, extent of disease and staging, first course of treatment, and outcomes. Data extraction is available to researchers with VA approved Institutional Review Board studies, peer review, and Data Use Agreements.

  5. c

    The Cancer Genome Atlas Breast Invasive Carcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Feb 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    Feb 2, 2014
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

  6. CDC WONDER: Cancer Statistics

    • catalog.data.gov
    • healthdata.gov
    • +3more
    Updated Jul 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention, Department of Health & Human Services (2025). CDC WONDER: Cancer Statistics [Dataset]. https://catalog.data.gov/dataset/cdc-wonder-cancer-statistics
    Explore at:
    Dataset updated
    Jul 29, 2025
    Description

    The United States Cancer Statistics (USCS) online databases in WONDER provide cancer incidence and mortality data for the United States for the years since 1999, by year, state and metropolitan areas (MSA), age group, race, ethnicity, sex, childhood cancer classifications and cancer site. Report case counts, deaths, crude and age-adjusted incidence and death rates, and 95% confidence intervals for rates. The USCS data are the official federal statistics on cancer incidence from registries having high-quality data and cancer mortality statistics for 50 states and the District of Columbia. USCS are produced by the Centers for Disease Control and Prevention (CDC) and the National Cancer Institute (NCI), in collaboration with the North American Association of Central Cancer Registries (NAACCR). Mortality data are provided by the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), National Vital Statistics System (NVSS).

  7. UAE Cancer Patient Dataset

    • kaggle.com
    zip
    Updated Mar 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akshay Kumar (2025). UAE Cancer Patient Dataset [Dataset]. https://www.kaggle.com/datasets/ak0212/uae-cancer-patient-dataset
    Explore at:
    zip(283039 bytes)Available download formats
    Dataset updated
    Mar 20, 2025
    Authors
    Akshay Kumar
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    United Arab Emirates
    Description

    This dataset is designed for research, analysis, and machine learning applications in healthcare. It includes 10,000+ records of synthetic cancer patient data from the United Arab Emirates (UAE) with 20 features, such as: ✔ Patient demographics (Age, Gender, Nationality, Ethnicity) ✔ Diagnosis details (Cancer Type, Stage, Diagnosis Date) ✔ Treatment information (Treatment Type, Hospital, Physician) ✔ Health-related factors (Smoking Status, Comorbidities, Weight, Height) ✔ Outcomes (Recovered, Under Treatment, Deceased)

  8. o

    breast-cancer

    • openml.org
    Updated Apr 6, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matjaz Zwitter; Milan Soklic (2014). breast-cancer [Dataset]. https://www.openml.org/d/13
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2014
    Authors
    Matjaz Zwitter; Milan Soklic
    Description

    Author:
    Source: Unknown -
    Please cite:

    Citation Request: This breast cancer domain was obtained from the University Medical Centre, Institute of Oncology, Ljubljana, Yugoslavia. Thanks go to M. Zwitter and M. Soklic for providing the data. Please include this citation if you plan to use this database.

    1. Title: Breast cancer data (Michalski has used this)

    2. Sources: -- Matjaz Zwitter & Milan Soklic (physicians) Institute of Oncology University Medical Center Ljubljana, Yugoslavia -- Donors: Ming Tan and Jeff Schlimmer (Jeffrey.Schlimmer@a.gp.cs.cmu.edu) -- Date: 11 July 1988

    3. Past Usage: (Several: here are some) -- Michalski,R.S., Mozetic,I., Hong,J., & Lavrac,N. (1986). The Multi-Purpose Incremental Learning System AQ15 and its Testing Application to Three Medical Domains. In Proceedings of the Fifth National Conference on Artificial Intelligence, 1041-1045, Philadelphia, PA: Morgan Kaufmann. -- accuracy range: 66%-72% -- Clark,P. & Niblett,T. (1987). Induction in Noisy Domains. In Progress in Machine Learning (from the Proceedings of the 2nd European Working Session on Learning), 11-30, Bled, Yugoslavia: Sigma Press. -- 8 test results given: 65%-72% accuracy range -- Tan, M., & Eshelman, L. (1988). Using weighted networks to represent classification knowledge in noisy domains. Proceedings of the Fifth International Conference on Machine Learning, 121-134, Ann Arbor, MI. -- 4 systems tested: accuracy range was 68%-73.5% -- Cestnik,G., Konenenko,I, & Bratko,I. (1987). Assistant-86: A Knowledge-Elicitation Tool for Sophisticated Users. In I.Bratko & N.Lavrac (Eds.) Progress in Machine Learning, 31-45, Sigma Press. -- Assistant-86: 78% accuracy

    4. Relevant Information: This is one of three domains provided by the Oncology Institute that has repeatedly appeared in the machine learning literature. (See also lymphography and primary-tumor.)

      This data set includes 201 instances of one class and 85 instances of another class. The instances are described by 9 attributes, some of which are linear and some are nominal.

    5. Number of Instances: 286

    6. Number of Attributes: 9 + the class attribute

    7. Attribute Information:

      1. Class: no-recurrence-events, recurrence-events
      2. age: 10-19, 20-29, 30-39, 40-49, 50-59, 60-69, 70-79, 80-89, 90-99.
      3. menopause: lt40, ge40, premeno.
      4. tumor-size: 0-4, 5-9, 10-14, 15-19, 20-24, 25-29, 30-34, 35-39, 40-44, 45-49, 50-54, 55-59.
      5. inv-nodes: 0-2, 3-5, 6-8, 9-11, 12-14, 15-17, 18-20, 21-23, 24-26, 27-29, 30-32, 33-35, 36-39.
      6. node-caps: yes, no.
      7. deg-malig: 1, 2, 3.
      8. breast: left, right.
      9. breast-quad: left-up, left-low, right-up, right-low, central.
      10. irradiat: yes, no.
    8. Missing Attribute Values: (denoted by "?") Attribute #: Number of instances with missing values:

      1. 8
      2. 1.
    9. Class Distribution:

      1. no-recurrence-events: 201 instances
      2. recurrence-events: 85 instances

      Num Instances: 286 Num Attributes: 10 Num Continuous: 0 (Int 0 / Real 0) Num Discrete: 10 Missing values: 9 / 0.3%

      name type enum ints real missing distinct (1) 1 'age' Enum 100% 0% 0% 0 / 0% 6 / 2% 0% 2 'menopause' Enum 100% 0% 0% 0 / 0% 3 / 1% 0% 3 'tumor-size' Enum 100% 0% 0% 0 / 0% 11 / 4% 0% 4 'inv-nodes' Enum 100% 0% 0% 0 / 0% 7 / 2% 0% 5 'node-caps' Enum 97% 0% 0% 8 / 3% 2 / 1% 0% 6 'deg-malig' Enum 100% 0% 0% 0 / 0% 3 / 1% 0% 7 'breast' Enum 100% 0% 0% 0 / 0% 2 / 1% 0% 8 'breast-quad' Enum 100% 0% 0% 1 / 0% 5 / 2% 0% 9 'irradiat' Enum 100% 0% 0% 0 / 0% 2 / 1% 0% 10 'Class' Enum 100% 0% 0% 0 / 0% 2 / 1% 0%

  9. Cancer Data

    • kaggle.com
    zip
    Updated Mar 22, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erdem Taha (2023). Cancer Data [Dataset]. https://www.kaggle.com/datasets/erdemtaha/cancer-data/data
    Explore at:
    zip(49810 bytes)Available download formats
    Dataset updated
    Mar 22, 2023
    Authors
    Erdem Taha
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    🦠 Breast Cancer Data Set

    This dataset contains the characteristics of patients diagnosed with cancer. The dataset contains a unique ID for each patient, the type of cancer (diagnosis), the visual characteristics of the cancer and the average values of these characteristics.

    📚 The main features of the dataset are as follows:

    1. id: Represents a unique ID of each patient.
    2. diagnosis: Indicates the type of cancer. This property can take the values "M" (Malignant - Benign) or "B" (Benign - Malignant).
    3. radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean: Represents the mean values of the cancer's visual characteristics.

    There are also several categorical features where patients in the dataset are labeled with numerical values. You can examine them in the Chart area.

    Other features contain specific ranges of average values of the features of the cancer image:

    • radius_mean, texture_mean, perimeter_mean, area_mean, smoothness_mean, compactness_mean, concavity_mean, concave points_mean

    Each of these features is mapped to a table containing the number of values in a given range. You can examine the Chart Tables

    Each sample contains the patient's unique ID, the cancer diagnosis and the average values of the cancer's visual characteristics.

    Such a dataset can be used to train or test models and algorithms used to make cancer diagnoses. Understanding and analyzing the dataset can contribute to the improvement of cancer-related visual features and diagnosis.

    ✨ Examples of Projects that can be done with the Data Set

    Logistic Regression: This algorithm can be used effectively for binary classification problems. In this dataset, logistic regression may be an appropriate choice since there are "Malignant" (benign) and "Benign" (malignant) classes. It can be used to predict cancer type with the visual features in the dataset.

    K-Nearest Neighbors (KNN): KNN classifies an example by looking at the k closest examples around it. This algorithm assumes that patients with similar characteristics tend to have similar types of cancer. KNN can be used for cancer diagnosis by taking into account neighborhood relationships in the data set.

    Support Vector Machines (SVM): SVM is effective for classification tasks, especially for two-class problems. Focusing on the clear separation of classes in the dataset, SVM is a powerful algorithm that can be used for cancer diagnosis.

    Data Set Related Training Notebooks 😊 ("I Recommend You Review")

    K-NN Project: https://www.kaggle.com/code/erdemtaha/prediction-cancer-data-with-k-nn-95

    Logistic Regressüon: https://www.kaggle.com/code/erdemtaha/cancer-prediction-96-5-with-logistic-regression

    💖 Acknowledgements and Information

    This is a copy of content that has been elaborated for educational purposes and published to reach more people, you can access the original source from the link below, please do not forget to support that data

    🔗 https://www.kaggle.com/datasets/uciml/breast-cancer-wisconsin-data

    This database can also be accessed via the UW CS ftp server: 🔗 ftp.cs.wisc.edu cd math-prog/cpo-dataset/machine-learn/WDBC/

    It can also be found at the UCI Machine Learning Repository: 🔗 https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+%28Diagnostic%29

    📩 Personal Information:

    If you have some questions or curiosities about the data or studies, you can contact me as you wish from the links below 😊

    LinkedIn: https://www.linkedin.com/in/erdem-taha-sokullu/

    Mail: erdemtahasokullu@gmail.com

    Github: https://github.com/Prometheussx

    Kaggle: https://www.kaggle.com/erdemtaha

    📜 License:

    This Data has a CC BY-NC-SA 4.0 License You can review the license rules from the link below

    License Link: https://creativecommons.org/licenses/by-nc-sa/4.0/

  10. i

    SEER Breast Cancer Data

    • ieee-dataport.org
    • data.niaid.nih.gov
    • +2more
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    jing teng (2025). SEER Breast Cancer Data [Dataset]. https://ieee-dataport.org/open-access/seer-breast-cancer-data
    Explore at:
    Dataset updated
    Jul 29, 2025
    Authors
    jing teng
    Description

    examined regional LNs

  11. Prostate Cancer Dataset

    • kaggle.com
    zip
    Updated Jun 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soujanya Hassan Prabhakar (2023). Prostate Cancer Dataset [Dataset]. https://www.kaggle.com/datasets/soujanyahp/prostate-cancer-dataset
    Explore at:
    zip(2573 bytes)Available download formats
    Dataset updated
    Jun 15, 2023
    Authors
    Soujanya Hassan Prabhakar
    Description

    The data used in this example is sourced from a study conducted by Stamey et al. (1989). The study aimed to investigate the relationship between the level of prostate-specific antigen (PSA) and various clinical measures in a group of 97 men who were scheduled to undergo a radical prostatectomy. PSA is a protein that is produced by the prostate gland, and higher levels of PSA are often associated with a higher likelihood of having prostate cancer. The dataset provides valuable information for examining the correlation between PSA levels and other clinical factors in the context of prostate cancer.

    source: https://web.stanford.edu/~hastie/ElemStatLearn/datasets/prostate.data

  12. m

    Massachusetts Cancer Data - Interactive City and Town

    • mass.gov
    Updated Feb 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Public Health (2018). Massachusetts Cancer Data - Interactive City and Town [Dataset]. https://www.mass.gov/info-details/massachusetts-cancer-data-interactive-city-and-town
    Explore at:
    Dataset updated
    Feb 16, 2018
    Dataset provided by
    Office of Health Data, Strategy, and Innovation
    Department of Public Health
    Area covered
    Massachusetts
    Description

    This page presents data on cancer incidence (new cases) in Massachusetts cities and towns, provided by the Massachusetts Cancer Registry (MACR).

  13. m

    The IQ-OTHNCCD lung cancer dataset

    • data.mendeley.com
    Updated Oct 19, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hamdalla alyasriy (2020). The IQ-OTHNCCD lung cancer dataset [Dataset]. http://doi.org/10.17632/bhmdr45bh2.1
    Explore at:
    Dataset updated
    Oct 19, 2020
    Authors
    hamdalla alyasriy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The Iraq-Oncology Teaching Hospital/National Center for Cancer Diseases (IQ-OTH/NCCD) lung cancer dataset was collected in the above-mentioned specialist hospitals over a period of three months in fall 2019. It includes CT scans of patients diagnosed with lung cancer in different stages, as well as healthy subjects. IQ-OTH/NCCD slides were marked by oncologists and radiologists in these two centers. The dataset contains a total of 1190 images representing CT scan slices of 110 cases (see Figure 1). These cases are grouped into three classes: normal, benign, and malignant. of these, 40 cases are diagnosed as malignant; 15 cases diagnosed with benign; and 55 cases classified as normal cases. The CT scans were originally collected in DICOM format. The scanner used is SOMATOM from Siemens. CT protocol includes: 120 kV, slice thickness of 1 mm, with window width ranging from 350 to 1200 HU and window center from 50 to 600 were used for reading. with breath hold at full inspiration. All images were de-identified before performing analysis. Written consent was waived by the oversight review board. The study was approved by the institutional review board of participating medical centers. Each scan contains several slices. The number of these slices range from 80 to 200 slices, each of them represents an image of the human chest with different sides and angles. The 110 cases vary in gender, age, educational attainment, area of residence and living status. Some of them are employees of the Iraqi ministries of Transport and Oil, others are farmers and gainers. Most of them come from places in the middle region of Iraq, particularly, the provinces of Baghdad, Wasit, Diyala, Salahuddin, and Babylon.

  14. m

    Massachusetts Cancer Registry

    • mass.gov
    Updated Mar 8, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of Health Data, Strategy, and Innovation (2007). Massachusetts Cancer Registry [Dataset]. https://www.mass.gov/massachusetts-cancer-registry
    Explore at:
    Dataset updated
    Mar 8, 2007
    Dataset provided by
    Office of Health Data, Strategy, and Innovation
    Department of Public Health
    Area covered
    Massachusetts
    Description

    The Massachusetts Cancer Registry (MACR) works to improve and save lives through collection and reporting of cancer data.

  15. c

    Multimodal imaging of ductal carcinoma in situ with microinvasion

    • cancerimagingarchive.net
    n/a +1
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2023). Multimodal imaging of ductal carcinoma in situ with microinvasion [Dataset]. http://doi.org/10.7937/3fyc-ac78
    Explore at:
    svs, tiff, and xml, n/aAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Dec 8, 2023
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    Ductal carcinoma in situ with microinvasion (DCISM) is a challenging subtype of breast cancer with controversial invasiveness and prognosis. Accurate diagnosis of DCISM from ductal carcinoma in situ (DCIS) is crucial for optimal treatment and improved clinical outcomes. This dataset provides histopathology images and paired CK5/6 immunohistochemical staining images from patients with DCISM, as well as multiphoton microscopy images of suspicious regions. It offers multi-modal imaging data from various perspectives for analysis and diagnosis of microinvasive breast cancer by other researchers in the field.

    The dataset contains data from 12 breast cancer patients, including 10 cases of ductal carcinoma in situ with microinvasion (DCISM), 1 case of ductal carcinoma in situ (DCIS), and 1 case of invasive breast cancer.

    The magnification of the glass slide images is 40x. The pathology slide scanner used was created by the Sunny Optical Technology (group) Co., Ltd., and the pixel aspect ratio of the images is 1. The dataset also includes multiphoton microscopy imaging of suspicious microinvasion areas. The multiphoton imaging system was manufactured by Zeiss, and it also has a pixel aspect ratio of 1.

    Our database was specifically collected for the use of imaging methods in diagnosing DICSM. The suffixes in each case number indicate the patient's condition - "DCISM" for ductal carcinoma in situ with microinvasion, "DCIS" for ductal carcinoma in situ, and "IDC" for invasive ductal carcinoma. Apart from these labels, we have not collected any additional clinical information for these cases.

  16. Melanoma Skin Cancer Dataset of 10000 Images

    • kaggle.com
    zip
    Updated Mar 29, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Hasnain Javid (2022). Melanoma Skin Cancer Dataset of 10000 Images [Dataset]. https://www.kaggle.com/datasets/hasnainjaved/melanoma-skin-cancer-dataset-of-10000-images
    Explore at:
    zip(103508268 bytes)Available download formats
    Dataset updated
    Mar 29, 2022
    Authors
    Muhammad Hasnain Javid
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Melanoma Skin Cancer Dataset contains 10000 images. Melanoma skin cancer is deadly cancer, early detection and cure can save many lives. This dataset will be useful for developing the deep learning models for accurate classification of melanoma. Dataset consists of 9600 images for training the model and 1000 images for evaluation of model.

  17. o

    National Cancer Institute Imaging Data Commons (IDC) Collections

    • registry.opendata.aws
    Updated May 10, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Imaging Data Commons (IDC)(https://imaging.datacommons.cancer.gov) team (2023). National Cancer Institute Imaging Data Commons (IDC) Collections [Dataset]. https://registry.opendata.aws/nci-imaging-data-commons/
    Explore at:
    Dataset updated
    May 10, 2023
    Dataset provided by
    Imaging Data Commons (IDC)(<a href="https://imaging.datacommons.cancer.gov">https://imaging.datacommons.cancer.gov</a>) team
    Description

    Imaging Data Commons (IDC) is a repository within the Cancer Research Data Commons (CRDC) that manages imaging data and enables its integration with the other components of CRDC. IDC hosts a growing number of imaging collections that are contributed by either funded US National Cancer Institute (NCI) data collection activities, or by the individual researchers.Image data hosted by IDC is stored in DICOM format.

  18. Cancer Dataset(Top 50 Populated Countries)

    • kaggle.com
    zip
    Updated Jan 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ankush Panday (2025). Cancer Dataset(Top 50 Populated Countries) [Dataset]. https://www.kaggle.com/datasets/ankushpanday1/cancer-datasettop-50-populated-countries
    Explore at:
    zip(23228945 bytes)Available download formats
    Dataset updated
    Jan 17, 2025
    Authors
    Ankush Panday
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset provides a detailed view of global cancer trends across the 50 most populated countries. With 160,000 records, it encompasses a wide range of variables including cancer types, risk factors, healthcare expenditure, and environmental factors. The data is designed to assist researchers, healthcare policymakers, and data scientists in identifying patterns, predicting future trends, and crafting effective cancer control strategies.

  19. n

    National Cancer Institute 3D Structure Database

    • neuinfo.org
    • dknet.org
    • +1more
    Updated Feb 1, 2001
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2001). National Cancer Institute 3D Structure Database [Dataset]. http://identifiers.org/RRID:SCR_008211/resolver?q=&i=rrid
    Explore at:
    Dataset updated
    Feb 1, 2001
    Description

    The NCI DIS 3D database is a collection of 3D structures for over 400,000 drugs. The database is an extension of the NCI Drug Information System. The structural information stored in the DIS is only the connection table for each drug. The connection table is just a list of which atoms are connected and how they are connected. It is essentially a searcheable database of three-dimensional structures has been developed from the chemistry database of the NCI Drug Information System (DIS), a file of about 450,000 primarily organic compounds which have been tested by NCI for anticancer activity. The DIS database is very similar in size and content to the proprietary databases used in the pharmaceutical industry; its development began in the 1950s; and this history led to a number of problems in the generation of 3D structures. This information can be searched to find drugs that share similar patterns of connections, which can correlate with similar biological activity. But the cellular targets for drug action, as well as the drugs themselves, are 3 dimensional objects and advances in computer hardware and software have reached the point where they can be represented as such. In many cases the important points of interaction between a drug and its target can be represented by a 3D arrangement of a small number of atoms. Such a group of atoms is called a pharmacophore. The pharmacophore can be used to search 3D databases and drugs that match the pharmacophore could have similar biological activity, but have very different patterns of atomic connections. Having a diverse set of lead compounds increases the chances of finding an active compound with acceptable properties for clinical development. Sponsor: The ICBG are supported by the Cooperative Agreement mechanism, with funds from nine components of the NIH, the National Science Foundation, and the Foreign Agricultural Service of the USDA.

  20. d

    Data from: Cancer Rates

    • catalog.data.gov
    • data-lakecountyil.opendata.arcgis.com
    • +2more
    Updated Nov 22, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lake County Illinois GIS (2024). Cancer Rates [Dataset]. https://catalog.data.gov/dataset/cancer-rates-5cf0c
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Lake County Illinois GIS
    Description

    Cancer Rates for Lake County Illinois. Explanation of field attributes: Colorectal Cancer - Cancer that develops in the colon (the longest part of the large intestine) and/or the rectum (the last several inches of the large intestine). This is a rate per 100,000. Lung Cancer – Cancer that forms in tissues of the lung, usually in the cells lining air passages. This is a rate per 100,000. Breast Cancer – Cancer that forms in tissues of the breast. This is a rate per 100,000. Prostate Cancer – Cancer that forms in tissues of the prostate. This is a rate per 100,000. Urinary System Cancer – Cancer that forms in the organs of the body that produce and discharge urine. These include the kidneys, ureters, bladder, and urethra. This is a rate per 100,000. All Cancer – All cancers including, but not limited to: colorectal cancer, lung cancer, breast cancer, prostate cancer, and cancer of the urinary system. This is a rate per 100,000.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Cancer Institute (NCI), National Institutes of Health (NIH) (2025). Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER) Registries Limited-Use [Dataset]. https://catalog.data.gov/dataset/cancer-incidence-surveillance-epidemiology-and-end-results-seer-registries-limited-use
Organization logo

Cancer Incidence - Surveillance, Epidemiology, and End Results (SEER) Registries Limited-Use

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Jul 16, 2025
Dataset provided by
National Cancer Institutehttp://www.cancer.gov/
Description

SEER Limited-Use cancer incidence data with associated population data. Geographic areas available are county and SEER registry. The Surveillance, Epidemiology, and End Results (SEER) Program of the National Cancer Institute collects and distributes high quality, comprehensive cancer data from a number of population-based cancer registries. Data include patient demographics, primary tumor site, morphology, stage at diagnosis, first course of treatment, and follow-up for vital status. The SEER Program is the only comprehensive source of population-based information in the United States that includes stage of cancer at the time of diagnosis and survival rates within each stage.

Search
Clear search
Close search
Google apps
Main menu