100+ datasets found

Lung Cancer Mortality Datasets v2
kaggle.com
zip
Updated Jun 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MasterDataSan (2024). Lung Cancer Mortality Datasets v2 [Dataset]. https://www.kaggle.com/datasets/masterdatasan/lung-cancer-mortality-datasets-v2
Explore at:
zip(81127029 bytes)Available download formats
Dataset updated
Jun 1, 2024
Authors
MasterDataSan
Description
This dataset contains data about lung cancer Mortality. This database is a comprehensive collection of patient information, specifically focused on individuals diagnosed with cancer. It is designed to facilitate the analysis of various factors that may influence cancer prognosis and treatment outcomes. The database includes a range of demographic, medical, and treatment-related variables, capturing essential details about each patient's condition and history.

Key components of the database include:

Demographic Information: Basic details about the patients such as age, gender, and country of residence. This helps in understanding the distribution of cancer cases across different populations and regions.

Medical History: Information about each patient’s medical background, including family history of cancer, smoking status, Body Mass Index (BMI), cholesterol levels, and the presence of other health conditions such as hypertension, asthma, cirrhosis, and other cancers. This section is crucial for identifying potential risk factors and comorbidities.

Cancer Diagnosis: Detailed data about the cancer diagnosis itself, including the date of diagnosis and the stage of cancer at the time of diagnosis. This helps in tracking the progression and severity of the disease.

Treatment Details: Information regarding the type of treatment each patient received, the end date of the treatment, and the outcome (whether the patient survived or not). This is essential for evaluating the effectiveness of different treatment approaches.

The structure of the database allows for in-depth analysis and research, making it possible to identify patterns, correlations, and potential causal relationships between various factors and cancer outcomes. It is a valuable resource for medical researchers, epidemiologists, and healthcare providers aiming to improve cancer treatment and patient care.

id: A unique identifier for each patient in the dataset. age: The age of the patient at the time of diagnosis. gender: The gender of the patient (e.g., male, female). country: The country or region where the patient resides. diagnosis_date: The date on which the patient was diagnosed with lung cancer. cancer_stage: The stage of lung cancer at the time of diagnosis (e.g., Stage I, Stage II, Stage III, Stage IV). family_history: Indicates whether there is a family history of cancer (e.g., yes, no). smoking_status: The smoking status of the patient (e.g., current smoker, former smoker, never smoked, passive smoker). bmi: The Body Mass Index of the patient at the time of diagnosis. cholesterol_level: The cholesterol level of the patient (value). hypertension: Indicates whether the patient has hypertension (high blood pressure) (e.g., yes, no). asthma: Indicates whether the patient has asthma (e.g., yes, no). cirrhosis: Indicates whether the patient has cirrhosis of the liver (e.g., yes, no). other_cancer: Indicates whether the patient has had any other type of cancer in addition to the primary diagnosis (e.g., yes, no). treatment_type: The type of treatment the patient received (e.g., surgery, chemotherapy, radiation, combined). end_treatment_date: The date on which the patient completed their cancer treatment or died. survived: Indicates whether the patient survived (e.g., yes, no).

This dataset contains artificially generated data with as close a representation of reality as possible. This data is free to use without any licence required.

Good luck Gakusei!
Cancer Rates by U.S. State
kaggle.com
zip
Updated Dec 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Heemali Chaudhari (2022). Cancer Rates by U.S. State [Dataset]. https://www.kaggle.com/datasets/heemalichaudhari/cancer-rates-by-us-state
Explore at:
zip(219237 bytes)Available download formats
Dataset updated
Dec 26, 2022
Authors
Heemali Chaudhari
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
In the following maps, the U.S. states are divided into groups based on the rates at which people developed or died from cancer in 2013, the most recent year for which incidence data are available.

The rates are the numbers out of 100,000 people who developed or died from cancer each year.

Incidence Rates by State The number of people who get cancer is called cancer incidence. In the United States, the rate of getting cancer varies from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

‡Rates are not shown if the state did not meet USCS publication criteria or if the state did not submit data to CDC.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Death Rates by State Rates of dying from cancer also vary from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Source: https://www.cdc.gov/cancer/dcpc/data/state.htm
Lung Cancer Dataset
kaggle.com
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aman_Kumar094 (2025). Lung Cancer Dataset [Dataset]. https://www.kaggle.com/datasets/amankumar094/lung-cancer-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 6, 2025
Dataset provided by
Kaggle
Authors
Aman_Kumar094
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
** Description**

This dataset contains data about lung cancer Mortality and is a comprehensive collection of patient information, specifically focused on individuals diagnosed with cancer. This dataset contains comprehensive information on 800,000 individuals related to lung cancer diagnosis, treatment, and outcomes. With 16 well-structured columns. This large-scale dataset is designed to aid researchers, data scientists, and healthcare professionals in studying patterns, building predictive models, and enhancing early detection and treatment strategies.

🌍 The Societal Impact of Lung Cancer

Lung cancer is not just a disease — it's a global crisis that steals time, health, and hope from millions of people every year. As the #1 cause of cancer deaths worldwide, it takes more lives annually than breast, colon, and prostate cancer combined.

But behind every statistic is a story:

A parent who never saw their child graduate.

A worker who had to leave their job too soon.

A community that lost a leader, a friend, a neighbor.

Why does this matter? Lung cancer often goes undetected until it's too late. It’s aggressive, silent, and devastating — especially in underserved areas where early detection is rare and treatment options are limited. It doesn’t just affect patients. It affects families, economies, and healthcare systems on a massive scale.

This dataset represents more than numbers. It represents 800,000 real-world stories — people who can help us unlock patterns, train models, and advance life-saving research.

By working with this data, you're not just analyzing a dataset — you're stepping into the fight against one of humanity’s deadliest diseases.

Let’s turn insight into impact. (😊The above descriptions is generated with the help of AI, Just wanted to share this dataset That all. Thank you)
Number and rates of new cases of primary cancer, by cancer type, age group...
www150.statcan.gc.ca
datasets.ai
+2more
Updated May 19, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government of Canada, Statistics Canada (2021). Number and rates of new cases of primary cancer, by cancer type, age group and sex [Dataset]. http://doi.org/10.25318/1310011101-eng
Explore at:
Unique identifier
https://doi.org/10.25318/1310011101-eng
Dataset updated
May 19, 2021
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
Area covered
Canada
Description
Number and rate of new cancer cases diagnosed annually from 1992 to the most recent diagnosis year available. Included are all invasive cancers and in situ bladder cancer with cases defined using the Surveillance, Epidemiology and End Results (SEER) Groups for Primary Site based on the World Health Organization International Classification of Diseases for Oncology, Third Edition (ICD-O-3). Random rounding of case counts to the nearest multiple of 5 is used to prevent inappropriate disclosure of health-related information.
p
Breast Cancer Dataset - Dataset - CKAN
data.poltekkes-smg.ac.id
Updated Oct 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Breast Cancer Dataset - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/breast-cancer-dataset
Explore at:
Dataset updated
Oct 7, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Description: Breast cancer is the most common cancer amongst women in the world. It accounts for 25% of all cancer cases, and affected over 2.1 Million people in 2015 alone. It starts when cells in the breast begin to grow out of control. These cells usually form tumors that can be seen via X-ray or felt as lumps in the breast area. The key challenges against it’s detection is how to classify tumors into malignant (cancerous) or benign(non cancerous). We ask you to complete the analysis of classifying these tumors using machine learning (with SVMs) and the Breast Cancer Wisconsin (Diagnostic) Dataset. Acknowledgements: This dataset has been referred from Kaggle. Objective: Understand the Dataset & cleanup (if required). Build classification models to predict whether the cancer type is Malignant or Benign. Also fine-tune the hyperparameters & compare the evaluation metrics of various classification algorithms.
d
[MI] Rapid Cancer Registration Data
digital.nhs.uk
Updated Nov 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). [MI] Rapid Cancer Registration Data [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/mi-rapid-cancer-registration-data
Explore at:
Dataset updated
Nov 27, 2025
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Description
Rapid Cancer Registration Data (RCRD) provides a quick, indicative source of cancer data. It is provided to support the planning and provision of cancer services. The data is based on a rapid processing of cancer registration data sources, in particular on Cancer Outcomes and Services Dataset (COSD) information. In comparison, National Cancer Registration Data (NCRD) relies on additional data sources, enhanced follow-up with trusts and expert processing by cancer registration officers. The Rapid Cancer Registration Data (RCRD) may be useful for service improvement projects including healthcare planning and prioritisation. However, it is poorly suited for epidemiological research due to limitations in the data quality and completeness.
a
Cancer (in persons of all ages): England
hub.arcgis.com
data.catchmentbasedapproach.org
Updated Apr 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Rivers Trust (2021). Cancer (in persons of all ages): England [Dataset]. https://hub.arcgis.com/datasets/c5c07229db684a65822fdc9a29388b0b
Explore at:
Dataset updated
Apr 6, 2021
Dataset authored and provided by
The Rivers Trust
Area covered

Description
SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of cancer (in persons of all ages). Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to cancer (in persons of all ages).This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.The percentage of each MSOA’s population (all ages) with cancer was estimated. This was achieved by calculating a weighted average based on:The percentage of the MSOA area that was covered by each GP practice’s catchment areaOf the GPs that covered part of that MSOA: the percentage of registered patients that have that illness The estimated percentage of each MSOA’s population with cancer was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with cancer, within the relevant age range.Each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have cancerB) the NUMBER of people within that MSOA who are estimated to have cancerAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA that are estimated to have cancer, compared to other MSOAs. In other words, those are areas where it’s estimated a large number of people suffer from cancer, and where those people make up a large percentage of the population, indicating there is a real issue with cancer within the population and the investment of resources to address that issue could have the greatest benefits.LIMITATIONS1. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).2. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.3. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of cancer, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of cancer.TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:Health and wellbeing statistics (GP-level, England): Missing data and potential outliersLevels of obesity, inactivity and associated illnesses (England): Missing dataDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.MSOA boundaries: © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021.Population data: Mid-2019 (June 30) Population Estimates for Middle Layer Super Output Areas in England and Wales. © Office for National Statistics licensed under the Open Government Licence v3.0. © Crown Copyright 2020.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital; © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021. © Crown Copyright 2020.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.

Lung-Cancer-Risk-Dataset

kaggle.com

Updated Aug 23, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Mikey-TraceGod (2025). Lung-Cancer-Risk-Dataset [Dataset]. http://doi.org/10.34740/kaggle/dsv/12844025

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.34740/kaggle/dsv/12844025

Dataset updated

Aug 23, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Mikey-TraceGod

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Lung Cancer Risk Dataset

Overview

This dataset contains 50,000 patient profiles designed for lung cancer risk analysis and machine learning applications. The dataset is clean, preprocessed, and ready for immediate use in classification tasks, statistical analysis, and data visualization.

Rows: 50,000
Columns: 11
File: preprocessed_lung_cancer_dataset.csv
License: CC0: Public Domain

Dataset Description

The dataset includes patient profiles with features based on established lung cancer risk factors such as smoking history, environmental exposures, and chronic lung conditions. All data is synthetic and designed to reflect realistic risk factor distributions while maintaining patient privacy.

Features

Column	Type	Description	Values/Range
patient_id	Integer	Unique patient identifier	100000-149999
age	Integer	Patient age in years	18-100
gender	String	Patient gender	'Male', 'Female'
pack_years	Float	Smoking exposure (years × packs per day)	0-100
radon_exposure	String	Residential radon exposure level	'Low', 'Medium', 'High'
asbestos_exposure	String	Occupational asbestos exposure history	'Yes', 'No'
secondhand_smoke_exposure	String	Passive smoking exposure	'Yes', 'No'
copd_diagnosis	String	Chronic obstructive pulmonary disease diagnosis	'Yes', 'No'
alcohol_consumption	String	Alcohol consumption pattern	'None', 'Moderate', 'Heavy'
family_history	String	Family history of lung cancer	'Yes', 'No'
lung_cancer	String	Target variable: Lung cancer diagnosis	'Yes', 'No'

Data Quality

Complete: No missing values or duplicates
Clean: All values within realistic ranges
Balanced Features: Realistic distribution of risk factors
Target Distribution: Approximately 25% positive cases, reflecting real-world lung cancer prevalence

Use Cases

Binary classification modeling
Risk factor correlation analysis
Data visualization and exploratory analysis
Machine learning pipeline development
Statistical hypothesis testing

h
lung-cancer
huggingface.co
Updated Jun 24, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nate Raw (2022). lung-cancer [Dataset]. https://huggingface.co/datasets/nateraw/lung-cancer
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 24, 2022
Authors
Nate Raw
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Card for Lung Cancer

Dataset Summary

The effectiveness of cancer prediction system helps the people to know their cancer risk with low cost and it also helps the people to take the appropriate decision based on their cancer risk status. The data is collected from the website online lung cancer prediction system .

Supported Tasks and Leaderboards

[More Information Needed]

Languages

[More Information Needed]

Dataset Structure… See the full description on the dataset page: https://huggingface.co/datasets/nateraw/lung-cancer.
b
Mortality rate from oral cancer, all ages - WMCA
cityobservatory.birmingham.gov.uk
csv, excel, geojson +1
Updated Nov 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Mortality rate from oral cancer, all ages - WMCA [Dataset]. https://cityobservatory.birmingham.gov.uk/explore/dataset/mortality-rate-from-oral-cancer-all-ages-wmca/
Explore at:
csv, geojson, json, excelAvailable download formats
Dataset updated
Nov 3, 2025
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Age-standardised rate of mortality from oral cancer (ICD-10 codes C00-C14) in persons of all ages and sexes per 100,000 population.RationaleOver the last decade in the UK (between 2003-2005 and 2012-2014), oral cancer mortality rates have increased by 20% for males and 19% for females1Five year survival rates are 56%. Most oral cancers are triggered by tobacco and alcohol, which together account for 75% of cases2. Cigarette smoking is associated with an increased risk of the more common forms of oral cancer. The risk among cigarette smokers is estimated to be 10 times that for non-smokers. More intense use of tobacco increases the risk, while ceasing to smoke for 10 years or more reduces it to almost the same as that of non-smokers3. Oral cancer mortality rates can be used in conjunction with registration data to inform service planning as well as comparing survival rates across areas of England to assess the impact of public health prevention policies such as smoking cessation.References:(1) Cancer Research Campaign. Cancer Statistics: Oral – UK. London: CRC, 2000.(2) Blot WJ, McLaughlin JK, Winn DM et al. Smoking and drinking in relation to oral and pharyngeal cancer. Cancer Res 1988; 48: 3282-7. (3) La Vecchia C, Tavani A, Franceschi S et al. Epidemiology and prevention of oral cancer. Oral Oncology 1997; 33: 302-12.Definition of numeratorAll cancer mortality for lip, oral cavity and pharynx (ICD-10 C00-C14) in the respective calendar years aggregated into quinary age bands (0-4, 5-9,…, 85-89, 90+). This does not include secondary cancers or recurrences. Data are reported according to the calendar year in which the cancer was diagnosed.Counts of deaths for years up to and including 2019 have been adjusted where needed to take account of the MUSE ICD-10 coding change introduced in 2020. Detailed guidance on the MUSE implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/causeofdeathcodinginmortalitystatisticssoftwarechanges/january2020Counts of deaths for years up to and including 2013 have been double adjusted by applying comparability ratios from both the IRIS coding change and the MUSE coding change where needed to take account of both the MUSE ICD-10 coding change and the IRIS ICD-10 coding change introduced in 2014. The detailed guidance on the IRIS implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/impactoftheimplementationofirissoftwareforicd10causeofdeathcodingonmortalitystatisticsenglandandwales/2014-08-08Counts of deaths for years up to and including 2010 have been triple adjusted by applying comparability ratios from the 2011 coding change, the IRIS coding change and the MUSE coding change where needed to take account of the MUSE ICD-10 coding change, the IRIS ICD-10 coding change and the ICD-10 coding change introduced in 2011. The detailed guidance on the 2011 implementation is available at https://webarchive.nationalarchives.gov.uk/ukgwa/20160108084125/http://www.ons.gov.uk/ons/guide-method/classifications/international-standard-classifications/icd-10-for-mortality/comparability-ratios/index.htmlDefinition of denominatorPopulation-years (aggregated populations for the three years) for people of all ages, aggregated into quinary age bands (0-4, 5-9, …, 85-89, 90+)
b
Under 75 mortality rate from cancer - ICP Outcomes Framework - Resident...
cityobservatory.birmingham.gov.uk
csv, excel, geojson +1
Updated Sep 9, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Under 75 mortality rate from cancer - ICP Outcomes Framework - Resident Locality [Dataset]. https://cityobservatory.birmingham.gov.uk/explore/dataset/under-75-mortality-rate-from-cancer-icp-outcomes-framework-resident-locality/
Explore at:
geojson, csv, excel, jsonAvailable download formats
Dataset updated
Sep 9, 2025
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
This dataset presents the mortality rate from cancer among individuals under the age of 75 within the Birmingham and Solihull area. It captures the number of deaths attributed to all cancers (classified under ICD-10 codes C00 to C97) and expresses this as a directly age-standardised rate per 100,000 population. The data is structured in quinary age bands and is available for both single-year and three-year rolling averages, providing a comprehensive view of premature cancer mortality trends in the region.

Rationale Reducing premature mortality from cancer is a key public health priority. This indicator helps track progress in lowering the number of cancer-related deaths among people under 75, supporting efforts to improve early diagnosis, treatment, and prevention strategies.

Numerator The numerator is the number of deaths from all cancers (ICD-10 codes C00 to C97) registered in the respective calendar years, for individuals aged under 75. These figures are aggregated into quinary age bands and sourced from the Death Register.

Denominator The denominator is the population of individuals under 75 years of age, also aggregated into quinary age bands. For single-year rates, the population for that year is used. For three-year rolling averages, the population-years are aggregated across the three years. The source of this data is the 2021 Census.

Caveats Data may not align exactly with published Office for National Statistics (ONS) figures due to differences in postcode lookup versions and the application of comparability ratios in Office for Health Improvement and Disparities (OHID) data. Users should be cautious when comparing this dataset with other national statistics.

External references Further information and related indicators can be found on the OHID Fingertips platform.

Localities ExplainedThis dataset contains data based on either the resident locality or registered locality of the patient, a distinction is made between resident locality and registered locality populations:Resident Locality refers to individuals who live within the defined geographic boundaries of the locality. These boundaries are aligned with official administrative areas such as wards and Lower Layer Super Output Areas (LSOAs).Registered Locality refers to individuals who are registered with GP practices that are assigned to a locality based on the Primary Care Network (PCN) they belong to. These assignments are approximate—PCNs are mapped to a locality based on the location of most of their GP surgeries. As a result, locality-registered patients may live outside the locality, sometimes even in different towns or cities.This distinction is important because some health indicators are only available at GP practice level, without information on where patients actually reside. In such cases, data is attributed to the locality based on GP registration, not residential address.

Click here to explore more from the Birmingham and Solihull Integrated Care Partnerships Outcome Framework.
One-year survival from all cancers (NHSOF 1.4.i) - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Aug 4, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2015). One-year survival from all cancers (NHSOF 1.4.i) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/one-year-survival-from-all-cancers-nhsof-1-4-i
Explore at:
Dataset updated
Aug 4, 2015
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
A measure of the number of adults diagnosed with any type of cancer in a year who are still alive one year after diagnosis. Purpose This indicator attempts to capture the success of the NHS in preventing people from dying once they have been diagnosed with any type of cancer. Current version updated: Feb-17 Next version due: Feb-18
d
Mortality Rates
catalog.data.gov
datasets.ai
+4more
Updated Nov 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lake County Illinois GIS (2024). Mortality Rates [Dataset]. https://catalog.data.gov/dataset/mortality-rates-6fb72
Explore at:
Dataset updated
Nov 22, 2024
Dataset provided by
Lake County Illinois GIS
Description
Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.
Mortality rate per 10,000 of people with cancer 2012-14 - Dataset -...
ckan.publishing.service.gov.uk
Updated Dec 19, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2016). Mortality rate per 10,000 of people with cancer 2012-14 - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/mortality-rate-per-10-000-of-people-with-cancer-2012-14
Explore at:
Dataset updated
Dec 19, 2016
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Data that shows Mortality rate per 10,000 of people with cancer in Plymouth 2012-14
Cancer survival in England - adults diagnosed
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Aug 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2019). Cancer survival in England - adults diagnosed [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/datasets/cancersurvivalratescancersurvivalinenglandadultsdiagnosed
Explore at:
xlsxAvailable download formats
Dataset updated
Aug 12, 2019
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
One-year and five-year net survival for adults (15-99) in England diagnosed with one of 29 common cancers, by age and sex.
d
SHIP Cancer Mortality Rate 2009-2021
catalog.data.gov
opendata.maryland.gov
+2more
Updated Aug 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
opendata.maryland.gov (2024). SHIP Cancer Mortality Rate 2009-2021 [Dataset]. https://catalog.data.gov/dataset/ship-cancer-mortality-rate-2009-2017
Explore at:
Dataset updated
Aug 16, 2024
Dataset provided by
opendata.maryland.gov
Description
This is historical data. The update frequency has been set to "Static Data" and is here for historic value. Updated on 8/14/2024 Cancer Mortality Rate - This indicator shows the age-adjusted mortality rate from cancer (per 100,000 population). Maryland’s age adjusted cancer mortality rate is higher than the US cancer mortality rate. Cancer impacts people across all population groups, however wide racial disparities exist. Link to Data Details
Five-year survival from all cancers (NHSOF 1.4.ii) - Dataset - data.gov.uk
ckan.publishing.service.gov.uk
Updated Aug 4, 2015
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.publishing.service.gov.uk (2015). Five-year survival from all cancers (NHSOF 1.4.ii) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/five-year-survival-from-all-cancers-nhsof-1-4-ii
Explore at:
Dataset updated
Aug 4, 2015
Dataset provided by
CKANhttps://ckan.org/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
A measure of the number of adults diagnosed with any type of cancer in a year who are still alive five years after diagnosis. Purpose This indicator attempts to capture the success of the NHS in preventing people from dying once they have been diagnosed with any type of cancer. Current version updated: Feb-17 Next version due: Feb-18

Breast Cancer Dataset [Wisconsin Diagnostic UCI]

kaggle.com

zip

Updated Jan 22, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Abhinav Mangalore (2024). Breast Cancer Dataset [Wisconsin Diagnostic UCI] [Dataset]. https://www.kaggle.com/datasets/abhinavmangalore/breast-cancer-dataset-wisconsin-diagnostic-uci

Explore at:

zip(49831 bytes)Available download formats

Dataset updated

Jan 22, 2024

Authors

Abhinav Mangalore

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Wisconsin

Description

This dataset is taken from the UCI Machine Learning Repository (Link: https://data.world/health/breast-cancer-wisconsin) by the Donor: Nick Street

The main idea and inspiration behind the upload was to provide datasets for Machine Learning as practice and reference for my peers at college. The main purpose is to analyze data and experiment with different machine learning ideas and techniques for this binary classification task. As such, this dataset is a very useful resource to practice on.

Breast cancer is when breast cells mutate and become cancerous cells that multiply and form tumors. It accounts for 25% of all cancer cases and affected over 2.1 Million people in 2015 alone. Breast cancer typically affects women and people assigned female at birth (AFAB) age 50 and older, but it can also affect men and people assigned male at birth (AMAB), as well as younger women. Healthcare providers may treat breast cancer with surgery to remove tumors or treatment to kill cancerous cells.

Features are computed from a digitized image of a fine needle aspirate (FNA) of a breast mass. They describe characteristics of the cell nuclei present in the image. A few of the images can be found at http://www.cs.wisc.edu/~street/images/

The task: To classify whether the tumor is benign (B) or malignant (M).

Relevant information

Features are computed from a digitized image of a fine needle
aspirate (FNA) of a breast mass. They describe
characteristics of the cell nuclei present in the image.
A few of the images can be found at
http://www.cs.wisc.edu/~street/images/

Separating plane described above was obtained using
Multisurface Method-Tree (MSM-T) [K. P. Bennett, "Decision Tree
Construction Via Linear Programming." Proceedings of the 4th
Midwest Artificial Intelligence and Cognitive Science Society,
pp. 97-101, 1992], a classification method which uses linear
programming to construct a decision tree. Relevant features
were selected using an exhaustive search in the space of 1-4
features and 1-3 separating planes.

The actual linear program used to obtain the separating plane
in the 3-dimensional space is that described in:
[K. P. Bennett and O. L. Mangasarian: "Robust Linear
Programming Discrimination of Two Linearly Inseparable Sets",
Optimization Methods and Software 1, 1992, 23-34].


This database is also available through the UW CS ftp server:

ftp ftp.cs.wisc.edu
cd math-prog/cpo-dataset/machine-learn/WDBC/

Number of instances: 569

Number of attributes: 32 (ID, diagnosis, 30 real-valued input features)

Original Creators:

Dr. William H. Wolberg, General Surgery Dept., University of
Wisconsin, Clinical Sciences Center, Madison, WI 53792
wolberg@eagle.surgery.wisc.edu

W. Nick Street, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
street@cs.wisc.edu 608-262-6619

Olvi L. Mangasarian, Computer Sciences Dept., University of
Wisconsin, 1210 West Dayton St., Madison, WI 53706
olvi@cs.wisc.edu

Donor: Nick Street

Date: November 1995

Past Usage:

first usage:

W.N. Street, W.H. Wolberg and O.L. Mangasarian 
Nuclear feature extraction for breast tumor diagnosis.
IS&T/SPIE 1993 International Symposium on Electronic Imaging: Science
and Technology, volume 1905, pages 861-870, San Jose, CA, 1993.

OR literature:

O.L. Mangasarian, W.N. Street and W.H. Wolberg. 
Breast cancer diagnosis and prognosis via linear programming. 
Operations Research, 43(4), pages 570-577, July-August 1995.

Medical literature:

W.H. Wolberg, W.N. Street, and O.L. Mangasarian. 
Machine learning techniques to diagnose breast cancer from
fine-needle aspirates. 
Cancer Letters 77 (1994) 163-171.

W.H. Wolberg, W.N. Street, and O.L. Mangasarian. 
Image analysis and machine learning applied to breast cancer
diagnosis and prognosis. 
Analytical and Quantitative Cytology and Histology, Vol. 17
No. 2, pages 77-87, April 1995. 

W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. 
Computerized breast cancer diagnosis and prognosis from fine
needle aspirates. 
Archives of Surgery 1995;130:511-516.

W.H. Wolberg, W.N. Street, D.M. Heisey, and O.L. Mangasarian. 
Computer-derived nuclear features distinguish malignant from
benign breast cytology. 
Human Pathology, 26:792--796, 1995.

s
Data from: CSAW-M: An Ordinal Classification Dataset for Benchmarking...
figshare.scilifelab.se
Updated Jan 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Moein Sorkhei; Yue Liu; Hossein Azizpour; Edward Azavedo; Karin Dembrower; Dimitra Ntoula; Anthanasios Zouzos; Fredrik Strand; Kevin Smith (2025). CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer [Dataset]. http://doi.org/10.17044/scilifelab.14687271.v2
Explore at:
Unique identifier
https://doi.org/10.17044/scilifelab.14687271.v2
Dataset updated
Jan 15, 2025
Dataset provided by
KTH Royal Institute of Technology
Authors
Moein Sorkhei; Yue Liu; Hossein Azizpour; Edward Azavedo; Karin Dembrower; Dimitra Ntoula; Anthanasios Zouzos; Fredrik Strand; Kevin Smith
License
https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/
Description
Welcome to the the CSAW-M dataset homepageThis page includes the files and metadata related to the CSAW-M, a curated dataset of mammograms with expert assessments of the masking of cancer. CSAW-M is collected from over 10,000 individuals and annotated with potential masking. In contrast to the previous approaches which measure breast image density as a proxy, our dataset directly provides annotations of masking potential assessments from five specialists. We trained deep learning models on CSAW-M to estimate the masking level, and showed that the estimated masking is significantly more predictive of screening participants diagnosed with interval and large invasive cancers — without being explicitly trained for these tasks — than its breast density counterparts. Please find the paper corresponding to our work here and the GitHub repo here.CSAW-M Research Use LicensePlease read carefully all the terms and conditions of the CSAW-M Research Use License. How to access the dataset:If you want to get access to the data, please use the "Request access to files" option above (currently, non-Swedish researchers need to have a general figshare account to be able to to request access). We will ask you to agree to our terms of conditions and provide us with some information about what you will use the data for. We will then receive the request and process it, after which you would be able to download all the files.If you use this Work, please cite our paper:@article{sorkhei2021csaw, title={CSAW-M: An Ordinal Classification Dataset for Benchmarking Mammographic Masking of Cancer}, author={Sorkhei, Moein and Liu, Yue and Azizpour, Hossein and Azavedo, Edward and Dembrower, Karin and Ntoula, Dimitra and Zouzos, Athanasios and Strand, Fredrik and Smith, Kevin}, year={2021} }
Data from: Dataset description.
plos.figshare.com
xls
Updated Aug 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Refat Khan Pathan; Israt Jahan Shorna; Md. Sayem Hossain; Mayeen Uddin Khandaker; Huda I. Almohammed; Zuhal Y. Hamd (2024). Dataset description. [Dataset]. http://doi.org/10.1371/journal.pone.0305035.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0305035.t001
Dataset updated
Aug 27, 2024
Dataset provided by
PLOShttp://plos.org/
Authors
Refat Khan Pathan; Israt Jahan Shorna; Md. Sayem Hossain; Mayeen Uddin Khandaker; Huda I. Almohammed; Zuhal Y. Hamd
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Among many types of cancers, to date, lung cancer remains one of the deadliest cancers around the world. Many researchers, scientists, doctors, and people from other fields continuously contribute to this subject regarding early prediction and diagnosis. One of the significant problems in prediction is the black-box nature of machine learning models. Though the detection rate is comparatively satisfactory, people have yet to learn how a model came to that decision, causing trust issues among patients and healthcare workers. This work uses multiple machine learning models on a numerical dataset of lung cancer-relevant parameters and compares performance and accuracy. After comparison, each model has been explained using different methods. The main contribution of this research is to give logical explanations of why the model reached a particular decision to achieve trust. This research has also been compared with a previous study that worked with a similar dataset and took expert opinions regarding their proposed model. We also showed that our research achieved better results than their proposed model and specialist opinion using hyperparameter tuning, having an improved accuracy of almost 100% in all four models.

Facebook

Twitter

Click to copy link

Link copied

Cite

MasterDataSan (2024). Lung Cancer Mortality Datasets v2 [Dataset]. https://www.kaggle.com/datasets/masterdatasan/lung-cancer-mortality-datasets-v2

Lung Cancer Mortality Datasets v2

Dataset of lung cancer with time observation durring theatment period

Explore at:

zip(81127029 bytes)Available download formats

Dataset updated

Jun 1, 2024

Authors

MasterDataSan

Description

This dataset contains data about lung cancer Mortality. This database is a comprehensive collection of patient information, specifically focused on individuals diagnosed with cancer. It is designed to facilitate the analysis of various factors that may influence cancer prognosis and treatment outcomes. The database includes a range of demographic, medical, and treatment-related variables, capturing essential details about each patient's condition and history.

Key components of the database include:

Demographic Information: Basic details about the patients such as age, gender, and country of residence. This helps in understanding the distribution of cancer cases across different populations and regions.

Medical History: Information about each patient’s medical background, including family history of cancer, smoking status, Body Mass Index (BMI), cholesterol levels, and the presence of other health conditions such as hypertension, asthma, cirrhosis, and other cancers. This section is crucial for identifying potential risk factors and comorbidities.

Cancer Diagnosis: Detailed data about the cancer diagnosis itself, including the date of diagnosis and the stage of cancer at the time of diagnosis. This helps in tracking the progression and severity of the disease.

Treatment Details: Information regarding the type of treatment each patient received, the end date of the treatment, and the outcome (whether the patient survived or not). This is essential for evaluating the effectiveness of different treatment approaches.

The structure of the database allows for in-depth analysis and research, making it possible to identify patterns, correlations, and potential causal relationships between various factors and cancer outcomes. It is a valuable resource for medical researchers, epidemiologists, and healthcare providers aiming to improve cancer treatment and patient care.

id: A unique identifier for each patient in the dataset. age: The age of the patient at the time of diagnosis. gender: The gender of the patient (e.g., male, female). country: The country or region where the patient resides. diagnosis_date: The date on which the patient was diagnosed with lung cancer. cancer_stage: The stage of lung cancer at the time of diagnosis (e.g., Stage I, Stage II, Stage III, Stage IV). family_history: Indicates whether there is a family history of cancer (e.g., yes, no). smoking_status: The smoking status of the patient (e.g., current smoker, former smoker, never smoked, passive smoker). bmi: The Body Mass Index of the patient at the time of diagnosis. cholesterol_level: The cholesterol level of the patient (value). hypertension: Indicates whether the patient has hypertension (high blood pressure) (e.g., yes, no). asthma: Indicates whether the patient has asthma (e.g., yes, no). cirrhosis: Indicates whether the patient has cirrhosis of the liver (e.g., yes, no). other_cancer: Indicates whether the patient has had any other type of cancer in addition to the primary diagnosis (e.g., yes, no). treatment_type: The type of treatment the patient received (e.g., surgery, chemotherapy, radiation, combined). end_treatment_date: The date on which the patient completed their cancer treatment or died. survived: Indicates whether the patient survived (e.g., yes, no).

This dataset contains artificially generated data with as close a representation of reality as possible. This data is free to use without any licence required.

Good luck Gakusei!

Clear search

Close search

Google apps

Main menu

Lung Cancer Mortality Datasets v2

Cancer Rates by U.S. State

Lung Cancer Dataset

Number and rates of new cases of primary cancer, by cancer type, age group...

Breast Cancer Dataset - Dataset - CKAN

[MI] Rapid Cancer Registration Data

Cancer (in persons of all ages): England

Lung-Cancer-Risk-Dataset

Lung Cancer Risk Dataset

Overview

Dataset Description

Features

Data Quality

Use Cases

lung-cancer

Mortality rate from oral cancer, all ages - WMCA

Under 75 mortality rate from cancer - ICP Outcomes Framework - Resident...

One-year survival from all cancers (NHSOF 1.4.i) - Dataset - data.gov.uk

Mortality Rates

Mortality rate per 10,000 of people with cancer 2012-14 - Dataset -...

Cancer survival in England - adults diagnosed

SHIP Cancer Mortality Rate 2009-2021

Five-year survival from all cancers (NHSOF 1.4.ii) - Dataset - data.gov.uk

Breast Cancer Dataset [Wisconsin Diagnostic UCI]

Data from: CSAW-M: An Ordinal Classification Dataset for Benchmarking...

Data from: Dataset description.

Lung Cancer Mortality Datasets v2

Dataset of lung cancer with time observation durring theatment period