94 datasets found
  1. Cancer Rates by U.S. State

    • kaggle.com
    zip
    Updated Dec 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heemali Chaudhari (2022). Cancer Rates by U.S. State [Dataset]. https://www.kaggle.com/datasets/heemalichaudhari/cancer-rates-by-us-state
    Explore at:
    zip(219237 bytes)Available download formats
    Dataset updated
    Dec 26, 2022
    Authors
    Heemali Chaudhari
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    In the following maps, the U.S. states are divided into groups based on the rates at which people developed or died from cancer in 2013, the most recent year for which incidence data are available.

    The rates are the numbers out of 100,000 people who developed or died from cancer each year.

    Incidence Rates by State The number of people who get cancer is called cancer incidence. In the United States, the rate of getting cancer varies from state to state.

    *Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

    ‡Rates are not shown if the state did not meet USCS publication criteria or if the state did not submit data to CDC.

    †Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

    Death Rates by State Rates of dying from cancer also vary from state to state.

    *Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

    †Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

    Source: https://www.cdc.gov/cancer/dcpc/data/state.htm

  2. Cancer Mortality & Incidence Rates: (Country LVL)

    • kaggle.com
    zip
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Cancer Mortality & Incidence Rates: (Country LVL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-county-level-cancer-mortality-and-incidence-r
    Explore at:
    zip(146998 bytes)Available download formats
    Dataset updated
    Dec 3, 2022
    Authors
    The Devastator
    Description

    Cancer Mortality & Incidence Rates: (Country LVL)

    Investigating Cancer Trends over time

    By Data Exercises [source]

    About this dataset

    This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.

    This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.

    When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied

    Research Ideas

    • Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
    • This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
    • This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...

  3. Number and rates of new cases of primary cancer, by cancer type, age group...

    • www150.statcan.gc.ca
    • datasets.ai
    • +2more
    Updated May 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2021). Number and rates of new cases of primary cancer, by cancer type, age group and sex [Dataset]. http://doi.org/10.25318/1310011101-eng
    Explore at:
    Dataset updated
    May 19, 2021
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and rate of new cancer cases diagnosed annually from 1992 to the most recent diagnosis year available. Included are all invasive cancers and in situ bladder cancer with cases defined using the Surveillance, Epidemiology and End Results (SEER) Groups for Primary Site based on the World Health Organization International Classification of Diseases for Oncology, Third Edition (ICD-O-3). Random rounding of case counts to the nearest multiple of 5 is used to prevent inappropriate disclosure of health-related information.

  4. Lung Cancer Mortality Datasets v2

    • kaggle.com
    zip
    Updated Jun 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MasterDataSan (2024). Lung Cancer Mortality Datasets v2 [Dataset]. https://www.kaggle.com/datasets/masterdatasan/lung-cancer-mortality-datasets-v2
    Explore at:
    zip(81127029 bytes)Available download formats
    Dataset updated
    Jun 1, 2024
    Authors
    MasterDataSan
    Description

    This dataset contains data about lung cancer Mortality. This database is a comprehensive collection of patient information, specifically focused on individuals diagnosed with cancer. It is designed to facilitate the analysis of various factors that may influence cancer prognosis and treatment outcomes. The database includes a range of demographic, medical, and treatment-related variables, capturing essential details about each patient's condition and history.

    Key components of the database include:

    Demographic Information: Basic details about the patients such as age, gender, and country of residence. This helps in understanding the distribution of cancer cases across different populations and regions.

    Medical History: Information about each patient’s medical background, including family history of cancer, smoking status, Body Mass Index (BMI), cholesterol levels, and the presence of other health conditions such as hypertension, asthma, cirrhosis, and other cancers. This section is crucial for identifying potential risk factors and comorbidities.

    Cancer Diagnosis: Detailed data about the cancer diagnosis itself, including the date of diagnosis and the stage of cancer at the time of diagnosis. This helps in tracking the progression and severity of the disease.

    Treatment Details: Information regarding the type of treatment each patient received, the end date of the treatment, and the outcome (whether the patient survived or not). This is essential for evaluating the effectiveness of different treatment approaches.

    The structure of the database allows for in-depth analysis and research, making it possible to identify patterns, correlations, and potential causal relationships between various factors and cancer outcomes. It is a valuable resource for medical researchers, epidemiologists, and healthcare providers aiming to improve cancer treatment and patient care.

    id: A unique identifier for each patient in the dataset. age: The age of the patient at the time of diagnosis. gender: The gender of the patient (e.g., male, female). country: The country or region where the patient resides. diagnosis_date: The date on which the patient was diagnosed with lung cancer. cancer_stage: The stage of lung cancer at the time of diagnosis (e.g., Stage I, Stage II, Stage III, Stage IV). family_history: Indicates whether there is a family history of cancer (e.g., yes, no). smoking_status: The smoking status of the patient (e.g., current smoker, former smoker, never smoked, passive smoker). bmi: The Body Mass Index of the patient at the time of diagnosis. cholesterol_level: The cholesterol level of the patient (value). hypertension: Indicates whether the patient has hypertension (high blood pressure) (e.g., yes, no). asthma: Indicates whether the patient has asthma (e.g., yes, no). cirrhosis: Indicates whether the patient has cirrhosis of the liver (e.g., yes, no). other_cancer: Indicates whether the patient has had any other type of cancer in addition to the primary diagnosis (e.g., yes, no). treatment_type: The type of treatment the patient received (e.g., surgery, chemotherapy, radiation, combined). end_treatment_date: The date on which the patient completed their cancer treatment or died. survived: Indicates whether the patient survived (e.g., yes, no).

    This dataset contains artificially generated data with as close a representation of reality as possible. This data is free to use without any licence required.

    Good luck Gakusei!

  5. CDC WONDER: Cancer Statistics

    • catalog.data.gov
    • healthdata.gov
    • +4more
    Updated Jul 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention, Department of Health & Human Services (2025). CDC WONDER: Cancer Statistics [Dataset]. https://catalog.data.gov/dataset/cdc-wonder-cancer-statistics
    Explore at:
    Dataset updated
    Jul 29, 2025
    Description

    The United States Cancer Statistics (USCS) online databases in WONDER provide cancer incidence and mortality data for the United States for the years since 1999, by year, state and metropolitan areas (MSA), age group, race, ethnicity, sex, childhood cancer classifications and cancer site. Report case counts, deaths, crude and age-adjusted incidence and death rates, and 95% confidence intervals for rates. The USCS data are the official federal statistics on cancer incidence from registries having high-quality data and cancer mortality statistics for 50 states and the District of Columbia. USCS are produced by the Centers for Disease Control and Prevention (CDC) and the National Cancer Institute (NCI), in collaboration with the North American Association of Central Cancer Registries (NAACCR). Mortality data are provided by the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), National Vital Statistics System (NVSS).

  6. Deaths from All Cancers - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jul 28, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2017). Deaths from All Cancers - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/deaths-from-all-cancers
    Explore at:
    Dataset updated
    Jul 28, 2017
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data shows premature deaths (Age under 75) from all Cancers, numbers and rates by gender, as 3-year moving-averages. Cancers are a major cause of premature deaths. Inequalities exist in cancer rates between the most deprived areas and the most affluent areas. Directly Age-Standardised Rates (DASR) are shown in the data (where numbers are sufficient) so that death rates can be directly compared between areas. The DASR calculation applies Age-specific rates to a Standard (European) population to cancel out possible effects on crude rates due to different age structures among populations, thus enabling direct comparisons of rates. A limitation on using mortalities as a proxy for prevalence of health conditions is that mortalities may give an incomplete view of health conditions in an area, as ill-health might not lead to premature death. Data source: Office for Health Improvement and Disparities (OHID), indicator ID 40501, E05a. This data is updated annually.

  7. b

    Mortality rate from oral cancer, all ages - WMCA

    • cityobservatory.birmingham.gov.uk
    csv, excel, geojson +1
    Updated Nov 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Mortality rate from oral cancer, all ages - WMCA [Dataset]. https://cityobservatory.birmingham.gov.uk/explore/dataset/mortality-rate-from-oral-cancer-all-ages-wmca/
    Explore at:
    csv, geojson, json, excelAvailable download formats
    Dataset updated
    Nov 3, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Age-standardised rate of mortality from oral cancer (ICD-10 codes C00-C14) in persons of all ages and sexes per 100,000 population.RationaleOver the last decade in the UK (between 2003-2005 and 2012-2014), oral cancer mortality rates have increased by 20% for males and 19% for females1Five year survival rates are 56%. Most oral cancers are triggered by tobacco and alcohol, which together account for 75% of cases2. Cigarette smoking is associated with an increased risk of the more common forms of oral cancer. The risk among cigarette smokers is estimated to be 10 times that for non-smokers. More intense use of tobacco increases the risk, while ceasing to smoke for 10 years or more reduces it to almost the same as that of non-smokers3. Oral cancer mortality rates can be used in conjunction with registration data to inform service planning as well as comparing survival rates across areas of England to assess the impact of public health prevention policies such as smoking cessation.References:(1) Cancer Research Campaign. Cancer Statistics: Oral – UK. London: CRC, 2000.(2) Blot WJ, McLaughlin JK, Winn DM et al. Smoking and drinking in relation to oral and pharyngeal cancer. Cancer Res 1988; 48: 3282-7. (3) La Vecchia C, Tavani A, Franceschi S et al. Epidemiology and prevention of oral cancer. Oral Oncology 1997; 33: 302-12.Definition of numeratorAll cancer mortality for lip, oral cavity and pharynx (ICD-10 C00-C14) in the respective calendar years aggregated into quinary age bands (0-4, 5-9,…, 85-89, 90+). This does not include secondary cancers or recurrences. Data are reported according to the calendar year in which the cancer was diagnosed.Counts of deaths for years up to and including 2019 have been adjusted where needed to take account of the MUSE ICD-10 coding change introduced in 2020. Detailed guidance on the MUSE implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/articles/causeofdeathcodinginmortalitystatisticssoftwarechanges/january2020Counts of deaths for years up to and including 2013 have been double adjusted by applying comparability ratios from both the IRIS coding change and the MUSE coding change where needed to take account of both the MUSE ICD-10 coding change and the IRIS ICD-10 coding change introduced in 2014. The detailed guidance on the IRIS implementation is available at: https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/bulletins/impactoftheimplementationofirissoftwareforicd10causeofdeathcodingonmortalitystatisticsenglandandwales/2014-08-08Counts of deaths for years up to and including 2010 have been triple adjusted by applying comparability ratios from the 2011 coding change, the IRIS coding change and the MUSE coding change where needed to take account of the MUSE ICD-10 coding change, the IRIS ICD-10 coding change and the ICD-10 coding change introduced in 2011. The detailed guidance on the 2011 implementation is available at https://webarchive.nationalarchives.gov.uk/ukgwa/20160108084125/http://www.ons.gov.uk/ons/guide-method/classifications/international-standard-classifications/icd-10-for-mortality/comparability-ratios/index.htmlDefinition of denominatorPopulation-years (aggregated populations for the three years) for people of all ages, aggregated into quinary age bands (0-4, 5-9, …, 85-89, 90+)

  8. d

    Deaths from All Cancers - Dataset - Datopian CKAN instance

    • demo.dev.datopian.com
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Deaths from All Cancers - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/lcc--deaths-from-all-cancers
    Explore at:
    Dataset updated
    Oct 7, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data shows premature deaths (Age under 75) from all Cancers, numbers and rates by gender, as 3-year moving-averages. Cancers are a major cause of premature deaths. Inequalities exist in cancer rates between the most deprived areas and the most affluent areas. Directly Age-Standardised Rates (DASR) are shown in the data (where numbers are sufficient) so that death rates can be directly compared between areas. The DASR calculation applies Age-specific rates to a Standard (European) population to cancel out possible effects on crude rates due to different age structures among populations, thus enabling direct comparisons of rates. A limitation on using mortalities as a proxy for prevalence of health conditions is that mortalities may give an incomplete view of health conditions in an area, as ill-health might not lead to premature death. Data source: Office for Health Improvement and Disparities (OHID), indicator ID 40501, E05a. This data is updated annually.

  9. Cancer County-Level

    • kaggle.com
    zip
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Cancer County-Level [Dataset]. https://www.kaggle.com/datasets/thedevastator/exploring-county-level-correlations-in-cancer-ra
    Explore at:
    zip(146998 bytes)Available download formats
    Dataset updated
    Dec 3, 2022
    Authors
    The Devastator
    Description

    Exploring County-Level Correlations in Cancer Rates and Trends

    A Multivariate Ordinary Least Squares Regression Model

    By Noah Rippner [source]

    About this dataset

    This dataset offers a unique opportunity to examine the pattern and trends of county-level cancer rates in the United States at the individual county level. Using data from cancer.gov and the US Census American Community Survey, this dataset allows us to gain insight into how age-adjusted death rate, average deaths per year, and recent trends vary between counties – along with other key metrics like average annual counts, met objectives of 45.5?, recent trends (2) in death rates, etc., captured within our deep multi-dimensional dataset. We are able to build linear regression models based on our data to determine correlations between variables that can help us better understand cancers prevalence levels across different counties over time - making it easier to target health initiatives and resources accurately when necessary or desired

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This kaggle dataset provides county-level datasets from the US Census American Community Survey and cancer.gov for exploring correlations between county-level cancer rates, trends, and mortality statistics. This dataset contains records from all U.S counties concerning the age-adjusted death rate, average deaths per year, recent trend (2) in death rates, average annual count of cases detected within 5 years, and whether or not an objective of 45.5 (1) was met in the county associated with each row in the table.

    To use this dataset to its fullest potential you need to understand how to perform simple descriptive analytics which includes calculating summary statistics such as mean, median or other numerical values; summarizing categorical variables using frequency tables; creating data visualizations such as charts and histograms; applying linear regression or other machine learning techniques such as support vector machines (SVMs), random forests or neural networks etc.; differentiating between supervised vs unsupervised learning techniques etc.; reviewing diagnostics tests to evaluate your models; interpreting your findings; hypothesizing possible reasons and patterns discovered during exploration made through data visualizations ; Communicating and conveying results found via effective presentation slides/documents etc.. Having this understanding will enable you apply different methods of analysis on this data set accurately ad effectively.

    Once these concepts are understood you are ready start exploring this data set by first importing it into your visualization software either tableau public/ desktop version/Qlikview / SAS Analytical suite/Python notebooks for building predictive models by loading specified packages based on usage like Scikit Learn if Python is used among others depending on what tool is used . Secondly a brief description of the entire table's column structure has been provided above . Statistical operations can be carried out with simple queries after proper knowledge of basic SQL commands is attained just like queries using sub sets can also be performed with good command over selecting columns while specifying conditions applicable along with sorting operations being done based on specific attributes as required leading up towards writing python codes needed when parsing specific portion of data desired grouping / aggregating different categories before performing any kind of predictions / models can also activated create post joining few tables possible , when ever necessary once again varying across tools being used Thereby diving deep into analyzing available features determined randomly thus creating correlation matrices figures showing distribution relationships using correlation & covariance matrixes , thus making evaluations deducing informative facts since revealing trends identified through corresponding scatter plots from a given metric gathered from appropriate fields!

    Research Ideas

    • Building a predictive cancer incidence model based on county-level demographic data to identify high-risk areas and target public health interventions.
    • Analyzing correlations between age-adjusted death rate, average annual count, and recent trends in order to develop more effective policy initiatives for cancer prevention and healthcare access.
    • Utilizing the dataset to construct a machine learning algorithm that can predict county-level mortality rates based on socio-economic factors such as poverty levels and educational attainment rates

    Acknowledgements

    If you use this dataset i...

  10. Five-year survival from all cancers (NHSOF 1.4.ii) - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Aug 4, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2015). Five-year survival from all cancers (NHSOF 1.4.ii) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/five-year-survival-from-all-cancers-nhsof-1-4-ii
    Explore at:
    Dataset updated
    Aug 4, 2015
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A measure of the number of adults diagnosed with any type of cancer in a year who are still alive five years after diagnosis. Purpose This indicator attempts to capture the success of the NHS in preventing people from dying once they have been diagnosed with any type of cancer. Current version updated: Feb-17 Next version due: Feb-18

  11. b

    Under 75 mortality rate from cancer - ICP Outcomes Framework - Resident...

    • cityobservatory.birmingham.gov.uk
    csv, excel, geojson +1
    Updated Sep 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Under 75 mortality rate from cancer - ICP Outcomes Framework - Resident Locality [Dataset]. https://cityobservatory.birmingham.gov.uk/explore/dataset/under-75-mortality-rate-from-cancer-icp-outcomes-framework-resident-locality/
    Explore at:
    geojson, csv, excel, jsonAvailable download formats
    Dataset updated
    Sep 9, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This dataset presents the mortality rate from cancer among individuals under the age of 75 within the Birmingham and Solihull area. It captures the number of deaths attributed to all cancers (classified under ICD-10 codes C00 to C97) and expresses this as a directly age-standardised rate per 100,000 population. The data is structured in quinary age bands and is available for both single-year and three-year rolling averages, providing a comprehensive view of premature cancer mortality trends in the region.

    Rationale Reducing premature mortality from cancer is a key public health priority. This indicator helps track progress in lowering the number of cancer-related deaths among people under 75, supporting efforts to improve early diagnosis, treatment, and prevention strategies.

    Numerator The numerator is the number of deaths from all cancers (ICD-10 codes C00 to C97) registered in the respective calendar years, for individuals aged under 75. These figures are aggregated into quinary age bands and sourced from the Death Register.

    Denominator The denominator is the population of individuals under 75 years of age, also aggregated into quinary age bands. For single-year rates, the population for that year is used. For three-year rolling averages, the population-years are aggregated across the three years. The source of this data is the 2021 Census.

    Caveats Data may not align exactly with published Office for National Statistics (ONS) figures due to differences in postcode lookup versions and the application of comparability ratios in Office for Health Improvement and Disparities (OHID) data. Users should be cautious when comparing this dataset with other national statistics.

    External references Further information and related indicators can be found on the OHID Fingertips platform.

    Localities ExplainedThis dataset contains data based on either the resident locality or registered locality of the patient, a distinction is made between resident locality and registered locality populations:Resident Locality refers to individuals who live within the defined geographic boundaries of the locality. These boundaries are aligned with official administrative areas such as wards and Lower Layer Super Output Areas (LSOAs).Registered Locality refers to individuals who are registered with GP practices that are assigned to a locality based on the Primary Care Network (PCN) they belong to. These assignments are approximate—PCNs are mapped to a locality based on the location of most of their GP surgeries. As a result, locality-registered patients may live outside the locality, sometimes even in different towns or cities.This distinction is important because some health indicators are only available at GP practice level, without information on where patients actually reside. In such cases, data is attributed to the locality based on GP registration, not residential address.

    Click here to explore more from the Birmingham and Solihull Integrated Care Partnerships Outcome Framework.

  12. One-year survival from all cancers (NHSOF 1.4.i) - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Aug 4, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2015). One-year survival from all cancers (NHSOF 1.4.i) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/one-year-survival-from-all-cancers-nhsof-1-4-i
    Explore at:
    Dataset updated
    Aug 4, 2015
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A measure of the number of adults diagnosed with any type of cancer in a year who are still alive one year after diagnosis. Purpose This indicator attempts to capture the success of the NHS in preventing people from dying once they have been diagnosed with any type of cancer. Current version updated: Feb-17 Next version due: Feb-18

  13. a

    Cancer (in persons of all ages): England

    • hub.arcgis.com
    • data.catchmentbasedapproach.org
    Updated Apr 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Rivers Trust (2021). Cancer (in persons of all ages): England [Dataset]. https://hub.arcgis.com/datasets/c5c07229db684a65822fdc9a29388b0b
    Explore at:
    Dataset updated
    Apr 6, 2021
    Dataset authored and provided by
    The Rivers Trust
    Area covered
    Description

    SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of cancer (in persons of all ages). Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to cancer (in persons of all ages).This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.The percentage of each MSOA’s population (all ages) with cancer was estimated. This was achieved by calculating a weighted average based on:The percentage of the MSOA area that was covered by each GP practice’s catchment areaOf the GPs that covered part of that MSOA: the percentage of registered patients that have that illness The estimated percentage of each MSOA’s population with cancer was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with cancer, within the relevant age range.Each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have cancerB) the NUMBER of people within that MSOA who are estimated to have cancerAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA that are estimated to have cancer, compared to other MSOAs. In other words, those are areas where it’s estimated a large number of people suffer from cancer, and where those people make up a large percentage of the population, indicating there is a real issue with cancer within the population and the investment of resources to address that issue could have the greatest benefits.LIMITATIONS1. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).2. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.3. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of cancer, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of cancer.TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:Health and wellbeing statistics (GP-level, England): Missing data and potential outliersLevels of obesity, inactivity and associated illnesses (England): Missing dataDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.MSOA boundaries: © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021.Population data: Mid-2019 (June 30) Population Estimates for Middle Layer Super Output Areas in England and Wales. © Office for National Statistics licensed under the Open Government Licence v3.0. © Crown Copyright 2020.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital; © Office for National Statistics licensed under the Open Government Licence v3.0. Contains OS data © Crown copyright and database right 2021. © Crown Copyright 2020.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.

  14. u

    Cancer death rates by county, 2019-2023 - Dataset - Healthy Communities Data...

    • midb.uspatial.umn.edu
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Cancer death rates by county, 2019-2023 - Dataset - Healthy Communities Data Portal [Dataset]. https://midb.uspatial.umn.edu/hcdp/dataset/cancer-death-rates-by-county-2019-2023
    Explore at:
    Dataset updated
    Oct 24, 2025
    Description

    Cancer death rates by county, all races (includes Hispanic/Latino), all sexes, all ages, 2019-2023. Death data were provided by the National Vital Statistics System. Death rates (deaths per 100,000 population per year) are age-adjusted to the 2000 US standard population (20 age groups: <1, 1-4, 5-9, ... , 80-84, 85-89, 90+). Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by the National Cancer Institute. The US Population Data File is used for mortality data. The Average Annual Percent Change is based onthe APCs calculated by the Joinpoint Regression Program (Version 4.9.0.0). Due to data availability issues, the time period used in the calculation of the joinpoint regression model may differ for selected counties. Counties with a (3) after their name may have their joinpoint regresssion model calculated using a different time period due to data availability issues.

  15. p

    Cervical Cancer Risk Classification - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Cervical Cancer Risk Classification - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/cervical-cancer-risk-classification
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.

  16. Data_Sheet_1_Revising Incidence and Mortality of Lung Cancer in Central...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    docx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krisztina Bogos; Zoltán Kiss; Gabriella Gálffy; Lilla Tamási; Gyula Ostoros; Veronika Müller; László Urbán; Nóra Bittner; Veronika Sárosi; Aladár Vastag; Zoltán Polányi; Zsófia Nagy-Erdei; Zoltán Vokó; Balázs Nagy; Krisztián Horváth; György Rokszin; Zsolt Abonyi-Tóth; Judit Moldvay (2023). Data_Sheet_1_Revising Incidence and Mortality of Lung Cancer in Central Europe: An Epidemiology Review From Hungary.docx [Dataset]. http://doi.org/10.3389/fonc.2019.01051.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Krisztina Bogos; Zoltán Kiss; Gabriella Gálffy; Lilla Tamási; Gyula Ostoros; Veronika Müller; László Urbán; Nóra Bittner; Veronika Sárosi; Aladár Vastag; Zoltán Polányi; Zsófia Nagy-Erdei; Zoltán Vokó; Balázs Nagy; Krisztián Horváth; György Rokszin; Zsolt Abonyi-Tóth; Judit Moldvay
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Central Europe, Hungary, Europe
    Description

    Objective: While Hungary is often reported to have the highest incidence and mortality rates of lung cancer, until 2018 no nationwide epidemiology study was conducted to confirm these trends. The objective of this study was to estimate the occurrence of lung cancer in Hungary based on a retrospective review of the National Health Insurance Fund (NHIF) database.Methods: Our retrospective, longitudinal study included patients aged ≥20 years who were diagnosed with lung cancer (ICD-10 C34) between 1 Jan 2011 and 31 Dec 2016. Age-standardized incidence and mortality rates were calculated using both the 1976 and 2013 European Standard Populations (ESP).Results: Between 2011 and 2016, 6,996 – 7,158 new lung cancer cases were recorded in the NHIF database annually, and 6,045 – 6,465 all-cause deaths occurred per year. Age-adjusted incidence rates were 115.7–101.6/100,000 person-years among men (ESP 1976: 84.7–72.6), showing a mean annual change of − 2.26% (p = 0.008). Incidence rates among women increased from 48.3 to 50.3/100,000 person-years (ESP 1976: 36.9–38.0), corresponding to a mean annual change of 1.23% (p = 0.028). Age-standardized mortality rates varied between 103.8 and 97.2/100,000 person-years (ESP 1976: 72.8–69.7) in men and between 38.3 and 42.7/100,000 person-years (ESP 1976: 27.8–29.3) in women.Conclusion: Age-standardized incidence and mortality rates of lung cancer in Hungary were found to be high compared to Western-European countries, but lower than those reported by previous publications. The incidence of lung cancer decreased in men, while there was an increase in incidence and mortality among female lung cancer patients.

  17. d

    Mortality Rates

    • catalog.data.gov
    • datasets.ai
    • +4more
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lake County Illinois GIS (2024). Mortality Rates [Dataset]. https://catalog.data.gov/dataset/mortality-rates-6fb72
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Lake County Illinois GIS
    Description

    Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.

  18. c

    Multimodal Head and Neck cancer dataset

    • cancerimagingarchive.net
    n/a, svs and png
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2025). Multimodal Head and Neck cancer dataset [Dataset]. http://doi.org/10.7937/rcty-5h16
    Explore at:
    svs and png, n/aAvailable download formats
    Dataset updated
    Nov 18, 2025
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    Nov 18, 2025
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    Abstract

    HANCOCK is a comprehensive, monocentric dataset of 763 head and neck cancer patients, including diverse data modalities. It contains histopathology imaging (whole-slide images of H&E-stained primary tumors and tissue microarrays with immunohistochemical staining) alongside structured clinical data (demographics, tumor pathology characteristics, laboratory blood measurements) and textual data (de-identified surgery reports and medical histories). All patients were treated curatively, and data span diagnoses from 2005–2019. This multimodal collection enables research into integrative analyses – for example, combining histologic features with clinical parameters for outcome prediction. Early analyses have demonstrated that fusing these modalities improves prognostic modeling compared to single-source data, and that leveraging histology with foundation models can enhance endpoint prediction​. HANCOCK aims to facilitate precision oncology studies by providing a large public resource for developing and benchmarking multimodal machine learning methods in head and neck cancer.

    Introduction

    Head and neck cancer (HNC) is a prevalent malignancy with poor outcomes – it is the 7th most common cancer globally and carries a 5-year survival of only ~25–60% despite modern treatments​. Improving patient prognosis may require personalized, multimodal therapy decisions, using information from pathology, clinical, and other data sources​. However, progress in multimodal prediction has been limited by the lack of large public datasets that integrate these diverse data types​. To our knowledge, existing HNC datasets are either small or incomplete; for example, a radiomics study included 288 oropharyngeal cases​, and a proteomics-focused set with imaging had only 122 cases​. The Cancer Genome Atlas (TCGA) provides multi-omics for >500 HNC cases, but lacks crucial data like pathology reports, blood tests, or comprehensive imaging for each patient​. These limitations hinder robust multimodal research​.

    HANCOCK was created to address this gap​. It aggregates 763 patients’ data from a single academic center, capturing a real-world, uniformly treated cohort. The dataset uniquely combines whole slide histopathology images, tissue microarray images, detailed clinical parameters, pathology reports, and lab values in one resource​​. By curating and harmonizing these modalities, HANCOCK enables researchers to explore complex data interdependencies and develop multimodal predictive models. The patient population reflects typical HNC demographics – 80% male, median age 61, with 72% being former or current smokers​ – aligning with expected epidemiology​ and supporting generalizability. In summary, HANCOCK is an unprecedented multimodal HNC dataset that can fuel research in machine learning, prognostic biomarker discovery, and integrative oncology, ultimately advancing personalized head and neck cancer care.

    Methods

    The following sections describe how the HANCOCK data were collected, processed, and prepared for public sharing.

    Subject Inclusion and Exclusion Criteria

    Patients included in HANCOCK were those diagnosed with head and neck cancer between 2005 and 2019 at University Hospital Erlangen (Germany) who underwent a curative-intent initial treatment (surgery and/or definitive therapy)​. This encompasses cancers of the oral cavity, oropharynx, hypopharynx, and larynx​. Patients treated palliatively or with recurrent/metastatic disease at presentation were excluded to focus on first-course, curative treatments. The cohort consists of 763 patients (approximately 80% male, 20% female) with a median age of 61 years​. Notably, ~72% have a history of tobacco use​, which is consistent with real-world HNC risk factors. The distribution of tumor subsites and stages reflects typical HNC presentation, and thus the dataset is broadly representative of the general HNC patient population​. Being a single-center dataset, there is limited geographic diversity; however, the homogeneous data acquisition and treatment context reduce variability in data quality. No significant selection biases were introduced aside from the exclusion of non-curative cases – all major HNC subsite cases over the inclusion period were captured, providing a comprehensive real-world sample. Ethical approval was obtained for this retrospective data collection and sharing (Ethics Committee vote #23-22-Br), and all data were fully de-identified prior to release.

    Data Acquisition

    Histopathology: Tissue specimens from the primary tumors (and involved lymph nodes, if present) were obtained from the pathology archives. All samples were formalin-fixed and paraffin-embedded (FFPE) and stained with hematoxylin and eosin (H&E) following routine protocols​. Digital whole-slide imaging was performed on these histology slides. A total of 709 H&E slides of primary tumor tissue (701 patients had one slide, 8 patients had two slides) were scanned at high resolution using a 3DHISTECH P1000 scanner at an effective 82.44× magnification (0.1213 µm/pixel). Additionally, 396 H&E slides of lymph node metastases were scanned, using two systems: an Aperio Leica GT450 at 40× (0.2634 µm/pixel) and the 3DHISTECH P1000 at ~51× (0.1945 µm/pixel). (Multiple scanners were utilized over the course of the project; all resulting images were cross-verified for quality.) The digital whole slide images (WSIs) are provided in the pyramidal Aperio SVS format, a TIFF-based format compatible with standard viewers.

    In addition to full slides, tissue microarrays (TMAs) were constructed from each patient’s tumor block to sample important regions. For each case, two cylindrical core biopsies (diameter 1.5 mm) were taken – one from the tumor center and one from the invasive tumor front. These cores were assembled into TMA blocks and stained on separate slides with a panel of eight stains: H&E plus immunohistochemical (IHC) markers targeting various immune cells and tumor biomarkers. The IHC markers include CD3, CD8, CD56, CD68, CD163, PD-L1, and MHC-1, which label T cells (CD3, CD8), natural killer cells (CD56), monocytes/macrophages (CD68, CD163), and a tumor immune checkpoint ligand (PD-L1), as well as MHC class I expression. Each core appears on up to 8 stained TMA slides (one per stain), yielding up to 16 TMA images per patient (two cores × eight stains). In the dataset, TMA images are provided for both the tumor-center and tumor-front cores; these too are digitized high-resolution images (consistent microscope settings, ~40×). The combination of WSIs and TMAs yields a rich imaging dataset: 701 patients have at least one primary tumor WSI (62 patients lack WSIs due to unavailable tissue), and all patients have TMA core images unless the tumor block was exhausted. This imaging data offers both broad tissue context from WSIs and targeted cellular detail from TMAs. Manual tumor region annotations are also included for the primary tumor WSIs (see Data Analysis below).

    Clinical and Pathology Data: A wide array of non-imaging data was extracted from hospital information systems and pathology reports for each patient. Key demographic variables (age, sex, etc.) and tumor pathology details were collected, including primary tumor site, histologic subtype, grade, TNM stage, resection margin status, depth of invasion, perineural and lymphovascular invasion, and nodal metastasis status. These pathology parameters were recorded in a structured format for each case​​. Standard clinical coding systems were used where applicable: e.g., diagnoses are coded with ICD-10 codes and procedures with OPS codes (the German procedure classification system)​. The dataset includes these codes for each patient’s conditions and treatments. Comprehensive laboratory blood test results at diagnosis or pre-treatment were also compiled, covering complete blood counts, coagulation measures, electrolytes, kidney function, C-reactive protein, and other relevant analytes. Reference ranges for each lab parameter are provided alongside the values to indicate whether a result was normal or abnormal. Most patients have a full panel of these lab results, though some values are missing if a test was not clinically indicated; the dataset notes availability per patient. All structured data have been cleaned and validated – for example, harmonizing category values and checking consistency (e.g. TNM stages align with recorded tumor sites).

    Textual Data (Surgical Reports and Histories): Unstructured clinical text was also included to add rich context on treatment details. Surgery reports (operative notes) from the primary tumor resection and associated medical history summaries were retrieved from the hospital’s electronic records. For each patient, the operative report from their first definitive surgery and the corresponding

  19. b

    Five year survival from all cancers - ICP Outcomes Framework - Registered...

    • cityobservatory.birmingham.gov.uk
    csv, excel, geojson +1
    Updated Sep 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Five year survival from all cancers - ICP Outcomes Framework - Registered Locality [Dataset]. https://cityobservatory.birmingham.gov.uk/explore/dataset/five-year-survival-from-all-cancers-icp-outcomes-framework-registered-locality/
    Explore at:
    csv, json, geojson, excelAvailable download formats
    Dataset updated
    Sep 9, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This dataset presents information on five-year survival rates from all cancers, focusing on individuals diagnosed with invasive cancers (ICD-10 codes C00 to C97, excluding non-melanoma skin cancer C44). It provides a simplified local methodology for calculating survival outcomes, enabling analysis by ethnicity, deprivation, and within the Birmingham and Solihull (BSol) geography. While it does not replicate the national calculation, it offers valuable insights into cancer survival trends at a more granular level.

    Rationale

    The primary aim of this indicator is to increase five-year survival rates from all cancers. Monitoring survival over a five-year period provides a meaningful measure of cancer outcomes and the effectiveness of early diagnosis and treatment interventions.

    Numerator

    The numerator includes individuals who were diagnosed with a specific type of cancer and subsequently died from the same type of cancer within five years of diagnosis. Only invasive cancers (ICD-10 codes C00 to C97, excluding C44) are included.

    Denominator

    The denominator comprises all individuals diagnosed with an invasive cancer (ICD-10 codes C00 to C97, excluding C44) within a five-year period.

    Caveats

    This dataset uses a simplified methodology that does not replicate the national calculation. As a result, the values reported here may differ from nationally published figures. However, this approach allows for the inclusion of breakdowns by ethnicity, deprivation, and local geography (BSol), which are not always available in national statistics.

    External References

    For more information, refer to the National Cancer Registration and Analysis Service (NCRAS).

    Localities ExplainedThis dataset contains data based on either the resident locality or registered locality of the patient, a distinction is made between resident locality and registered locality populations:Resident Locality refers to individuals who live within the defined geographic boundaries of the locality. These boundaries are aligned with official administrative areas such as wards and Lower Layer Super Output Areas (LSOAs).Registered Locality refers to individuals who are registered with GP practices that are assigned to a locality based on the Primary Care Network (PCN) they belong to. These assignments are approximate—PCNs are mapped to a locality based on the location of most of their GP surgeries. As a result, locality-registered patients may live outside the locality, sometimes even in different towns or cities.This distinction is important because some health indicators are only available at GP practice level, without information on where patients actually reside. In such cases, data is attributed to the locality based on GP registration, not residential address.

    Click here to explore more from the Birmingham and Solihull Integrated Care Partnerships Outcome Framework.

  20. Cancer data of United States of America

    • kaggle.com
    zip
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tanisha1604 (2024). Cancer data of United States of America [Dataset]. https://www.kaggle.com/datasets/tanisha1604/cancer-data-of-united-states-of-america
    Explore at:
    zip(346754 bytes)Available download formats
    Dataset updated
    Apr 18, 2024
    Authors
    Tanisha1604
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    United States
    Description

    About Dataset

    The dataset contains 2 .csv files This file contains various demographic and health-related data for different regions. Here's a brief description of each column:

    File 1st

    • avganncount: Average number of cancer cases diagnosed annually.

    • avgdeathsperyear: Average number of deaths due to cancer per year.

    • target_deathrate: Target death rate due to cancer.

    • incidencerate: Incidence rate of cancer.

    • medincome: Median income in the region.

    • popest2015: Estimated population in 2015.

    • povertypercent: Percentage of population below the poverty line.

    • studypercap: Per capita number of cancer-related clinical trials conducted.

    • binnedinc: Binned median income.

    • medianage: Median age in the region.

    • pctprivatecoveragealone: Percentage of population covered by private health insurance alone.

    • pctempprivcoverage: Percentage of population covered by employee-provided private health insurance.

    • pctpubliccoverage: Percentage of population covered by public health insurance.

    • pctpubliccoveragealone: Percentage of population covered by public health insurance only.

    • pctwhite: Percentage of White population.

    • pctblack: Percentage of Black population.

    • pctasian: Percentage of Asian population.

    • pctotherrace: Percentage of population belonging to other races.

    • pctmarriedhouseholds: Percentage of married households. birthrate: Birth rate in the region.

    File 2nd

    This file contains demographic information about different regions, including details about household size and geographical location. Here's a description of each column:

    • statefips: The FIPS code representing the state.

    • countyfips: The FIPS code representing the county or census area within the state.

    • avghouseholdsize: The average household size in the region.

    • geography: The geographical location, typically represented as the county or census area name followed by the state name.

    Each row in the file represents a specific region, providing details about household size and geographical location. This information can be used for various demographic analyses and studies.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Heemali Chaudhari (2022). Cancer Rates by U.S. State [Dataset]. https://www.kaggle.com/datasets/heemalichaudhari/cancer-rates-by-us-state
Organization logo

Cancer Rates by U.S. State

Cancer Rates by U.S. State

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
zip(219237 bytes)Available download formats
Dataset updated
Dec 26, 2022
Authors
Heemali Chaudhari
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
United States
Description

In the following maps, the U.S. states are divided into groups based on the rates at which people developed or died from cancer in 2013, the most recent year for which incidence data are available.

The rates are the numbers out of 100,000 people who developed or died from cancer each year.

Incidence Rates by State The number of people who get cancer is called cancer incidence. In the United States, the rate of getting cancer varies from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

‡Rates are not shown if the state did not meet USCS publication criteria or if the state did not submit data to CDC.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Death Rates by State Rates of dying from cancer also vary from state to state.

*Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

†Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

Source: https://www.cdc.gov/cancer/dcpc/data/state.htm

Search
Clear search
Close search
Google apps
Main menu