81 datasets found
  1. County Cancer Death Rates

    • kaggle.com
    Updated Dec 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). County Cancer Death Rates [Dataset]. https://www.kaggle.com/datasets/thedevastator/county-cancer-death-rates
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    County Cancer Death Rates

    County-level cancer death rates with related variables

    By Noah Rippner [source]

    About this dataset

    This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.

    Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.

    The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.

    To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.

    Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.

    It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.

    Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.

    Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes

    How to use the dataset

    • Familiarize Yourself with the Columns:

      • County: The name of the county.
      • FIPS: The Federal Information Processing Standards code for the county.
      • Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).
      • Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.
      • Average Deaths per Year: The average number of deaths per year due to cancer in the county.
      • Recent Trend (2): The recent trend in cancer death rates/incidence in the county.
      • Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.
      • Average Annual Count: The average annual count of cancer deaths/incidence in the county.
    • Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.

    • Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.

    • Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.

    • Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.

    • Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...

  2. CDC WONDER: Cancer Statistics

    • catalog.data.gov
    • healthdata.gov
    • +5more
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention, Department of Health & Human Services (2025). CDC WONDER: Cancer Statistics [Dataset]. https://catalog.data.gov/dataset/cdc-wonder-cancer-statistics
    Explore at:
    Dataset updated
    Feb 22, 2025
    Description

    The United States Cancer Statistics (USCS) online databases in WONDER provide cancer incidence and mortality data for the United States for the years since 1999, by year, state and metropolitan areas (MSA), age group, race, ethnicity, sex, childhood cancer classifications and cancer site. Report case counts, deaths, crude and age-adjusted incidence and death rates, and 95% confidence intervals for rates. The USCS data are the official federal statistics on cancer incidence from registries having high-quality data and cancer mortality statistics for 50 states and the District of Columbia. USCS are produced by the Centers for Disease Control and Prevention (CDC) and the National Cancer Institute (NCI), in collaboration with the North American Association of Central Cancer Registries (NAACCR). Mortality data are provided by the Centers for Disease Control and Prevention (CDC), National Center for Health Statistics (NCHS), National Vital Statistics System (NVSS).

  3. A

    ‘🎗️ Cancer Rates by U.S. State’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘🎗️ Cancer Rates by U.S. State’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-cancer-rates-by-u-s-state-5f6a/af56eb24/?iid=000-919&v=presentation
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    Analysis of ‘🎗️ Cancer Rates by U.S. State’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/yamqwe/cancer-rates-by-u-s-statee on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    About this dataset

    In the following maps, the U.S. states are divided into groups based on the rates at which people developed or died from cancer in 2013, the most recent year for which incidence data are available.

    The rates are the numbers out of 100,000 people who developed or died from cancer each year.

    Incidence Rates by State
    The number of people who get cancer is called cancer incidence. In the United States, the rate of getting cancer varies from state to state.

    • *Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

    • ‡Rates are not shown if the state did not meet USCS publication criteria or if the state did not submit data to CDC.

    • †Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

    Death Rates by State
    Rates of dying from cancer also vary from state to state.

    • *Rates are per 100,000 and are age-adjusted to the 2000 U.S. standard population.

    • †Source: U.S. Cancer Statistics Working Group. United States Cancer Statistics: 1999–2013 Incidence and Mortality Web-based Report. Atlanta (GA): Department of Health and Human Services, Centers for Disease Control and Prevention, and National Cancer Institute; 2016. Available at: http://www.cdc.gov/uscs.

    Source: https://www.cdc.gov/cancer/dcpc/data/state.htm

    This dataset was created by Adam Helsinger and contains around 100 samples along with Range, Rate, technical information and other features such as: - Range - Rate - and more.

    How to use this dataset

    • Analyze Range in relation to Rate
    • Study the influence of Range on Rate
    • More datasets

    Acknowledgements

    If you use this dataset in your research, please credit Adam Helsinger

    Start A New Notebook!

    --- Original source retains full ownership of the source dataset ---

  4. Cancer Mortality & Incidence Rates: (Country LVL)

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Cancer Mortality & Incidence Rates: (Country LVL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-county-level-cancer-mortality-and-incidence-r/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Cancer Mortality & Incidence Rates: (Country LVL)

    Investigating Cancer Trends over time

    By Data Exercises [source]

    About this dataset

    This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.

    This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.

    When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied

    Research Ideas

    • Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
    • This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
    • This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...

  5. G

    Number and rates of new cases of primary cancer, by cancer type, age group...

    • open.canada.ca
    • www150.statcan.gc.ca
    • +2more
    csv, html, xml
    Updated Feb 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2025). Number and rates of new cases of primary cancer, by cancer type, age group and sex [Dataset]. https://open.canada.ca/data/en/dataset/e667992c-5f2e-425a-8a44-a880930d82d8
    Explore at:
    csv, xml, htmlAvailable download formats
    Dataset updated
    Feb 3, 2025
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    Number and rate of new cancer cases diagnosed annually from 1992 to the most recent diagnosis year available. Included are all invasive cancers and in situ bladder cancer with cases defined using the Surveillance, Epidemiology and End Results (SEER) Groups for Primary Site based on the World Health Organization International Classification of Diseases for Oncology, Third Edition (ICD-O-3). Random rounding of case counts to the nearest multiple of 5 is used to prevent inappropriate disclosure of health-related information.

  6. Deaths from All Cancers - Datasets - Lincolnshire Open Data

    • lincolnshire.ckan.io
    Updated May 9, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.io (2017). Deaths from All Cancers - Datasets - Lincolnshire Open Data [Dataset]. https://lincolnshire.ckan.io/dataset/deaths-from-all-cancers
    Explore at:
    Dataset updated
    May 9, 2017
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data shows premature deaths (Age under 75) from all Cancers, numbers and rates by gender, as 3-year moving-averages. Cancers are a major cause of premature deaths. Inequalities exist in cancer rates between the most deprived areas and the most affluent areas. Directly Age-Standardised Rates (DASR) are shown in the data (where numbers are sufficient) so that death rates can be directly compared between areas. The DASR calculation applies Age-specific rates to a Standard (European) population to cancel out possible effects on crude rates due to different age structures among populations, thus enabling direct comparisons of rates. A limitation on using mortalities as a proxy for prevalence of health conditions is that mortalities may give an incomplete view of health conditions in an area, as ill-health might not lead to premature death. Data source: Office for Health Improvement and Disparities (OHID), indicator ID 40501, E05a. This data is updated annually.

  7. Cancer Statistics in US States

    • kaggle.com
    Updated Jun 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ms. Nancy Al Aswad (2022). Cancer Statistics in US States [Dataset]. https://www.kaggle.com/datasets/nancyalaswad90/cancer-statistics-in-us-states
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 17, 2022
    Dataset provided by
    Kaggle
    Authors
    Ms. Nancy Al Aswad
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    What are Cancer Statistics in US States?

    The circled group of good survivors has genetic indicators of poor survivors (i.e. low ESR1 levels, which is typically the prognostic indicator of poor outcomes in breast cancer) – understanding this group could be critical for helping improve mortality rates for this disease. Why this group survived was quickly analysed by using the Outcome Column (here Event Death - which is binary - 0,1) as a Data Lens (which we term Supervised vs Unsupervised analyses).

    How to use this dataset

    • A network was built using only gene expression with 272 breast cancer patients (as rows), and 1570 columns.

    • Metadata includes patient info, treatment, and survival.

    • Each node is a group of patients similar to each other. Flares (left) represent sub-populations that are distinct from the larger population. (One differentiating factor between the two flares is estrogen expression (low = top flare, high = bottom flare)).

    • A bottom flare is a group of patients with 100% survival. The top flare shows a range of survival – very poor towards the tip (red), and very good near the base (circled).

    Acknowledgments

    When we use this dataset in our research, we credit the authors as :

    The main idea for uploading this dataset is to practice data analysis with my students, as I am working in college and want my student to train our studying ideas in a big dataset, It may be not up to date and I mention the collecting years, but it is a good resource of data to practice

  8. n

    Data from: A ten-year (2009–2018) database of cancer mortality rates in...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Oct 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arianna Di Paola; Roberto Cazzolla Gatti; Alfonso Monaco; Alena Velichevskaya; Nicola Amoroso; Roberto Bellotti (2022). A ten-year (2009–2018) database of cancer mortality rates in Italy [Dataset]. http://doi.org/10.5061/dryad.ns1rn8pvg
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 24, 2022
    Dataset provided by
    University of Bologna
    Italian National Research Council
    Istituto Nazionale di Fisica Nucleare, Sezione di Bari
    National Research Tomsk State University
    University of Bari Aldo Moro
    Authors
    Arianna Di Paola; Roberto Cazzolla Gatti; Alfonso Monaco; Alena Velichevskaya; Nicola Amoroso; Roberto Bellotti
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Area covered
    Italy
    Description

    AbstractIn Italy, approximately 400.000 new cases of malignant tumors are recorded every year. The average of annual deaths caused by tumors, according to the Italian Cancer Registers, is about 3.5 deaths and about 2.5 per 1,000 men and women respectively, for a total of about 3 deaths every 1,000 people. Long-term (at least a decade) and spatially detailed data (up to the municipality scale) are neither easily accessible nor fully available for public consultation by the citizens, scientists, research groups, and associations. Therefore, here we present a ten-year (2009–2018) database on cancer mortality rates (in the form of Standardized Mortality Ratios, SMR) for 23 cancer macro-types in Italy on municipal, provincial, and regional scales. We aim to make easily accessible a comprehensive, ready-to-use, and openly accessible source of data on the most updated status of cancer mortality in Italy for local and national stakeholders, researchers, and policymakers and to provide researchers with ready-to-use data to perform specific studies. Methods For a given locality, year, and cause of death, the SMR is the ratio between the observed number of deaths (Om) and the number of expected deaths (Em): SMR = Om/Em (1) where Om should be an available observational data and Em is estimated as the weighted sum of age-specific population size for the given locality (ni) per age-specific death rates of the reference population (MRi): Em = sum(MRi x ni) (2) MRi could be provided by a public health organization or be estimated as the ratio between the age-specific number of deaths of reference population (Mi) to the age-specific reference population size (Ni): MRi = Mi/Ni (3) Thus, the value of Em is weighted by the age distribution of deaths and population size. SMR assumes value 1 when the number of observed and expected deaths are equal. Following eqns. (1-3), the SMR was computed for single years of the period 2009-2018 and for single cause of death as defined by the International ICD-10 classification system by using the following data: age-specific number of deaths by cause of reference population (i.e., Mi) from the Italian National Institute of Statistics (ISTAT, (http://www.istat.it/en/, last access: 26/01/2022)); age-specific census data on reference population (i.e., Ni) from ISTAT; the observed number of deaths by cause (i.e., Om) from ISTAT; the age-specific census data on population (ni); the SMR was estimated at three different level of aggregation: municipal, provincial (equivalent to the European classification NUTS 3) and regional (i.e., NUTS2). The SMR was also computed for the broad category of malignant tumors (i.e. C00-C979, hereinafter cancer macro-type C), and for the broad category of malignant tumor plus non-malignant tumors (i.e. C00-C979 plus D0-D489, hereinafter cancer macro-type CD). Lower 90% and 95% confidence intervals of 10-year average values were computed according to the Byar method.

  9. H

    SEER Cancer Statistics Database

    • data.niaid.nih.gov
    Updated Jul 11, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). SEER Cancer Statistics Database [Dataset]. http://doi.org/10.7910/DVN/C9KBBC
    Explore at:
    Dataset updated
    Jul 11, 2011
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Users can access data about cancer statistics in the United States including but not limited to searches by type of cancer and race, sex, ethnicity, age at diagnosis, and age at death. Background Surveillance Epidemiology and End Results (SEER) database’s mission is to provide information on cancer statistics to help reduce the burden of disease in the U.S. population. The SEER database is a project to the National Cancer Institute. The SEER database collects information on incidence, prevalence, and survival from specific geographic areas representing 28 percent of the United States population. User functionality Users can access a variety of reso urces. Cancer Stat Fact Sheets allow users to look at summaries of statistics by major cancer type. Cancer Statistic Reviews are available from 1975-2008 in table format. Users are also able to build their own tables and graphs using Fast Stats. The Cancer Query system provides more flexibility and a larger set of cancer statistics than F ast Stats but requires more input from the user. State Cancer Profiles include dynamic maps and graphs enabling the investigation of cancer trends at the county, state, and national levels. SEER research data files and SEER*Stat software are available to download through your Internet connection (SEER*Stat’s client-server mode) or via discs shipped directly to you. A signed data agreement form is required to access the SEER data Data Notes Data is available in different formats depending on which type of data is accessed. Some data is available in table, PDF, and html formats. Detailed information about the data is available under “Data Documentation and Variable Recodes”.

  10. Cancer incidence, by selected sites of cancer and sex, three-year average,...

    • www150.statcan.gc.ca
    • data.urbandatacentre.ca
    • +4more
    Updated Feb 14, 2018
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2018). Cancer incidence, by selected sites of cancer and sex, three-year average, census metropolitan areas [Dataset]. http://doi.org/10.25318/1310011201-eng
    Explore at:
    Dataset updated
    Feb 14, 2018
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Age standardized rate of cancer incidence, by selected sites of cancer and sex, three-year average, census metropolitan areas.

  11. Cancer death rate (per 100,000), New Jersey, by year: Beginning 2010

    • healthdata.nj.gov
    • data.wu.ac.at
    application/rdfxml +5
    Updated Dec 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health (2020). Cancer death rate (per 100,000), New Jersey, by year: Beginning 2010 [Dataset]. https://healthdata.nj.gov/dataset/Cancer-death-rate-per-100-000-New-Jersey-by-year-B/sc3j-a37s
    Explore at:
    csv, application/rdfxml, json, application/rssxml, xml, tsvAvailable download formats
    Dataset updated
    Dec 8, 2020
    Dataset provided by
    New Jersey Department of Healthhttps://www.nj.gov/health/
    Authors
    Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health
    Area covered
    New Jersey
    Description

    Rate: Number of deaths due to all kinds of Cancer per 100,000 Population.

    Definition: Number of deaths per 100,000 with malignant neoplasm (cancer) as the underlying cause (ICD-10 codes: C00-C97).

    Data Sources:

    (1) Centers for Disease Control and Prevention, National Center for Health Statistics. Compressed Mortality File. CDC WONDER On-line Database accessed at http://wonder.cdc.gov/cmf-icd10.html

    (2) Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health

    (3) Population Estimates, State Data Center, New Jersey Department of Labor and Workforce Development

  12. d

    Mortality Rates

    • catalog.data.gov
    • data.amerigeoss.org
    • +3more
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lake County Illinois GIS (2024). Mortality Rates [Dataset]. https://catalog.data.gov/dataset/mortality-rates-6fb72
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Lake County Illinois GIS
    Description

    Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.

  13. p

    Cervical Cancer Risk Classification - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Cervical Cancer Risk Classification - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/cervical-cancer-risk-classification
    Explore at:
    Dataset updated
    Oct 7, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cervical Cancer Risk Factors for Biopsy: This Dataset is Obtained from UCI Repository and kindly acknowledged! This file contains a List of Risk Factors for Cervical Cancer leading to a Biopsy Examination! About 11,000 new cases of invasive cervical cancer are diagnosed each year in the U.S. However, the number of new cervical cancer cases has been declining steadily over the past decades. Although it is the most preventable type of cancer, each year cervical cancer kills about 4,000 women in the U.S. and about 300,000 women worldwide. In the United States, cervical cancer mortality rates plunged by 74% from 1955 - 1992 thanks to increased screening and early detection with the Pap test. AGE Fifty percent of cervical cancer diagnoses occur in women ages 35 - 54, and about 20% occur in women over 65 years of age. The median age of diagnosis is 48 years. About 15% of women develop cervical cancer between the ages of 20 - 30. Cervical cancer is extremely rare in women younger than age 20. However, many young women become infected with multiple types of human papilloma virus, which then can increase their risk of getting cervical cancer in the future. Young women with early abnormal changes who do not have regular examinations are at high risk for localized cancer by the time they are age 40, and for invasive cancer by age 50. SOCIOECONOMIC AND ETHNIC FACTORS Although the rate of cervical cancer has declined among both Caucasian and African-American women over the past decades, it remains much more prevalent in African-Americans -- whose death rates are twice as high as Caucasian women. Hispanic American women have more than twice the risk of invasive cervical cancer as Caucasian women, also due to a lower rate of screening. These differences, however, are almost certainly due to social and economic differences. Numerous studies report that high poverty levels are linked with low screening rates. In addition, lack of health insurance, limited transportation, and language difficulties hinder a poor woman’s access to screening services. HIGH SEXUAL ACTIVITY Human papilloma virus (HPV) is the main risk factor for cervical cancer. In adults, the most important risk factor for HPV is sexual activity with an infected person. Women most at risk for cervical cancer are those with a history of multiple sexual partners, sexual intercourse at age 17 years or younger, or both. A woman who has never been sexually active has a very low risk for developing cervical cancer. Sexual activity with multiple partners increases the likelihood of many other sexually transmitted infections (chlamydia, gonorrhea, syphilis).Studies have found an association between chlamydia and cervical cancer risk, including the possibility that chlamydia may prolong HPV infection. FAMILY HISTORY Women have a higher risk of cervical cancer if they have a first-degree relative (mother, sister) who has had cervical cancer. USE OF ORAL CONTRACEPTIVES Studies have reported a strong association between cervical cancer and long-term use of oral contraception (OC). Women who take birth control pills for more than 5 - 10 years appear to have a much higher risk HPV infection (up to four times higher) than those who do not use OCs. (Women taking OCs for fewer than 5 years do not have a significantly higher risk.) The reasons for this risk from OC use are not entirely clear. Women who use OCs may be less likely to use a diaphragm, condoms, or other methods that offer some protection against sexual transmitted diseases, including HPV. Some research also suggests that the hormones in OCs might help the virus enter the genetic material of cervical cells. HAVING MANY CHILDREN Studies indicate that having many children increases the risk for developing cervical cancer, particularly in women infected with HPV. SMOKING Smoking is associated with a higher risk for precancerous changes (dysplasia) in the cervix and for progression to invasive cervical cancer, especially for women infected with HPV. IMMUNOSUPPRESSION Women with weak immune systems, (such as those with HIV / AIDS), are more susceptible to acquiring HPV. Immunocompromised patients are also at higher risk for having cervical precancer develop rapidly into invasive cancer. DIETHYLSTILBESTROL (DES) From 1938 - 1971, diethylstilbestrol (DES), an estrogen-related drug, was widely prescribed to pregnant women to help prevent miscarriages. The daughters of these women face a higher risk for cervical cancer. DES is no longer prsecribed.

  14. Data from: County-level cumulative environmental quality associated with...

    • catalog.data.gov
    • s.cnmilf.com
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). County-level cumulative environmental quality associated with cancer incidence. [Dataset]. https://catalog.data.gov/dataset/county-level-cumulative-environmental-quality-associated-with-cancer-incidence
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Population based cancer incidence rates were abstracted from National Cancer Institute, State Cancer Profiles for all available counties in the United States for which data were available. This is a national county-level database of cancer data that are collected by state public health surveillance systems. All-site cancer is defined as any type of cancer that is captured in the state registry data, though non-melanoma skin cancer is not included. All-site age-adjusted cancer incidence rates were abstracted separately for males and females. County-level annual age-adjusted all-site cancer incidence rates for years 2006–2010 were available for 2687 of 3142 (85.5%) counties in the U.S. Counties for which there are fewer than 16 reported cases in a specific area-sex-race category are suppressed to ensure confidentiality and stability of rate estimates; this accounted for 14 counties in our study. Two states, Kansas and Virginia, do not provide data because of state legislation and regulations which prohibit the release of county level data to outside entities. Data from Michigan does not include cases diagnosed in other states because data exchange agreements prohibit the release of data to third parties. Finally, state data is not available for three states, Minnesota, Ohio, and Washington. The age-adjusted average annual incidence rate for all counties was 453.7 per 100,000 persons. We selected 2006–2010 as it is subsequent in time to the EQI exposure data which was constructed to represent the years 2000–2005. We also gathered data for the three leading causes of cancer for males (lung, prostate, and colorectal) and females (lung, breast, and colorectal). The EQI was used as an exposure metric as an indicator of cumulative environmental exposures at the county-level representing the period 2000 to 2005. A complete description of the datasets used in the EQI are provided in Lobdell et al. and methods used for index construction are described by Messer et al. The EQI was developed for the period 2000– 2005 because it was the time period for which the most recent data were available when index construction was initiated. The EQI includes variables representing each of the environmental domains. The air domain includes 87 variables representing criteria and hazardous air pollutants. The water domain includes 80 variables representing overall water quality, general water contamination, recreational water quality, drinking water quality, atmospheric deposition, drought, and chemical contamination. The land domain includes 26 variables representing agriculture, pesticides, contaminants, facilities, and radon. The built domain includes 14 variables representing roads, highway/road safety, public transit behavior, business environment, and subsidized housing environment. The sociodemographic environment includes 12 variables representing socioeconomics and crime. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., L. Messer, K. Rappazzo , C. Gray, S. Grabich , and D. Lobdell. County-level environmental quality and associations with cancer incidence#. Cancer. John Wiley & Sons Incorporated, New York, NY, USA, 123(15): 2901-2908, (2017).

  15. a

    5 Year Male Cancer Incidence MSSA

    • usc-geohealth-hub-uscssi.hub.arcgis.com
    • uscssi.hub.arcgis.com
    Updated Nov 10, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spatial Sciences Institute (2021). 5 Year Male Cancer Incidence MSSA [Dataset]. https://usc-geohealth-hub-uscssi.hub.arcgis.com/datasets/5-year-male-cancer-incidence-mssa
    Explore at:
    Dataset updated
    Nov 10, 2021
    Dataset authored and provided by
    Spatial Sciences Institute
    Area covered
    Description

    Medical Service Study Areas (MSSAs)As defined by California's Office of Statewide Health Planning and Development (OSHPD) in 2013, "MSSAs are sub-city and sub-county geographical units used to organize and display population, demographic and physician data" (Source). Each census tract in CA is assigned to a given MSSA. The most recent MSSA dataset (2014) was used. Spatial data are available via OSHPD at the California Open Data Portal. This information may be useful in studying health equity.Age-Adjusted Incidence Rate (AAIR)Age-adjustment is a statistical method that allows comparisons of incidence rates to be made between populations with different age distributions. This is important since the incidence of most cancers increases with age. An age-adjusted cancer incidence (or death) rate is defined as the number of new cancers (or deaths) per 100,000 population that would occur in a certain period of time if that population had a 'standard' age distribution. In the California Health Maps, incidence rates are age-adjusted using the U.S. 2000 Standard Population.

    Cancer incidence rates

    Incidence rates were calculated using case counts from the California Cancer Registry. Population data from 2010 Census and SEER 2015 census tract estimates by race/origin (controlling to Vintage 2015) were used to estimate population denominators. Yearly SEER 2015 census tract estimates by race/origin (controlling to Vintage 2015) were used to estimate population denominators for 5-year incidence rates (2013-2017)According to California Department of Public Health guidelines, cancer incidence rates cannot be reported if based on <15 cancer cases and/or a population <10,000 to ensure confidentiality and stable statistical rates.Spatial extent: CaliforniaSpatial Unit: MSSACreated: n/aUpdated: n/aSource: California Health MapsContact Email: gbacr@ucsf.eduSource Link: https://www.californiahealthmaps.org/?areatype=mssa&address=&sex=Both&site=AllSite&race=&year=05yr&overlays=none&choropleth=Obesity

  16. o

    Most Fatal Cancers in South Africa - Dataset - openAFRICA

    • open.africa
    Updated Oct 22, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Most Fatal Cancers in South Africa - Dataset - openAFRICA [Dataset]. https://open.africa/dataset/most-fatal-cancers-in-south-africa
    Explore at:
    Dataset updated
    Oct 22, 2015
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    South Africa
    Description

    Two datasets that explore causes of death due to cancer in South Africa, drawing on data from the Revised Burden of Disease estimates for the Comparative Risk Factor Assessment for South Africa, 2000. The number and percentage of deaths due to cancer by cause are ranked for persons, males and females in the tables below. Lung cancer is the leading cause of cancer in SA accounting for 17% of all cancer deaths. This is followed by oesophagus Ca which accounts for 13%, cervix cancer accounting for 8%, breast cancer accounting for 8% and liver cancer which accounts for 6% of all cancers. Many more males suffer from lung and oesophagus cancer than females.

  17. Lung Cancer Death Rate (per 100,000), New Jersey, by year: Beginning 2010

    • healthdata.nj.gov
    • data.wu.ac.at
    application/rdfxml +5
    Updated Dec 8, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health (2020). Lung Cancer Death Rate (per 100,000), New Jersey, by year: Beginning 2010 [Dataset]. https://healthdata.nj.gov/dataset/Lung-Cancer-Death-Rate-per-100-000-New-Jersey-by-y/ia77-ctqr
    Explore at:
    csv, tsv, application/rdfxml, application/rssxml, xml, jsonAvailable download formats
    Dataset updated
    Dec 8, 2020
    Dataset provided by
    New Jersey Department of Healthhttps://www.nj.gov/health/
    Authors
    Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health
    Area covered
    New Jersey
    Description

    Rate: Number of deaths due to cancer of the trachea, bronchus, and lung per 100,000 Population.

    Definition: Number of deaths per 100,000 with malignant neoplasm (cancer) cancer of the trachea, bronchus, and lung as the underlying cause (ICD-10 codes: C33-C34).

    Data Sources:

    (1) Centers for Disease Control and Prevention, National Center for Health Statistics. Compressed Mortality File. CDC WONDER On-line Database accessed at http://wonder.cdc.gov/cmf-icd10.html

    (2) Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health

    (3) Population Estimates, State Data Center, New Jersey Department of Labor and Workforce Development

  18. a

    NCI State Late Stage Breast Cancer Incidence Rates

    • hub.arcgis.com
    Updated Jan 21, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Cancer Institute (2020). NCI State Late Stage Breast Cancer Incidence Rates [Dataset]. https://hub.arcgis.com/datasets/9dd0d923f8034cc8806173fdc224777d
    Explore at:
    Dataset updated
    Jan 21, 2020
    Dataset authored and provided by
    National Cancer Institute
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Area covered
    Description

    This dataset contains Cancer Incidence data for Breast Cancer (Late Stage^) including: Age-Adjusted Rate, Confidence Interval, Average Annual Count, and Trend field information for US States for the average 5 year span from 2016 to 2020.Data are for females segmented by age (All Ages, Ages Under 50, Ages 50 & Over, Ages Under 65, and Ages 65 & Over), with field names and aliases describing the sex and age group tabulated.For more information, visit statecancerprofiles.cancer.govData NotationsState Cancer Registries may provide more current or more local data.TrendRising when 95% confidence interval of average annual percent change is above 0.Stable when 95% confidence interval of average annual percent change includes 0.Falling when 95% confidence interval of average annual percent change is below 0.† Incidence rates (cases per 100,000 population per year) are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84, 85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Rates calculated using SEER*Stat. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used for SEER and NPCR incidence rates.‡ Incidence Trend data come from different sources. Due to different years of data availability, most of the trends are AAPCs based on APCs but some are APCs calculated in SEER*Stat. Please refer to the source for each area for additional information.Rates and trends are computed using different standards for malignancy. For more information see malignant.^ Late Stage is defined as cases determined to be regional or distant. Due to changes in stage coding, Combined Summary Stage (2004+) is used for data from Surveillance, Epidemiology, and End Results (SEER) databases and Merged Summary Stage is used for data from National Program of Cancer Registries databases. Due to the increased complexity with staging, other staging variables maybe used if necessary.Data Source Field Key(1) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(5) Source: National Program of Cancer Registries and Surveillance, Epidemiology, and End Results SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention and National Cancer Institute. Based on the 2022 submission.(6) Source: National Program of Cancer Registries SEER*Stat Database - United States Department of Health and Human Services, Centers for Disease Control and Prevention (based on the 2022 submission).(7) Source: SEER November 2022 submission.(8) Source: Incidence data provided by the SEER Program. AAPCs are calculated by the Joinpoint Regression Program and are based on APCs. Data are age-adjusted to the 2000 US standard population (19 age groups: <1, 1-4, 5-9, ... , 80-84,85+). Rates are for invasive cancer only (except for bladder cancer which is invasive and in situ) or unless otherwise specified. Population counts for denominators are based on Census populations as modified by NCI. The US Population Data File is used with SEER November 2022 data.Some data are not available, see Data Not Available for combinations of geography, cancer site, age, and race/ethnicity.Data for the United States does not include data from Nevada.Data for the United States does not include Puerto Rico.

  19. Cancer survival in England - adults diagnosed

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2019). Cancer survival in England - adults diagnosed [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/conditionsanddiseases/datasets/cancersurvivalratescancersurvivalinenglandadultsdiagnosed
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 12, 2019
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    One-year and five-year net survival for adults (15-99) in England diagnosed with one of 29 common cancers, by age and sex.

  20. o

    Synthetic Oral Cancer Prediction Dataset

    • opendatabay.com
    .undefined
    Updated Jun 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Opendatabay Labs (2025). Synthetic Oral Cancer Prediction Dataset [Dataset]. https://www.opendatabay.com/data/synthetic/09f348fc-a2e8-4132-9f1b-195765d80afc
    Explore at:
    .undefinedAvailable download formats
    Dataset updated
    Jun 26, 2025
    Dataset authored and provided by
    Opendatabay Labs
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Patient Health Records & Digital Health
    Description

    The Synthetic Oral Cancer Prediction Dataset is designed for educational and research purposes to analyse factors associated with oral cancer risk, progression, and treatment outcomes. The dataset includes anonymised, synthetic data on various clinical, lifestyle, and demographic factors for individuals diagnosed with oral cancer.

    Dataset Features

    • ID: Unique identifier for each participant.
    • Country: Country of residence of the participant.
    • Age: Age of the participant (in years).
    • Gender: Gender of the participant (Male/Female).
    • Tobacco Use: History of tobacco use (Yes/No).
    • Alcohol Consumption: History of alcohol consumption (Yes/No).
    • HPV Infection: Presence of human papillomavirus infection (Yes/No).
    • Betel Quid Use: History of Betel quid use (Yes/No).
    • Chronic Sun Exposure: History of chronic sun exposure (Yes/No).
    • Poor Oral Hygiene: Poor oral hygiene habits (Yes/No).
    • Diet (Fruits & Vegetables Intake): Frequency of consuming fruits and vegetables (Yes/No).
    • Family History of Cancer: Family history of cancer (Yes/No).
    • Compromised Immune System: Whether the participant has a compromised immune system (Yes/No).
    • Oral Lesions: Presence of oral lesions (Yes/No).
    • Unexplained Bleeding: Presence of unexplained bleeding (Yes/No).
    • Difficulty Swallowing: Difficulty in swallowing (Yes/No).
    • White or Red Patches in Mouth: Presence of white or red patches in the mouth (Yes/No).
    • Tumor Size (cm): Size of the tumor in centimeters.
    • Cancer Stage: Stage of the oral cancer (1-4).
    • Treatment Type: Type of treatment received (e.g., Surgery, Radiation, Chemotherapy).
    • Survival Rate (5-Year, %): 5-year survival rate in percentage.
    • Cost of Treatment (USD): Total cost of treatment in USD.
    • Economic Burden (Lost Workdays per Year): Economic burden due to lost workdays each year.
    • Early Diagnosis: Whether early diagnosis was made (Yes/No).
    • Oral Cancer (Diagnosis): Diagnosis of oral cancer (Yes/No).

    Distribution

    https://storage.googleapis.com/opendatabay_public/09f348fc-a2e8-4132-9f1b-195765d80afc/622bf59174d1_plot_output.png" alt="Synthetic oral cancer dataset plot_output.png">

    Usage

    This dataset can be used for the following applications:

    • Cancer Research: Investigate the relationship between various lifestyle, clinical, and demographic factors with oral cancer risk and progression.
    • Predictive Modeling: Build machine learning models to predict cancer diagnosis, survival rate, or treatment outcomes based on participant data.
    • Healthcare and Public Health: Study the impact of lifestyle factors (e.g., tobacco, alcohol, diet) on the development and progression of oral cancer.
    • Educational Purposes: Provide a dataset for students and researchers in oncology, medical data science, and public health fields to analyze cancer risk factors and treatment outcomes.

    Coverage

    This synthetic dataset is fully anonymized and complies with data privacy standards. It includes a wide array of factors that support diverse research and analysis in the oncology and public health domains.

    License

    CC0 (Public Domain)

    Who Can Use It

    • Cancer Researchers: To explore correlations between lifestyle factors, clinical features, and treatment outcomes in oral cancer.
    • Oncologists and Healthcare Providers: To analyze the effectiveness of different treatments and factors that affect prognosis and survival.
    • Public Health Professionals: To study the broader societal and economic impacts of oral cancer and develop preventive measures.
    • Data Scientists and Machine Learning Practitioners: To develop predictive models for diagnosing oral cancer and improving treatment planning.
    • Educators and Students: As a resource for studying cancer risk analysis, healthcare data science, and public health analytics.
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2023). County Cancer Death Rates [Dataset]. https://www.kaggle.com/datasets/thedevastator/county-cancer-death-rates
Organization logo

County Cancer Death Rates

County-level cancer death rates with related variables

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 3, 2023
Dataset provided by
Kaggle
Authors
The Devastator
Description

County Cancer Death Rates

County-level cancer death rates with related variables

By Noah Rippner [source]

About this dataset

This dataset provides comprehensive information on county-level cancer death and incidence rates, as well as various related variables. It includes data on age-adjusted death rates, average deaths per year, recent trends in cancer death rates, recent 5-year trends in death rates, and average annual counts of cancer deaths or incidence. The dataset also includes the federal information processing standards (FIPS) codes for each county.

Additionally, the dataset indicates whether each county met the objective of a targeted death rate of 45.5. The recent trend in cancer deaths or incidence is also captured for analysis purposes.

The purpose of the death.csv file within this dataset is to offer detailed information specifically concerning county-level cancer death rates and related variables. On the other hand, the incd.csv file contains data on county-level cancer incidence rates and additional relevant variables.

To provide more context and understanding about the included data points, there is a separate file named cancer_data_notes.csv. This file serves to provide informative notes and explanations regarding the various aspects of the cancer data used in this dataset.

Please note that this particular description provides an overview for a linear regression walkthrough using this dataset based on Python programming language. It highlights how to source and import the data properly before moving into data preparation steps such as exploratory analysis. The walkthrough further covers model selection and important model diagnostics measures.

It's essential to bear in mind that this example serves as an initial attempt at creating a multivariate Ordinary Least Squares regression model using these datasets from various sources like cancer.gov along with US Census American Community Survey data. This baseline model allows easy comparisons with future iterations intended for improvements or refinements.

Important columns found within this extensively documented Kaggle dataset include County names along with their corresponding FIPS codes—a standardized coding system by Federal Information Processing Standards (FIPS). Moreover,Met Objective of 45.5? (1) column denotes whether a specific county achieved the targeted objective of a death rate of 45.5 or not.

Overall, this dataset aims to offer valuable insights into county-level cancer death and incidence rates across various regions, providing policymakers, researchers, and healthcare professionals with essential information for analysis and decision-making purposes

How to use the dataset

  • Familiarize Yourself with the Columns:

    • County: The name of the county.
    • FIPS: The Federal Information Processing Standards code for the county.
    • Met Objective of 45.5? (1): Indicates whether the county met the objective of a death rate of 45.5 (Boolean).
    • Age-Adjusted Death Rate: The age-adjusted death rate for cancer in the county.
    • Average Deaths per Year: The average number of deaths per year due to cancer in the county.
    • Recent Trend (2): The recent trend in cancer death rates/incidence in the county.
    • Recent 5-Year Trend (2) in Death Rates: The recent 5-year trend in cancer death rates/incidence in the county.
    • Average Annual Count: The average annual count of cancer deaths/incidence in the county.
  • Determine Counties Meeting Objective: Use this dataset to identify counties that have met or not met an objective death rate threshold of 45.5%. Look for entries where Met Objective of 45.5? (1) is marked as True or False.

  • Analyze Age-Adjusted Death Rates: Study and compare age-adjusted death rates across different counties using Age-Adjusted Death Rate values provided as floats.

  • Explore Average Deaths per Year: Examine and compare average annual counts and trends regarding deaths caused by cancer, using Average Deaths per Year as a reference point.

  • Investigate Recent Trends: Assess recent trends related to cancer deaths or incidence by analyzing data under columns such as Recent Trend, Recent Trend (2), and Recent 5-Year Trend (2) in Death Rates. These columns provide information on how cancer death rates/incidence have changed over time.

  • Compare Counties: Utilize this dataset to compare counties based on their cancer death rates and related variables. Identify counties with lower or higher average annual counts, age-adjusted death rates, or recent trends to analyze and understand the factors contributing ...

Search
Clear search
Close search
Google apps
Main menu