100+ datasets found
  1. Death in the United States

    • kaggle.com
    zip
    Updated Aug 3, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
    Explore at:
    zip(766333584 bytes)Available download formats
    Dataset updated
    Aug 3, 2017
    Dataset authored and provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

    It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

    Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

    Overview

    This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

    A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

    All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

    Project ideas

    • The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?
    • Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?
    • As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!
    • WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

    Differences from the first version of the dataset

    • This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.
    • We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.
    • A tutorial kernel providing a quick introduction to the new format is available here.
    • Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
  2. Death Profiles by County

    • data.ca.gov
    • data.chhs.ca.gov
    • +3more
    csv, zip
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by County [Dataset]. https://data.ca.gov/dataset/death-profiles-by-county
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  3. Statewide Death Profiles

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Aug 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
    Explore at:
    csv(5401561), csv(2026589), csv(463460), csv(5034), csv(16301), csv(200270), csv(164006), csv(419332), csv(4689434), zip, csv(406971)Available download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  4. Deaths registered by single year of age, UK

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jan 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2022). Deaths registered by single year of age, UK [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathregistrationssummarytablesenglandandwalesdeathsbysingleyearofagetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 18, 2022
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Annual data on death registrations by single year of age for the UK (1974 onwards) and England and Wales (1963 onwards).

  5. Provisional COVID-19 death counts, rates, and percent of total deaths, by...

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Sep 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Provisional COVID-19 death counts, rates, and percent of total deaths, by jurisdiction of residence [Dataset]. https://catalog.data.gov/dataset/provisional-covid-19-death-counts-rates-and-percent-of-total-deaths-by-jurisdiction-of-res
    Explore at:
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    This file contains COVID-19 death counts, death rates, and percent of total deaths by jurisdiction of residence. The data is grouped by different time periods including 3-month period, weekly, and total (cumulative since January 1, 2020). United States death counts and rates include the 50 states, plus the District of Columbia and New York City. New York state estimates exclude New York City. Puerto Rico is included in HHS Region 2 estimates. Deaths with confirmed or presumed COVID-19, coded to ICD–10 code U07.1. Number of deaths reported in this file are the total number of COVID-19 deaths received and coded as of the date of analysis and may not represent all deaths that occurred in that period. Counts of deaths occurring before or after the reporting period are not included in the file. Data during recent periods are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death. Death counts should not be compared across states. Data timeliness varies by state. Some states report deaths on a daily basis, while other states report deaths weekly or monthly. The ten (10) United States Department of Health and Human Services (HHS) regions include the following jurisdictions. Region 1: Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; Region 2: New Jersey, New York, New York City, Puerto Rico; Region 3: Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, West Virginia; Region 4: Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee; Region 5: Illinois, Indiana, Michigan, Minnesota, Ohio, Wisconsin; Region 6: Arkansas, Louisiana, New Mexico, Oklahoma, Texas; Region 7: Iowa, Kansas, Missouri, Nebraska; Region 8: Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming; Region 9: Arizona, California, Hawaii, Nevada; Region 10: Alaska, Idaho, Oregon, Washington. Rates were calculated using the population estimates for 2021, which are estimated as of July 1, 2021 based on the Blended Base produced by the US Census Bureau in lieu of the April 1, 2020 decennial population count. The Blended Base consists of the blend of Vintage 2020 postcensal population estimates, 2020 Demographic Analysis Estimates, and 2020 Census PL 94-171 Redistricting File (see https://www2.census.gov/programs-surveys/popest/technical-documentation/methodology/2020-2021/methods-statement-v2021.pdf). Rates are based on deaths occurring in the specified week/month and are age-adjusted to the 2000 standard population using the direct method (see https://www.cdc.gov/nchs/data/nvsr/nvsr70/nvsr70-08-508.pdf). These rates differ from annual age-adjusted rates, typically presented in NCHS publications based on a full year of data and annualized weekly/monthly age-adjusted rates which have been adjusted to allow comparison with annual rates. Annualization rates presents deaths per year per 100,000 population that would be expected in a year if the observed period specific (weekly/monthly) rate prevailed for a full year. Sub-national death counts between 1-9 are suppressed in accordance with NCHS data confidentiality standards. Rates based on death counts less than 20 are suppressed in accordance with NCHS standards of reliability as specified in NCHS Data Presentation Standards for Proportions (available from: https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf.).

  6. Effect of suicide rates on life expectancy dataset

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filip Zoubek; Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. http://doi.org/10.5281/zenodo.4694270
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 16, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Filip Zoubek; Filip Zoubek
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Effect of suicide rates on life expectancy dataset

    Abstract
    In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
    The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

    Data

    The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

    LICENSE

    THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

    [1] https://www.kaggle.com/szamil/who-suicide-statistics

    [2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

  7. Deaths registered weekly in England and Wales, provisional

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Sep 24, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Deaths registered weekly in England and Wales, provisional [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/weeklyprovisionalfiguresondeathsregisteredinenglandandwales
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Sep 24, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Area covered
    England
    Description

    Provisional counts of the number of deaths registered in England and Wales, by age, sex, region and Index of Multiple Deprivation (IMD), in the latest weeks for which data are available.

  8. TABLE III. Deaths in 122 U.S. cities

    • data.cdc.gov
    • data.virginia.gov
    • +6more
    csv, xlsx, xml
    Updated Oct 6, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Center for Immunization and Respiratory Diseases (NCIRD) (2016). TABLE III. Deaths in 122 U.S. cities [Dataset]. https://data.cdc.gov/dataset/TABLE-III-Deaths-in-122-U-S-cities/rpjd-ejph
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Oct 6, 2016
    Dataset provided by
    National Center for Immunization and Respiratory Diseases
    Authors
    National Center for Immunization and Respiratory Diseases (NCIRD)
    Area covered
    United States
    Description

    TABLE III. Deaths in 122 U.S. cities – 2016. 122 Cities Mortality Reporting System — Each week, the vital statistics offices of 122 cities across the United States report the total number of death certificates processed and the number of those for which pneumonia or influenza was listed as the underlying or contributing cause of death by age group (Under 28 days, 28 days –1 year, 1-14 years, 15-24 years, 25-44 years, 45-64 years, 65-74 years, 75-84 years, and ≥ 85 years).

    FOOTNOTE: U: Unavailable. —: No reported cases. * Mortality data in this table are voluntarily reported from 122 cities in the United States, most of which have populations of 100,000 or more. A death is reported by the place of its occurrence and by the week that the death certificate was filed. Fetal deaths are not included.

    † Pneumonia and influenza.

    § Total includes unknown ages.

  9. N

    New York City Leading Causes of Death

    • data.cityofnewyork.us
    • catalog.data.gov
    application/rdfxml +5
    Updated Dec 9, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Department of Health and Mental Hygiene (DOHMH) (2024). New York City Leading Causes of Death [Dataset]. https://data.cityofnewyork.us/Health/New-York-City-Leading-Causes-of-Death/jb7j-dtam
    Explore at:
    csv, json, tsv, xml, application/rssxml, application/rdfxmlAvailable download formats
    Dataset updated
    Dec 9, 2024
    Dataset authored and provided by
    Department of Health and Mental Hygiene (DOHMH)
    Area covered
    New York
    Description

    The leading causes of death by sex and ethnicity in New York City in since 2007. Cause of death is derived from the NYC death certificate which is issued for every death that occurs in New York City.

    Report last ran: 09/24/2019
    Rates based on small numbers (RSE > 30) as well as aggregate counts less than 5 have been suppressed in downloaded data

    Source: Bureau of Vital Statistics and New York City Department of Health and Mental Hygiene

  10. d

    COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Aug 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.ct.gov (2023). COVID-19 Cases and Deaths by Race/Ethnicity - ARCHIVE [Dataset]. https://catalog.data.gov/dataset/covid-19-cases-and-deaths-by-race-ethnicity
    Explore at:
    Dataset updated
    Aug 12, 2023
    Dataset provided by
    data.ct.gov
    Description

    Note: DPH is updating and streamlining the COVID-19 cases, deaths, and testing data. As of 6/27/2022, the data will be published in four tables instead of twelve. The COVID-19 Cases, Deaths, and Tests by Day dataset contains cases and test data by date of sample submission. The death data are by date of death. This dataset is updated daily and contains information back to the beginning of the pandemic. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-Deaths-and-Tests-by-Day/g9vi-2ahj. The COVID-19 State Metrics dataset contains over 93 columns of data. This dataset is updated daily and currently contains information starting June 21, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-State-Level-Data/qmgw-5kp6 . The COVID-19 County Metrics dataset contains 25 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-County-Level-Data/ujiq-dy22 . The COVID-19 Town Metrics dataset contains 16 columns of data. This dataset is updated daily and currently contains information starting June 16, 2022 to the present. The data can be found at https://data.ct.gov/Health-and-Human-Services/COVID-19-Town-Level-Data/icxw-cada . To protect confidentiality, if a town has fewer than 5 cases or positive NAAT tests over the past 7 days, those data will be suppressed. COVID-19 cases and associated deaths that have been reported among Connecticut residents, broken down by race and ethnicity. All data in this report are preliminary; data for previous dates will be updated as new reports are received and data errors are corrected. Deaths reported to the either the Office of the Chief Medical Examiner (OCME) or Department of Public Health (DPH) are included in the COVID-19 update. The following data show the number of COVID-19 cases and associated deaths per 100,000 population by race and ethnicity. Crude rates represent the total cases or deaths per 100,000 people. Age-adjusted rates consider the age of the person at diagnosis or death when estimating the rate and use a standardized population to provide a fair comparison between population groups with different age distributions. Age-adjustment is important in Connecticut as the median age of among the non-Hispanic white population is 47 years, whereas it is 34 years among non-Hispanic blacks, and 29 years among Hispanics. Because most non-Hispanic white residents who died were over 75 years of age, the age-adjusted rates are lower than the unadjusted rates. In contrast, Hispanic residents who died tend to be younger than 75 years of age which results in higher age-adjusted rates. The population data used to calculate rates is based on the CT DPH population statistics for 2019, which is available online here: https://portal.ct.gov/DPH/Health-Information-Systems--Reporting/Population/Population-Statistics. Prior to 5/10/2021, the population estimates from 2018 were used. Rates are standardized to the 2000 US Millions Standard population (data available here: https://seer.cancer.gov/stdpopulations/). Standardization was done using 19 age groups (0, 1-4, 5-9, 10-14, ..., 80-84, 85 years and older). More information about direct standardization for age adjustment is available here: https://www.cdc.gov/nchs/data/statnt/statnt06rv.pdf Categories are mutually exclusive. The category “multiracial” includes people who answered ‘yes’ to more than one race category. Counts may not add up to total case counts as data on race and ethnicity may be missing. Age adjusted rates calculated only for groups with more than 20 deaths. Abbreviation: NH=Non-Hispanic. Data on Connecticut deaths were obtained from the Connecticut Deaths Registry maintained by the DPH Office of Vital Records. Cause of death was determined by a death certifier (e.g., physician, APRN, medical

  11. Provisional COVID-19 death counts and rates by month, jurisdiction of...

    • data.virginia.gov
    • healthdata.gov
    • +3more
    csv, json, rdf, xsl
    Updated Sep 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Provisional COVID-19 death counts and rates by month, jurisdiction of residence, and demographic characteristics [Dataset]. https://data.virginia.gov/dataset/provisional-covid-19-death-counts-and-rates-by-month-jurisdiction-of-residence-and-demographic-
    Explore at:
    json, csv, xsl, rdfAvailable download formats
    Dataset updated
    Sep 25, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    This file contains COVID-19 death counts and rates by month and year of death, jurisdiction of residence (U.S., HHS Region) and demographic characteristics (sex, age, race and Hispanic origin, and age/race and Hispanic origin). United States death counts and rates include the 50 states, plus the District of Columbia.

    Deaths with confirmed or presumed COVID-19, coded to ICD–10 code U07.1. Number of deaths reported in this file are the total number of COVID-19 deaths received and coded as of the date of analysis and may not represent all deaths that occurred in that period. Counts of deaths occurring before or after the reporting period are not included in the file.

    Data during recent periods are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death.

    Death counts should not be compared across jurisdictions. Data timeliness varies by state. Some states report deaths on a daily basis, while other states report deaths weekly or monthly.

    The ten (10) United States Department of Health and Human Services (HHS) regions include the following jurisdictions. Region 1: Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; Region 2: New Jersey, New York; Region 3: Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, West Virginia; Region 4: Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee; Region 5: Illinois, Indiana, Michigan, Minnesota, Ohio, Wisconsin; Region 6: Arkansas, Louisiana, New Mexico, Oklahoma, Texas; Region 7: Iowa, Kansas, Missouri, Nebraska; Region 8: Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming; Region 9: Arizona, California, Hawaii, Nevada; Region 10: Alaska, Idaho, Oregon, Washington.

    Rates were calculated using the population estimates for 2021, which are estimated as of July 1, 2021 based on the Blended Base produced by the US Census Bureau in lieu of the April 1, 2020 decennial population count. The Blended Base consists of the blend of Vintage 2020 postcensal population estimates, 2020 Demographic Analysis Estimates, and 2020 Census PL 94-171 Redistricting File (see https://www2.census.gov/programs-surveys/popest/technical-documentation/methodology/2020-2021/methods-statement-v2021.pdf).

    Rate are based on deaths occurring in the specified week and are age-adjusted to the 2000 standard population using the direct method (see https://www.cdc.gov/nchs/data/nvsr/nvsr70/nvsr70-08-508.pdf). These rates differ from annual age-adjusted rates, typically presented in NCHS publications based on a full year of data and annualized weekly age-adjusted rates which have been adjusted to allow comparison with annual rates. Annualization rates presents deaths per year per 100,000 population that would be expected in a year if the observed period specific (weekly) rate prevailed for a full year.

    Sub-national death counts between 1-9 are suppressed in accordance with NCHS data confidentiality standards. Rates based on death counts less than 20 are suppressed in accordance with NCHS standards of reliability as specified in NCHS Data Presentation Standards for Proportions (available from: https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf.).

  12. Deaths, by month

    • www150.statcan.gc.ca
    • gimi9.com
    • +3more
    Updated Feb 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Deaths, by month [Dataset]. http://doi.org/10.25318/1310070801-eng
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Government of Canadahttp://www.gg.ca/
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of deaths, by month and place of residence, 1991 to most recent year.

  13. T

    CORONAVIRUS DEATHS by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS DEATHS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-deaths
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Mar 4, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  14. Deaths from Suicide - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jul 11, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2017). Deaths from Suicide - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/deaths-from-suicide
    Explore at:
    Dataset updated
    Jul 11, 2017
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    This data shows deaths (of people age 10 and over) from Suicide and Undetermined Injury, numbers and rates by gender, as 3-year moving-averages. Suicide is a significant cause of premature deaths occurring generally at younger ages than other common causes of premature mortality. It may also be seen as an indicator of underlying rates of mental ill-health. Directly Age-Standardised Rates (DASR) are shown in the data, where numbers are sufficient, so that death rates can be directly compared between areas. The DASR calculation applies Age-specific rates to a Standard (European) population to cancel out possible effects on crude rates due to different age structures among populations, thus enabling direct comparisons of rates. The figures in this dataset include deaths recorded as suicide (people age 10 and over) and undetermined injury (age 15 and over) as those are mostly likely also to have been caused by self-harm rather than unverifiable accident, neglect or abuse. The population denominators for rates are age 10 and over. Low numbers may result in zero values or missing data. Data source: Office for Health Improvement and Disparities (OHID), Public Health Outcomes Framework (PHOF) indicator 41001 (E10). This data is updated annually.

  15. Death Statistics | DATA.GOV.HK

    • data.gov.hk
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.gov.hk, Death Statistics | DATA.GOV.HK [Dataset]. https://data.gov.hk/en-data/dataset/hk-dh-dh_ncddhss-ncdd-dataset-3
    Explore at:
    Dataset provided by
    data.gov.hk
    Description

    Death statistics (i) Number of Deaths for Different Sexes and Crude Death Rate for the Period from 1981 to 2023 (ii) Age-standardised Death Rate (Overall and by Sex) for the Period from 1981 to 2023 (iii) Age-specific Death Rate for Year 2013 and 2023 (iv) Death Rates by Leading Causes of Death for the Period from 2001 to 2023 (v) Number of Deaths by Leading Causes of Death for the Period from 2001 to 2023 (vi) Age-standardised Death Rates by Leading Causes of Death for the Period from 2001 to 2023 (vii) Late Foetal Mortality Rate for the Period from 1981 to 2023 (viii) Perinatal Mortality Rate for the Period from 1981 to 2023 (ix) Neonatal Mortality Rate for the Period from 1981 to 2023 (x) Infant Mortality Rate for the Period from 1981 to 2023 (xi) Number of Maternal Deaths for the Period from 1981 to 2023 (xii) Maternal Mortality Ratio for the Period from 1981 to 2023

  16. Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status and...

    • healthdata.gov
    • odgavaprod.ogopendata.com
    • +2more
    application/rdfxml +5
    Updated Jun 16, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cdc.gov (2023). Rates of COVID-19 Cases or Deaths by Age Group and Vaccination Status and Booster Dose [Dataset]. https://healthdata.gov/w/pifi-rn2z/default?cur=dU-uRhCR4oE
    Explore at:
    xml, csv, application/rdfxml, tsv, json, application/rssxmlAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    data.cdc.gov
    Description

    Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Vaccination Status. Click 'More' for important dataset description and footnotes

    Dataset and data visualization details: These data were posted on October 21, 2022, archived on November 18, 2022, and revised on February 22, 2023. These data reflect cases among persons with a positive specimen collection date through September 24, 2022, and deaths among persons with a positive specimen collection date through September 3, 2022.

    Vaccination status: A person vaccinated with a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected. Additional or booster dose: A person vaccinated with a primary series and an additional or booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after receipt of an additional or booster dose of any COVID-19 vaccine on or after August 13, 2021. For people ages 18 years and older, data are graphed starting the week including September 24, 2021, when a COVID-19 booster dose was first recommended by CDC for adults 65+ years old and people in certain populations and high risk occupational and institutional settings. For people ages 12-17 years, data are graphed starting the week of December 26, 2021, 2 weeks after the first recommendation for a booster dose for adolescents ages 16-17 years. For people ages 5-11 years, data are included starting the week of June 5, 2022, 2 weeks after the first recommendation for a booster dose for children aged 5-11 years. For people ages 50 years and older, data on second booster doses are graphed starting the week including March 29, 2022, when the recommendation was made for second boosters. Vertical lines represent dates when changes occurred in U.S. policy for COVID-19 vaccination (details provided above). Reporting is by primary series vaccine type rather than additional or booster dose vaccine type. The booster dose vaccine type may be different than the primary series vaccine type. ** Because data on the immune status of cases and associated deaths are unavailable, an additional dose in an immunocompromised person cannot be distinguished from a booster dose. This is a relevant consideration because vaccines can be less effective in this group. Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Rates of COVID-19 deaths by vaccination status are reported based on when the patient was tested for COVID-19, not the date they died. Deaths usually occur up to 30 days after COVID-19 diagnosis. Participating jurisdictions: Currently, these 31 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (New York), North Carolina, Philadelphia (Pennsylvania), Rhode Island, South Dakota, Tennessee, Texas, Utah, Washington, and West Virginia; 30 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 72% of the total U.S. population and all ten of the Health and Human Services Regions. Data on cases

  17. COVID-19 Time-Series Metrics by County and State (ARCHIVED)

    • data.chhs.ca.gov
    • data.ca.gov
    • +2more
    csv, xlsx, zip
    Updated Aug 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2024). COVID-19 Time-Series Metrics by County and State (ARCHIVED) [Dataset]. https://data.chhs.ca.gov/dataset/covid-19-time-series-metrics-by-county-and-state
    Explore at:
    csv(3313), csv(7729431), xlsx(6471), csv(6223281), csv(4836928), xlsx(11305), zip, xlsx(7811)Available download formats
    Dataset updated
    Aug 28, 2024
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    Note: This COVID-19 data set is no longer being updated as of December 1, 2023. Access current COVID-19 data on the CDPH respiratory virus dashboard (https://www.cdph.ca.gov/Programs/CID/DCDC/Pages/Respiratory-Viruses/RespiratoryDashboard.aspx) or in open data format (https://data.chhs.ca.gov/dataset/respiratory-virus-dashboard-metrics).

    As of August 17, 2023, data is being updated each Friday.

    For death data after December 31, 2022, California uses Provisional Deaths from the Center for Disease Control and Prevention’s National Center for Health Statistics (NCHS) National Vital Statistics System (NVSS). Prior to January 1, 2023, death data was sourced from the COVID-19 registry. The change in data source occurred in July 2023 and was applied retroactively to all 2023 data to provide a consistent source of death data for the year of 2023.

    As of May 11, 2023, data on cases, deaths, and testing is being updated each Thursday. Metrics by report date have been removed, but previous versions of files with report date metrics are archived below.

    All metrics include people in state and federal prisons, US Immigration and Customs Enforcement facilities, US Marshal detention facilities, and Department of State Hospitals facilities. Members of California's tribal communities are also included.

    The "Total Tests" and "Positive Tests" columns show totals based on the collection date. There is a lag between when a specimen is collected and when it is reported in this dataset. As a result, the most recent dates on the table will temporarily show NONE in the "Total Tests" and "Positive Tests" columns. This should not be interpreted as no tests being conducted on these dates. Instead, these values will be updated with the number of tests conducted as data is received.

  18. Death Profiles by ZIP Code

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by ZIP Code [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-zip-code
    Explore at:
    csv(80055974), csv(78958555), csv(4571), csv(80054609), csv(40627562), zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California residents by ZIP Code based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths of California residents. The data tables include deaths of residents of California by ZIP Code of residence (by residence). The data are reported as totals, as well as stratified by age and gender. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  19. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    csv, zip
    Updated Sep 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Sep 26, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  20. Single year of age and average age of death of people whose death was due to...

    • ons.gov.uk
    xlsx
    Updated Aug 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Single year of age and average age of death of people whose death was due to or involved coronavirus (COVID-19) [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/singleyearofageandaverageageofdeathofpeoplewhosedeathwasduetoorinvolvedcovid19
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Provisional deaths registration data for single year of age and average age of death (median and mean) of persons whose death involved coronavirus (COVID-19), England and Wales. Includes deaths due to COVID-19 and breakdowns by sex.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
Organization logo

Death in the United States

Learn more about the leading causes of death from 2005-2015

Explore at:
zip(766333584 bytes)Available download formats
Dataset updated
Aug 3, 2017
Dataset authored and provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
United States
Description

Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

Overview

This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

Project ideas

  • The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?
  • Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?
  • As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!
  • WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

Differences from the first version of the dataset

  • This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.
  • We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.
  • A tutorial kernel providing a quick introduction to the new format is available here.
  • Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
Search
Clear search
Close search
Google apps
Main menu