100+ datasets found
  1. C

    Death Profiles by County

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Nov 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by County [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-county
    Explore at:
    csv(74351424), csv(75015194), csv(11738570), csv(1128641), csv(15127221), csv(60517511), csv(73906266), csv(60201673), csv(60676655), csv(28125832), csv(60023260), csv(51592721), csv(74689382), csv(52019564), csv(5095), csv(74043128), csv(24235858), csv(74497014), zip, csv(29775349)Available download formats
    Dataset updated
    Nov 26, 2025
    Dataset authored and provided by
    California Department of Public Health
    Description

    This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  2. Causes of Death - Our World In Data

    • kaggle.com
    zip
    Updated Mar 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IVAN CHAVEZ (2022). Causes of Death - Our World In Data [Dataset]. https://www.kaggle.com/ivanchvez/causes-of-death-our-world-in-data
    Explore at:
    zip(1553815 bytes)Available download formats
    Dataset updated
    Mar 29, 2022
    Authors
    IVAN CHAVEZ
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    56 million people died in 2017. What did they die from?

    The Global Burden of Disease is a major global study on the causes of death and disease published in the medical journal The Lancet. These estimates of the annual number of deaths dataset are shown here.

    Downloaded https://ourworldindata.org/causes-of-death dataset from first chart as CSV. Loaded the raw file in tableau prep for exploratory data distribution and applying some pivoting and cleaning. The output were uploaded in this dataset as well the original raw file.

    Please notice the raw file have some country agrupations by region, but there is no data indicating it's an aggregation, so be careful analyzing the whole dataset guessing there are just countries as level of detail data. In order to be more accurate, I begin to analyze countries using the ISO Country code ("Code" named column). If you have no clue as me what country ZAF is, Google is your best friend (South Africa) 😉.

  3. Deaths registered by single year of age, UK

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jan 18, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2022). Deaths registered by single year of age, UK [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathregistrationssummarytablesenglandandwalesdeathsbysingleyearofagetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 18, 2022
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Annual data on death registrations by single year of age for the UK (1974 onwards) and England and Wales (1963 onwards).

  4. d

    MD COVID-19 - Confirmed Deaths by Age Distribution

    • catalog.data.gov
    • opendata.maryland.gov
    Updated Oct 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata.maryland.gov (2025). MD COVID-19 - Confirmed Deaths by Age Distribution [Dataset]. https://catalog.data.gov/dataset/md-covid-19-confirmed-deaths-by-age-distribution
    Explore at:
    Dataset updated
    Oct 18, 2025
    Dataset provided by
    opendata.maryland.gov
    Description

    Note: Note: Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. As of April 27, 2023 updates changed from daily to weekly. Summary The cumulative number of confirmed COVID-19 deaths among Maryland residents by age: 0-9; 10-19; 20-29; 30-39; 40-49; 50-59; 60-69; 70-79; 80+; Unknown. Description The MD COVID-19 - Confirmed Deaths by Age Distribution data layer is a collection of the statewide confirmed COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by designated age ranges. A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Probable deaths are available from the MD COVID-19 - Probable Deaths by Age Distribution data layer. Terms of Use The Spatial Data, and the information therein, (collectively the "Data") is provided "as is" without warranty of any kind, either expressed, implied, or statutory. The user assumes the entire risk as to quality and performance of the Data. No guarantee of accuracy is granted, nor is any responsibility for reliance thereon assumed. In no event shall the State of Maryland be liable for direct, indirect, incidental, consequential or special damages of any kind. The State of Maryland does not accept liability for any damages or misrepresentation caused by inaccuracies in the Data or as a result to changes to the Data, nor is there responsibility assumed to maintain the Data in any manner or form. The Data can be freely distributed as long as the metadata entry is not modified or deleted. Any data derived from the Data must acknowledge the State of Maryland in the metadata.

  5. Data from: Life Expectancy prediction Dataset

    • kaggle.com
    zip
    Updated Dec 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sujay Kapadnis (2023). Life Expectancy prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sujaykapadnis/life-expectancy-prediction-dataset
    Explore at:
    zip(765628 bytes)Available download formats
    Dataset updated
    Dec 6, 2023
    Authors
    Sujay Kapadnis
    Description

    Across the world, people are living longer. In 1900, the average life expectancy of a newborn was 32 years. By 2021 this had more than doubled to 71 years. But where, when, how, and why has this dramatic change occurred? To understand it, we can look at data on life expectancy worldwide. The large reduction in child mortality has played an important role in increasing life expectancy. But life expectancy has increased at all ages. Infants, children, adults, and the elderly are all less likely to die than in the past, and death is being delayed. This remarkable shift results from advances in medicine, public health, and living standards. Along with it, many predictions of the ‘limit’ of life expectancy have been broken.

    Data Dictionary

    life_expectancy.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancydoublePeriod life expectancy at birth - Sex: all - Age: 0

    life_expectancy_different_ages.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancy0doublePeriod life expectancy at birth - Sex: all - Age: 0
    LifeExpectancy10doublePeriod life expectancy - Sex: all - Age: 10
    LifeExpectancy25doublePeriod life expectancy - Sex: all - Age: 25
    LifeExpectancy45doublePeriod life expectancy - Sex: all - Age: 45
    LifeExpectancy65doublePeriod life expectancy - Sex: all - Age: 65
    LifeExpectancy80doublePeriod life expectancy - Sex: all - Age: 80

    life_expectancy_female_male.csv

    variableclassdescription
    EntitycharacterCountry or region entity
    CodecharacterEntity code
    YeardoubleYear
    LifeExpectancyDiffFMdoubleLife expectancy difference (f-m) - Type: period - Sex: both - Age: 0

    citation(tidytuesday)

  6. NCHS - Leading Causes of Death: United States

    • catalog.data.gov
    • healthdata.gov
    • +5more
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). NCHS - Leading Causes of Death: United States [Dataset]. https://catalog.data.gov/dataset/nchs-leading-causes-of-death-united-states
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Area covered
    United States
    Description

    This dataset presents the age-adjusted death rates for the 10 leading causes of death in the United States beginning in 1999. Data are based on information from all resident death certificates filed in the 50 states and the District of Columbia using demographic and medical characteristics. Age-adjusted death rates (per 100,000 population) are based on the 2000 U.S. standard population. Populations used for computing death rates after 2010 are postcensal estimates based on the 2010 census, estimated as of July 1, 2010. Rates for census years are based on populations enumerated in the corresponding censuses. Rates for non-census years before 2010 are revised using updated intercensal population estimates and may differ from rates previously published. Causes of death classified by the International Classification of Diseases, Tenth Revision (ICD–10) are ranked according to the number of deaths assigned to rankable causes. Cause of death statistics are based on the underlying cause of death. SOURCES CDC/NCHS, National Vital Statistics System, mortality data (see http://www.cdc.gov/nchs/deaths.htm); and CDC WONDER (see http://wonder.cdc.gov). REFERENCES National Center for Health Statistics. Vital statistics data available. Mortality multiple cause files. Hyattsville, MD: National Center for Health Statistics. Available from: https://www.cdc.gov/nchs/data_access/vitalstatsonline.htm. Murphy SL, Xu JQ, Kochanek KD, Curtin SC, and Arias E. Deaths: Final data for 2015. National vital statistics reports; vol 66. no. 6. Hyattsville, MD: National Center for Health Statistics. 2017. Available from: https://www.cdc.gov/nchs/data/nvsr/nvsr66/nvsr66_06.pdf.

  7. Deaths, by place of death (hospital or non-hospital)

    • www150.statcan.gc.ca
    • ouvert.canada.ca
    • +2more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Deaths, by place of death (hospital or non-hospital) [Dataset]. http://doi.org/10.25318/1310071501-eng
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Government of Canadahttp://www.gg.ca/
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of deaths, by place of death (in hospital or non-hospital), 1991 to most recent year.

  8. Single year of age and average age of death of people whose death was due to...

    • ons.gov.uk
    xlsx
    Updated Aug 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Single year of age and average age of death of people whose death was due to or involved coronavirus (COVID-19) [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/singleyearofageandaverageageofdeathofpeoplewhosedeathwasduetoorinvolvedcovid19
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 23, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Provisional deaths registration data for single year of age and average age of death (median and mean) of persons whose death involved coronavirus (COVID-19), England and Wales. Includes deaths due to COVID-19 and breakdowns by sex.

  9. COVID-19 Deaths Mapping Tool - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jun 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2020). COVID-19 Deaths Mapping Tool - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/covid-19-deaths-mapping-tool
    Explore at:
    Dataset updated
    Jun 4, 2020
    Dataset provided by
    CKANhttps://ckan.org/
    Description

    This mapping tool enables you to see how COVID-19 deaths in your area may relate to factors in the local population, which research has shown are associated with COVID-19 mortality. It maps COVID-19 deaths rates for small areas of London (known as MSOAs) and enables you to compare these to a number of other factors including the Index of Multiple Deprivation, the age and ethnicity of the local population, extent of pre-existing health conditions in the local population, and occupational data. Research has shown that the mortality risk from COVID-19 is higher for people of older age groups, for men, for people with pre-existing health conditions, and for people from BAME backgrounds. London boroughs had some of the highest mortality rates from COVID-19 based on data to April 17th 2020, based on data from the Office for National Statistics (ONS). Analysis from the ONS has also shown how mortality is also related to socio-economic issues such as occupations classified ‘at risk’ and area deprivation. There is much about COVID-19-related mortality that is still not fully understood, including the intersection between the different factors e.g. relationship between BAME groups and occupation. On their own, none of these individual factors correlate strongly with deaths for these small areas. This is most likely because the most relevant factors will vary from area to area. In some cases it may relate to the age of the population, in others it may relate to the prevalence of underlying health conditions, area deprivation or the proportion of the population working in ‘at risk occupations’, and in some cases a combination of these or none of them. Further descriptive analysis of the factors in this tool can be found here: https://data.london.gov.uk/dataset/covid-19--socio-economic-risk-factors-briefing

  10. Deaths, by month

    • www150.statcan.gc.ca
    • gimi9.com
    • +2more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Deaths, by month [Dataset]. http://doi.org/10.25318/1310070801-eng
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Government of Canadahttp://www.gg.ca/
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of deaths, by month and place of residence, 1991 to most recent year.

  11. Excess Winter Deaths - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jul 11, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2017). Excess Winter Deaths - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/excess-winter-deaths
    Explore at:
    Dataset updated
    Jul 11, 2017
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The Excess Winter Mortality Index (EWD Index) shows excess winter deaths as a Percentage Ratio of the number of deaths expected in the (eight) warmer months either side of Winter (01 December to 31 March). So the data’s yearly time period is from 01 August to 31 July the following year. In other words, EWD is the ratio of extra deaths from all causes during the winter months compared to average non-winter deaths. The EWD Index is partly dependent on the proportion of Older People in the population, as most excess winter deaths affect Older People. This indicator covers all ages, but there is no standardisation in its calculation by age or any other factor. So figures for an area can be influenced for example by the proportion of Older People. This dataset is updated annually. Source: Office for Health Improvement and Disparities (OHID) Public Health Outcomes Framework (PHOF), indicator 90360 / E14. Age breakouts, confidence intervals and metadata are shown on the PHE (PHOF) site. Note: Please be advised that the ONS currently has this dataset under consultation for review (as of 09/01/2025) so may not be updated annually until the review has concluded. The full notice can be found on the ONS link for the Winter Mortality publication - please see link in the Additional Information Section.

  12. 💀Deaths And Obesity - 🎀Health

    • kaggle.com
    zip
    Updated May 24, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    waticson (2024). 💀Deaths And Obesity - 🎀Health [Dataset]. https://www.kaggle.com/datasets/yutodennou/death-and-obesity
    Explore at:
    zip(224551 bytes)Available download formats
    Dataset updated
    May 24, 2024
    Authors
    waticson
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This data set summarizes obesity and the number of deaths caused by it in each country

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F2993575%2Fb55c8c53db1eb6809cc0fb6b5a081195%2F2024-05-25%20093352.png?generation=1716597253375211&alt=media" alt="">

    💡I have already divided these into TRAIN data, TEST data, and ANSWER data so you guys can start working on the regression problem right away.

    • train.csv: Obesity and deaths data from 1990 to 2013
    • test.csv: The explanatory variable in 2014
    • answer.csv: The objective variable in 2014

    These data were created with the assumption that the number of deaths due to obesity in 2014 will be estimated from data from 1990 to 2013.

    There is also something called HINT data(hint.csv). This is data for 2015 and beyond. I have left it out of the train or test data because it has many missing values, but it may be useful for forecasting and for those who are interested in more recent data.

    VariablesDiscription
    Country205 country names
    CodeCountry code like AFG for Afghanistan
    YearYear of collecting data
    PopulationPopulation in a country
    Percentage-OverweightPercentage of defined as overweight, BMI >= 25(age-standardized estimate)(%),Sex: both sexes, Age group:18+
    Mean-Daily-Caloric-SupplyMean of daily supply of calories among overweight or obesity, BMI >= 25(age-standardized). Only about men
    Mean-BMIBMI, Age group:18+ years. 2 columns for both male and female
    Percentage-Overweighted-MalePercentage of adults who are overweight (age-standardized) - Age group: 18+ years. 2 columns for both male and female
    Prevalence-Hypertension-MalePrevalence of hypertension among adults aged 30-79 years(age-standardized). 2 columns for both male and female
    Prevalence-ObesityPrevalence of obesity among adults, BMI >= 30(age-standardized estimate)(%),Sex: both sexes, Age group:18+
    Death-By-High-BMIDeaths that are from all causes attributed to high body-mass index per 100,000 people, in both sexes aged age-standarized
  13. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 1, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Nov 29, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 7:11 AM EASTERN ON DEC. 1

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  14. Data Science for Good: WHO NCDs Dataset

    • kaggle.com
    zip
    Updated Jun 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Beni Vitai (2020). Data Science for Good: WHO NCDs Dataset [Dataset]. https://www.kaggle.com/datasets/benivitai/ncd-who-dataset/suggestions
    Explore at:
    zip(15630 bytes)Available download formats
    Dataset updated
    Jun 22, 2020
    Authors
    Beni Vitai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Context

    In the shadows of the Covid-19 pandemic, there is another global health crisis that has gone largely unnoticed. This is the Noncommunicable Disease (NCD) pandemic.

    The WHO website describes NCDs as follows:

    Noncommunicable diseases (NCDs), also known as chronic diseases, tend to be of long duration and are the result of a combination of genetic, physiological, environmental and behaviours factors.

    The main types of NCDs are cardiovascular diseases (like heart attacks and stroke), cancers, chronic respiratory diseases (such as chronic obstructive pulmonary disease and asthma) and diabetes.

    NCDs disproportionately affect people in low- and middle-income countries where more than three quarters of global NCD deaths – 32million – occur.

    Key facts:

    • Noncommunicable diseases (NCDs) kill 41 million people each year, equivalent to 71% of all deaths globally.
    • Each year, 15 million people die from a NCD between the ages of 30 and 69 years; over 85% of these "premature" deaths occur in low- and middle-income > * countries.
    • Cardiovascular diseases account for most NCD deaths, or 17.9 million people annually, followed by cancers (9.0 million), respiratory diseases (3.9million), and diabetes (1.6 million).
    • These 4 groups of diseases account for over 80% of all premature NCD deaths.
    • Tobacco use, physical inactivity, the harmful use of alcohol and unhealthy diets all increase the risk of dying from a NCD.
    • Detection, screening and treatment of NCDs, as well as palliative care, are key components of the response to NCDs.

    Content

    This data repository consists of 3 CSV files: WHO-cause-of-death-by-NCD.csv is the main dataset, which provides the percentage of deaths caused by NCDs out of all causes of death, for each nation globally. Metadata_Country.csv and Metadata_Indicator.csv provide additional metadata which is helpful for interpreting the main CSV.

    The data collected spans a period from 2000 to 2016. The main CSV has columns for every year from 1960 to 2019. It is advisable to drop all redundant columns where no data was collected.

    Furthermore, it is advisable to merge Metadata_Country.csv with the main CSV as it provides valuable additional information, particularly on the economic situation of each nation.

    Acknowledgements

    This dataset has been extracted from The World Bank 'Cause of death, by non-communicable diseases (% of total)' Dataset, derived based on the data from WHO's Global Health Estimates. It is freely provided under a Creative Commons Attribution 4.0 International License (CC BY 4.0), with the additional terms as stated on the World Bank website: World Bank Terms of Use for Datasets.

    Inspiration

    I would be interested to see some good data wrangling (dropping redundant columns), as well as kernels interpreting additional information in 'SpecialNotes' column in Metadata_country.csv

    It would also be great to see what different factors influence NCDs: most of all, the geopolitical factors. Would be great to see some choropleth visualisations to get an idea of which regions are most affected by NCDs.

  15. Asthma Deaths by County

    • data.chhs.ca.gov
    • data.ca.gov
    • +6more
    csv, zip
    Updated Nov 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Asthma Deaths by County [Dataset]. https://data.chhs.ca.gov/dataset/asthma-deaths-by-county
    Explore at:
    csv(43300), zipAvailable download formats
    Dataset updated
    Nov 6, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts and rates (per 1,000,000 residents) of asthma deaths among Californians statewide and by county. The data are stratified by age group (all ages, 0-17, 18+) and reported for 3-year periods. The data are derived from the California Death Statistical Master Files, which contain information collected from death certificates. All deaths with asthma coded as the underlying cause of death (ICD-10 CM J45 or J46) are included.

  16. d

    Excess Winter Deaths - Dataset - Datopian CKAN instance

    • demo.dev.datopian.com
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Excess Winter Deaths - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/marmar--excess-winter-deaths
    Explore at:
    Dataset updated
    Oct 7, 2025
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    The Excess Winter Mortality Index (EWD Index) shows excess winter deaths as a Percentage Ratio of the number of deaths expected in the (eight) warmer months either side of Winter (01 December to 31 March). So the data’s yearly time period is from 01 August to 31 July the following year. In other words, EWD is the ratio of extra deaths from all causes during the winter months compared to average non-winter deaths. The EWD Index is partly dependent on the proportion of Older People in the population, as most excess winter deaths affect Older People. This indicator covers all ages, but there is no standardisation in its calculation by age or any other factor. So figures for an area can be influenced for example by the proportion of Older People. This dataset is updated annually. Source: Office for Health Improvement and Disparities (OHID) Public Health Outcomes Framework (PHOF), indicator 90360 / E14. Age breakouts, confidence intervals and metadata are shown on the PHE (PHOF) site. Note: Please be advised that the ONS currently has this dataset under consultation for review (as of 09/01/2025) so may not be updated annually until the review has concluded. The full notice can be found on the ONS link for the Winter Mortality publication - please see link in the Additional Information Section.

  17. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    • kaggle.com
    csv, zip
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 3, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  18. Covid19 Global Excess Deaths (daily updates)

    • kaggle.com
    zip
    Updated Dec 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joakim Arvidsson (2025). Covid19 Global Excess Deaths (daily updates) [Dataset]. https://www.kaggle.com/datasets/joebeachcapital/covid19-global-excess-deaths-daily-updates
    Explore at:
    zip(2989004967 bytes)Available download formats
    Dataset updated
    Dec 2, 2025
    Authors
    Joakim Arvidsson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Daily updates of Covid-19 Global Excess Deaths from the Economist's GitHub repository: https://github.com/TheEconomist/covid-19-the-economist-global-excess-deaths-model

    Interpreting estimates

    Estimating excess deaths for every country every day since the pandemic began is a complex and difficult task. Rather than being overly confident in a single number, limited data means that we can often only give a very very wide range of plausible values. Focusing on central estimates in such cases would be misleading: unless ranges are very narrow, the 95% range should be reported when possible. The ranges assume that the conditions for bootstrap confidence intervals are met. Please see our tracker page and methodology for more information.

    New variants

    The Omicron variant, first detected in southern Africa in November 2021, appears to have characteristics that are different to earlier versions of sars-cov-2. Where this variant is now dominant, this change makes estimates uncertain beyond the ranges indicated. Other new variants may do the same. As more data is incorporated from places where new variants are dominant, predictions improve.

    Non-reporting countries

    Turkmenistan and the Democratic People's Republic of Korea have not reported any covid-19 figures since the start of the pandemic. They also have not published all-cause mortality data. Exports of estimates for the Democratic People's Republic of Korea have been temporarily disabled as it now issues contradictory data: reporting a significant outbreak through its state media, but zero confirmed covid-19 cases/deaths to the WHO.

    Acknowledgements

    A special thanks to all our sources and to those who have made the data to create these estimates available. We list all our sources in our methodology. Within script 1, the source for each variable is also given as the data is loaded, with the exception of our sources for excess deaths data, which we detail in on our free-to-read excess deaths tracker as well as on GitHub. The gradient booster implementation used to fit the models is aGTBoost, detailed here.

    Calculating excess deaths for the entire world over multiple years is both complex and imprecise. We welcome any suggestions on how to improve the model, be it data, algorithm, or logic. If you have one, please open an issue.

    The Economist would also like to acknowledge the many people who have helped us refine the model so far, be it through discussions, facilitating data access, or offering coding assistance. A special thanks to Ariel Karlinsky, Philip Schellekens, Oliver Watson, Lukas Appelhans, Berent Å. S. Lunde, Gideon Wakefield, Johannes Hunger, Carol D'Souza, Yun Wei, Mehran Hosseini, Samantha Dolan, Mollie Van Gordon, Rahul Arora, Austin Teda Atmaja, Dirk Eddelbuettel and Tom Wenseleers.

    All coding and data collection to construct these models (and make them update dynamically) was done by Sondre Ulvund Solstad. Should you have any questions about them after reading the methodology, please open an issue or contact him at sondresolstad@economist.com.

    Suggested citation The Economist and Solstad, S. (corresponding author), 2021. The pandemic’s true death toll. [online] The Economist. Available at: https://www.economist.com/graphic-detail/coronavirus-excess-deaths-estimates [Accessed ---]. First published in the article "Counting the dead", The Economist, issue 20, 2021.

  19. Deaths by vaccination status, England

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Deaths by vaccination status, England [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathsbyvaccinationstatusengland
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 25, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Age-standardised mortality rates for deaths involving coronavirus (COVID-19), non-COVID-19 deaths and all deaths by vaccination status, broken down by age group.

  20. Data from: VSRR Provisional Drug Overdose Death Counts

    • catalog.data.gov
    • healthdata.gov
    • +8more
    Updated Sep 20, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). VSRR Provisional Drug Overdose Death Counts [Dataset]. https://catalog.data.gov/dataset/vsrr-provisional-drug-overdose-death-counts
    Explore at:
    Dataset updated
    Sep 20, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    This data presents provisional counts for drug overdose deaths based on a current flow of mortality data in the National Vital Statistics System. Counts for the most recent final annual data are provided for comparison. National provisional counts include deaths occurring within the 50 states and the District of Columbia as of the date specified and may not include all deaths that occurred during a given time period. Provisional counts are often incomplete and causes of death may be pending investigation resulting in an underestimate relative to final counts. To address this, methods were developed to adjust provisional counts for reporting delays by generating a set of predicted provisional counts. Several data quality metrics, including the percent completeness in overall death reporting, percentage of deaths with cause of death pending further investigation, and the percentage of drug overdose deaths with specific drugs or drug classes reported are included to aid in interpretation of provisional data as these measures are related to the accuracy of provisional counts. Reporting of the specific drugs and drug classes involved in drug overdose deaths varies by jurisdiction, and comparisons of death rates involving specific drugs across selected jurisdictions should not be made. Provisional data presented will be updated on a monthly basis as additional records are received. For more information please visit: https://www.cdc.gov/nchs/nvss/vsrr/drug-overdose-data.htm

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
California Department of Public Health (2025). Death Profiles by County [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-county

Death Profiles by County

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
csv(74351424), csv(75015194), csv(11738570), csv(1128641), csv(15127221), csv(60517511), csv(73906266), csv(60201673), csv(60676655), csv(28125832), csv(60023260), csv(51592721), csv(74689382), csv(52019564), csv(5095), csv(74043128), csv(24235858), csv(74497014), zip, csv(29775349)Available download formats
Dataset updated
Nov 26, 2025
Dataset authored and provided by
California Department of Public Health
Description

This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

Search
Clear search
Close search
Google apps
Main menu