100+ datasets found
  1. Death in the United States

    • kaggle.com
    zip
    Updated Aug 3, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
    Explore at:
    zip(766333584 bytes)Available download formats
    Dataset updated
    Aug 3, 2017
    Dataset authored and provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

    It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

    Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

    Overview

    This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

    A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

    All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

    Project ideas

    • The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?
    • Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?
    • As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!
    • WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

    Differences from the first version of the dataset

    • This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.
    • We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.
    • A tutorial kernel providing a quick introduction to the new format is available here.
    • Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
  2. C

    Death Profiles by County

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Aug 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by County [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-county
    Explore at:
    csv(24235858), csv(11738570), csv(15127221), csv(60676655), csv(1128641), csv(60023260), csv(28125832), csv(75015194), zip, csv(74043128), csv(74351424), csv(74497014), csv(60201673), csv(74689382), csv(73906266), csv(60517511), csv(52019564), csv(51592721), csv(5095), csv(25609913)Available download formats
    Dataset updated
    Aug 22, 2025
    Dataset authored and provided by
    California Department of Public Health
    Description

    This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  3. Provisional COVID-19 death counts, rates, and percent of total deaths, by...

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Sep 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). Provisional COVID-19 death counts, rates, and percent of total deaths, by jurisdiction of residence [Dataset]. https://catalog.data.gov/dataset/provisional-covid-19-death-counts-rates-and-percent-of-total-deaths-by-jurisdiction-of-res
    Explore at:
    Dataset updated
    Sep 26, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    This file contains COVID-19 death counts, death rates, and percent of total deaths by jurisdiction of residence. The data is grouped by different time periods including 3-month period, weekly, and total (cumulative since January 1, 2020). United States death counts and rates include the 50 states, plus the District of Columbia and New York City. New York state estimates exclude New York City. Puerto Rico is included in HHS Region 2 estimates. Deaths with confirmed or presumed COVID-19, coded to ICD–10 code U07.1. Number of deaths reported in this file are the total number of COVID-19 deaths received and coded as of the date of analysis and may not represent all deaths that occurred in that period. Counts of deaths occurring before or after the reporting period are not included in the file. Data during recent periods are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death. Death counts should not be compared across states. Data timeliness varies by state. Some states report deaths on a daily basis, while other states report deaths weekly or monthly. The ten (10) United States Department of Health and Human Services (HHS) regions include the following jurisdictions. Region 1: Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; Region 2: New Jersey, New York, New York City, Puerto Rico; Region 3: Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, West Virginia; Region 4: Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee; Region 5: Illinois, Indiana, Michigan, Minnesota, Ohio, Wisconsin; Region 6: Arkansas, Louisiana, New Mexico, Oklahoma, Texas; Region 7: Iowa, Kansas, Missouri, Nebraska; Region 8: Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming; Region 9: Arizona, California, Hawaii, Nevada; Region 10: Alaska, Idaho, Oregon, Washington. Rates were calculated using the population estimates for 2021, which are estimated as of July 1, 2021 based on the Blended Base produced by the US Census Bureau in lieu of the April 1, 2020 decennial population count. The Blended Base consists of the blend of Vintage 2020 postcensal population estimates, 2020 Demographic Analysis Estimates, and 2020 Census PL 94-171 Redistricting File (see https://www2.census.gov/programs-surveys/popest/technical-documentation/methodology/2020-2021/methods-statement-v2021.pdf). Rates are based on deaths occurring in the specified week/month and are age-adjusted to the 2000 standard population using the direct method (see https://www.cdc.gov/nchs/data/nvsr/nvsr70/nvsr70-08-508.pdf). These rates differ from annual age-adjusted rates, typically presented in NCHS publications based on a full year of data and annualized weekly/monthly age-adjusted rates which have been adjusted to allow comparison with annual rates. Annualization rates presents deaths per year per 100,000 population that would be expected in a year if the observed period specific (weekly/monthly) rate prevailed for a full year. Sub-national death counts between 1-9 are suppressed in accordance with NCHS data confidentiality standards. Rates based on death counts less than 20 are suppressed in accordance with NCHS standards of reliability as specified in NCHS Data Presentation Standards for Proportions (available from: https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf.).

  4. T

    CORONAVIRUS DEATH by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Aug 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2021). CORONAVIRUS DEATH by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-death
    Explore at:
    csv, xml, excel, jsonAvailable download formats
    Dataset updated
    Aug 14, 2021
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for CORONAVIRUS DEATH reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  5. Natural Disasters Deaths

    • kaggle.com
    Updated Nov 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Natural Disasters Deaths [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-fatal-cost-of-natural-disasters
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 19, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Natural Disasters Deaths

    People killed in natural disasters by country by year

    About this dataset

    How much do natural disasters cost us? In lives, in dollars, in infrastructure? This dataset attempts to answer those questions, tracking the death toll and damage cost of major natural disasters since 1985. Disasters included are storms ( hurricanes, typhoons, and cyclones ), floods, earthquakes, droughts, wildfires, and extreme temperatures

    How to use the dataset

    This dataset contains information on natural disasters that have occurred around the world from 1900 to 2017. The data includes the date of the disaster, the location, the type of disaster, the number of people killed, and the estimated cost in US dollars

    Research Ideas

    • An all-in-one disaster map displaying all recorded natural disasters dating back to 1900.
    • Natural disaster hotspots - where do natural disasters most commonly occur and kill the most people?
    • A live map tracking current natural disasters around the world

    Acknowledgements

    License

    See the dataset description for more information.

  6. T

    World Coronavirus COVID-19 Deaths

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). World Coronavirus COVID-19 Deaths [Dataset]. https://tradingeconomics.com/world/coronavirus-deaths
    Explore at:
    excel, csv, xml, jsonAvailable download formats
    Dataset updated
    Mar 9, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 4, 2020 - May 17, 2023
    Area covered
    World
    Description

    The World Health Organization reported 6932591 Coronavirus Deaths since the epidemic began. In addition, countries reported 766440796 Coronavirus Cases. This dataset provides - World Coronavirus Deaths- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  7. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    csv, zip
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  8. g

    CIA Factbook, Death Rate by Country, World, 2007

    • geocommons.com
    Updated May 27, 2008
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data (2008). CIA Factbook, Death Rate by Country, World, 2007 [Dataset]. http://geocommons.com/search.html
    Explore at:
    Dataset updated
    May 27, 2008
    Dataset provided by
    data
    Description

    This dataset gives the average annual number of deaths during a year per 1,000 population at midyear; also known as crude death rate. This information was found at the CIA's World Factbook 2007. The site had this to say about death rate, "The death rate, while only a rough indicator of the mortality situation in a country, accurately indicates the current mortality impact on population growth. This indicator is significantly affected by age distribution, and most countries will eventually show a rise in the overall death rate, in spite of continued decline in mortality at all ages, as declining fertility results in an aging population." Source: https://www.cia.gov/library/publications/the-world-factbook/docs/notesanddefs.html#2010 Accessed: 9.17.07

  9. Causes of death - deaths by country of residence and occurrence

    • ec.europa.eu
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat, Causes of death - deaths by country of residence and occurrence [Dataset]. http://doi.org/10.2908/HLTH_CD_ARO
    Explore at:
    tsv, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+xml;version=3.0.0, json, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.data+csv;version=2.0.0Available download formats
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2011 - 2023
    Area covered
    Romania, France, United Kingdom, Spain, Serbia, Liechtenstein, Switzerland, Türkiye, Metropolitan France, Luxembourg
    Description

    Data on causes of death (COD) provide information on mortality patterns and form a major element of public health information.

    The COD data refer to the underlying cause which - according to the World Health Organisation (WHO) - is "the disease or injury which initiated the train of morbid events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury".

    The data are derived from the medical certificate of death, which is obligatory in the Member States. The information recorded in the death certificate is according to the rules specified by the WHO.

    Data published in Eurostat's dissemination database are broken down by sex, 5-year age groups, cause of death and by residency and country of occurrence. For stillbirths and neonatal deaths additional breakdowns might include age of mother and parity.

    Data are available for Member States, Iceland, Norway, Liechtenstein, Switzerland, United Kingdom, Serbia, Turkey, North Macedonia and Albania. Regional data (NUTS level 2) are available for all of the countries having NUTS2 regions except Albania.

    Annual national data are available in Eurostat's dissemination database in absolute number, crude death rates and standardised death rates. At regional level the same is provided in form of 3-years averages (the average of year, year -1 and year -2). Annual crude and standardised death rates are also available at NUTS2 level. Monthly national data are available for 21 EU Member States from reference year 2019 and in 24 Member States from reference year 2022 in absolute numbers and standardised death rates.

  10. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Oct 7, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Sep 28, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 2:10 AM EASTERN ON OCT. 7

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  11. Leading causes of death, total population, by age group

    • www150.statcan.gc.ca
    • open.canada.ca
    Updated Feb 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Leading causes of death, total population, by age group [Dataset]. http://doi.org/10.25318/1310039401-eng
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Rank, number of deaths, percentage of deaths, and age-specific mortality rates for the leading causes of death, by age group and sex, 2000 to most recent year.

  12. w

    Fire statistics data tables

    • gov.uk
    • s3.amazonaws.com
    Updated Sep 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ministry of Housing, Communities and Local Government (2025). Fire statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/fire-statistics-data-tables
    Explore at:
    Dataset updated
    Sep 25, 2025
    Dataset provided by
    GOV.UK
    Authors
    Ministry of Housing, Communities and Local Government
    Description

    On 1 April 2025 responsibility for fire and rescue transferred from the Home Office to the Ministry of Housing, Communities and Local Government.

    This information covers fires, false alarms and other incidents attended by fire crews, and the statistics include the numbers of incidents, fires, fatalities and casualties as well as information on response times to fires. The Ministry of Housing, Communities and Local Government (MHCLG) also collect information on the workforce, fire prevention work, health and safety and firefighter pensions. All data tables on fire statistics are below.

    MHCLG has responsibility for fire services in England. The vast majority of data tables produced by the Ministry of Housing, Communities and Local Government are for England but some (0101, 0103, 0201, 0501, 1401) tables are for Great Britain split by nation. In the past the Department for Communities and Local Government (who previously had responsibility for fire services in England) produced data tables for Great Britain and at times the UK. Similar information for devolved administrations are available at https://www.firescotland.gov.uk/about/statistics/">Scotland: Fire and Rescue Statistics, https://statswales.gov.wales/Catalogue/Community-Safety-and-Social-Inclusion/Community-Safety">Wales: Community safety and https://www.nifrs.org/home/about-us/publications/">Northern Ireland: Fire and Rescue Statistics.

    If you use assistive technology (for example, a screen reader) and need a version of any of these documents in a more accessible format, please email alternativeformats@communities.gov.uk. Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Fire statistics guidance
    Fire statistics incident level datasets

    Incidents attended

    https://assets.publishing.service.gov.uk/media/686d2aa22557debd867cbe14/FIRE0101.xlsx">FIRE0101: Incidents attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 153 KB) Previous FIRE0101 tables

    https://assets.publishing.service.gov.uk/media/686d2ab52557debd867cbe15/FIRE0102.xlsx">FIRE0102: Incidents attended by fire and rescue services in England, by incident type and fire and rescue authority (MS Excel Spreadsheet, 2.19 MB) Previous FIRE0102 tables

    https://assets.publishing.service.gov.uk/media/686d2aca10d550c668de3c69/FIRE0103.xlsx">FIRE0103: Fires attended by fire and rescue services by nation and population (MS Excel Spreadsheet, 201 KB) Previous FIRE0103 tables

    https://assets.publishing.service.gov.uk/media/686d2ad92557debd867cbe16/FIRE0104.xlsx">FIRE0104: Fire false alarms by reason for false alarm, England (MS Excel Spreadsheet, 492 KB) Previous FIRE0104 tables

    Dwelling fires attended

    https://assets.publishing.service.gov.uk/media/686d2af42cfe301b5fb6789f/FIRE0201.xlsx">FIRE0201: Dwelling fires attended by fire and rescue services by motive, population and nation (MS Excel Spreadsheet, 192 KB) Previous FIRE0201 tables

    <span class="gem

  13. Leading Causes of Death US

    • kaggle.com
    Updated Jan 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Leading Causes of Death US [Dataset]. https://www.kaggle.com/datasets/thedevastator/leading-causes-of-death-us
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 29, 2023
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Area covered
    United States
    Description

    Leading Causes of Death US

    1980–2009 by Sex, Race, and Hispanic Origin

    By Health [source]

    About this dataset

    This fascinating dataset takes a look at the leading causes of death in the United States from 1980-2009, broken down by sex, race, and Hispanic origin. This data sheds light on how mortality in the US has changed over time among these categories. Accounting for everything from heart disease to cancer to suicide, this insight can be used by health researchers and policy makers to gain a better understanding of disparities in healthcare and deaths across different groups. Whether studying questions related to public health or more targeted population issues such as gender biases in death rates, this dataset provides an important resource for anyone interested in examining mortality across demographic lines

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset can be used to explore some of the leading causes of death in the United States from 1980 to 2009, broken down by sex, race, and Hispanic origin. This data can be used to better understand mortality trends and risk factors associated with different populations in America.

    By using this dataset you can compare and contrast mortality rates across different gender, racial, and ethnic groups during this time period. You can also compare different causes of death within these demographic categories to see if there are any patterns over time or notable differences between groups.

    You could even use this data to track changes across population groups as a whole or look at details for specific years or types of causes of death in particular groups. With this information one may gain insight into health disparities across population segments in America— aiding advocates for social change & public policy shifts toward improved health outcomes for all Americans!

    Research Ideas

    • Analyzing regional or state-level differences in mortality rates over time.
    • Examining the beahvioral factors or risk factors associated with each cause of death for different genders and populations.
    • Examining the prevalence of each cause of death as a proportion to an overall population trend in different socio-economic categories such as race or income level

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Selected_Trend_Table_from_Health_United_States_2011._Leading_causes_of_death_and_numbers_of_deaths_by_sex_race_and_Hispanic_origin_United_States_1980_and_2009.csv | Column name | Description | |:-------------------|:---------------------------------------------------------------------------------------------------------| | Group | The group of people the cause of death applies to (e.g. men, women, whites, blacks, hispanics). (String) | | Year | The year the cause of death was recorded. (Integer) | | Cause of death | The cause of death. (String) | | Flag | A flag indicating whether the cause of death is considered a leading cause. (Boolean) | | Deaths | The number of deaths attributed to the cause of death. (Integer) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Health.

  14. I

    Israel IL: Death Rate: Crude: per 1000 People

    • ceicdata.com
    Updated Mar 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2018). Israel IL: Death Rate: Crude: per 1000 People [Dataset]. https://www.ceicdata.com/en/israel/population-and-urbanization-statistics/il-death-rate-crude-per-1000-people
    Explore at:
    Dataset updated
    Mar 15, 2018
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2005 - Dec 1, 2016
    Area covered
    Israel
    Variables measured
    Population
    Description

    Israel IL: Death Rate: Crude: per 1000 People data was reported at 5.100 Ratio in 2016. This records a decrease from the previous number of 5.300 Ratio for 2015. Israel IL: Death Rate: Crude: per 1000 People data is updated yearly, averaging 6.300 Ratio from Dec 1960 (Median) to 2016, with 57 observations. The data reached an all-time high of 7.200 Ratio in 1972 and a record low of 5.100 Ratio in 2016. Israel IL: Death Rate: Crude: per 1000 People data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Israel – Table IL.World Bank.WDI: Population and Urbanization Statistics. Crude death rate indicates the number of deaths occurring during the year, per 1,000 population estimated at midyear. Subtracting the crude death rate from the crude birth rate provides the rate of natural increase, which is equal to the rate of population change in the absence of migration.; ; (1) United Nations Population Division. World Population Prospects: 2017 Revision. (2) Census reports and other statistical publications from national statistical offices, (3) Eurostat: Demographic Statistics, (4) United Nations Statistical Division. Population and Vital Statistics Reprot (various years), (5) U.S. Census Bureau: International Database, and (6) Secretariat of the Pacific Community: Statistics and Demography Programme.; Weighted average;

  15. C

    Death Profiles by ZIP Code

    • data.chhs.ca.gov
    csv, zip
    Updated Apr 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Death Profiles by ZIP Code [Dataset]. https://data.chhs.ca.gov/dataset/death-profiles-by-zip-code
    Explore at:
    csv(4571), csv(78958555), csv(80055974), csv(80054609), csv(40627562), zipAvailable download formats
    Dataset updated
    Apr 22, 2025
    Dataset authored and provided by
    California Department of Public Health
    Description

    This dataset contains counts of deaths for California residents by ZIP Code based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths of California residents. The data tables include deaths of residents of California by ZIP Code of residence (by residence). The data are reported as totals, as well as stratified by age and gender. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  16. Cancer Mortality & Incidence Rates: (Country LVL)

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Cancer Mortality & Incidence Rates: (Country LVL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-county-level-cancer-mortality-and-incidence-r/versions/2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kaggle
    Authors
    The Devastator
    Description

    Cancer Mortality & Incidence Rates: (Country LVL)

    Investigating Cancer Trends over time

    By Data Exercises [source]

    About this dataset

    This dataset is a comprehensive collection of data from county-level cancer mortality and incidence rates in the United States between 2000-2014. This data provides an unprecedented level of detail into cancer cases, deaths, and trends at a local level. The included columns include County, FIPS, age-adjusted death rate, average death rate per year, recent trend (2) in death rates, recent 5-year trend (2) in death rates and average annual count for each county. This dataset can be used to provide deep insight into the patterns and effects of cancer on communities as well as help inform policy decisions related to mitigating risk factors or increasing preventive measures such as screenings. With this comprehensive set of records from across the United States over 15 years, you will be able to make informed decisions regarding individual patient care or policy development within your own community!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides comprehensive US county-level cancer mortality and incidence rates from 2000 to 2014. It includes the mortality and incidence rate for each county, as well as whether the county met the objective of 45.5 deaths per 100,000 people. It also provides information on recent trends in death rates and average annual counts of cases over the five year period studied.

    This dataset can be extremely useful to researchers looking to study trends in cancer death rates across counties. By using this data, researchers will be able to gain valuable insight into how different counties are performing in terms of providing treatment and prevention services for cancer patients and whether preventative measures and healthcare access are having an effect on reducing cancer mortality rates over time. This data can also be used to inform policy makers about counties needing more target prevention efforts or additional resources for providing better healthcare access within at risk communities.

    When using this dataset, it is important to pay close attention to any qualitative columns such as “Recent Trend” or “Recent 5-Year Trend (2)” that may provide insights into long term changes that may not be readily apparent when using quantitative variables such as age-adjusted death rate or average deaths per year over shorter periods of time like one year or five years respectively. Additionally, when studying differences between different counties it is important to take note of any standard FIPS code differences that may indicate that data was collected by a different source with a difference methodology than what was used in other areas studied

    Research Ideas

    • Using this dataset, we can identify patterns in cancer mortality and incidence rates that are statistically significant to create treatment regimens or preventive measures specifically targeting those areas.
    • This data can be useful for policymakers to target areas with elevated cancer mortality and incidence rates so they can allocate financial resources to these areas more efficiently.
    • This dataset can be used to investigate which factors (such as pollution levels, access to medical care, genetic make up) may have an influence on the cancer mortality and incidence rates in different US counties

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: death .csv | Column name | Description | |:-------------------------------------------|:-------------------------------------------------------------------...

  17. Global Country Information 2023

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Jun 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nidula Elgiriyewithana; Nidula Elgiriyewithana (2024). Global Country Information 2023 [Dataset]. http://doi.org/10.5281/zenodo.8165229
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jun 15, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nidula Elgiriyewithana; Nidula Elgiriyewithana
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.

    Key Features

    • Country: Name of the country.
    • Density (P/Km2): Population density measured in persons per square kilometer.
    • Abbreviation: Abbreviation or code representing the country.
    • Agricultural Land (%): Percentage of land area used for agricultural purposes.
    • Land Area (Km2): Total land area of the country in square kilometers.
    • Armed Forces Size: Size of the armed forces in the country.
    • Birth Rate: Number of births per 1,000 population per year.
    • Calling Code: International calling code for the country.
    • Capital/Major City: Name of the capital or major city.
    • CO2 Emissions: Carbon dioxide emissions in tons.
    • CPI: Consumer Price Index, a measure of inflation and purchasing power.
    • CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
    • Currency_Code: Currency code used in the country.
    • Fertility Rate: Average number of children born to a woman during her lifetime.
    • Forested Area (%): Percentage of land area covered by forests.
    • Gasoline_Price: Price of gasoline per liter in local currency.
    • GDP: Gross Domestic Product, the total value of goods and services produced in the country.
    • Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
    • Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
    • Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
    • Largest City: Name of the country's largest city.
    • Life Expectancy: Average number of years a newborn is expected to live.
    • Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
    • Minimum Wage: Minimum wage level in local currency.
    • Official Language: Official language(s) spoken in the country.
    • Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
    • Physicians per Thousand: Number of physicians per thousand people.
    • Population: Total population of the country.
    • Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
    • Tax Revenue (%): Tax revenue as a percentage of GDP.
    • Total Tax Rate: Overall tax burden as a percentage of commercial profits.
    • Unemployment Rate: Percentage of the labor force that is unemployed.
    • Urban Population: Percentage of the population living in urban areas.
    • Latitude: Latitude coordinate of the country's location.
    • Longitude: Longitude coordinate of the country's location.

    Potential Use Cases

    • Analyze population density and land area to study spatial distribution patterns.
    • Investigate the relationship between agricultural land and food security.
    • Examine carbon dioxide emissions and their impact on climate change.
    • Explore correlations between economic indicators such as GDP and various socio-economic factors.
    • Investigate educational enrollment rates and their implications for human capital development.
    • Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
    • Study labor market dynamics through indicators such as labor force participation and unemployment rates.
    • Investigate the role of taxation and its impact on economic development.
    • Explore urbanization trends and their social and environmental consequences.
  18. g

    Cross-National Statistics on the Causes of Death, 1966-1974 - Archival...

    • search.gesis.org
    Updated Feb 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United Nations (2021). Cross-National Statistics on the Causes of Death, 1966-1974 - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR07624
    Explore at:
    Dataset updated
    Feb 26, 2021
    Dataset provided by
    GESIS search
    ICPSR - Interuniversity Consortium for Political and Social Research
    Authors
    United Nations
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441841https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441841

    Description

    Abstract (en): These data are a collection of demographic statistics for the populations of 125 countries or areas throughout the world, prepared by the Statistical Office of the United Nations. The units of analysis are both country and data year. The primary source of data is a set of questionnaires sent monthly and annually to national statistical services and other appropriate government offices. Data include statistics on approximately 50 types of causes of death for the years 1966 through 1974 for males, females, and total populations. Causes of death in 125 countries or areas throughout the world between the years 1966 and 1974. 2005-11-04 On 2005-03-14 new files were added to one or more datasets. These files included additional setup files as well as one or more of the following: SAS program, SAS transport, SPSS portable, and Stata system files. The metadata record was revised 2005-11-04 to reflect these additions. The causes of death are classified according to the 6th, 7th, and 8th versions of an abbreviated list of the World Health Organization's INTERNATIONAL STATISTICAL CLASSIFICATION OF DISEASES, INJURIES, AND CAUSES OF DEATH. Therefore, data for causes of death are not necessarily comparable across countries or data years. Users should refer to Variable 5 in the Variable List for full discussion of this problem. Users interested in comparing deaths for countries or years that use different versions of the Abbreviated list should consult two publications: A. Joan Klebba, and Alice B. Dolman. COMPARABILITY OF MORTALITY STATISTICS FOR THE SEVENTH AND EIGHTH REVISIONS OF THE INTERNATIONAL CLASSIFICATION OF DISEASES, UNITED STATES. Rockville, MD: United States Department of Health, Education, and Welfare. Public Health Service. Health Services and Mental Health Administration. National Center for Health Statistics, 1975, and World Health Organization. MANUAL OF THE INTERNATIONAL STATISTICAL CLASSIFICATION OF DISEASES, INJURIES, AND CAUSES OF DEATH. Geneva, Switzerland: World Health Organization, 1967.The user should note that countries have data covering a variety of time spans (the maximum span being 1965-1973), and the data have not been padded to supply missing data codes for those years for which a country does not have data. Thus, Egypt has data for years 1965 through 1972, while Kenya has data for only 1970. (See Appendix D in the codebook to determine the years for which a country has data.)It is important that any user of these data consult the United Nations' DEMOGRAPHIC YEARBOOK, 1976, for further explanation of the data's limitations. Certain countries have modified reporting procedures which are presented in both the footnotes and the technical notes accompanying the tables in the Yearbook. There is no way to identify these problems using only the machine-readable data.In order to eliminate unnecessary repetition of identifying information, data were merged so that each record now contains all the data for a country for one particular year. In this process, breakdowns of deaths by ethnic group and/or urban/rural classification were omitted since only a few countries provided such information. Each record now contains the data for the number of deaths from each cause of death for male, female, and total.While the data appear to be in a rectangular matrix, such is not the case. This occurs because different versions of the abbreviated list are referenced in different data years. The lack of a rectangular data matrix does little to restrict the manageability of the dataset. See codebook for examples.While the data have been reformatted and documented by ICPSR staff, there has been no attempt to verify the accuracy and consistency of the data received from the U.N. Statistical Office.

  19. Decadal Avg. Natural Disasters Data [ 1900 - 2010]

    • kaggle.com
    Updated Feb 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shubam Sumbria (2022). Decadal Avg. Natural Disasters Data [ 1900 - 2010] [Dataset]. https://www.kaggle.com/datasets/shubamsumbria/decadal-avg-natural-disasters-data-1900-2010/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 25, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shubam Sumbria
    Description

    Data published by Our World in Data based on EM-DAT, CRED / UCLouvain, Brussels, Belgium – www.emdat.be (D. Guha-Sapir)

    Variable time span 1900 – 2010

    This dataset has been calculated and compiled by Our World in Data based on raw disaster data published by EM-DAT, CRED / UCLouvain, Brussels, Belgium – www.emdat.be (D. Guha-Sapir). EM-DAT publishes comprehensive, global data on each individual disaster event – estimating the number of deaths; people affected; and economic damages, from UN reports; government records; expert opinion; and additional sources. Our World in Data has calculated annual aggregates, and decadal averages, for each country based on this raw event-by-event dataset. Decadal figures are measured as the annual average over the subsequent ten-year period. This means figures for ‘1900’ represent the average from 1900 to 1909; ‘1910’ is the average from 1910 to 1919 etc. We have calculated per capita rates using population figures from Gapminder (gapminder.org) and the UN World Population Prospects (https://population.un.org/wpp/). Economic damages data is provided by EM-DAT in concurrent US$. We have calculated this as a share of gross domestic product (GDP) using the World Bank’s GDP figures (also in current US$) (https://data.worldbank.org/indicator). Definitions of specific metrics are as follows: – ‘All disasters’ includes all geophysical, meteorological, and climate events including earthquakes, volcanic activity, landslides, drought, wildfires, storms, and flooding. – People affected are those requiring immediate assistance during an emergency situation. – The total number of people affected is the sum of injured, affected, and homeless.Link www.emdat.be

  20. m

    An Extensive Dataset for the Heart Disease Classification System

    • data.mendeley.com
    Updated Feb 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.2
    Explore at:
    Dataset updated
    Feb 17, 2022
    Authors
    Sozan S. Maghdid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70. A comprehensive database for factors that contribute to a heart attack has been constructed. The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. The size of the dataset is 1319 samples, which have nine fields, where eight fields are for input fields and one field for an output field. Age, gender, heart rate (impulse), systolic BP (pressurehight), diastolic BP (pressurelow), blood sugar(glucose), CK-MB (kcm), and Test-Troponin (troponin) are representing the input fields, while the output field pertains to the presence of heart attack (class), which is divided into two categories (negative and positive); negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Centers for Disease Control and Prevention (2017). Death in the United States [Dataset]. https://www.kaggle.com/datasets/cdc/mortality
Organization logo

Death in the United States

Learn more about the leading causes of death from 2005-2015

Explore at:
zip(766333584 bytes)Available download formats
Dataset updated
Aug 3, 2017
Dataset authored and provided by
Centers for Disease Control and Preventionhttp://www.cdc.gov/
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered
United States
Description

Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.

It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.

Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.

Overview

This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.

A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.

All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.

Project ideas

  • The CDC's mortality data was the basis of a widely publicized paper, by Anne Case and Nobel prize winner Angus Deaton, arguing that middle-aged whites are dying at elevated rates. One of the criticisms against the paper is that it failed to properly account for the exact ages within the broad bins available through the CDC's WONDER tool. What do these results look like with exact/not-binned age data?
  • Similarly, how sensitive are the mortality trends being discussed in the news to the choice of bin-widths?
  • As noted above, the data preparation process could have introduced errors. Can you find any discrepancies compared to the aggregate metrics on WONDER? If so, please let me know in the forums!
  • WONDER is cited in numerous economics, sociology, and public health research papers. Can you find any papers whose conclusions would be altered if they used the exact data available here rather than binned data from Wonder?

Differences from the first version of the dataset

  • This version of the dataset was prepared in a completely different many. This has allowed us to provide a much larger volume of data and ensure that codes are available for every field.
  • We've replaced the batch of sql files with a single JSON per year. Kaggle's platform currently offer's better support for JSON files, and this keeps the number of files manageable.
  • A tutorial kernel providing a quick introduction to the new format is available here.
  • Lastly, I apologize if the transition has interrupted anyone's work! If need be, you can still download v1.
Search
Clear search
Close search
Google apps
Main menu