https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.
It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.
Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.
This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.
A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.
All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.
This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
This file contains COVID-19 death counts, death rates, and percent of total deaths by jurisdiction of residence. The data is grouped by different time periods including 3-month period, weekly, and total (cumulative since January 1, 2020). United States death counts and rates include the 50 states, plus the District of Columbia and New York City. New York state estimates exclude New York City. Puerto Rico is included in HHS Region 2 estimates. Deaths with confirmed or presumed COVID-19, coded to ICD–10 code U07.1. Number of deaths reported in this file are the total number of COVID-19 deaths received and coded as of the date of analysis and may not represent all deaths that occurred in that period. Counts of deaths occurring before or after the reporting period are not included in the file. Data during recent periods are incomplete because of the lag in time between when the death occurred and when the death certificate is completed, submitted to NCHS and processed for reporting purposes. This delay can range from 1 week to 8 weeks or more, depending on the jurisdiction and cause of death. Death counts should not be compared across states. Data timeliness varies by state. Some states report deaths on a daily basis, while other states report deaths weekly or monthly. The ten (10) United States Department of Health and Human Services (HHS) regions include the following jurisdictions. Region 1: Connecticut, Maine, Massachusetts, New Hampshire, Rhode Island, Vermont; Region 2: New Jersey, New York, New York City, Puerto Rico; Region 3: Delaware, District of Columbia, Maryland, Pennsylvania, Virginia, West Virginia; Region 4: Alabama, Florida, Georgia, Kentucky, Mississippi, North Carolina, South Carolina, Tennessee; Region 5: Illinois, Indiana, Michigan, Minnesota, Ohio, Wisconsin; Region 6: Arkansas, Louisiana, New Mexico, Oklahoma, Texas; Region 7: Iowa, Kansas, Missouri, Nebraska; Region 8: Colorado, Montana, North Dakota, South Dakota, Utah, Wyoming; Region 9: Arizona, California, Hawaii, Nevada; Region 10: Alaska, Idaho, Oregon, Washington. Rates were calculated using the population estimates for 2021, which are estimated as of July 1, 2021 based on the Blended Base produced by the US Census Bureau in lieu of the April 1, 2020 decennial population count. The Blended Base consists of the blend of Vintage 2020 postcensal population estimates, 2020 Demographic Analysis Estimates, and 2020 Census PL 94-171 Redistricting File (see https://www2.census.gov/programs-surveys/popest/technical-documentation/methodology/2020-2021/methods-statement-v2021.pdf). Rates are based on deaths occurring in the specified week/month and are age-adjusted to the 2000 standard population using the direct method (see https://www.cdc.gov/nchs/data/nvsr/nvsr70/nvsr70-08-508.pdf). These rates differ from annual age-adjusted rates, typically presented in NCHS publications based on a full year of data and annualized weekly/monthly age-adjusted rates which have been adjusted to allow comparison with annual rates. Annualization rates presents deaths per year per 100,000 population that would be expected in a year if the observed period specific (weekly/monthly) rate prevailed for a full year. Sub-national death counts between 1-9 are suppressed in accordance with NCHS data confidentiality standards. Rates based on death counts less than 20 are suppressed in accordance with NCHS standards of reliability as specified in NCHS Data Presentation Standards for Proportions (available from: https://www.cdc.gov/nchs/data/series/sr_02/sr02_175.pdf.).
Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Vaccination Status.
Click 'More' for important dataset description and footnotes
Dataset and data visualization details:
These data were posted on October 21, 2022, archived on November 18, 2022, and revised on February 22, 2023. These data reflect cases among persons with a positive specimen collection date through September 24, 2022, and deaths among persons with a positive specimen collection date through September 3, 2022.
Vaccination status: A person vaccinated with a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected.
Additional or booster dose: A person vaccinated with a primary series and an additional or booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after receipt of an additional or booster dose of any COVID-19 vaccine on or after August 13, 2021. For people ages 18 years and older, data are graphed starting the week including September 24, 2021, when a COVID-19 booster dose was first recommended by CDC for adults 65+ years old and people in certain populations and high risk occupational and institutional settings. For people ages 12-17 years, data are graphed starting the week of December 26, 2021, 2 weeks after the first recommendation for a booster dose for adolescents ages 16-17 years. For people ages 5-11 years, data are included starting the week of June 5, 2022, 2 weeks after the first recommendation for a booster dose for children aged 5-11 years. For people ages 50 years and older, data on second booster doses are graphed starting the week including March 29, 2022, when the recommendation was made for second boosters. Vertical lines represent dates when changes occurred in U.S. policy for COVID-19 vaccination (details provided above). Reporting is by primary series vaccine type rather than additional or booster dose vaccine type. The booster dose vaccine type may be different than the primary series vaccine type.
** Because data on the immune status of cases and associated deaths are unavailable, an additional dose in an immunocompromised person cannot be distinguished from a booster dose. This is a relevant consideration because vaccines can be less effective in this group.
Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Rates of COVID-19 deaths by vaccination status are reported based on when the patient was tested for COVID-19, not the date they died. Deaths usually occur up to 30 days after COVID-19 diagnosis.
Participating jurisdictions: Currently, these 31 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (New York), North Carolina, Philadelphia (Pennsylvania), Rhode Island, South Dakota, Tennessee, Texas, Utah, Washington, and West Virginia; 30 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 72% of the total U.S. population and all ten of the Health and Human Services Regions. Data on cases
Splitgraph serves as an HTTP API that lets you run SQL queries directly on this data to power Web applications. For example:
See the Splitgraph documentation for more information.
Number and percentage of deaths, by month and place of residence, 1991 to most recent year.
This dataset includes births, deaths and the ratio of births to deaths by metropolitan area for the years 2000-2006. The actual births and deaths for 2000 and estimates were taken from the U.S. Census Components of Population Change. Ratios were calculated based on that data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Updated (Bivalent) Booster Status. Click 'More' for important dataset description and footnotes
Webpage: https://covid.cdc.gov/covid-data-tracker/#rates-by-vaccine-status
Dataset and data visualization details:
These data were posted and archived on May 30, 2023 and reflect cases among persons with a positive specimen collection date through April 22, 2023, and deaths among persons with a positive specimen collection date through April 1, 2023. These data will no longer be updated after May 2023.
Vaccination status: A person vaccinated with at least a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected. A person vaccinated with a primary series and a monovalent booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably receiving a primary series of an FDA-authorized or approved vaccine and at least one additional dose of any monovalent FDA-authorized or approved COVID-19 vaccine on or after August 13, 2021. (Note: this definition does not distinguish between vaccine recipients who are immunocompromised and are receiving an additional dose versus those who are not immunocompromised and receiving a booster dose.) A person vaccinated with a primary series and an updated (bivalent) booster dose had SARS-CoV-2 RNA or antigen detected in a respiratory specimen collected ≥14 days after verifiably receiving a primary series of an FDA-authorized or approved vaccine and an additional dose of any bivalent FDA-authorized or approved vaccine COVID-19 vaccine on or after September 1, 2022. (Note: Doses with bivalent doses reported as first or second doses are classified as vaccinated with a bivalent booster dose.) People with primary series or a monovalent booster dose were combined in the “vaccinated without an updated booster” category.
Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Per the interim guidance of the Council of State and Territorial Epidemiologists (CSTE), this should include persons whose death certificate lists COVID-19 disease or SARS-CoV-2 as the underlying cause of death or as a significant condition contributing to death. Rates of COVID-19 deaths by vaccination status are primarily reported based on when the patient was tested for COVID-19. In select jurisdictions, deaths are included that are not laboratory confirmed and are reported based on alternative dates (i.e., onset date for most; or date of death or report date, where onset date is unavailable). Deaths usually occur up to 30 days after COVID-19 diagnosis.
Participating jurisdictions: Currently, these 24 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Colorado, District of Columbia, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (NY), North Carolina, Rhode Island, Tennessee, Texas, Utah, and West Virginia; 23 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 48% of the total U.S. population and all ten of the Health and Human Services Regions. This list will be updated as more jurisdictions participate.
Incidence rate estimates: Weekly age-specific incidence rates by vaccination status were calculated as the number of cases or deaths divided by the number of people vaccinated with a primary series, overall or with/without a booster dose (cumulative) or unvaccinated (obtained by subtracting the cumulative number of people vaccinated with at least a primary series and partially vaccinated people from the 2019 U.S. intercensal population estimates) and multiplied by 100,000. Overall incidence rates were age-standardized using the 2000 U.S. Census standard population. To estimate population counts for ages 6-12 months, half of the single-year population counts for ages <12 months were used. All rates are plotted by positive specimen collection date to reflect when incident infections occurred.
Continuity correction: A continuity correction has been applied to the denominators by capping the percent population coverage at 95%. To do this, we assumed that at least 5% of each age group would always be unvaccinated in each jurisdiction. Adding this correction ensures that there is always a reasonable denominator for the unvaccinated population that would prevent incidence and death rates from growing unrealistically large due to potential overestimates of vaccination coverage.
Incidence rate ratios (IRRs): IRRs for the past one month were calculated by dividing the average weekly incidence rates among unvaccinated people by that among people vaccinated without an updated (bivalent) booster dose) or vaccinated with an updated (bivalent) booster dose.
Archive: An archive of historic data, including April 3, 2021-September 24, 2022 and posted on October 21, 2022 is available on data.cdc.gov. The analysis by vaccination status (unvaccinated and at least a primary series) for 31 jurisdictions is posted here: https://data.cdc.gov/Public-Health-Surveillance/Rates-of-COVID-19-Cases-or-Deaths-by-Age-Group-and/3rge-nu2a. The analysis for one booster dose (unvaccinated, primary series only, and at least one booster dose) in 31 jurisdictions is posted here: https://data.cdc.gov/Public-Health-Surveillance/Rates-of-COVID-19-Cases-or-Deaths-by-Age-Group-and/d6p8-wqjm. The analysis for two booster doses (unvaccinated, primary series only, one booster dose, and at least two booster doses) in 28 jurisdictions is posted here: https://data.cdc.gov/Public-Health-Surveillance/Rates-of-COVID-19-Cases-or-Deaths-by-Age-Group-and/ukww-au2k.
References
Scobie HM, Johnson AG, Suthar AB, et al. Monitoring Incidence of COVID-19 Cases, Hospitalizations, and Deaths, by Vaccination Status — 13 U.S. Jurisdictions, April 4–July 17, 2021. MMWR Morb Mortal Wkly Rep 2021;70:1284–1290.
Johnson AG, Amin AB, Ali AR, et al. COVID-19 Incidence and Death Rates Among Unvaccinated and Fully Vaccinated Adults with and Without Booster Doses During Periods of Delta and Omicron Variant Emergence — 25 U.S. Jurisdictions, April 4–December 25, 2021. MMWR Morb Mortal Wkly Rep 2022;71:132–138
Johnson AG, Linde L, Ali AR, et al. COVID-19 Incidence and Mortality Among Unvaccinated and Vaccinated Persons Aged ≥12 Years by Receipt of Bivalent Booster Doses and Time Since Vaccination — 24 U.S. Jurisdictions, October 3, 2021–December 24, 2022. MMWR Morb Mortal Wkly Rep 2023;72:145–152
CAPITAL PUNISHMENT IN THE UNITED STATES, 1973-2010 provides annual data on prisoners under a sentence of death, as well as those who had their sentences commuted or vacated and prisoners who were executed. This study examines basic sociodemographic classifications including age, sex, race and ethnicity, marital status at time of imprisonment, level of education, and State and region of incarceration. Criminal history information includes prior felony convictions and prior convictions for criminal homicide and the legal status at the time of the capital offense. Additional information is provided on those inmates removed from death row by yearend 2010. The dataset consists of one part which contains 9,058 cases. The file provides information on inmates whose death sentences were removed in addition to information on those inmates who were executed. The file also gives information about inmates who received a second death sentence by yearend 2010 as well as inmates who were already on death row.
Data for CDC’s COVID Data Tracker site on Rates of COVID-19 Cases and Deaths by Vaccination Status. Click 'More' for important dataset description and footnotes
Dataset and data visualization details: These data were posted on October 21, 2022, archived on November 18, 2022, and revised on February 22, 2023. These data reflect cases among persons with a positive specimen collection date through September 24, 2022, and deaths among persons with a positive specimen collection date through September 3, 2022.
Vaccination status: A person vaccinated with a primary series had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after verifiably completing the primary series of an FDA-authorized or approved COVID-19 vaccine. An unvaccinated person had SARS-CoV-2 RNA or antigen detected on a respiratory specimen and has not been verified to have received COVID-19 vaccine. Excluded were partially vaccinated people who received at least one FDA-authorized vaccine dose but did not complete a primary series ≥14 days before collection of a specimen where SARS-CoV-2 RNA or antigen was detected. Additional or booster dose: A person vaccinated with a primary series and an additional or booster dose had SARS-CoV-2 RNA or antigen detected on a respiratory specimen collected ≥14 days after receipt of an additional or booster dose of any COVID-19 vaccine on or after August 13, 2021. For people ages 18 years and older, data are graphed starting the week including September 24, 2021, when a COVID-19 booster dose was first recommended by CDC for adults 65+ years old and people in certain populations and high risk occupational and institutional settings. For people ages 12-17 years, data are graphed starting the week of December 26, 2021, 2 weeks after the first recommendation for a booster dose for adolescents ages 16-17 years. For people ages 5-11 years, data are included starting the week of June 5, 2022, 2 weeks after the first recommendation for a booster dose for children aged 5-11 years. For people ages 50 years and older, data on second booster doses are graphed starting the week including March 29, 2022, when the recommendation was made for second boosters. Vertical lines represent dates when changes occurred in U.S. policy for COVID-19 vaccination (details provided above). Reporting is by primary series vaccine type rather than additional or booster dose vaccine type. The booster dose vaccine type may be different than the primary series vaccine type. ** Because data on the immune status of cases and associated deaths are unavailable, an additional dose in an immunocompromised person cannot be distinguished from a booster dose. This is a relevant consideration because vaccines can be less effective in this group. Deaths: A COVID-19–associated death occurred in a person with a documented COVID-19 diagnosis who died; health department staff reviewed to make a determination using vital records, public health investigation, or other data sources. Rates of COVID-19 deaths by vaccination status are reported based on when the patient was tested for COVID-19, not the date they died. Deaths usually occur up to 30 days after COVID-19 diagnosis. Participating jurisdictions: Currently, these 31 health departments that regularly link their case surveillance to immunization information system data are included in these incidence rate estimates: Alabama, Arizona, Arkansas, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Idaho, Indiana, Kansas, Kentucky, Louisiana, Massachusetts, Michigan, Minnesota, Nebraska, New Jersey, New Mexico, New York, New York City (New York), North Carolina, Philadelphia (Pennsylvania), Rhode Island, South Dakota, Tennessee, Texas, Utah, Washington, and West Virginia; 30 jurisdictions also report deaths among vaccinated and unvaccinated people. These jurisdictions represent 72% of the total U.S. population and all ten of the Health and Human Services Regions. Data on cases
Rank, number of deaths, percentage of deaths, and age-specific mortality rates for the leading causes of death, by age group and sex, 2000 to most recent year.
THIS DATASET WAS LAST UPDATED AT 8:11 PM EASTERN ON SEPT. 30
2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.
In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.
A total of 229 people died in mass killings in 2019.
The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.
One-third of the offenders died at the scene of the killing or soon after, half from suicides.
The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.
The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.
This data will be updated periodically and can be used as an ongoing resource to help cover these events.
To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:
To get these counts just for your state:
Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.
This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”
Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.
Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.
Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.
In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.
Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.
Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.
This project started at USA TODAY in 2012.
Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.
This dataset displays all the hazardous waste sites in the United States and it's Territories as of 5.08. The data comes from the Agency for Toxic Substances and Disease Registry(ATSDR). The dataset contains information about the site: Site ID Site Name CERCLIS # Address City State County Latitude Longitude Population Region # Congressional Districts Federal Facility National Priorities List Status Ownership Status Classification For more information go to the Agency for Toxic Substances and Disease Registry(ATSDR)website at http://www.atsdr.cdc.gov
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
To estimate county of residence of Filipinx healthcare workers who died of COVID-19, we retrieved data from the Kanlungan website during the month of December 2020.22 In deciding who to include on the website, the AF3IRM team that established the Kanlungan website set two standards in data collection. First, the team found at least one source explicitly stating that the fallen healthcare worker was of Philippine ancestry; this was mostly media articles or obituaries sharing the life stories of the deceased. In a few cases, the confirmation came directly from the deceased healthcare worker's family member who submitted a tribute. Second, the team required a minimum of two sources to identify and announce fallen healthcare workers. We retrieved 86 US tributes from Kanlungan, but only 81 of them had information on county of residence. In total, 45 US counties with at least one reported tribute to a Filipinx healthcare worker who died of COVID-19 were identified for analysis and will hereafter be referred to as “Kanlungan counties.” Mortality data by county, race, and ethnicity came from the National Center for Health Statistics (NCHS).24 Updated weekly, this dataset is based on vital statistics data for use in conducting public health surveillance in near real time to provide provisional mortality estimates based on data received and processed by a specified cutoff date, before data are finalized and publicly released.25 We used the data released on December 30, 2020, which included provisional COVID-19 death counts from February 1, 2020 to December 26, 2020—during the height of the pandemic and prior to COVID-19 vaccines being available—for counties with at least 100 total COVID-19 deaths. During this time period, 501 counties (15.9% of the total 3,142 counties in all 50 states and Washington DC)26 met this criterion. Data on COVID-19 deaths were available for six major racial/ethnic groups: Non-Hispanic White, Non-Hispanic Black, Non-Hispanic Native Hawaiian or Other Pacific Islander, Non-Hispanic American Indian or Alaska Native, Non-Hispanic Asian (hereafter referred to as Asian American), and Hispanic. People with more than one race, and those with unknown race were included in the “Other” category. NCHS suppressed county-level data by race and ethnicity if death counts are less than 10. In total, 133 US counties reported COVID-19 mortality data for Asian Americans. These data were used to calculate the percentage of all COVID-19 decedents in the county who were Asian American. We used data from the 2018 American Community Survey (ACS) five-year estimates, downloaded from the Integrated Public Use Microdata Series (IPUMS) to create county-level population demographic variables.27 IPUMS is publicly available, and the database integrates samples using ACS data from 2000 to the present using a high degree of precision.27 We applied survey weights to calculate the following variables at the county-level: median age among Asian Americans, average income to poverty ratio among Asian Americans, the percentage of the county population that is Filipinx, and the percentage of healthcare workers in the county who are Filipinx. Healthcare workers encompassed all healthcare practitioners, technical occupations, and healthcare service occupations, including nurse practitioners, physicians, surgeons, dentists, physical therapists, home health aides, personal care aides, and other medical technicians and healthcare support workers. County-level data were available for 107 out of the 133 counties (80.5%) that had NCHS data on the distribution of COVID-19 deaths among Asian Americans, and 96 counties (72.2%) with Asian American healthcare workforce data. The ACS 2018 five-year estimates were also the source of county-level percentage of the Asian American population (alone or in combination) who are Filipinx.8 In addition, the ACS provided county-level population counts26 to calculate population density (people per 1,000 people per square mile), estimated by dividing the total population by the county area, then dividing by 1,000 people. The county area was calculated in ArcGIS 10.7.1 using the county boundary shapefile and projected to Albers equal area conic (for counties in the US contiguous states), Hawai’i Albers Equal Area Conic (for Hawai’i counties), and Alaska Albers Equal Area Conic (for Alaska counties).20
This dataset was created from the CDC's National Vital Statistics Reports Volume 56, Number 6. The dataset includes all data available from this report by state level and includes births by race and Hispanic origin, births to unmarried women, rates of cesarean delivery, and twin and multiple birth rates. The data are final for 2005. No value is represented by a -1. "Descriptive tabulations of data reported on the birth certificates of the 4.1 million births that occurred in 2005 are presented. Denominators for population-based rates are postcensal estimates derived from the U.S. 2000 census".
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data included in this publication depict components of wildfire risk specifically for populated areas in the United States. These datasets represent where people live in the United States and the in situ risk from wildfire, i.e., the risk at the location where the adverse effects take place.
National wildfire hazard datasets of annual burn probability and fire intensity, generated by the USDA Forest Service, Rocky Mountain Research Station and Pyrologix LLC, form the foundation of the Wildfire Risk to Communities data. Vegetation and wildland fuels data from LANDFIRE 2020 (version 2.2.0) were used as input to two different but related geospatial fire simulation systems. Annual burn probability was produced with the USFS geospatial fire simulator (FSim) at a relatively coarse cell size of 270 meters (m). To bring the burn probability raster data down to a finer resolution more useful for assessing hazard and risk to communities, we upsampled them to the native 30 m resolution of the LANDFIRE fuel and vegetation data. In this upsampling process, we also spread values of modeled burn probability into developed areas represented in LANDFIRE fuels data as non-burnable. Burn probability rasters represent landscape conditions as of the end of 2020. Fire intensity characteristics were modeled at 30 m resolution using a process that performs a comprehensive set of FlamMap runs spanning the full range of weather-related characteristics that occur during a fire season and then integrates those runs into a variety of results based on the likelihood of those weather types occurring. Before the fire intensity modeling, the LANDFIRE 2020 data were updated to reflect fuels disturbances occurring in 2021 and 2022. As such, the fire intensity datasets represent landscape conditions as of the end of 2022. The data products in this publication that represent where people live, reflect 2020 estimates of housing units and 2021 estimates of population counts from the U.S. Census Bureau, combined with building footprint data from Onegeo and USA Structures, both reflecting 2022 conditions.
The specific raster datasets included in this publication include:
Building Count: Building Count is a 30-m raster representing the count of buildings in the building footprint dataset located within each 30-m pixel.
Building Density: Building Density is a 30-m raster representing the density of buildings in the building footprint dataset (buildings per square kilometer [km²]).
Building Coverage: Building Coverage is a 30-m raster depicting the percentage of habitable land area covered by building footprints.
Population Count (PopCount): PopCount is a 30-m raster with pixel values representing residential population count (persons) in each pixel.
Population Density (PopDen): PopDen is a 30-m raster of residential population density (people/km²).
Housing Unit Count (HUCount): HUCount is a 30-m raster representing the number of housing units in each pixel.
Housing Unit Density (HUDen): HUDen is a 30-m raster of housing-unit density (housing units/km²).
Housing Unit Exposure (HUExposure): HUExposure is a 30-m raster that represents the expected number of housing units within a pixel potentially exposed to wildfire in a year. This is a long-term annual average and not intended to represent the actual number of housing units exposed in any specific year.
Housing Unit Impact (HUImpact): HUImpact is a 30-m raster that represents the relative potential impact of fire to housing units at any pixel, if a fire were to occur. It is an index that incorporates the general consequences of fire on a home as a function of fire intensity and uses flame length probabilities from wildfire modeling to capture likely intensity of fire.
Housing Unit Risk (HURisk): HURisk is a 30-m raster that integrates all four primary elements of wildfire risk - likelihood, intensity, susceptibility, and exposure - on pixels where housing unit density is greater than zero.The geospatial data products described and distributed here are part of the Wildfire Risk to Communities project. This project was directed by Congress in the 2018 Consolidated Appropriations Act (i.e., 2018 Omnibus Act, H.R. 1625, Section 210: Wildfire Hazard Severity Mapping) to help U.S. communities understand components of their relative wildfire risk profile, the nature and effects of wildfire risk, and actions communities can take to mitigate risk. The first edition of these data represented the first time wildfire risk to communities had been mapped nationally with consistent methodology. They provided foundational information for comparing the relative wildfire risk among populated communities in the United States. In this version, the 2nd edition, we use improved modeling and mapping methodology and updated input data to generate the current suite of products.See the Wildfire Risk to Communities website at https://www.wildfirerisk.org for complete project information and an interactive web application for exploring some of the datasets published here. We deliver the data here as zip files by U.S. state (including AK and HI), and for the full extent of the continental U.S.
This data publication is a second edition and represents an update to any previous versions of Wildfire Risk to Communities risk datasets published by the USDA Forest Service. This second edition was originally published on 06/03/2024. On 09/10/2024, a minor correction was made to the abstract in this overall metadata document as well as the individual metadata documents associated with each raster dataset. The supplemental file containing data product descriptions was also updated. In addition, we separated the large CONUS download into a series of smaller zip files (one for each layer).
There are two companion data publications that are part of the WRC 2.0 data update: one that characterizes landscape-wide wildfire hazard and risk for the nation (Scott et al. 2024, https://doi.org/10.2737/RDS-2020-0016-2), and one that delineates wildfire risk reduction zones and provides tabular summaries of wildfire hazard and risk raster datasets (Dillon et al. 2024, https://doi.org/10.2737/RDS-2024-0030).
This dataset illustrates the largest difference between high and low temperatures and the smallest difference between high and low temperatures in cities with 50,000 people or more. A value of -1 means that the data was not applicable. Also included are the rankings, the inverse ranking to be used for mapping purposes, the popualtion, the name of city and state, and the temperature degree difference. Source City-Data URL http//www.city-data.com/top2/c489.html http//www.city-data.com/top2/c490.html Date Accessed November 13,2007
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset presents the mean household income for each of the five quintiles in Newark, DE, as reported by the U.S. Census Bureau. The dataset highlights the variation in mean household income across quintiles, offering valuable insights into income distribution and inequality.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Income Levels:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Newark median household income. You can refer the same here
This dataset provides highly detailed (Block Level) views of various demographics for Manhattan, New York city. this dataset includes information on age, race, sex, income, housing, and various other attributes. This data comes from the 2000 Us Census and was joined to the Census Tiger line files to create the output. enjoy!
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.
It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.
Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.
This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.
A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.
All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.