28 datasets found
  1. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Dec 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 1, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Nov 29, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 7:11 AM EASTERN ON DEC. 1

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  2. US Mass Shootings

    • kaggle.com
    zip
    Updated May 25, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeeshan-ul-hassan Usmani (2022). US Mass Shootings [Dataset]. https://www.kaggle.com/zusmani/us-mass-shootings-last-50-years
    Explore at:
    zip(317763 bytes)Available download formats
    Dataset updated
    May 25, 2022
    Authors
    Zeeshan-ul-hassan Usmani
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    United States
    Description

    Context

    Mass Shootings in the United States of America (1966-2017) The US has witnessed 398 mass shootings in last 50 years that resulted in 1,996 deaths and 2,488 injured. The latest and the worst mass shooting of October 2, 2017 killed 58 and injured 515 so far. The number of people injured in this attack is more than the number of people injured in all mass shootings of 2015 and 2016 combined. The average number of mass shootings per year is 7 for the last 50 years that would claim 39 lives and 48 injured per year.

    Content

    Geography: United States of America

    Time period: 1966-2017

    Unit of analysis: Mass Shooting Attack

    Dataset: The dataset contains detailed information of 398 mass shootings in the United States of America that killed 1996 and injured 2488 people.

    Variables: The dataset contains Serial No, Title, Location, Date, Summary, Fatalities, Injured, Total Victims, Mental Health Issue, Race, Gender, and Lat-Long information.

    Acknowledgements

    I’ve consulted several public datasets and web pages to compile this data. Some of the major data sources include Wikipedia, Mother Jones, Stanford, USA Today and other web sources.

    Inspiration

    With a broken heart, I like to call the attention of my fellow Kagglers to use Machine Learning and Data Sciences to help me explore these ideas:

    • How many people got killed and injured per year?

    • Visualize mass shootings on the U.S map

    • Is there any correlation between shooter and his/her race, gender

    • Any correlation with calendar dates? Do we have more deadly days, weeks or months on average

    • What cities and states are more prone to such attacks

    • Can you find and combine any other external datasets to enrich the analysis, for example, gun ownership by state

    • Any other pattern you see that can help in prediction, crowd safety or in-depth analysis of the event

    • How many shooters have some kind of mental health problem? Can we compare that shooter with general population with same condition

    Mass Shootings Dataset Ver 3

    This is the new Version of Mass Shootings Dataset. I've added eight new variables:

    1. Incident Area (where the incident took place),
    2. Open/Close Location (Inside a building or open space)
    3. Target (possible target audience or company),
    4. Cause (Terrorism, Hate Crime, Fun (for no obvious reason etc.)
    5. Policeman Killed (how many on duty officers got killed)
    6. Age (age of the shooter)
    7. Employed (Y/N)
    8. Employed at (Employer Name)

    Age, Employed and Employed at (3 variables) contain shooter details

    Mass Shootings Dataset Ver 4

    Quite a few missing values have been added

    Mass Shootings Dataset Ver 5

    Three more recent mass shootings have been added including the Texas Church shooting of November 5, 2017

    I hope it will help create more visualization and extract patterns.

    Keep Coding!

  3. d

    Johns Hopkins COVID-19 Case Tracker

    • data.world
    • kaggle.com
    csv, zip
    Updated Dec 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Johns Hopkins COVID-19 Case Tracker [Dataset]. https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Dec 3, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 22, 2020 - Mar 9, 2023
    Area covered
    Description

    Updates

    • Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.

    • April 9, 2020

      • The population estimate data for New York County, NY has been updated to include all five New York City counties (Kings County, Queens County, Bronx County, Richmond County and New York County). This has been done to match the Johns Hopkins COVID-19 data, which aggregates counts for the five New York City counties to New York County.
    • April 20, 2020

      • Johns Hopkins death totals in the US now include confirmed and probable deaths in accordance with CDC guidelines as of April 14. One significant result of this change was an increase of more than 3,700 deaths in the New York City count. This change will likely result in increases for death counts elsewhere as well. The AP does not alter the Johns Hopkins source data, so probable deaths are included in this dataset as well.
    • April 29, 2020

      • The AP is now providing timeseries data for counts of COVID-19 cases and deaths. The raw counts are provided here unaltered, along with a population column with Census ACS-5 estimates and calculated daily case and death rates per 100,000 people. Please read the updated caveats section for more information.
    • September 1st, 2020

      • Johns Hopkins is now providing counts for the five New York City counties individually.
    • February 12, 2021

      • The Ohio Department of Health recently announced that as many as 4,000 COVID-19 deaths may have been underreported through the state’s reporting system, and that the "daily reported death counts will be high for a two to three-day period."
      • Because deaths data will be anomalous for consecutive days, we have chosen to freeze Ohio's rolling average for daily deaths at the last valid measure until Johns Hopkins is able to back-distribute the data. The raw daily death counts, as reported by Johns Hopkins and including the backlogged death data, will still be present in the new_deaths column.
    • February 16, 2021

      - Johns Hopkins has reconciled Ohio's historical deaths data with the state.

      Overview

    The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.

    The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.

    This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.

    The AP is updating this dataset hourly at 45 minutes past the hour.

    To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.

    Queries

    Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic

    Interactive

    The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.

    @(https://datawrapper.dwcdn.net/nRyaf/15/)

    Interactive Embed Code

    <iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
    

    Caveats

    • This data represents the number of cases and deaths reported by each state and has been collected by Johns Hopkins from a number of sources cited on their website.
    • In some cases, deaths or cases of people who've crossed state lines -- either to receive treatment or because they became sick and couldn't return home while traveling -- are reported in a state they aren't currently in, because of state reporting rules.
    • In some states, there are a number of cases not assigned to a specific county -- for those cases, the county name is "unassigned to a single county"
    • This data should be credited to Johns Hopkins University's COVID-19 tracking project. The AP is simply making it available here for ease of use for reporters and members.
    • Caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
    • Population estimates at the county level are drawn from 2014-18 5-year estimates from the American Community Survey.
    • The Urban/Rural classification scheme is from the Center for Disease Control and Preventions's National Center for Health Statistics. It puts each county into one of six categories -- from Large Central Metro to Non-Core -- according to population and other characteristics. More details about the classifications can be found here.

    Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here

    Attribution

    This data should be credited to Johns Hopkins University COVID-19 tracking project

  4. Fatal Police Shootings in the US (2015-2020)

    • kaggle.com
    zip
    Updated Jun 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Larxel (2020). Fatal Police Shootings in the US (2015-2020) [Dataset]. https://www.kaggle.com/datasets/andrewmvd/police-deadly-force-usage-us/code
    Explore at:
    zip(135929 bytes)Available download formats
    Dataset updated
    Jun 1, 2020
    Authors
    Larxel
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    About this dataset

    The Washington Post compiled a dataset of every fatal shooting in the United States by a police officer in the line of duty since Jan. 1, 2015.

    In 2015, The Post began tracking more than a dozen details about each killing by culling local news reports, law enforcement websites and social media and by monitoring independent databases such as Killed by Police and Fatal Encounters. The available features are: - Race of the deceased; - Circumstances of the shooting; - Whether the person was armed; - Whether the victim was experiencing a mental-health crisis; - Among others.

    In 2016, The Post is gathering additional information about each fatal shooting that occurs this year and is filing open-records requests with departments. More than a dozen additional details are being collected about officers in each shooting.

    The Post is documenting only those shootings in which a police officer, in the line of duty, shot and killed a civilian — the circumstances that most closely parallel the 2014 killing of Michael Brown in Ferguson, Mo., which began the protest movement culminating in Black Lives Matter and an increased focus on police accountability nationwide. The Post is not tracking deaths of people in police custody, fatal shootings by off-duty officers or non-shooting deaths.

    The FBI and the Centers for Disease Control and Prevention log fatal shootings by police, but officials acknowledge that their data is incomplete. In 2015, The Post documented more than two times more fatal shootings by police than had been recorded by the FBI. Last year, the FBI announced plans to overhaul how it tracks fatal police encounters.

    How to use this dataset

    Acknowledgements

    If you use this dataset in your research, please credit the authors.

    BibTeX

    @misc{wapo-police-shootings-bot , author = {The Washington Post}, title = {data-police-shootings}, month = jan, year = 2015, publisher = {Github}, url = {https://github.com/washingtonpost/data-police-shootings} }

    License

    CC BY NC SA 4.0

    Splash banner

    Image by pixabay avaiable on pexels.

  5. Deaths registered weekly in England and Wales, provisional

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Nov 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2025). Deaths registered weekly in England and Wales, provisional [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/weeklyprovisionalfiguresondeathsregisteredinenglandandwales
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2025
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Provisional counts of the number of deaths registered in England and Wales, by age, sex, region and Index of Multiple Deprivation (IMD), in the latest weeks for which data are available.

  6. Number of United States military fatalities in major wars 1775-2025

    • statista.com
    • boostndoto.org
    • +4more
    Updated Jul 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of United States military fatalities in major wars 1775-2025 [Dataset]. https://www.statista.com/statistics/1009819/total-us-military-fatalities-in-american-wars-1775-present/
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    The American Civil War is the conflict with the largest number of American military fatalities in history. In fact, the Civil War's death toll is comparable to all other major wars combined, the deadliest of which were the World Wars, which have a combined death toll of more than 520,000 American fatalities. The ongoing series of conflicts and interventions in the Middle East and North Africa, collectively referred to as the War on Terror in the west, has a combined death toll of more than 7,000 for the U.S. military since 2001. Other records In terms of the number of deaths per day, the American Civil War is still at the top, with an average of 425 deaths per day, while the First and Second World Wars have averages of roughly 100 and 200 fatalities per day respectively. Technically, the costliest battle in U.S. military history was the Battle of Elsenborn Ridge, which was a part of the Battle of the Bulge in the Second World War, and saw upwards of 5,000 deaths over 10 days. However, the Battle of Gettysburg had more military fatalities of American soldiers, with almost 3,200 Union deaths and over 3,900 Confederate deaths, giving a combined total of more than 7,000. The Battle of Antietam is viewed as the bloodiest day in American military history, with over 3,600 combined fatalities and almost 23,000 total casualties on September 17, 1862. Revised Civil War figures For more than a century, the total death toll of the American Civil War was generally accepted to be around 620,000, a number which was first proposed by Union historians William F. Fox and Thomas L. Livermore in 1888. This number was calculated by using enlistment figures, battle reports, and census data, however many prominent historians since then have thought the number should be higher. In 2011, historian J. David Hacker conducted further investigations and claimed that the number was closer to 750,000 (and possibly as high as 850,000). While many Civil War historians agree that this is possible, and even likely, obtaining consistently accurate figures has proven to be impossible until now; both sides were poor at keeping detailed records throughout the war, and much of the Confederacy's records were lost by the war's end. Many Confederate widows also did not register their husbands death with the authorities, as they would have then been ineligible for benefits.

  7. N

    Kill Devil Hills, NC Annual Population and Growth Analysis Dataset: A...

    • neilsberg.com
    csv, json
    Updated Jul 30, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Kill Devil Hills, NC Annual Population and Growth Analysis Dataset: A Comprehensive Overview of Population Changes and Yearly Growth Rates in Kill Devil Hills from 2000 to 2023 // 2024 Edition [Dataset]. https://www.neilsberg.com/insights/kill-devil-hills-nc-population-by-year/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Jul 30, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Kill Devil Hills, Kill Devil Hills, NC, North Carolina
    Variables measured
    Annual Population Growth Rate, Population Between 2000 and 2023, Annual Population Growth Rate Percent
    Measurement technique
    The data presented in this dataset is derived from the 20 years data of U.S. Census Bureau Population Estimates Program (PEP) 2000 - 2023. To measure the variables, namely (a) population and (b) population change in ( absolute and as a percentage ), we initially analyzed and tabulated the data for each of the years between 2000 and 2023. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Kill Devil Hills population over the last 20 plus years. It lists the population for each year, along with the year on year change in population, as well as the change in percentage terms for each year. The dataset can be utilized to understand the population change of Kill Devil Hills across the last two decades. For example, using this dataset, we can identify if the population is declining or increasing. If there is a change, when the population peaked, or if it is still growing and has not reached its peak. We can also compare the trend with the overall trend of United States population over the same period of time.

    Key observations

    In 2023, the population of Kill Devil Hills was 7,778, a 0.32% decrease year-by-year from 2022. Previously, in 2022, Kill Devil Hills population was 7,803, a decline of 0.12% compared to a population of 7,812 in 2021. Over the last 20 plus years, between 2000 and 2023, population of Kill Devil Hills increased by 1,865. In this period, the peak population was 7,812 in the year 2021. The numbers suggest that the population has already reached its peak and is showing a trend of decline. Source: U.S. Census Bureau Population Estimates Program (PEP).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau Population Estimates Program (PEP).

    Data Coverage:

    • From 2000 to 2023

    Variables / Data Columns

    • Year: This column displays the data year (Measured annually and for years 2000 to 2023)
    • Population: The population for the specific year for the Kill Devil Hills is shown in this column.
    • Year on Year Change: This column displays the change in Kill Devil Hills population for each year compared to the previous year.
    • Change in Percent: This column displays the year on year change as a percentage. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Kill Devil Hills Population by Year. You can refer the same here

  8. American Community Survey: 1-Year Estimates: Detailed Tables 1-Year

    • catalog.data.gov
    • datasets.ai
    Updated Jul 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2023). American Community Survey: 1-Year Estimates: Detailed Tables 1-Year [Dataset]. https://catalog.data.gov/dataset/american-community-survey-1-year-estimates-detailed-tables-1-year-3092c
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    The American Community Survey (ACS) is an ongoing survey that provides data every year -- giving communities the current information they need to plan investments and services. The ACS covers a broad range of topics about social, economic, demographic, and housing characteristics of the U.S. population. Much of the ACS data provided on the Census Bureau's Web site are available separately by age group, race, Hispanic origin, and sex. Summary files, Subject tables, Data profiles, and Comparison profiles are available for the nation, all 50 states, the District of Columbia, Puerto Rico, every congressional district, every metropolitan area, and all counties and places with populations of 65,000 or more. Detailed Tables contain the most detailed cross-tabulations published for areas 65k and more. The data are population counts. There are over 31,000 variables in this dataset.

  9. c

    Annual AIDS Deaths in the U.S.: Declining Trend (1981-2021)

    • consumershield.com
    csv
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ConsumerShield Research Team (2024). Annual AIDS Deaths in the U.S.: Declining Trend (1981-2021) [Dataset]. https://www.consumershield.com/articles/aids-deaths-per-year
    Explore at:
    csvAvailable download formats
    Dataset updated
    Nov 8, 2024
    Dataset authored and provided by
    ConsumerShield Research Team
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    The graph depicts the number of AIDS-related deaths in the United States annually from 1981 to 2021. The x-axis represents the years, labeled with two-digit abbreviations from '81 to '21, while the y-axis shows the number of deaths in thousands. Over this 41-year span, AIDS deaths increased dramatically from 1,675.77 in 1981, reaching a peak of 43,276.94 in 1994, and then declined significantly to 6,306.24 by 2021. The data highlights a sharp upward trend in the early years of the epidemic, followed by a substantial downward trend starting in the mid-1990s, reflecting improvements in treatment and prevention. The information is presented in a line graph format, effectively illustrating the rise and subsequent decline in AIDS-related fatalities over the four decades.

  10. d

    ACS 5 Year Data by Community Area

    • catalog.data.gov
    Updated Jun 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cityofchicago.org (2025). ACS 5 Year Data by Community Area [Dataset]. https://catalog.data.gov/dataset/acs-5-year-data-by-community-area
    Explore at:
    Dataset updated
    Jun 7, 2025
    Dataset provided by
    data.cityofchicago.org
    Description

    Selected variables from the most recent ACS Community Survey (Released 2023) aggregated by Community Area. Additional years will be added as they become available. The underlying algorithm to create the dataset calculates the % of a census tract that falls within the boundaries of a given community area. Given that census tracts and community area boundaries are not aligned, these figures should be considered an estimate. Total population in this dataset: 2,647,621 Total Chicago Population Per ACS 2023: 2,664,452 % Difference: -0.632% There are different approaches in common use for displaying Hispanic or Latino population counts. In this dataset, following the approach taken by the Census Bureau, a person who identifies as Hispanic or Latino will also be counted in the race category with which they identify. However, again following the Census Bureau data, there is also a column for White Not Hispanic or Latino. Code can be found here: https://github.com/Chicago/5-Year-ACS-Survey-Data Community Area Shapefile: https://data.cityofchicago.org/Facilities-Geographic-Boundaries/Boundaries-Community-Areas-current-/cauq-8yn6 Census Area Python Package Documentation: https://census-area.readthedocs.io/en/latest/index.html

  11. Population Health (BRFSS: HRQOL)

    • kaggle.com
    zip
    Updated Dec 14, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Population Health (BRFSS: HRQOL) [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-population-health-needs-with-brfss-hrqol
    Explore at:
    zip(2247473 bytes)Available download formats
    Dataset updated
    Dec 14, 2022
    Authors
    The Devastator
    Description

    Population Health (BRFSS: HRQOL)

    Examining Trends, Disparities and Determinants of Health in the US Population

    By Health [source]

    About this dataset

    The Behavioral Risk Factor Surveillance System (BRFSS) offers an expansive collection of data on the health-related quality of life (HRQOL) from 1993 to 2010. Over this time period, the Health-Related Quality of Life dataset consists of a comprehensive survey reflecting the health and well-being of non-institutionalized US adults aged 18 years or older. The data collected can help track and identify unmet population health needs, recognize trends, identify disparities in healthcare, determine determinants of public health, inform decision making and policy development, as well as evaluate programs within public healthcare services.

    The HRQOL surveillance system has developed a compact set of HRQOL measures such as a summary measure indicating unhealthy days which have been validated for population health surveillance purposes and have been widely implemented in practice since 1993. Within this study's dataset you will be able to access information such as year recorded, location abbreviations & descriptions, category & topic overviews, questions asked in surveys and much more detailed information including types & units regarding data values retrieved from respondents along with their sample sizes & geographical locations involved!

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset tracks the Health-Related Quality of Life (HRQOL) from 1993 to 2010 using data from the Behavioral Risk Factor Surveillance System (BRFSS). This dataset includes information on the year, location abbreviation, location description, type and unit of data value, sample size, category and topic of survey questions.

    Using this dataset on BRFSS: HRQOL data between 1993-2010 will allow for a variety of analyses related to population health needs. The compact set of HRQOL measures can be used to identify trends in population health needs as well as determine disparities among various locations. Additionally, responses to survey questions can be used to inform decision making and program and policy development in public health initiatives.

    Research Ideas

    • Analyzing trends in HRQOL over the years by location to identify disparities in health outcomes between different populations and develop targeted policy interventions.
    • Developing new models for predicting HRQOL indicators at a regional level, and using this information to inform medical practice and public health implementation efforts.
    • Using the data to understand differences between states in terms of their HRQOL scores and establish best practices for healthcare provision based on that understanding, including areas such as access to care, preventative care services availability, etc

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: rows.csv | Column name | Description | |:-------------------------------|:----------------------------------------------------------| | Year | Year of survey. (Integer) | | LocationAbbr | Abbreviation of location. (String) | | LocationDesc | Description of location. (String) | | Category | Category of survey. (String) | | Topic | Topic of survey. (String) | | Question | Question asked in survey. (String) | | DataSource | Source of data. (String) | | Data_Value_Unit | Unit of data value. (String) | | Data_Value_Type | Type of data value. (String) | | Data_Value_Footnote_Symbol | Footnote symbol for data value. (String) | | Data_Value_Std_Err | Standard error of the data value. (Float) | | Sample_Size | Sample size used in sample. (Integer) | | Break_Out | Break out categories used. (String) | | Break_Out_Category | Type break out assessed. (String) | | **GeoLocation*...

  12. Deaths by vaccination status, England

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Deaths by vaccination status, England [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/deathsbyvaccinationstatusengland
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 25, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Age-standardised mortality rates for deaths involving coronavirus (COVID-19), non-COVID-19 deaths and all deaths by vaccination status, broken down by age group.

  13. US County Demographics

    • kaggle.com
    zip
    Updated Jan 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). US County Demographics [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-county-demographics/data
    Explore at:
    zip(7779793 bytes)Available download formats
    Dataset updated
    Jan 24, 2023
    Authors
    The Devastator
    Area covered
    United States
    Description

    US County Demographics

    Social, Health, and Economic Indicators

    By Danny [source]

    About this dataset

    This dataset contains US county-level demographic data from 2016, giving insight into the health and economic conditions of counties in the United States. Aggregated and filtered from various sources such as the US Census Small Area Income and Poverty Estimates (SAIPE) Program, American Community Survey, CDC National Center for Health Statistics, and more, this comprehensive dataset provides information on population as well as desert population for each county. Additionally, data is split between metropolitan and nonmetropolitan areas according to the Office of Management and Budget's 2013 classification scheme. Valuable information pertaining to infant mortality rates and total population are also included in this detailed set of data. Use this dataset to gain a better understanding of one of our nation's most essential regions

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    • Look at the information within the 'About this Dataset' section to have an understanding of what data sources were used to create this dataset as well as any transformations that may have been done while creating it.
    • Familiarize yourself with the columns provided in the data set to understand what information is available for each county such as total population (totpop), parental education level (educationLvl), median household income (medianIncome), etc.,
    • Use a combination of filtering and sorting techniques to narrow down results and focus in on more specific county demographics that you are looking for such as total households living below poverty line by state or median household income per capita between two counties etc.,
    • Keep in mind any additional transformations/simplifications/aggregations done during step 2 when using your data for analysis. For example, if certain variables were pivoted during step two from being rows into columns because it was easier to work with multiple years of income levels by having them all consolidated into one column then be aware that some states may not appear in all records due to those transformations being applied differently between regions which could result in missing values or other inconsistencies when doing downstream analysis on your selected variables.
    • Utilize resources such as Wikipedia and government census estimates if you need more detailed information surrounding these demographic characteristics beyond what's available within our current dataset – these can be helpful when conducting further research outside of solely relying on our provided spreadsheet values alone!

    Research Ideas

    • Creating a US county-level heat map of infant mortality rates, offering insight into which areas are most at risk for poor health outcomes.
    • Generating predictive models from the population data to anticipate and prepare for future population trends in different states or regions.
    • Developing an interactive web-based tool for school districts to explore potential impacts of student mobility on their area's population stability and diversity

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Food Desert.csv | Column name | Description | |:--------------------|:----------------------------------------------------------------------------------| | year | The year the data was collected. (Integer) | | fips | The Federal Information Processing Standard (FIPS) code for the county. (Integer) | | state_fips | The FIPS code for the state. (Integer) | | county_fips | The FIPS code for the county. (Integer)...

  14. Number of victims of the Holocaust and Nazi persecution 1933-1945, by...

    • statista.com
    Updated Feb 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2022). Number of victims of the Holocaust and Nazi persecution 1933-1945, by background [Dataset]. https://www.statista.com/statistics/1071011/holocaust-nazi-persecution-victims-wwii/
    Explore at:
    Dataset updated
    Feb 5, 2022
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Europe
    Description

    Most estimates place the total number of deaths during the Second World War at around 70-85 million people. Approximately 17 million of these deaths (20-25 percent of the total) were due to crimes against humanity carried out by the Nazi regime in Europe. In comparison to the millions of deaths that took place through conflict, famine, or disease, these 17 million stand out due to the reasoning behind them, along with the systematic nature and scale in which they were carried out. Nazi ideology claimed that the Aryan race (a non-existent ethnic group referring to northern Europeans) was superior to all other ethnicities; this became the justification for German expansion and the extermination of others. During the war, millions of people deemed to be of lesser races were captured and used as slave laborers, with a large share dying of exhaustion, starvation, or individual execution. Murder campaigns were also used for systematic extermination; the most famous of these were the extermination camps, such as at Auschwitz, where roughly 80 percent of the 1.1 million victims were murdered in gas chambers upon arrival at the camp. German death squads in Eastern Europe carried out widespread mass shootings, and up to two million people were killed in this way. In Germany itself, many disabled, homosexual, and "undesirables" were also killed or euthanized as part of a wider eugenics program, which aimed to "purify" German society.

    The Holocaust Of all races, the Nazi's viewed Jews as being the most inferior. Conspiracy theories involving Jews go back for centuries in Europe, and they have been repeatedly marginalized throughout history. German fascists used the Jews as scapegoats for the economic struggles during the interwar period. Following Hitler's ascendency to the Chancellorship in 1933, the German authorities began constructing concentration camps for political opponents and so-called undesirables, but the share of Jews being transported to these camps gradually increased in the following years, particularly after Kristallnacht (the Night of Broken Glass) in 1938. In 1939, Germany then invaded Poland, home to Europe's largest Jewish population. German authorities segregated the Jewish population into ghettos, and constructed thousands more concentration and detention camps across Eastern Europe, to which millions of Jews were transported from other territories. By the end of the war, over two thirds of Europe's Jewish population had been killed, and this share is higher still when one excludes the neutral or non-annexed territories.

    Lebensraum Another key aspect of Nazi ideology was that of the Lebensraum (living space). Both the populations of the Soviet Union and United States were heavily concentrated in one side of the country, with vast territories extending to the east and west, respectively. Germany was much smaller and more densely populated, therefore Hitler aspired to extend Germany's territory to the east and create new "living space" for Germany's population and industry to grow. While Hitler may have envied the U.S. in this regard, the USSR was seen as undeserving; Slavs were the largest major group in the east and the Nazis viewed them as inferior, which was again used to justify the annexation of their land and subjugation of their people. As the Germans took Slavic lands in Poland, the USSR, and Yugoslavia, ethnic cleansings (often with the help of local conspirators) became commonplace in the annexed territories. It is also believed that the majority of Soviet prisoners of war (PoWs) died through starvation and disease, and they were not given the same treatment as PoWs on the western front. The Soviet Union lost as many as 27 million people during the war, and 10 million of these were due to Nazi genocide. It is estimated that Poland lost up to six million people, and almost all of these were through genocide.

  15. FiveThirtyEight Police Killings Dataset

    • kaggle.com
    zip
    Updated Apr 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight Police Killings Dataset [Dataset]. https://www.kaggle.com/fivethirtyeight/fivethirtyeight-police-killings-dataset
    Explore at:
    zip(53916 bytes)Available download formats
    Dataset updated
    Apr 26, 2019
    Dataset authored and provided by
    FiveThirtyEighthttps://abcnews.go.com/538
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    Police Killings

    This directory contains the data behind the story Where Police Have Killed Americans In 2015.

    We linked entries from the Guardian's database on police killings to census data from the American Community Survey. The Guardian data was downloaded on June 2, 2015. More information about its database is available here.

    Census data was calculated at the tract level from the 2015 5-year American Community Survey using the tables S0601 (demographics), S1901 (tract-level income and poverty), S1701 (employment and education) and DP03 (county-level income). Census tracts were determined by geocoding addresses to latitude/longitude using the Bing Maps and Google Maps APIs and then overlaying points onto 2014 census tracts. GEOIDs are census-standard and should be easily joinable to other ACS tables -- let us know if you find anything interesting.

    Field descriptions:

    HeaderDescriptionSource
    nameName of deceasedGuardian
    ageAge of deceasedGuardian
    genderGender of deceasedGuardian
    raceethnicityRace/ethnicity of deceasedGuardian
    monthMonth of killingGuardian
    dayDay of incidentGuardian
    yearYear of incidentGuardian
    streetaddressAddress/intersection where incident occurredGuardian
    cityCity where incident occurredGuardian
    stateState where incident occurredGuardian
    latitudeLatitude, geocoded from address
    longitudeLongitude, geocoded from address
    state_fpState FIPS codeCensus
    county_fpCounty FIPS codeCensus
    tract_ceTract ID codeCensus
    geo_idCombined tract ID code
    county_idCombined county ID code
    namelsadTract descriptionCensus
    lawenforcementagencyAgency involved in incidentGuardian
    causeCause of deathGuardian
    armedHow/whether deceased was armedGuardian
    popTract populationCensus
    share_whiteShare of pop that is non-Hispanic whiteCensus
    share_bloackShare of pop that is black (alone, not in combination)Census
    share_hispanicShare of pop that is Hispanic/Latino (any race)Census
    p_incomeTract-level median personal incomeCensus
    h_incomeTract-level median household incomeCensus
    county_incomeCounty-level median household incomeCensus
    comp_incomeh_income / county_incomeCalculated from Census
    county_bucketHousehold income, quintile within countyCalculated from Census
    nat_bucketHousehold income, quintile nationallyCalculated from Census
    povTract-level poverty rate (official)Census
    urateTract-level unemployment rateCalculated from Census
    collegeShare of 25+ pop with BA or higherCalculated from Census

    Note regarding income calculations:

    All income fields are in inflation-adjusted 2013 dollars.

    comp_income is simply tract-level median household income as a share of county-level median household income.

    county_bucket provides where the tract's median household income falls in the distribution (by quintile) of all tracts in the county. (1 indicates a tract falls in the poorest 20% of tracts within the county.) Distribution is not weighted by population.

    nat_bucket is the same but for all U.S. counties.

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

  16. Covid-19 Highest City Population Density

    • kaggle.com
    zip
    Updated Mar 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    lookfwd (2020). Covid-19 Highest City Population Density [Dataset]. https://www.kaggle.com/lookfwd/covid19highestcitypopulationdensity
    Explore at:
    zip(4685 bytes)Available download formats
    Dataset updated
    Mar 25, 2020
    Authors
    lookfwd
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This is a dataset of the most highly populated city (if applicable) in a form easy to join with the COVID19 Global Forecasting (Week 1) dataset. You can see how to use it in this kernel

    Content

    There are four columns. The first two correspond to the columns from the original COVID19 Global Forecasting (Week 1) dataset. The other two is the highest population density, at city level, for the given country/state. Note that some countries are very small and in those cases the population density reflects the entire country. Since the original dataset has a few cruise ships as well, I've added them there.

    Acknowledgements

    Thanks a lot to Kaggle for this competition that gave me the opportunity to look closely at some data and understand this problem better.

    Inspiration

    Summary: I believe that the square root of the population density should relate to the logistic growth factor of the SIR model. I think the SEIR model isn't applicable due to any intervention being too late for a fast-spreading virus like this, especially in places with dense populations.

    After playing with the data provided in COVID19 Global Forecasting (Week 1) (and everything else online or media) a bit, one thing becomes clear. They have nothing to do with epidemiology. They reflect sociopolitical characteristics of a country/state and, more specifically, the reactivity and attitude towards testing.

    The testing method used (PCR tests) means that what we measure could potentially be a proxy for the number of people infected during the last 3 weeks, i.e the growth (with lag). It's not how many people have been infected and recovered. Antibody or serology tests would measure that, and by using them, we could go back to normality faster... but those will arrive too late. Way earlier, China will have experimentally shown that it's safe to go back to normal as soon as your number of newly infected per day is close to zero.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F197482%2F429e0fdd7f1ce86eba882857ac7a735e%2Fcovid-summary.png?generation=1585072438685236&alt=media" alt="">

    My view, as a person living in NYC, about this virus, is that by the time governments react to media pressure, to lockdown or even test, it's too late. In dense areas, everyone susceptible has already amble opportunities to be infected. Especially for a virus with 5-14 days lag between infections and symptoms, a period during which hosts spread it all over on subway, the conditions are hopeless. Active populations have already been exposed, mostly asymptomatic and recovered. Sensitive/older populations are more self-isolated/careful in affluent societies (maybe this isn't the case in North Italy). As the virus finishes exploring the active population, it starts penetrating the more isolated ones. At this point in time, the first fatalities happen. Then testing starts. Then the media and the lockdown. Lockdown seems overly effective because it coincides with the tail of the disease spread. It helps slow down the virus exploring the long-tail of sensitive population, and we should all contribute by doing it, but it doesn't cause the end of the disease. If it did, then as soon as people were back in the streets (see China), there would be repeated outbreaks.

    Smart politicians will test a lot because it will make their condition look worse. It helps them demand more resources. At the same time, they will have a low rate of fatalities due to large denominator. They can take credit for managing well a disproportionally major crisis - in contrast to people who didn't test.

    We were lucky this time. We, Westerners, have woken up to the potential of a pandemic. I'm sure we will give further resources for prevention. Additionally, we will be more open-minded, helping politicians to have more direct responses. We will also require them to be more responsible in their messages and reactions.

  17. Indian Terrorism Deaths

    • kaggle.com
    zip
    Updated Jan 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Indian Terrorism Deaths [Dataset]. https://www.kaggle.com/datasets/thedevastator/indian-terrorism-death-toll-and-incident-informa/versions/2
    Explore at:
    zip(3695892 bytes)Available download formats
    Dataset updated
    Jan 6, 2023
    Authors
    The Devastator
    Description

    Indian Terrorism Deaths

    Consequences of Terror Incidents in India

    By CrowdFlower [source]

    About this dataset

    This dataset contains comprehensive information on Indian terrorism deaths, including death tolls due to violence, civilian deaths and militant/terrorist/insurgent fatalities. Accurate estimates from 27,233 sentences sourced and verified from the South Asia Terrorism Portal are provided for every incident. Each row of the dataset includes variables corresponding to the state, district, date reported on as well as features indicating accuracy of judgments for each row. Golden rows indicate maximum accuracy levels for these details and include totals for civilians killed or injured according to the gold standard. Additionally features such as trusted judgements count along with extracted subjects and objects of sentences can be derived from this data set making it a powerful interface that allows researchers to gain access into key aspects of India's current situation related to lethal force events

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides information on deaths that have occurred in India due to terrorism, as well as the incident details. This can be a valuable source of information for researchers looking to better understand the impacts of terrorism on Indian society and the associated prevention measures.

    Here’s how you can use this dataset:

    • Analyze death tolls by type (civilian, militant/terrorist/insurgent, security forces):Use descriptive statistics functions to compare and contrast the number of deaths caused by civilian, militant/terrorist/insurgent, and security forces over time. You could also look for correlations between these types of incidents and other factors such as region or date.
    • Explore different regions impacted by terrorism: Explore which states or districts in India are affected most adversely by terrorist activities using location data from this dataset. You could also examine trends related to where incidents take place over time as well as total cumulative death counts per region; these findings may help inform where intense anti-terrorism efforts are required most.
    • Generate insight on key dates of events: Utilize date fields such as report date or last judgment at in order to pinpoint when certain major events have taken place related to terrorism in India; you could then dive deeper into any relevant context surrounding those dates that may spark further curiosity into the topic itself (e.g., who was involved? what was going on politically?)

    Research Ideas

    • Identifying trends in the number of deaths for different types of people over time in each district, state and country. This can be used to identify areas where violence is increasing or decreasing, and help develop interventions to reduce casualties from terrorism.
    • Investigating correlations between the type of people killed (civilians, militants/terrorists/insurgents etc.) and other factors such as political instability or development levels in the region.
    • Performing sentiment analysis on the sentences found in this dataset to measure how public opinion about terrorism is changing over time. This could be combined with other datasets such as media coverage to provide an even more comprehensive understanding of public attitudes towards terrorism

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: deaths-in-india-satp-dfe.csv | Column name | Description | |:-----------------------------------------|:--------------------------------------------------------------------------------------------------------------------| | _golden | A boolean value indicating whether the annotation is a golden annotation or not. (Boolean) | | _unit_state | A value indicating the state of the annotation unit. (String) | | _trusted_judgments | The number of trusted judgments for the annotation unit. (Integer) | | _last_judgment_at | The date and time of the last judgment for the annotation unit. (DateTime) ...

  18. Countries CO2 Emission and more...

    • kaggle.com
    zip
    Updated Feb 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Vanous (2022). Countries CO2 Emission and more... [Dataset]. https://www.kaggle.com/datasets/lobosi/c02-emission-by-countrys-grouth-and-population
    Explore at:
    zip(1321587 bytes)Available download formats
    Dataset updated
    Feb 26, 2022
    Authors
    Benjamin Vanous
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The world is becoming more modernized by the year, and with this becoming all the more polluted.

    This data was pulled from the US Energy Administration and joined together for an easier analysis. Its a collection of some big factors that play into C02 Emissions, with everything from the Production and Consumption of each type of major energy source for each country and its pollution rating each year. It also includes each countries GDP, Population, Energy intensity per capita (person), and Energy intensity per GDP (per person GDP). All the data spans all the way from the 1980's to 2020.

    Feature Descriptions:

    • Country - Country in question
    • Energy_type - Type of energy source
    • Year - Year the data was recorded
    • Energy_consumption - Amount of Consumption for the specific energy source, measured (quad Btu)
    • Energy_production - Amount of Production for the specific energy source, measured (quad Btu)
    • GDP - Countries GDP at purchasing power parities, measured (Billion 2015$ PPP)
    • Population - Population of specific Country, measured (Mperson)
    • Energy_intensity_per_capita - Energy intensity is a measure of the energy inefficiency of an economy. It is calculated as units of energy per unit of capita (capita = individual person), measured (MMBtu/person)
    • Energy_intensity_by_GDP- Energy intensity is a measure of the energy inefficiency of an economy. It is calculated as units of energy per unit of GDP, measred (1000 Btu/2015$ GDP PPP)
    • CO2_emission - The amount of C02 emitted, measured (MMtonnes CO2)
  19. Profitability by Crops Field

    • kaggle.com
    zip
    Updated Nov 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Profitability by Crops Field [Dataset]. https://www.kaggle.com/datasets/thedevastator/the-world-s-most-profitable-cash-crops
    Explore at:
    zip(1574784 bytes)Available download formats
    Dataset updated
    Nov 25, 2022
    Authors
    The Devastator
    Description

    Profitability by Crops Field

    Harvested Crops and Their Value by Country and Year

    By Andy Kriebel [source]

    About this dataset

    This dataset contains information on the world's harvested crops. The data includes the value of the crop, the country of origin, the year of harvest, and more. This data can be used to understand which crops are the most valuable, and how this value has changed over time

    How to use the dataset

    This dataset provides information on the world's most profitable cash crops. The data includes the value of the crop, the country of origin, and the year of harvest. This dataset can be used to understand which crops are most valuable and how this value has changed over time

    Research Ideas

    • To find out which crops are grown in which countries
    • To find out the value of harvested crops by country and year
    • To find out the world's biggest cash crop

    Acknowledgements

    Data Source

    License

    License: Dataset copyright by authors - You are free to: - Share - copy and redistribute the material in any medium or format for any purpose, even commercially. - Adapt - remix, transform, and build upon the material for any purpose, even commercially. - You must: - Give appropriate credit - Provide a link to the license, and indicate if changes were made. - ShareAlike - You must distribute your contributions under the same license as the original. - Keep intact - all notices that refer to this license, including copyright notices.

    Columns

    File: Harvested Crops.csv | Column name | Description | |:---------------------|:------------------------------------------------------------| | Area | The country where the crop was grown. (String) | | Element | The type of crop. (String) | | Item | The name of the crop. (String) | | Year | The year the crop was harvested. (Integer) | | Unit | The unit of measurement for the value of the crop. (String) | | Value | The value of the crop. (Float) | | Flag | A code that indicates the quality of the data. (String) | | Flag Description | A description of the flag code. (String) |

    File: Harvested Crops Summary.csv

    Acknowledgements

    If you use this dataset in your research, please credit Andy Kriebel.

  20. 2021 US Federal Award Data

    • kaggle.com
    zip
    Updated Mar 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stephen Keller (2022). 2021 US Federal Award Data [Dataset]. https://www.kaggle.com/datasets/skeller/2021-us-federal-award-data
    Explore at:
    zip(2046620087 bytes)Available download formats
    Dataset updated
    Mar 11, 2022
    Authors
    Stephen Keller
    Area covered
    United States
    Description

    Context

    USASpending.gov is the government's official tool for tracking spending, it shows where money goes and who benefits from federal funds.

    The Federal Funding Accountability and Transparency Act of 2006 required that federal contract, grant, loan awards over $25k be searchable online to give the American public access to government spending. The data that is collected in USAspending.gov is derived from data gathered at more than a hundred agencies, as well as other government systems. Federal agencies submit contracts, grants, loans and other awards information to be uploaded on USAspending.gov at least twice a month.

    Content

    The United States spends a lot of money on contracts every year but where does it all go? This data set has information about how much different agencies have spent on awards for the fiscal year 2021. More data can be downloaded, for other years, on USAspending.gov.

    Contracts are published to the GSA's Federal Procurement Data System within five days of being awarded, with contract reporting automatically getting posted on USAspending.gov by 9 AM the next day and going live at 8:00 am EST two mornings later

    Learn more about the contents here: https://www.usaspending.gov/data-dictionary

    The Bureau of the Fiscal Service, United States Department of the Treasury, is dedicated to making government spending data available to everyone.

    Data Description

    This data starts off separated into smaller files that need to be joined.

    Data Overview

    The federal government buys a lot of things, like office furniture and aircraft. It also buys services, like telephone and Internet access. The Federal Government and its sub-agencies use contracts to buy these things. They use Product and Service Codes (PSC) to classify the items and services they purchase.

    An obligation is a promise to spend money. An outlay is when the government spends money. When the government enters into a contract or grant, it promises to spend all of the money. This is so it can pay people who do what they agreed to do. When the government actually pays someone, then it counts as an outlay.

    Data Items that Help get Started

    There are many different variables in this database, which are spread across multiple files. The most important ones to start learning are:

    1. The contractor who won the award - recipient_name
    2. The agency issuing the award - awarding_agency_name
    3. The product or service code (PSC) - product_or_service_code
    4. The industry classification code (NAICS) of the vendor - naics_code
    5. How much was obligated - total_dollars_obligated or total_obligated_amount
    6. The contract modification number - modification_number
    7. The description of the award - award_description
    8. The date of award - action_date or award_base_action_date

    Data Dictionary and Analyst Guide

    To learn more about the data, you can reference the data dictionary. The data dictionary includes information on outlays, which are not included in the data provided here. https://www.usaspending.gov/data-dictionary

    Please see the analysts guide for more information: https://datalab.usaspending.gov/analyst-guide/

    License

    The U.S. Department of the Treasury, Bureau of the Fiscal Service is committed to providing open data to enable effective tracking of federal spending. The data is available to copy, adapt, redistribute, or otherwise use for non-commercial or for commercial purposes, subject to the Limitation on Permissible Use of Dun & Bradstreet, Inc. Data noted on the homepage. https://www.usaspending.gov/db_info

    Acknowledgements

    USAspending.gov collects data from all over the government to provide information to the public. Special thanks for the Data Transparency Team within the Office of the Chief Data Officer at the Bureau of Fiscal Services.

    Inspiration

    Can we find any patterns to help the public? How about predicting future spending needs or opportunities? Test out your ideas here!

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public

Mass Killings in America, 2006 - present

Data from the AP-USA TODAY-Northeastern project tracking the killings of four or more victims from 2006-present

Explore at:
6 scholarly articles cite this dataset (View in Google Scholar)
zip, csvAvailable download formats
Dataset updated
Dec 1, 2025
Authors
The Associated Press
Time period covered
Jan 1, 2006 - Nov 29, 2025
Area covered
Description

THIS DATASET WAS LAST UPDATED AT 7:11 AM EASTERN ON DEC. 1

OVERVIEW

2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

A total of 229 people died in mass killings in 2019.

The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

One-third of the offenders died at the scene of the killing or soon after, half from suicides.

About this Dataset

The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

This data will be updated periodically and can be used as an ongoing resource to help cover these events.

Using this Dataset

To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

Mass killings by year

Mass shootings by year

To get these counts just for your state:

Filter killings by state

Definition of "mass murder"

Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

Methodology

Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

This project started at USA TODAY in 2012.

Contacts

Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

Search
Clear search
Close search
Google apps
Main menu