100+ datasets found
  1. Z

    Effect of suicide rates on life expectancy dataset

    • data.niaid.nih.gov
    • zenodo.org
    Updated Apr 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4694269
    Explore at:
    Dataset updated
    Apr 16, 2021
    Dataset authored and provided by
    Filip Zoubek
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Description

    Effect of suicide rates on life expectancy dataset

    Abstract In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy. The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

    Data

    The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

    LICENSE

    THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

    [1] https://www.kaggle.com/szamil/who-suicide-statistics

    [2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

  2. d

    Mass Killings in America, 2006 - present

    • data.world
    csv, zip
    Updated Aug 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Associated Press (2025). Mass Killings in America, 2006 - present [Dataset]. https://data.world/associatedpress/mass-killings-public
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 11, 2025
    Authors
    The Associated Press
    Time period covered
    Jan 1, 2006 - Aug 1, 2025
    Area covered
    Description

    THIS DATASET WAS LAST UPDATED AT 2:11 AM EASTERN ON AUG. 11

    OVERVIEW

    2019 had the most mass killings since at least the 1970s, according to the Associated Press/USA TODAY/Northeastern University Mass Killings Database.

    In all, there were 45 mass killings, defined as when four or more people are killed excluding the perpetrator. Of those, 33 were mass shootings . This summer was especially violent, with three high-profile public mass shootings occurring in the span of just four weeks, leaving 38 killed and 66 injured.

    A total of 229 people died in mass killings in 2019.

    The AP's analysis found that more than 50% of the incidents were family annihilations, which is similar to prior years. Although they are far less common, the 9 public mass shootings during the year were the most deadly type of mass murder, resulting in 73 people's deaths, not including the assailants.

    One-third of the offenders died at the scene of the killing or soon after, half from suicides.

    About this Dataset

    The Associated Press/USA TODAY/Northeastern University Mass Killings database tracks all U.S. homicides since 2006 involving four or more people killed (not including the offender) over a short period of time (24 hours) regardless of weapon, location, victim-offender relationship or motive. The database includes information on these and other characteristics concerning the incidents, offenders, and victims.

    The AP/USA TODAY/Northeastern database represents the most complete tracking of mass murders by the above definition currently available. Other efforts, such as the Gun Violence Archive or Everytown for Gun Safety may include events that do not meet our criteria, but a review of these sites and others indicates that this database contains every event that matches the definition, including some not tracked by other organizations.

    This data will be updated periodically and can be used as an ongoing resource to help cover these events.

    Using this Dataset

    To get basic counts of incidents of mass killings and mass shootings by year nationwide, use these queries:

    Mass killings by year

    Mass shootings by year

    To get these counts just for your state:

    Filter killings by state

    Definition of "mass murder"

    Mass murder is defined as the intentional killing of four or more victims by any means within a 24-hour period, excluding the deaths of unborn children and the offender(s). The standard of four or more dead was initially set by the FBI.

    This definition does not exclude cases based on method (e.g., shootings only), type or motivation (e.g., public only), victim-offender relationship (e.g., strangers only), or number of locations (e.g., one). The time frame of 24 hours was chosen to eliminate conflation with spree killers, who kill multiple victims in quick succession in different locations or incidents, and to satisfy the traditional requirement of occurring in a “single incident.”

    Offenders who commit mass murder during a spree (before or after committing additional homicides) are included in the database, and all victims within seven days of the mass murder are included in the victim count. Negligent homicides related to driving under the influence or accidental fires are excluded due to the lack of offender intent. Only incidents occurring within the 50 states and Washington D.C. are considered.

    Methodology

    Project researchers first identified potential incidents using the Federal Bureau of Investigation’s Supplementary Homicide Reports (SHR). Homicide incidents in the SHR were flagged as potential mass murder cases if four or more victims were reported on the same record, and the type of death was murder or non-negligent manslaughter.

    Cases were subsequently verified utilizing media accounts, court documents, academic journal articles, books, and local law enforcement records obtained through Freedom of Information Act (FOIA) requests. Each data point was corroborated by multiple sources, which were compiled into a single document to assess the quality of information.

    In case(s) of contradiction among sources, official law enforcement or court records were used, when available, followed by the most recent media or academic source.

    Case information was subsequently compared with every other known mass murder database to ensure reliability and validity. Incidents listed in the SHR that could not be independently verified were excluded from the database.

    Project researchers also conducted extensive searches for incidents not reported in the SHR during the time period, utilizing internet search engines, Lexis-Nexis, and Newspapers.com. Search terms include: [number] dead, [number] killed, [number] slain, [number] murdered, [number] homicide, mass murder, mass shooting, massacre, rampage, family killing, familicide, and arson murder. Offender, victim, and location names were also directly searched when available.

    This project started at USA TODAY in 2012.

    Contacts

    Contact AP Data Editor Justin Myers with questions, suggestions or comments about this dataset at jmyers@ap.org. The Northeastern University researcher working with AP and USA TODAY is Professor James Alan Fox, who can be reached at j.fox@northeastern.edu or 617-416-4400.

  3. Statewide Death Profiles

    • data.chhs.ca.gov
    • data.ca.gov
    • +3more
    csv, zip
    Updated Jul 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Statewide Death Profiles [Dataset]. https://data.chhs.ca.gov/dataset/statewide-death-profiles
    Explore at:
    csv(5401561), csv(200270), csv(16301), csv(164006), csv(5034), csv(463460), csv(2026589), csv(419332), csv(4689434), zip, csv(385695)Available download formats
    Dataset updated
    Jul 28, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    This dataset contains counts of deaths for California as a whole based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.

    The final data tables include both deaths that occurred in California regardless of the place of residence (by occurrence) and deaths to California residents (by residence), whereas the provisional data table only includes deaths that occurred in California regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.

    The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.

  4. T

    PDI (Police Data Initiative) Crime Incidents

    • data.cincinnati-oh.gov
    csv, xlsx, xml
    Updated Aug 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Cincinnati (2025). PDI (Police Data Initiative) Crime Incidents [Dataset]. https://data.cincinnati-oh.gov/Safety/PDI-Police-Data-Initiative-Crime-Incidents/k59e-2pvf
    Explore at:
    xlsx, xml, csvAvailable download formats
    Dataset updated
    Aug 10, 2025
    Dataset authored and provided by
    City of Cincinnati
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Note: Due to the RMS change for CPS, this data set stops on 6/2/2024. For records beginning on 6/3/2024, please see the dataset at this link: https://data.cincinnati-oh.gov/safety/Reported-Crime-STARS-Category-Offenses-/7aqy-xrv9/about_data

    Data Description: This data represents reported Crime Incidents in the City of Cincinnati. Incidents are the records, of reported crimes, collated by an agency for management. Incidents are typically housed in a Records Management System (RMS) that stores agency-wide data about law enforcement operations. This does not include police calls for service, arrest information, final case determination, or any other incident outcome data.

    Data Creation: The Cincinnati Police Department's (CPD) records crime incidents in the City through Records Management System (RMS) that stores agency-wide data about law enforcement operations.

    Data Created By: The source of this data is the Cincinnati Police Department.

    Refresh Frequency: This data is updated daily.

    CincyInsights: The City of Cincinnati maintains an interactive dashboard portal, CincyInsights in addition to our Open Data in an effort to increase access and usage of city data. This data set has an associated dashboard available here: https://insights.cincinnati-oh.gov/stories/s/8eaa-xrvz

    Data Dictionary: A data dictionary providing definitions of columns and attributes is available as an attachment to this dataset.

    Processing: The City of Cincinnati is committed to providing the most granular and accurate data possible. In that pursuit the Office of Performance and Data Analytics facilitates standard processing to most raw data prior to publication. Processing includes but is not limited: address verification, geocoding, decoding attributes, and addition of administrative areas (i.e. Census, neighborhoods, police districts, etc.).

    Data Usage: For directions on downloading and using open data please visit our How-to Guide: https://data.cincinnati-oh.gov/dataset/Open-Data-How-To-Guide/gdr9-g3ad

    Disclaimer: In compliance with privacy laws, all Public Safety datasets are anonymized and appropriately redacted prior to publication on the City of Cincinnati’s Open Data Portal. This means that for all public safety datasets: (1) the last two digits of all addresses have been replaced with “XX,” and in cases where there is a single digit street address, the entire address number is replaced with "X"; and (2) Latitude and Longitude have been randomly skewed to represent values within the same block area (but not the exact location) of the incident.

  5. Deaths; suicide (residents), various themes

    • cbs.nl
    • dexes.eu
    • +3more
    xml
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centraal Bureau voor de Statistiek (2025). Deaths; suicide (residents), various themes [Dataset]. https://www.cbs.nl/en-gb/figures/detail/7022eng
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    Statistics Netherlands
    Authors
    Centraal Bureau voor de Statistiek
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    1950 - 2023
    Area covered
    The Netherlands
    Description

    This table contains the number of victims of suicide arranged by marital status, method, motives, age and sex. They represent the number deaths by suicide in the resident population of the Netherlands.

    The figures in this table are equal to the suicide figures in the causes of death statistics, because they are based on the same files. The causes of death statistics do not contain information on the motive of suicide. For the years 1950-1995, this information is obtained from a historical data file on suicides. For the years 1996-now the motive is taken from the external causes of death (Niet-Natuurlijke dood) file. Before the 9th revision of the International Statistical Classification of Diseases and Related Health Problems (ICD), i.e. for the years 1950-1978, it was not possible to code "jumping in front of train/metro". For these years 1950-1978 "jumping in front of train/metro" has been left empty, and it has been counted in the group "other method".

    Relative figures have been calculated per 100 000 of the corresponding population group. The figures are calculated based on the average population of the corresponding year.

    Data available from: 1950

    Status of the figures: The figures up to and including 2023 are final.

    Changes as of January 23rd 2025: The figures for 2023 are made final.

    When will new figures be published: In the third quarter of 2025 the provisional figures for 2024 will be published.

  6. f

    Prevalence of Suicidal Ideation in Chinese College Students: A Meta-Analysis...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 6, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Ya-Ming; Tang, Si-Yuan; Li, Zhan-Zhan; Lei, Xian-Yang; Liu, Li; Chen, Lizhang; Zhang, Dan (2014). Prevalence of Suicidal Ideation in Chinese College Students: A Meta-Analysis [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001185115
    Explore at:
    Dataset updated
    Oct 6, 2014
    Authors
    Li, Ya-Ming; Tang, Si-Yuan; Li, Zhan-Zhan; Lei, Xian-Yang; Liu, Li; Chen, Lizhang; Zhang, Dan
    Description

    BackgroundAbout 1 million people worldwide commit suicide each year, and college students with suicidal ideation are at high risk of suicide. The prevalence of suicidal ideation in college students has been estimated extensively, but quantitative syntheses of overall prevalence are scarce, especially in China. Accurate estimates of prevalence are important for making public policy. In this paper, we aimed to determine the prevalence of suicidal ideation in Chinese college students.Objective and MethodsDatabases including PubMed, Web of Knowledge, Chinese Web of Knowledge, Wangfang (Chinese database) and Weipu (Chinese database) were systematically reviewed to identify articles published between 2004 to July 2013, in either English or Chinese, reporting prevalence estimates of suicidal ideation among Chinese college students. The strategy also included a secondary search of reference lists of records retrieved from databases. Then the prevalence estimates were summarized using a random effects model. The effects of moderator variables on the prevalence estimates were assessed using a meta-regression model.ResultsA total of 41 studies involving 160339 college students were identified, and the prevalence ranged from 1.24% to 26.00%. The overall pooled prevalence of suicidal ideation among Chinese college students was 10.72% (95%CI: 8.41% to 13.28%). We noted substantial heterogeneity in prevalence estimates. Subgroup analyses showed that prevalence of suicidal ideation in females is higher than in males.ConclusionsThe prevalence of suicidal ideation in Chinese college students is relatively high, although the suicide rate is lower compared with the entire society, suggesting the need for local surveys to inform the development of health services for college students.

  7. Suicides in England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Aug 29, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2024). Suicides in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/deaths/datasets/suicidesintheunitedkingdomreferencetables
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Aug 29, 2024
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Number of suicides and suicide rates, by sex and age, in England and Wales. Information on conclusion type is provided, along with the proportion of suicides by method and the median registration delay.

  8. f

    SuicideBD: A Suicidal Dataset for Bangladesh Public

    • figshare.com
    Updated May 31, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Md. Abir Hassan; Subangkar Karmaker Shanto; Md. Saddam Hossain Mukta; salekul Islam; Md.Arafat Hossain (2023). SuicideBD: A Suicidal Dataset for Bangladesh Public [Dataset]. http://doi.org/10.6084/m9.figshare.19550761.v4
    Explore at:
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Authors
    Md. Abir Hassan; Subangkar Karmaker Shanto; Md. Saddam Hossain Mukta; salekul Islam; Md.Arafat Hossain
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Bangladesh
    Description

    This dataset contains individual details who committed suicide in Bangladesh during the Pandemic between February, 2020 to November, 2020. This dataset includes details of every individuals who committed suicide like personal details, family & social life, profession, financial condition, methods of committing suicide, location and weather info. The dataset is freely available. The major fields included in this dataset are: age group, age, gender, profession group, reason, method, suicide date & time, addiction status, mental status, economic condition, marital status, family details, academic qualification, weather. Apart from the above data this dataset also contains a CSV file of a Bengali wordcloud built on social media posts of the suicide victims.

    The access to the dataset files is kept restricted. Fill the form (link in the References section) to request the data.

  9. Weapons Used in Crimes in LA

    • kaggle.com
    Updated Apr 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Mann (2024). Weapons Used in Crimes in LA [Dataset]. https://www.kaggle.com/datasets/benmann2448/weapons-used-in-crimes-in-la/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 11, 2024
    Dataset provided by
    Kaggle
    Authors
    Benjamin Mann
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Los Angeles
    Description

    This dataset contains all the different kinds of weapons and how many times they were used to commit crimes in Los Angeles between the years 2020 to early 2024. This dataset was created from the data published by the LAPD and you can find the original dataset here.

  10. Number and percentage of homicide victims, by type of firearm used to commit...

    • www150.statcan.gc.ca
    • open.canada.ca
    • +1more
    Updated Jul 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Number and percentage of homicide victims, by type of firearm used to commit the homicide [Dataset]. http://doi.org/10.25318/3510017001-eng
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number and percentage of homicide victims, by type of firearm used to commit the homicide (total firearms; handgun; rifle or shotgun; other firearm-like weapons; firearm, type of firearm is unknown), Canada, 1974 to 2024.

  11. predict-criminal

    • kaggle.com
    Updated Jan 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RANOELISON Dimbisoa Patrick (2021). predict-criminal [Dataset]. https://www.kaggle.com/dimbisoa/predictcriminal
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 12, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    RANOELISON Dimbisoa Patrick
    Description

    There has been a surge in crimes committed in recent years, making crime a top cause of concern for law enforcement. If we are able to estimate whether someone is going to commit a crime in the future, we can take precautions and be prepared. You are given a dataset containing answers to various questions concerning the professional and private lives of several people. A few of them have been arrested for various small and large crimes in the past.The train data consists of 39999 rows, while the test data consists of 5710 rows.

    The train data consists of 39999 rows, while the test data consists of 5710 rows.

    Use the given data to predict if the people in the test data will commit a crime. You are given three files to download: train, test and sample submission. The evaluation metric is precision score.

  12. Suicides in India during 2015

    • kaggle.com
    Updated Aug 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vidya Pb (2020). Suicides in India during 2015 [Dataset]. https://www.kaggle.com/vidyapb/suicides-in-india-during-2015/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 22, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vidya Pb
    Area covered
    India
    Description

    Context

    This dataset contains information on suicides which happened in India during 2015.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4208638%2Ffab2e99b439f9780daf358511060f514%2FWorld-Suicide-Prevention-Day.jpg?generation=1598114750200382&alt=media" alt="">

    The singular age-old social precept of 'Lok Kya Kahenge?' (loosely translated: "What will people say?") suppresses the much-needed psychological care in India. It's high time that we understand why suicides happen and what are the reasons behind it. This dataset aims to spread awareness about suicides in India.

    Content

    I acquired this dataset from here. Have a look at the website.

    This dataset contains 9 files in .csv format. You can find a description for each column. Let me summarize it here as well.

    1. Cause-wise distribution of suicides in Central Armed Police Force (CAPF) during 2015.
    2. Economic Status-wise distribution of suicides during 2015.
    3. Educational Status-wise distribution of suicides during 2015.
    4. Farmer or Cultivators distribution of suicides during 2015.
    5. Profession-wise distribution of suicides during 2015.
    6. Social status-wise distribution of suicides during 2015.
    7. Cause-wise distribution of suicides during 2015.
    8. Suicides by Agricultural labourers during 2015.
    9. Suicides by means adopted during 2015.

    Inspiration

    We now have plenty of data to explore to draw some conclusions about suicides which happened in India during 2015. Let's start by answering these questions: - What are the top 5 states where Farmers' suicides occurred the most? - What's the top reason that agricultural labourers committed suicide? - Which Profession has the most suicides? What could be the reason? - How many Transgender suicides have occurred in different categories?

    I hope these questions interest you in starting to explore this dataset.

    Acknowledgements

    I thank the Indian Government for making it public under their Open Government Data (OGD) Platform India. Please use this dataset strictly for educational purposes. Thank you.

  13. C

    Violence Reduction - Victim Demographics - Aggregated

    • data.cityofchicago.org
    • s.cnmilf.com
    • +1more
    csv, xlsx, xml
    Updated Aug 10, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Chicago (2025). Violence Reduction - Victim Demographics - Aggregated [Dataset]. https://data.cityofchicago.org/Public-Safety/Violence-Reduction-Victim-Demographics-Aggregated/gj7a-742p
    Explore at:
    xml, xlsx, csvAvailable download formats
    Dataset updated
    Aug 10, 2025
    Dataset authored and provided by
    City of Chicago
    Description

    This dataset contains aggregate data on violent index victimizations at the quarter level of each year (i.e., January – March, April – June, July – September, October – December), from 2001 to the present (1991 to present for Homicides), with a focus on those related to gun violence. Index crimes are 10 crime types selected by the FBI (codes 1-4) for special focus due to their seriousness and frequency. This dataset includes only those index crimes that involve bodily harm or the threat of bodily harm and are reported to the Chicago Police Department (CPD). Each row is aggregated up to victimization type, age group, sex, race, and whether the victimization was domestic-related. Aggregating at the quarter level provides large enough blocks of incidents to protect anonymity while allowing the end user to observe inter-year and intra-year variation. Any row where there were fewer than three incidents during a given quarter has been deleted to help prevent re-identification of victims. For example, if there were three domestic criminal sexual assaults during January to March 2020, all victims associated with those incidents have been removed from this dataset. Human trafficking victimizations have been aggregated separately due to the extremely small number of victimizations.

    This dataset includes a " GUNSHOT_INJURY_I " column to indicate whether the victimization involved a shooting, showing either Yes ("Y"), No ("N"), or Unknown ("UKNOWN.") For homicides, injury descriptions are available dating back to 1991, so the "shooting" column will read either "Y" or "N" to indicate whether the homicide was a fatal shooting or not. For non-fatal shootings, data is only available as of 2010. As a result, for any non-fatal shootings that occurred from 2010 to the present, the shooting column will read as “Y.” Non-fatal shooting victims will not be included in this dataset prior to 2010; they will be included in the authorized dataset, but with "UNKNOWN" in the shooting column.

    The dataset is refreshed daily, but excludes the most recent complete day to allow CPD time to gather the best available information. Each time the dataset is refreshed, records can change as CPD learns more about each victimization, especially those victimizations that are most recent. The data on the Mayor's Office Violence Reduction Dashboard is updated daily with an approximately 48-hour lag. As cases are passed from the initial reporting officer to the investigating detectives, some recorded data about incidents and victimizations may change once additional information arises. Regularly updated datasets on the City's public portal may change to reflect new or corrected information.

    How does this dataset classify victims?

    The methodology by which this dataset classifies victims of violent crime differs by victimization type:

    Homicide and non-fatal shooting victims: A victimization is considered a homicide victimization or non-fatal shooting victimization depending on its presence in CPD's homicide victims data table or its shooting victims data table. A victimization is considered a homicide only if it is present in CPD's homicide data table, while a victimization is considered a non-fatal shooting only if it is present in CPD's shooting data tables and absent from CPD's homicide data table.

    To determine the IUCR code of homicide and non-fatal shooting victimizations, we defer to the incident IUCR code available in CPD's Crimes, 2001-present dataset (available on the City's open data portal). If the IUCR code in CPD's Crimes dataset is inconsistent with the homicide/non-fatal shooting categorization, we defer to CPD's Victims dataset.

    For a criminal homicide, the only sensible IUCR codes are 0110 (first-degree murder) or 0130 (second-degree murder). For a non-fatal shooting, a sensible IUCR code must signify a criminal sexual assault, a robbery, or, most commonly, an aggravated battery. In rare instances, the IUCR code in CPD's Crimes and Victims dataset do not align with the homicide/non-fatal shooting categorization:

    1. In instances where a homicide victimization does not correspond to an IUCR code 0110 or 0130, we set the IUCR code to "01XX" to indicate that the victimization was a homicide but we do not know whether it was a first-degree murder (IUCR code = 0110) or a second-degree murder (IUCR code = 0130).
    2. When a non-fatal shooting victimization does not correspond to an IUCR code that signifies a criminal sexual assault, robbery, or aggravated battery, we enter “UNK” in the IUCR column, “YES” in the GUNSHOT_I column, and “NON-FATAL” in the PRIMARY column to indicate that the victim was non-fatally shot, but the precise IUCR code is unknown.

    Other violent crime victims: For other violent crime types, we refer to the IUCR classification that exists in CPD's victim table, with only one exception:

    1. When there is an incident that is associated with no victim with a matching IUCR code, we assume that this is an error. Every crime should have at least 1 victim with a matching IUCR code. In these cases, we change the IUCR code to reflect the incident IUCR code because CPD's incident table is considered to be more reliable than the victim table.

    Note: All businesses identified as victims in CPD data have been removed from this dataset.

    Note: The definition of “homicide” (shooting or otherwise) does not include justifiable homicide or involuntary manslaughter. This dataset also excludes any cases that CPD considers to be “unfounded” or “noncriminal.”

    Note: In some instances, the police department's raw incident-level data and victim-level data that were inputs into this dataset do not align on the type of crime that occurred. In those instances, this dataset attempts to correct mismatches between incident and victim specific crime types. When it is not possible to determine which victims are associated with the most recent crime determination, the dataset will show empty cells in the respective demographic fields (age, sex, race, etc.).

    Note: The initial reporting officer usually asks victims to report demographic data. If victims are unable to recall, the reporting officer will use their best judgment. “Unknown” can be reported if it is truly unknown.

  14. Number of suicides India 1971-2022

    • statista.com
    Updated May 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of suicides India 1971-2022 [Dataset]. https://www.statista.com/statistics/665354/number-of-suicides-india/
    Explore at:
    Dataset updated
    May 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    Over *** thousand deaths due to suicides were recorded in India in 2022. Furthermore, majority of suicides were reported in the state of Tamil Nadu, followed by Rajasthan. The number of suicides that year had increased from the previous year. Some of the causes for suicides in the country were due to professional problems, abuse, violence, family problems, financial loss, sense of isolation and mental disorders. Depressive disorders and suicide As of 2015, over ****** million people worldwide suffered from some kind of depressive disorder. Furthermore, over ** percent of the total population in India suffer from different forms of mental disorders as of 2017. There exists a positive correlation between the number of suicide mortality rates and people with select mental disorders as opposed to those without. Risk factors for mental disorders Every ******* person in India suffers from some form of mental disorder. Today, depressive disorders are regarded as the leading contributor not only to disease burden and morbidity worldwide, but even suicide if not addressed. In 2022, the leading cause for suicide deaths in India was due to family problems. The second leading cause was due to illness. Some of the risk factors, relative to developing mental disorders including depressive and anxiety disorders, include bullying victimization, poverty, unemployment, childhood sexual abuse and intimate partner violence.

  15. Number of homicide victims, by method used to commit the homicide

    • www150.statcan.gc.ca
    • open.canada.ca
    Updated Jul 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Number of homicide victims, by method used to commit the homicide [Dataset]. http://doi.org/10.25318/3510006901-eng
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number of homicide victims, by method used to commit the homicide (total methods used; shooting; stabbing; beating; strangulation; fire (burns or suffocation); other methods used; methods used unknown), Canada, 1974 to 2024.

  16. CommitBench

    • zenodo.org
    csv, json
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maximilian Schall; Maximilian Schall; Tamara Czinczoll; Tamara Czinczoll; Gerard de Melo; Gerard de Melo (2024). CommitBench [Dataset]. http://doi.org/10.5281/zenodo.10497442
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 14, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maximilian Schall; Maximilian Schall; Tamara Czinczoll; Tamara Czinczoll; Gerard de Melo; Gerard de Melo
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Time period covered
    Dec 15, 2023
    Description

    Data Statement for CommitBench

    - Dataset Title: CommitBench
    - Dataset Curator: Maximilian Schall, Tamara Czinczoll, Gerard de Melo
    - Dataset Version: 1.0, 15.12.2023
    - Data Statement Author: Maximilian Schall, Tamara Czinczoll
    - Data Statement Version: 1.0, 16.01.2023

    EXECUTIVE SUMMARY

    We provide CommitBench as an open-source, reproducible and privacy- and license-aware benchmark for commit message generation. The dataset is gathered from github repositories with licenses that permit redistribution. We provide six programming languages, Java, Python, Go, JavaScript, PHP and Ruby. The commit messages in natural language are restricted to English, as it is the working language in many software development projects. The dataset has 1,664,590 examples that were generated by using extensive quality-focused filtering techniques (e.g. excluding bot commits). Additionally, we provide a version with longer sequences for benchmarking models with more extended sequence input, as well a version with

    CURATION RATIONALE

    We created this dataset due to quality and legal issues with previous commit message generation datasets. Given a git diff displaying code changes between two file versions, the task is to predict the accompanying commit message describing these changes in natural language. We base our GitHub repository selection on that of a previous dataset, CodeSearchNet, but apply a large number of filtering techniques to improve the data quality and eliminate noise. Due to the original repository selection, we are also restricted to the aforementioned programming languages. It was important to us, however, to provide some number of programming languages to accommodate any changes in the task due to the degree of hardware-relatedness of a language. The dataset is provides as a large CSV file containing all samples. We provide the following fields: Diff, Commit Message, Hash, Project, Split.

    DOCUMENTATION FOR SOURCE DATASETS

    Repository selection based on CodeSearchNet, which can be found under https://github.com/github/CodeSearchNet

    LANGUAGE VARIETIES

    Since GitHub hosts software projects from all over the world, there is no single uniform variety of English used across all commit messages. This means that phrasing can be regional or subject to influences from the programmer's native language. It also means that different spelling conventions may co-exist and that different terms may used for the same concept. Any model trained on this data should take these factors into account. For the number of samples for different programming languages, see Table below:

    LanguageNumber of Samples
    Java153,119
    Ruby233,710
    Go137,998
    JavaScript373,598
    Python472,469
    PHP294,394

    SPEAKER DEMOGRAPHIC

    Due to the extremely diverse (geographically, but also socio-economically) backgrounds of the software development community, there is no single demographic the data comes from. Of course, this does not entail that there are no biases when it comes to the data origin. Globally, the average software developer tends to be male and has obtained higher education. Due to the anonymous nature of GitHub profiles, gender distribution information cannot be extracted.

    ANNOTATOR DEMOGRAPHIC

    Due to the automated generation of the dataset, no annotators were used.

    SPEECH SITUATION AND CHARACTERISTICS

    The public nature and often business-related creation of the data by the original GitHub users fosters a more neutral, information-focused and formal language. As it is not uncommon for developers to find the writing of commit messages tedious, there can also be commit messages representing the frustration or boredom of the commit author. While our filtering is supposed to catch these types of messages, there can be some instances still in the dataset.

    PREPROCESSING AND DATA FORMATTING

    See paper for all preprocessing steps. We do not provide the un-processed raw data due to privacy concerns, but it can be obtained via CodeSearchNet or requested from the authors.

    CAPTURE QUALITY

    While our dataset is completely reproducible at the time of writing, there are external dependencies that could restrict this. If GitHub shuts down and someone with a software project in the dataset deletes their repository, there can be instances that are non-reproducible.

    LIMITATIONS

    While our filters are meant to ensure a high quality for each data sample in the dataset, we cannot ensure that only low-quality examples were removed. Similarly, we cannot guarantee that our extensive filtering methods catch all low-quality examples. Some might remain in the dataset. Another limitation of our dataset is the low number of programming languages (there are many more) as well as our focus on English commit messages. There might be some people that only write commit messages in their respective languages, e.g., because the organization they work at has established this or because they do not speak English (confidently enough). Perhaps some languages' syntax better aligns with that of programming languages. These effects cannot be investigated with CommitBench.

    Although we anonymize the data as far as possible, the required information for reproducibility, including the organization, project name, and project hash, makes it possible to refer back to the original authoring user account, since this information is freely available in the original repository on GitHub.

    METADATA

    License: Dataset under the CC BY-NC 4.0 license

    DISCLOSURES AND ETHICAL REVIEW

    While we put substantial effort into removing privacy-sensitive information, our solutions cannot find 100% of such cases. This means that researchers and anyone using the data need to incorporate their own safeguards to effectively reduce the amount of personal information that can be exposed.

    ABOUT THIS DOCUMENT

    A data statement is a characterization of a dataset that provides context to allow developers and users to better understand how experimental results might generalize, how software might be appropriately deployed, and what biases might be reflected in systems built on the software.

    This data statement was written based on the template for the Data Statements Version 2 schema. The template was prepared by Angelina McMillan-Major, Emily M. Bender, and Batya Friedman and can be found at https://techpolicylab.uw.edu/data-statements/ and was updated from the community Version 1 Markdown template by Leon Dercyznski.

  17. Number, percentage and rate of homicide victims, by racialized identity...

    • www150.statcan.gc.ca
    • data.urbandatacentre.ca
    • +3more
    Updated Jul 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Number, percentage and rate of homicide victims, by racialized identity group, gender and region [Dataset]. http://doi.org/10.25318/3510020601-eng
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number, percentage and rate (per 100,000 population) of homicide victims, by racialized identity group (total, by racialized identity group; racialized identity group; South Asian; Chinese; Black; Filipino; Arab; Latin American; Southeast Asian; West Asian; Korean; Japanese; other racialized identity group; multiple racialized identity; racialized identity, but racialized identity group is unknown; rest of the population; unknown racialized identity group), gender (all genders; male; female; gender unknown) and region (Canada; Atlantic region; Quebec; Ontario; Prairies region; British Columbia; territories), 2019 to 2024.

  18. w

    Immigration system statistics data tables

    • gov.uk
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Home Office (2025). Immigration system statistics data tables [Dataset]. https://www.gov.uk/government/statistical-data-sets/immigration-system-statistics-data-tables
    Explore at:
    Dataset updated
    May 22, 2025
    Dataset provided by
    GOV.UK
    Authors
    Home Office
    Description

    List of the data tables as part of the Immigration System Statistics Home Office release. Summary and detailed data tables covering the immigration system, including out-of-country and in-country visas, asylum, detention, and returns.

    If you have any feedback, please email MigrationStatsEnquiries@homeoffice.gov.uk.

    Accessible file formats

    The Microsoft Excel .xlsx files may not be suitable for users of assistive technology.
    If you use assistive technology (such as a screen reader) and need a version of these documents in a more accessible format, please email MigrationStatsEnquiries@homeoffice.gov.uk
    Please tell us what format you need. It will help us if you say what assistive technology you use.

    Related content

    Immigration system statistics, year ending March 2025
    Immigration system statistics quarterly release
    Immigration system statistics user guide
    Publishing detailed data tables in migration statistics
    Policy and legislative changes affecting migration to the UK: timeline
    Immigration statistics data archives

    Passenger arrivals

    https://assets.publishing.service.gov.uk/media/68258d71aa3556876875ec80/passenger-arrivals-summary-mar-2025-tables.xlsx">Passenger arrivals summary tables, year ending March 2025 (MS Excel Spreadsheet, 66.5 KB)

    ‘Passengers refused entry at the border summary tables’ and ‘Passengers refused entry at the border detailed datasets’ have been discontinued. The latest published versions of these tables are from February 2025 and are available in the ‘Passenger refusals – release discontinued’ section. A similar data series, ‘Refused entry at port and subsequently departed’, is available within the Returns detailed and summary tables.

    Electronic travel authorisation

    https://assets.publishing.service.gov.uk/media/681e406753add7d476d8187f/electronic-travel-authorisation-datasets-mar-2025.xlsx">Electronic travel authorisation detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 56.7 KB)
    ETA_D01: Applications for electronic travel authorisations, by nationality ETA_D02: Outcomes of applications for electronic travel authorisations, by nationality

    Entry clearance visas granted outside the UK

    https://assets.publishing.service.gov.uk/media/68247953b296b83ad5262ed7/visas-summary-mar-2025-tables.xlsx">Entry clearance visas summary tables, year ending March 2025 (MS Excel Spreadsheet, 113 KB)

    https://assets.publishing.service.gov.uk/media/682c4241010c5c28d1c7e820/entry-clearance-visa-outcomes-datasets-mar-2025.xlsx">Entry clearance visa applications and outcomes detailed datasets, year ending March 2025 (MS Excel Spreadsheet, 29.1 MB)
    Vis_D01: Entry clearance visa applications, by nationality and visa type
    Vis_D02: Outcomes of entry clearance visa applications, by nationality, visa type, and outcome

    Additional d

  19. Number, rate and percentage changes in rates of homicide victims

    • www150.statcan.gc.ca
    • datasets.ai
    • +1more
    Updated Jul 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Number, rate and percentage changes in rates of homicide victims [Dataset]. http://doi.org/10.25318/3510006801-eng
    Explore at:
    Dataset updated
    Jul 22, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number, rate and percentage changes in rates of homicide victims, Canada, provinces and territories, 1961 to 2024.

  20. FiveThirtyEight Hate Crimes Dataset

    • kaggle.com
    Updated Apr 26, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FiveThirtyEight (2019). FiveThirtyEight Hate Crimes Dataset [Dataset]. https://www.kaggle.com/datasets/fivethirtyeight/fivethirtyeight-hate-crimes-dataset/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 26, 2019
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    FiveThirtyEight
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Content

    Hate Crimes

    This folder contains data behind the story Higher Rates Of Hate Crimes Are Tied To Income Inequality.

    HeaderDefinition
    stateState name
    median_household_incomeMedian household income, 2016
    share_unemployed_seasonalShare of the population that is unemployed (seasonally adjusted), Sept. 2016
    share_population_in_metro_areasShare of the population that lives in metropolitan areas, 2015
    share_population_with_high_school_degreeShare of adults 25 and older with a high-school degree, 2009
    share_non_citizenShare of the population that are not U.S. citizens, 2015
    share_white_povertyShare of white residents who are living in poverty, 2015
    gini_indexGini Index, 2015
    share_non_whiteShare of the population that is not white, 2015
    share_voters_voted_trumpShare of 2016 U.S. presidential voters who voted for Donald Trump
    hate_crimes_per_100k_splcHate crimes per 100,000 population, Southern Poverty Law Center, Nov. 9-18, 2016
    avg_hatecrimes_per_100k_fbiAverage annual hate crimes per 100,000 population, FBI, 2010-2015

    Sources: Kaiser Family Foundation Kaiser Family Foundation Kaiser Family Foundation Census Bureau Kaiser Family Foundation Kaiser Family Foundation Census Bureau Kaiser Family Foundation United States Elections Project Southern Poverty Law Center FBI

    Correction

    Please see the following commit: https://github.com/fivethirtyeight/data/commit/fbc884a5c8d45a0636e1d6b000021632a0861986

    Context

    This is a dataset from FiveThirtyEight hosted on their GitHub. Explore FiveThirtyEight data using Kaggle and all of the data sources available through the FiveThirtyEight organization page!

    • Update Frequency: This dataset is updated daily.

    Acknowledgements

    This dataset is maintained using GitHub's API and Kaggle's API.

    This dataset is distributed under the Attribution 4.0 International (CC BY 4.0) license.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Filip Zoubek (2021). Effect of suicide rates on life expectancy dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4694269

Effect of suicide rates on life expectancy dataset

Explore at:
Dataset updated
Apr 16, 2021
Dataset authored and provided by
Filip Zoubek
License

Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically

Description

Effect of suicide rates on life expectancy dataset

Abstract In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy. The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.

Data

The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.

LICENSE

THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).

[1] https://www.kaggle.com/szamil/who-suicide-statistics

[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who

Search
Clear search
Close search
Google apps
Main menu