Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Effect of suicide rates on life expectancy dataset
Abstract
In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.
Data
The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.
LICENSE
THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).
[1] https://www.kaggle.com/szamil/who-suicide-statistics
[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who
Data on death rates for suicide, by selected population characteristics. Please refer to the PDF or Excel version of this table in the HUS 2019 Data Finder (https://www.cdc.gov/nchs/hus/contents2019.htm) for critical information about measures, definitions, and changes over time. SOURCE: NCHS, National Vital Statistics System (NVSS); Grove RD, Hetzel AM. Vital statistics rates in the United States, 1940–1960. National Center for Health Statistics. 1968; numerator data from NVSS annual public-use Mortality Files; denominator data from U.S. Census Bureau national population estimates; and Murphy SL, Xu JQ, Kochanek KD, Arias E, Tejada-Vera B. Deaths: Final data for 2018. National Vital Statistics Reports; vol 69 no 13. Hyattsville, MD: National Center for Health Statistics. 2021. Available from: https://www.cdc.gov/nchs/products/nvsr.htm. For more information on the National Vital Statistics System, see the corresponding Appendix entry at https://www.cdc.gov/nchs/data/hus/hus19-appendix-508.pdf.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Number of suicides and suicide rates, by sex and age, in England and Wales. Information on conclusion type is provided, along with the proportion of suicides by method and the median registration delay.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
https://i.imgur.com/Vrs6apv.png" alt="">
There is a well-documented phenomenon of increased suicide rates among United States military veterans. One recent analysis, published in 2016, found the suicide rate amongst veterans to be around 20 per day. The widespread nature of the problem has resulted in efforts by and pressure on the United States military services to combat and address mental health issues in and after service in the country's armed forces.
In 2013 News21 published a sequence of reports on the phenomenon, aggregating and using data provided by individual states to typify the nationwide pattern. This dataset is the underlying data used in that report, as collected by the News21 team.
The data consists of six files, one for each year between 2005 and 2011. Each year's worth of data includes the general population of each US state, a count of suicides, a count of state veterans, and a count of veteran suicides.
This data was originally published by News21. It has been converted from an XLS to a CSV format for publication on Kaggle. The original data, visualizations, and stories can be found at the source.
What is the geospatial pattern of veterans in the United States? How much more vulnerable is the average veteran to suicide than the average citizen? Is the problem increasing or decreasing over time?
The included dataset contains 10,000 synthetic Veteran patient records generated by Synthea. The scope of the data includes over 500 clinical concepts across 90 disease modules, as well as additional social determinants of health (SDoH) data elements that are not traditionally tracked in electronic health records. Each synthetic patient conceptually represents one Veteran in the existing US population; each Veteran has a name, sociodemographic profile, a series of documented clinical encounters and diagnoses, as well as associated cost and payer data. To learn more about Synthea, please visit the Synthea wiki at https://github.com/synthetichealth/synthea/wiki. To find a description of how this dataset is organized by data type, please visit the Synthea CSV File Data Dictionary at https://github.com/synthetichealth/synthea/wiki/CSV-File-Data-Dictionary.The included dataset contains 10,000 synthetic Veteran patient records generated by Synthea. The scope of the data includes over 500 clinical concepts across 90 disease modules, as well as additional social determinants of health (SDoH) data elements that are not traditionally tracked in electronic health records. Each synthetic patient conceptually represents one Veteran in the existing US population; each Veteran has a name, sociodemographic profile, a series of documented clinical encounters and diagnoses, as well as associated cost and payer data. To learn more about Synthea, please visit the Synthea wiki at https://github.com/synthetichealth/synthea/wiki. To find a description of how this dataset is organized by data type, please visit the Synthea CSV File Data Dictionary at https://github.com/synthetichealth/synthea/wiki/CSV-File-Data-Dictionary.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
This dataset shows the suicide rates for just over 100 countries. The data is compiled from the the World Health Organization from 2008 in which a country's rank is determined by its total rate deaths officially recorded as suicides. Rates are expressed as per 100,000 of population. Note - year is not consistant for all entries, please refer to the year column to determine what year the data represents. Data sourced from WHO website - Mental health. World Health Organization. 2009. http://www.who.int/mental_health/prevention/suicide/country_reports/en/index.html. GIS vector data. This dataset was first accessioned in the EDINA ShareGeo Open repository on 2011-01-31 and migrated to Edinburgh DataShare on 2017-02-21.
Download data on suicides in Massachusetts by demographics and year. This page also includes reporting on military & veteran suicide, and suicides during COVID-19.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides comprehensive information on the death rates for suicide in the United States, segmented by sex, race, Hispanic origin, and age, spanning from 1950 to 2020. The data is sourced from reputable public health records and aims to offer valuable insights into the demographic factors associated with suicide rates over an extensive period.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Provisional rate and number of suicide deaths registered in England per quarter. Includes 2001 to 2023 registrations and provisional data for Quarter 1 (Jan to Mar) to Quarter 4 (Oct to Dec) 2024. These are official statistics in development.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Close to 800 000 people die due to suicide every year, which is one person every 40 seconds. Suicide is a global phenomenon and occurs throughout the lifespan. Effective and evidence-based interventions can be implemented at population, sub-population and individual levels to prevent suicide and suicide attempts. There are indications that for each adult who died by suicide there may have been more than 20 others attempting suicide.
Suicide is a complex issue and therefore suicide prevention efforts require coordination and collaboration among multiple sectors of society, including the health sector and other sectors such as education, labour, agriculture, business, justice, law, defense, politics, and the media. These efforts must be comprehensive and integrated as no single approach alone can make an impact on an issue as complex as suicide.
This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset provides the age-standardised rate of deaths from suicide and injury of undetermined intent. It includes deaths registered in calendar years and classified using specific ICD-10 codes. The data is aggregated into quinary age bands starting from age 10 and is expressed per 100,000 population. Age standardisation ensures comparability across different population groups and time periods.
Rationale Reducing the suicide rate is a critical public health goal. Monitoring this indicator helps identify trends, assess the effectiveness of mental health interventions, and inform policy decisions aimed at preventing suicide and supporting at-risk populations.
Numerator The numerator is the number of deaths from suicide and injury of undetermined intent. These are classified by underlying cause of death using ICD-10 codes X60–X84 (ages 10+ only) and Y10–Y34 (ages 15+ only), and are registered in the respective calendar years. The data is grouped into quinary age bands.
Denominator The denominator is the aggregated population-years for individuals aged 10 and over, also grouped into quinary age bands. These population estimates are based on the 2021 Census.
Caveats Rates for the period 2001 to 2006 were revised in March 2015. Prior to this revision, ICD code Y33.9 was incorrectly included, which resulted in inflated rates for those years. Users should be cautious when comparing data across this time span.
External References Further details and related indicators can be accessed on the Fingertips Public Health Profiles website.
Localities ExplainedThis dataset contains data based on either the resident locality or registered locality of the patient, a distinction is made between resident locality and registered locality populations:Resident Locality refers to individuals who live within the defined geographic boundaries of the locality. These boundaries are aligned with official administrative areas such as wards and Lower Layer Super Output Areas (LSOAs).Registered Locality refers to individuals who are registered with GP practices that are assigned to a locality based on the Primary Care Network (PCN) they belong to. These assignments are approximate—PCNs are mapped to a locality based on the location of most of their GP surgeries. As a result, locality-registered patients may live outside the locality, sometimes even in different towns or cities.This distinction is important because some health indicators are only available at GP practice level, without information on where patients actually reside. In such cases, data is attributed to the locality based on GP registration, not residential address.
Click here to explore more from the Birmingham and Solihull Integrated Care Partnerships Outcome Framework.
Dataset 1: Participants were asked to determine their agreement or disagreement with 23 common views on suicide using a five-point Likert scale (“1” means “strongly disagree,” and “5” means “strongly agree”). The data were collected from different academic years from 2021 to 2023, and the teacher group. Dataset 2: The survey required the first-year students to imagine themselves facing a classmate in a high-risk crisis. They were told to consider a case where they had to reduce the classmate’s suicidal intention with only one sentence, and what it would be. Ultimately, 1,284 feedback items were collected.
Suicide Rate - This indicator shows the suicide rate per 100,000 population. Suicide is a serious public health problem that can have lasting effects on individuals, families, and communities. Mental disorders and/or substance abuse have been found in the great majority of people who have died by suicide. In Maryland, approximately 500 lives are lost each year to this preventable cause of death. Link to Data Details
This dataset provides model-based provisional estimates of the weekly numbers of drug overdose, suicide, and transportation-related deaths using “nowcasting” methods to account for the normal lag between the occurrence and reporting of these deaths. Estimates less than 10 are suppressed. These early model-based provisional estimates were generated using a multi-stage hierarchical Bayesian modeling process to generate smoothed estimates of the weekly numbers of death, accounting for reporting lags. These estimates are based on several assumptions about how the reporting lags have changed in recent months across different jurisdictions, and the resulting estimates differ from other sources of provisional mortality data. For now, these estimates should be considered highly uncertain until further evaluations can be done to determine the validity of these assumptions about timeliness. The true patterns in reporting lags will not be known until data are finalized, typically 11–12 months after the end of the calendar year. Importantly, these estimates are not a replacement for monthly provisional drug overdose death counts, or quarterly provisional mortality estimates. For more detail about the nowcasting methods and models, see: Rossen LM, Hedegaard H, Warner M, Ahmad FB, Sutton PD. Early provisional estimates of drug overdose, suicide, and transportation-related deaths: Nowcasting methods to account for reporting lags. Vital Statistics Rapid Release; no 11. Hyattsville, MD: National Center for Health Statistics. February 2021. DOI: https://doi.org/10.15620/ cdc:101132
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Effect of suicide rates on life expectancy dataset
Abstract
In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.
Data
The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.
LICENSE
THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).
[1] https://www.kaggle.com/szamil/who-suicide-statistics
[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who