Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Potential years of life lost (PYLL) due to alcohol-related conditions, all ages, directly age-standardised per 100,000 population (standardised to the ESP).
Rationale Alcohol consumption is a contributing factor to hospital admissions and deaths from a diverse range of conditions. Alcohol misuse is estimated to cost the NHS about £3.5 billion per year and society as a whole £21 billion annually. The Government has said that everyone has a role to play in reducing the harmful use of alcohol - this indicator is one of the key contributions by the Government (and the Department of Health and Social Care) to promote measurable, evidence-based prevention activities at a local level, and supports the national ambitions to reduce harm set out in the Government's Alcohol Strategy. This ambition is part of the monitoring arrangements for the Responsibility Deal Alcohol Network. Alcohol-related deaths can be reduced through local interventions to reduce alcohol misuse and harm.
Potential years of life lost (PYLL) is a measure of the potential number of years lost when a person dies prematurely. The basic concept of PYLL is that deaths at younger ages are weighted more heavily than those at older ages. The advantage in doing this is that deaths at younger ages may be seen as less important if cause-specific death rates were just used on their own in highlighting the burden of disease and injury, since conditions such as cancer and heart disease usually occur at older ages and have relatively high mortality rates.
To enable comparisons between areas and over time, PYLL rates are age-standardised to represent the PYLL if each area had the same population structure as the 2013 European Standard Population (ESP). PYLL rates are presented as years of life lost per 100,000 population.
Definition of numerator The number of age-specific alcohol-related deaths multiplied by the national life expectancy for each age group and summed to give the total potential years of life lost due to alcohol-related conditions.
Definition of denominator ONS Mid-Year Population Estimates aggregated into quinary age bands.
Caveats There is the potential for the underlying cause of death to be incorrectly attributed on the death certificate and the cause of death misclassified. Alcohol-attributable fractions were not available for children. Conditions where low levels of alcohol consumption are protective (have a negative alcohol-attributable fraction) are not included in the calculation of the indicator.
The national life expectancies for England have been used for all sub-national geographies to illustrate the disparities in the burden caused by alcohol between local areas and the national average.
The confidence intervals do not take into account the uncertainty involved in the calculation of the AAFs – that is, the proportion of deaths that are caused by alcohol and the alcohol consumption prevalence that are included in the AAF formula are only an estimate and so include uncertainty. The confidence intervals published here are based only on the observed number of deaths and do not account for this uncertainty in the calculation of attributable fraction - as such the intervals may be too narrow.
Facebook
TwitterThe Detailed Mortality - Underlying Cause of Death data on CDC WONDER are county-level national mortality and population data spanning the years 1999-2009. Data are based on death certificates for U.S. residents. Each death certificate contains a single underlying cause of death, and demographic data. The number of deaths, crude death rates, age-adjusted death rates, standard errors and 95% confidence intervals for death rates can be obtained by place of residence (total U.S., region, state, and county), age group (including infants and single-year-of-age cohorts), race (4 groups), Hispanic ethnicity, sex, year of death, and cause-of-death (4-digit ICD-10 code or group of codes, injury intent and mechanism categories, or drug and alcohol related causes), year, month and week day of death, place of death and whether an autopsy was performed. The data are produced by the National Center for Health Statistics.
Facebook
TwitterThis dataset presents information on alcohol-attributable mortality rates for Alberta, for selected causes of death, per 100,000 population, for the years 2002 to 2012.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Effect of suicide rates on life expectancy dataset
Abstract
In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy.
The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.
Data
The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.
LICENSE
THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).
[1] https://www.kaggle.com/szamil/who-suicide-statistics
[2] https://www.kaggle.com/kumarajarshi/life-expectancy-who
Facebook
TwitterA straightforward way to assess the health status of a population is to focus on mortality – or concepts like child mortality or life expectancy, which are based on mortality estimates. A focus on mortality, however, does not take into account that the burden of diseases is not only that they kill people, but that they cause suffering to people who live with them. Assessing health outcomes by both mortality and morbidity (the prevalent diseases) provides a more encompassing view on health outcomes. This is the topic of this entry. The sum of mortality and morbidity is referred to as the ‘burden of disease’ and can be measured by a metric called ‘Disability Adjusted Life Years‘ (DALYs). DALYs are measuring lost health and are a standardized metric that allow for direct comparisons of disease burdens of different diseases across countries, between different populations, and over time. Conceptually, one DALY is the equivalent of losing one year in good health because of either premature death or disease or disability. One DALY represents one lost year of healthy life. The first ‘Global Burden of Disease’ (GBD) was GBD 1990 and the DALY metric was prominently featured in the World Bank’s 1993 World Development Report. Today it is published by both the researchers at the Institute of Health Metrics and Evaluation (IHME) and the ‘Disease Burden Unit’ at the World Health Organization (WHO), which was created in 1998. The IHME continues the work that was started in the early 1990s and publishes the Global Burden of Disease study.
In this Dataset, we have Historical Data of different cause of deaths for all ages around the World. The key features of this Dataset are: Meningitis, Alzheimer's Disease and Other Dementias, Parkinson's Disease, Nutritional Deficiencies, Malaria, Drowning, Interpersonal Violence, Maternal Disorders, HIV/AIDS, Drug Use Disorders, Tuberculosis, Cardiovascular Diseases, Lower Respiratory Infections, Neonatal Disorders, Alcohol Use Disorders, Self-harm, Exposure to Forces of Nature, Diarrheal Diseases, Environmental Heat and Cold Exposure, Neoplasms, Conflict and Terrorism, Diabetes Mellitus, Chronic Kidney Disease, Poisonings, Protein-Energy Malnutrition, Road Injuries, Chronic Respiratory Diseases, Cirrhosis and Other Chronic Liver Diseases, Digestive Diseases, Fire, Heat, and Hot Substances, Acute Hepatitis.
Facebook
TwitterBackgroundAlthough excessive alcohol-related mortality in the post-Soviet countries remains the major public health threat, determinants of this phenomenon are still poorly understood.AimsWe assess simultaneously individual- and area-level factors associated with an elevated risk of alcohol-related mortality among Lithuanian males aged 30–64.MethodsOur analysis is based on a census-linked dataset containing information on individual- and area-level characteristics and death events which occurred between March 1st, 2011 and December 31st, 2013. We limit the analysis to a few causes of death which are directly linked to excessive alcohol consumption: accidental poisonings by alcohol (X45) and liver cirrhosis (K70 and K74). Multilevel Poisson regression models with random intercepts are applied to estimate mortality rate ratios (MRR).ResultsThe selected individual-level characteristics are important predictors of alcohol-related mortality, whereas area-level variables show much less pronounced or insignificant effects. Compared to married men, never married (MRR = 1.9, CI:1.6–2.2), divorced (MRR = 2.6, CI:2.3–2.9), and widowed (MRR = 2.4, CI: 1.8–3.1) men are disadvantaged groups. Men who have the lowest level of educational attainment have the highest mortality risk (MRR = 1.7 CI:1.4–2.1). Being unemployed is associated with a five-fold risk of alcohol-related death (MRR = 5.1, CI: 4.4–5.9), even after adjusting for all other individual variables. Lithuanian males have an advantage over Russian (MRR = 1.3, CI:1.1–1.6) and Polish (MRR = 1.8, CI: 1.5–2.2) males. After adjusting for all individual characteristics, only two out of seven area-level variables—i.e., the share of ethnic minorities in the population and the election turnout—have statistically significant direct associations. These variables contribute to a higher risk of alcohol-related mortality at the individual level.ConclusionsThe huge and increasing socio-economic disparities in alcohol-related mortality indicate that recently implemented anti-alcohol measures in Lithuania should be reinforced by specific measures targeting the most disadvantaged population groups and geographical areas.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
ARDI is an online application that provides national and state estimates of alcohol-related health impacts, including deaths and years of potential life lost (YPLL). These estimates are calculated for 54 acute and chronic causes using alcohol-attributable fractions, and are reported by age and sex for 2006-2010. This dataset provides estimates of the proportion of deaths from various causes that are attributable to alcohol.
Facebook
TwitterBackgroundSocioeconomic inequalities in alcohol-related mortality have been documented in several European countries, but it is unknown whether the magnitude of these inequalities differs between countries and whether these inequalities increase or decrease over time.Methods and FindingsWe collected and harmonized data on mortality from four alcohol-related causes (alcoholic psychosis, dependence, and abuse; alcoholic cardiomyopathy; alcoholic liver cirrhosis; and accidental poisoning by alcohol) by age, sex, education level, and occupational class in 20 European populations from 17 different countries, both for a recent period and for previous points in time, using data from mortality registers. Mortality was age-standardized using the European Standard Population, and measures for both relative and absolute inequality between low and high socioeconomic groups (as measured by educational level and occupational class) were calculated.Rates of alcohol-related mortality are higher in lower educational and occupational groups in all countries. Both relative and absolute inequalities are largest in Eastern Europe, and Finland and Denmark also have very large absolute inequalities in alcohol-related mortality. For example, for educational inequality among Finnish men, the relative index of inequality is 3.6 (95% CI 3.3–4.0) and the slope index of inequality is 112.5 (95% CI 106.2–118.8) deaths per 100,000 person-years. Over time, the relative inequality in alcohol-related mortality has increased in many countries, but the main change is a strong rise of absolute inequality in several countries in Eastern Europe (Hungary, Lithuania, Estonia) and Northern Europe (Finland, Denmark) because of a rapid rise in alcohol-related mortality in lower socioeconomic groups. In some of these countries, alcohol-related causes now account for 10% or more of the socioeconomic inequality in total mortality.Because our study relies on routinely collected underlying causes of death, it is likely that our results underestimate the true extent of the problem.ConclusionsAlcohol-related conditions play an important role in generating inequalities in total mortality in many European countries. Countering increases in alcohol-related mortality in lower socioeconomic groups is essential for reducing inequalities in mortality. Studies of why such increases have not occurred in countries like France, Switzerland, Spain, and Italy can help in developing evidence-based policies in other European countries.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
A straightforward way to assess the health status of a population is to focus on mortality – or concepts like child mortality or life expectancy, which are based on mortality estimates. A focus on mortality, however, does not take into account that the burden of diseases is not only that they kill people, but that they cause suffering to people who live with them. Assessing health outcomes by both mortality and morbidity (the prevalent diseases) provides a more encompassing view on health outcomes. This is the topic of this entry. The sum of mortality and morbidity is referred to as the ‘burden of disease’ and can be measured by a metric called ‘Disability Adjusted Life Years‘ (DALYs).
DALYs are measuring lost health and are a standardized metric that allow for direct comparisons of disease burdens of different diseases across countries, between different populations, and over time. Conceptually, one DALY is the equivalent of losing one year in good health because of either premature death or disease or disability. One DALY represents one lost year of healthy life. The first ‘Global Burden of Disease’ (GBD) was GBD 1990 and the DALY metric was prominently featured in the World Bank’s 1993 World Development Report. Today it is published by both the researchers at the Institute of Health Metrics and Evaluation (IHME) and the ‘Disease Burden Unit’ at the World Health Organization (WHO), which was created in 1998. The IHME continues the work that was started in the early 1990s and publishes the Global Burden of Disease study.
In this Dataset, we have Historical Data of different cause of deaths for all ages around the World. The key features of this Dataset are: Meningitis, Alzheimer's Disease and Other Dementias, Parkinson's Disease, Nutritional Deficiencies, Malaria, Drowning, Interpersonal Violence, Maternal Disorders, HIV/AIDS, Drug Use Disorders, Tuberculosis, Cardiovascular Diseases, Lower Respiratory Infections, Neonatal Disorders, Alcohol Use Disorders, Self-harm, Exposure to Forces of Nature, Diarrheal Diseases, Environmental Heat and Cold Exposure, Neoplasms, Conflict and Terrorism, Diabetes Mellitus, Chronic Kidney Disease, Poisonings, Protein-Energy Malnutrition, Road Injuries, Chronic Respiratory Diseases, Cirrhosis and Other Chronic Liver Diseases, Digestive Diseases, Fire, Heat, and Hot Substances, Acute Hepatitis.
This Dataset is created from Our World in Data. This Dataset falls under open access under the Creative Commons BY license. You can check the FAQ for more informa...
Facebook
TwitterThis data set depicts unintentional overdose deaths by county for Tennessee from 1999-2017.Data
was compiled from the CDC Wonder database for each year and combined
into a single spreadsheet. Each year has both a death field and a rate
of fatalities per 100,000 people. The CDC does not publish the number of
fatalities by county if the total is less than 10 in a given year. The
CDC does not post a rate of fatalities if the total number of deaths per
county is less than 20. The population field contains estimates from 2018 and is NOT the data used to generate the rates over time.The
following details are copied directly from the CDC Wonder database text
file. Note that the year is different for each data download from the
original database."Dataset: Underlying Cause of Death, 1999-2017""Query Parameters:""Drug/Alcohol Induced Causes: Drug poisonings (overdose) Unintentional (X40-X44)""States: Tennessee (47)""Year/Month: 1999""Group By: County""Show Totals: True""Show Zero Values: False""Show Suppressed: False""Calculate Rates Per: 100,000""Rate Options: Default intercensal populations for years 2001-2009 (except Infant Age Groups)""---""Help: See http://wonder.cdc.gov/wonder/help/ucd.html for more information.""---""Query Date: Aug 19, 2019 10:22:15 PM""1. Rows with suppressed Deaths are hidden, but the Deaths and Population values in those rows are included in the totals. Use""Quick Options above to show suppressed rows.""---"Caveats:"1. Data are Suppressed when the data meet the criteria for confidentiality constraints. More information:""http://wonder.cdc.gov/wonder/help/ucd.html#Assurance of Confidentiality.""2. Death rates are flagged as Unreliable when the rate is calculated with a numerator of 20 or less. More information:""http://wonder.cdc.gov/wonder/help/ucd.html#Unreliable.""3. The population figures for year 2017 are bridged-race estimates of the July 1 resident population, from the Vintage 2017""postcensal
series released by NCHS on June 27, 2018. The population figures for
year 2016 are bridged-race estimates of the July""1 resident population, from the Vintage 2016 postcensal series released by NCHS on June 26, 2017. The population figures for""year
2015 are bridged-race estimates of the July 1 resident population, from
the Vintage 2015 postcensal series released by NCHS""on June 28, 2016. The population figures for year 2014 are bridged-race estimates of the July 1 resident population, from the""Vintage 2014 postcensal series released by NCHS on June 30, 2015. The population figures for year 2013 are bridged-race""estimates of the July 1 resident population, from the Vintage 2013 postcensal series released by NCHS on June 26, 2014. The""population
figures for year 2012 are bridged-race estimates of the July 1 resident
population, from the Vintage 2012 postcensal""series released by
NCHS on June 13, 2013. The population figures for year 2011 are
bridged-race estimates of the July 1 resident""population, from the Vintage 2011 postcensal series released by NCHS on July 18, 2012. Population figures for 2010 are April 1""Census counts. The population figures for years 2001 - 2009 are bridged-race estimates of the July 1 resident population, from""the revised intercensal county-level 2000 - 2009 series released by NCHS on October 26, 2012. Population figures for 2000 are""April 1 Census counts. Population figures for 1999 are from the 1990-1999 intercensal series of July 1 estimates. Population""figures
for the infant age groups are the number of live births.
Note: Rates and population figures for
years 2001 -""2009 differ slightly from previously published
reports, due to use of the population estimates which were available at
the time""of release.""4. The population figures used in the calculation of death rates for the age group 'under 1 year' are the estimates of the""resident population that is under one year of age. More information: http://wonder.cdc.gov/wonder/help/ucd.html#Age Group."
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset is a more different and reliable version to KumarRajarshi's Life Expectancy (WHO) dataset - where some of his values and methods can be questioned.
Context All of the data in this dataset is compiled and downloaded from the Global Health Observatory (GHO) – which is a public health data repository established by the World Health Organisation (WHO). This makes the dataset very reliable and valid.
Challenges - Perform EDA to explore factors that affect life expectancy? - Produce a model to predict life expectancy?
Dataset Contents Life Expectancy from birth: - https://www.who.int/data/gho/data/indicators/indicator-details/GHO/life-expectancy-at-birth-(years)
Mean BMI (kg/m²) (crude estimate): - https://www.who.int/data/gho/data/indicators/indicator-details/GHO/mean-bmi-(kg-m-)-(crude-estimate)
Alcohol, total per capita (15+) consumption (in litres of pure alcohol): - https://www.who.int/data/gho/data/indicators/indicator-details/GHO/total-(recorded-unrecorded)-alcohol-per-capita-(15-)-consumption
The rest of the factors: - https://www.who.int/data/gho/data/themes/mortality-and-global-health-estimates/ghe-leading-causes-of-death (BY COUNTRY, Summary tables of mortality estimates by cause, age and sex, by country, 2000–2019, Number of Deaths [2000, 2010, 2015, 2019]). All of the values are crude estimates number of deaths per 1000.
I did this so you don't have to!
Data Collected: March 2023
Facebook
TwitterData
was compiled from the CDC Wonder database for each year and combined
into a single spreadsheet. Each year has both a death field and a rate
of fatalities per 100,000 people. The CDC does not publish the number of
fatalities by county if the total is less than 10 in a given year. The
CDC does not post a rate of fatalities if the total number of deaths per
county is less than 20. The population field contains estimates from 2018 and is NOT the data used to generate the rates over time.The
following details are copied directly from the CDC Wonder database text
file. Note that the year is different for each data download from the
original database."Dataset: Underlying Cause of Death, 1999-2017""Query Parameters:""Drug/Alcohol Induced Causes: Drug poisonings (overdose) Unintentional (X40-X44)""States: Tennessee (47)""Year/Month: 1999""Group By: County""Show Totals: True""Show Zero Values: False""Show Suppressed: False""Calculate Rates Per: 100,000""Rate Options: Default intercensal populations for years 2001-2009 (except Infant Age Groups)""---""Help: See http://wonder.cdc.gov/wonder/help/ucd.html for more information.""---""Query Date: Aug 19, 2019 10:22:15 PM""1. Rows with suppressed Deaths are hidden, but the Deaths and Population values in those rows are included in the totals. Use""Quick Options above to show suppressed rows.""---"Caveats:"1. Data are Suppressed when the data meet the criteria for confidentiality constraints. More information:""http://wonder.cdc.gov/wonder/help/ucd.html#Assurance of Confidentiality.""2. Death rates are flagged as Unreliable when the rate is calculated with a numerator of 20 or less. More information:""http://wonder.cdc.gov/wonder/help/ucd.html#Unreliable.""3. The population figures for year 2017 are bridged-race estimates of the July 1 resident population, from the Vintage 2017""postcensal
series released by NCHS on June 27, 2018. The population figures for
year 2016 are bridged-race estimates of the July""1 resident population, from the Vintage 2016 postcensal series released by NCHS on June 26, 2017. The population figures for""year
2015 are bridged-race estimates of the July 1 resident population, from
the Vintage 2015 postcensal series released by NCHS""on June 28, 2016. The population figures for year 2014 are bridged-race estimates of the July 1 resident population, from the""Vintage 2014 postcensal series released by NCHS on June 30, 2015. The population figures for year 2013 are bridged-race""estimates of the July 1 resident population, from the Vintage 2013 postcensal series released by NCHS on June 26, 2014. The""population
figures for year 2012 are bridged-race estimates of the July 1 resident
population, from the Vintage 2012 postcensal""series released by
NCHS on June 13, 2013. The population figures for year 2011 are
bridged-race estimates of the July 1 resident""population, from the Vintage 2011 postcensal series released by NCHS on July 18, 2012. Population figures for 2010 are April 1""Census counts. The population figures for years 2001 - 2009 are bridged-race estimates of the July 1 resident population, from""the revised intercensal county-level 2000 - 2009 series released by NCHS on October 26, 2012. Population figures for 2000 are""April 1 Census counts. Population figures for 1999 are from the 1990-1999 intercensal series of July 1 estimates. Population""figures
for the infant age groups are the number of live births.
Note: Rates and population figures for
years 2001 -""2009 differ slightly from previously published
reports, due to use of the population estimates which were available at
the time""of release.""4. The population figures used in the calculation of death rates for the age group 'under 1 year' are the estimates of the""resident population that is under one year of age. More information: http://wonder.cdc.gov/wonder/help/ucd.html#Age Group."
Facebook
TwitterData underlying figures and relative risk curves within the article. Provides readers the mean value and uncertainty intervals for prevalence of current drinking, drinks per day by location, relative risks by outcome and dose, along with results for the weighted all-cause relative risk curve used to justify TMREL within the study. Based off sources mentioned in Appendix I.
From Abstract in linked paper:
Background Alcohol use is a leading risk factor for death and disability, but its overall association with health remains complex given the possible protective effects of moderate alcohol consumption on some conditions. With our comprehensive approach to health accounting within the Global Burden of Diseases, Injuries, and Risk Factors Study 2016, we generated improved estimates of alcohol use and alcohol-attributable deaths and disability-adjusted life-years (DALYs) for 195 locations from 1990 to 2016, for both sexes and for 5-year age groups between the ages of 15 years and 95 years and older.
Methods Using 694 data sources of individual and population-level alcohol consumption, along with 592 prospective and retrospective studies on the risk of alcohol use, we produced estimates of the prevalence of current drinking, abstention, the distribution of alcohol consumption among current drinkers in standard drinks daily (defined as 10 g of pure ethyl alcohol), and alcohol-attributable deaths and DALYs. We made several methodological improvements compared with previous estimates: first, we adjusted alcohol sales estimates to take into account tourist and unrecorded consumption; second, we did a new meta-analysis of relative risks for 23 health outcomes associated with alcohol use; and third, we developed a new method to quantify the level of alcohol consumption that minimises the overall risk to individual health
Facebook
Twitterhttps://www.usa.gov/government-workshttps://www.usa.gov/government-works
View annual counts of Accidental or Undetermined overdose deaths for 2012 forward, including provisional estimates of annual counts of overdose deaths for recent years, as noted with an asterisk and the month the data was pulled. NOTE: Finalized death records for overdose deaths are often delayed by 3-6 months. Counties labeled “no value” have data suppressed because the counts are between 1 and 9. Dataset includes overdose deaths where the Manner of Death is Accidental or Undetermined. County complement counts file located here - https://data.pa.gov/Opioid-Related/Estimated-Accidental-and-Undetermined-Drug-Overdos/azzc-q64m Overdose Deaths are classified using the International Classification of Diseases, Tenth Revision (ICD–10). Accidental and Undetermined drug overdose deaths are identified using underlying cause-of-death codes X40–X44, and Y10–Y14, and include - R99 when the Injury Description indicates an overdose death. - X49 when literal COD is Mixed or Combined or Multiple Substance Toxicity, as these are likely drug overdoses - X47 when substance indicated is difluoroethane, alone or in combination with other drugs Source Pennsylvania Prescription Drug Monitoring Program * * These data were supplied by the Bureau of Health Statistics and Registries, Harrisburg, Pennsylvania. The Bureau of Health Statistics and Registries specifically disclaims responsibility for any analyses, interpretations or conclusions. - Estimates are broken down by type of drugs involved in the overdose - Any Drug Overdose Death - all drug overdose deaths, regardless of type of drug involved, excluding alcohol only deaths - Opioid Overdose Death - any overdose death involving opioids, prescription or illegal
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure. Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies. People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
This dataset presents the age-standardised mortality rate from drug misuse across the population. It captures deaths where the underlying cause is linked to mental and behavioural disorders due to psychoactive substance use (excluding alcohol, tobacco, and volatile solvents), as well as deaths involving poisoning by controlled drugs. The data is sourced from the Office for National Statistics (ONS) and is intended to support public health monitoring and policy development aimed at reducing drug-related harm.
Rationale The indicator is designed to track and reduce the mortality rate from drug misuse. Monitoring these deaths helps inform public health strategies, resource allocation, and interventions aimed at preventing drug-related harm and supporting individuals with substance use disorders.
Numerator The numerator includes deaths where the underlying cause is coded to specific categories of mental and behavioural disorders due to psychoactive substance use (excluding alcohol, tobacco, and volatile solvents), as well as deaths involving poisoning by drugs controlled under the Misuse of Drugs Act 1971. These include accidental, intentional, undetermined, and assault-related poisonings, as well as disorders due to volatile solvents.
Denominator The denominator is the total population of the relevant age group, as recorded in the 2021 Census.
Caveats There are limitations in the classification and reporting of drug-related deaths, including potential underreporting or misclassification in death records. The indicator may not capture all deaths indirectly related to drug misuse, and changes in coding practices or legal definitions over time may affect comparability.
External references Public Health England - Fingertips: Deaths from drug misuse
Localities ExplainedThis dataset contains data based on either the resident locality or registered locality of the patient, a distinction is made between resident locality and registered locality populations:Resident Locality refers to individuals who live within the defined geographic boundaries of the locality. These boundaries are aligned with official administrative areas such as wards and Lower Layer Super Output Areas (LSOAs).Registered Locality refers to individuals who are registered with GP practices that are assigned to a locality based on the Primary Care Network (PCN) they belong to. These assignments are approximate—PCNs are mapped to a locality based on the location of most of their GP surgeries. As a result, locality-registered patients may live outside the locality, sometimes even in different towns or cities.This distinction is important because some health indicators are only available at GP practice level, without information on where patients actually reside. In such cases, data is attributed to the locality based on GP registration, not residential address.
Click here to explore more from the Birmingham and Solihull Integrated Care Partnerships Outcome Framework.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data Set from the Russian Federation Federal State Statistics Service - Росстат. Collected, translated into English language and published. Mortality in Russia by cause of death in 2018 (absolute numbers).
Causes of death statistics are obtained from the inscriptions in medical death certificates filled in by a physician referring to disease, accident, homicide, suicide or any other external factor (injuries due to actions envisaged by the law, non-specified injuries, injuries caused by military actions) which led directly to death. Such inscriptions are used as a reason for classifying death causes in civil registration records of deaths.
Some of the presented causes of death: Cause of death, Cholera, Typhoid fever, Paratyphoid, Salmonella infections, Shigellosis, Food poisoning, Intestinal infections, Tuberculosis, Plague, Anthrax, Brucellosis, Leprosy, Tetanus, Diphtheria, Whooping cough Scarlet fever, Meningococcal infection, Sepsis, Erysipelas, Other bacterial infections, Syphilis, Sexually transmitted infections, Typhus, Poliomyelitis, Rabies, Viral encephalitis, Measles, Hepatitis A, Human Immunodeficiency Virus (HIV) Disease, Other diseases caused by viruses, Malaria, Leishmaniasis, Trypanosomiasis, Schistosomiasis, Malignant, Leukemia, Neoplasms, Diabetes, Diseases of the endocrine system, eating disorders and metabolic disorders, Mental disorders, Parkinson's disease, Alzheimer's disease, Multiple sclerosis, Hypertension, myocardial infarction, Myocardial infarction, Stroke, Urolithiasis, Birth injury, Intrauterine hypoxia and asphyxia in childbirth, Suicides, Murder, Firearm Accident, Other accidents, Causes of death due to alcohol, Drug-related causes of death, All types of transport accidents And many more causes of death.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
According to the CDC, heart disease is a leading cause of death for people of most races in the U.S. (African Americans, American Indians and Alaska Natives, and whites). About half of all Americans (47%) have at least 1 of 3 major risk factors for heart disease: high blood pressure, high cholesterol, and smoking. Other key indicators include diabetes status, obesity (high BMI), not getting enough physical activity, or drinking too much alcohol. Identifying and preventing the factors that have the greatest impact on heart disease is very important in healthcare. In turn, developments in computing allow the application of machine learning methods to detect "patterns" in the data that can predict a patient's condition.
The dataset originally comes from the CDC and is a major part of the Behavioral Risk Factor Surveillance System (BRFSS), which conducts annual telephone surveys to collect data on the health status of U.S. residents. As described by the CDC: "Established in 1984 with 15 states, BRFSS now collects data in all 50 states, the District of Columbia, and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world. The most recent dataset includes data from 2023. In this dataset, I noticed many factors (questions) that directly or indirectly influence heart disease, so I decided to select the most relevant variables from it. I also decided to share with you two versions of the most recent dataset: with NaNs and without it.
As described above, the original dataset of nearly 300 variables was reduced to 40variables. In addition to classical EDA, this dataset can be used to apply a number of machine learning methods, especially classifier models (logistic regression, SVM, random forest, etc.). You should treat the variable "HadHeartAttack" as binary ("Yes" - respondent had heart disease; "No" - respondent did not have heart disease). Note, however, that the classes are unbalanced, so the classic approach of applying a model is not advisable. Fixing the weights/undersampling should yield much better results. Based on the data set, I built a logistic regression model and embedded it in an application that might inspire you: https://share.streamlit.io/kamilpytlak/heart-condition-checker/main/app.py. Can you indicate which variables have a significant effect on the likelihood of heart disease?
Check out this notebook in my GitHub repository: https://github.com/kamilpytlak/data-science-projects/blob/main/heart-disease-prediction/2022/notebooks/data_processing.ipynb
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
****Upvote above**** 👍
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F13496874%2Fd56f59efa72d43a3da3ae7349235b429%2FScreenshot%202024-03-12%20211249.png?generation=1710258188677782&alt=media" alt="">
Video on Risk factors of Lung Cancer - ![https://youtu.be/0vVRp5eNDlA?feature=shared]
Dataset: 1. GENDER: Gender of the individual (M: Male, F: Female) 2. AGE: Age of the individual 3. SMOKING: Smoking status (2: Yes, 1: No) 4. YELLOW_FINGERS: Presence of yellow fingers (2: Yes, 1: No) 5. ANXIETY: Anxiety level (2: High, 1: Low) 6. PEER_PRESSURE: Peer pressure level (2: High, 1: Low) 7. CHRONIC DISEASE: Presence of chronic disease (2: Yes, 1: No) 8. FATIGUE: Fatigue level (2: High, 1: Low) 9. ALLERGY: Allergy status (2: Yes, 1: No) 10. WHEEZING: Wheezing condition (2: Yes, 1: No) 11. ALCOHOL CONSUMING: Alcohol consumption status (2: Yes, 1: No) 12. COUGHING: Presence of coughing (2: Yes, 1: No) 13. SHORTNESS OF BREATH: Shortness of breath condition (2: Yes, 1: No) 14. SWALLOWING DIFFICULTY: Difficulty in swallowing (2: Yes, 1: No) 15. CHEST PAIN: Presence of chest pain (2: Yes, 1: No) 16. LUNG_CANCER: Lung cancer diagnosis (2: Yes, 1: No)
Data has 309 rows and 16 columns with floating variables, integer, object which ranges from 0 - 308
Lung cancer is the uncontrollable growth of abnormal cells in one or both of the lungs. Cigarette smoking causes most lung cancers when smoke gets in the lungs. Lung cancer kills 1.8 million people each year, more than any other cancer. It has an 80-90% death rate, and is the leading cause of cancer death in men, and the second leading cause of cancer death in women.
The global cancer burden is estimated to have risen to 18.1 million new cases and 9.6 million deaths in 2018. One in 5 men and one in 6 women worldwide develop cancer during their lifetime, and one in 8 men and one in 11 women die from the disease. Worldwide, the total number of people who are alive within 5 years of a cancer diagnosis, called the 5-year prevalence, is estimated to be 43.8 million.
Facebook
TwitterThe project, based at the University of Greenwich, UK and Stellenbosch University, South Africa, aimed to examine epidemiologic transitions by identifying and quantifying the drivers of change in CVD risk in the middle-income country of South Africa compared to the high-income nation of England. The project produced a harmonised dataset of national surveys measuring CVD risk factors in South Africa and England for others to use in future work. The harmonised dataset includes microdata from nationally-representative surveys in South Africa derived from the Demographic and Health Surveys, National Income Dynamics Study, South Africa National Health and Nutrition Examination Survey and Study on Global Ageing and Adult Health, covering 11 cross-sections and approximately 156,000 individuals aged 15+ years, representing South Africa’s adult population from 1998 to 2017.
Data for England come from 17 Health Surveys for England (HSE) over the same time period, covering over 168,000 individuals aged 16+ years, representing England’s adult population.
This study uses existing data to identify drivers of recent health transitions in South Africa compared to England. The global burden of non-communicable diseases (NCDs) on health is increasing. Cardiovascular diseases (CVD) in particular are the leading causes of death globally and often share characteristics with many major NCDs. Namely, they tend to increase with age and are influenced by behavioural factors such as diet, exercise and smoking. Risk factors for CVD are routinely measured in population surveys and thus provide an opportunity to study health transitions. Understanding the drivers of health transitions in countries that have not followed expected paths (eg, South Africa) compared to those that exemplified models of 'epidemiologic transition' (eg, England) can generate knowledge on where resources may best be directed to reduce the burden of disease. In the middle-income country of South Africa, CVD is the second leading cause of death after HIV/AIDS and tuberculosis (TB). Moreover, many of the known risk factors for NCDs like CVD are highly prevalent. Rates of hypertension are high, with recent estimates suggesting that over 40% of adults have high blood pressure. Around 60% of women and 30% of men over 15 are overweight in South Africa. In addition, excessive alcohol consumption, a risk factor for many chronic diseases, is high, with over 30% of men aged 15 and older having engaged in heavy episodic drinking within a 30-day period. Nevertheless, infectious diseases such as HIV/AIDS remain the leading cause of death, though many with HIV/AIDS and TB also have NCDs. In high-income countries like England, by contrast, NCDs such as CVD have been the leading causes of death since the mid-1900s. However, CVD and risk factors such as hypertension have been declining in recent decades due to increased prevention and treatment. The major drivers of change in disease burden have been attributed to factors including ageing, improved living standards, urbanisation, lifestyle change, and reduced infectious disease. Together, these changes are often referred to as the epidemiologic transition. However, recent research has questioned whether epidemiologic transition theory accurately describes the experience of many low- and middle-income countries or, in fact, of high-income nations such as England. Furthermore, few studies have empirically tested the relative contributions of demographic, behavioural, health and economic factors to trends in disease burden and risk, particularly on the African continent. In addition, many social and environmental factors are overlooked in this research. To address these gaps, our study will use population measurements of CVD risk derived from surveys in South Africa over nearly 20 years in order to examine whether and to what extent demographic, behavioural, environmental, medical, social and other factors contribute to recent health trends and transitions. We will compare these trends to those occurring in England over the same time period. Thus, this analysis seeks to illuminate the drivers of health transitions in a country which is assumed to still be 'transitioning' to a chronic disease profile but which continues to have a high infectious disease burden (South Africa) as compared to a country which is assumed to have already transitioned following epidemiological transition theory (England). The analysis will employ modelling techniques on pooled cross-sectional data to examine how various factors explain the variation in CVD risk over time in representative population samples from South Africa and England. The results of this analysis may help to identify some of the main contributors to recent changes in CVD risk in South Africa and England. Such information can be used to pinpoint potential areas for intervention, such as social policy and services, thereby helping to set priorities for governmental and nongovernmental action to control the CVD epidemic and improve health.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Potential years of life lost (PYLL) due to alcohol-related conditions, all ages, directly age-standardised per 100,000 population (standardised to the ESP).
Rationale Alcohol consumption is a contributing factor to hospital admissions and deaths from a diverse range of conditions. Alcohol misuse is estimated to cost the NHS about £3.5 billion per year and society as a whole £21 billion annually. The Government has said that everyone has a role to play in reducing the harmful use of alcohol - this indicator is one of the key contributions by the Government (and the Department of Health and Social Care) to promote measurable, evidence-based prevention activities at a local level, and supports the national ambitions to reduce harm set out in the Government's Alcohol Strategy. This ambition is part of the monitoring arrangements for the Responsibility Deal Alcohol Network. Alcohol-related deaths can be reduced through local interventions to reduce alcohol misuse and harm.
Potential years of life lost (PYLL) is a measure of the potential number of years lost when a person dies prematurely. The basic concept of PYLL is that deaths at younger ages are weighted more heavily than those at older ages. The advantage in doing this is that deaths at younger ages may be seen as less important if cause-specific death rates were just used on their own in highlighting the burden of disease and injury, since conditions such as cancer and heart disease usually occur at older ages and have relatively high mortality rates.
To enable comparisons between areas and over time, PYLL rates are age-standardised to represent the PYLL if each area had the same population structure as the 2013 European Standard Population (ESP). PYLL rates are presented as years of life lost per 100,000 population.
Definition of numerator The number of age-specific alcohol-related deaths multiplied by the national life expectancy for each age group and summed to give the total potential years of life lost due to alcohol-related conditions.
Definition of denominator ONS Mid-Year Population Estimates aggregated into quinary age bands.
Caveats There is the potential for the underlying cause of death to be incorrectly attributed on the death certificate and the cause of death misclassified. Alcohol-attributable fractions were not available for children. Conditions where low levels of alcohol consumption are protective (have a negative alcohol-attributable fraction) are not included in the calculation of the indicator.
The national life expectancies for England have been used for all sub-national geographies to illustrate the disparities in the burden caused by alcohol between local areas and the national average.
The confidence intervals do not take into account the uncertainty involved in the calculation of the AAFs – that is, the proportion of deaths that are caused by alcohol and the alcohol consumption prevalence that are included in the AAF formula are only an estimate and so include uncertainty. The confidence intervals published here are based only on the observed number of deaths and do not account for this uncertainty in the calculation of attributable fraction - as such the intervals may be too narrow.