Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Provisional deaths registration data for single year of age and average age of death (median and mean) of persons whose death involved coronavirus (COVID-19), England and Wales. Includes deaths due to COVID-19 and breakdowns by sex.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 2394 series, with data for years 1991 - 1991 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 items: Canada ...), Population group (19 items: Entire cohort; Income adequacy quintile 1 (lowest);Income adequacy quintile 2;Income adequacy quintile 3 ...), Age (14 items: At 25 years; At 30 years; At 40 years; At 35 years ...), Sex (3 items: Both sexes; Females; Males ...), Characteristics (3 items: Life expectancy; High 95% confidence interval; life expectancy; Low 95% confidence interval; life expectancy ...).
This dataset contains the number of deaths and the average age at death for all deaths in a ZIP Code between 2011 and 2015. The data were obtained by special request from Texas Department of State Health Services Vital Statistics.
VITAL SIGNS INDICATOR Life Expectancy (EQ6)
FULL MEASURE NAME Life Expectancy
LAST UPDATED April 2017
DESCRIPTION Life expectancy refers to the average number of years a newborn is expected to live if mortality patterns remain the same. The measure reflects the mortality rate across a population for a point in time.
DATA SOURCE State of California, Department of Health: Death Records (1990-2013) No link
California Department of Finance: Population Estimates Annual Intercensal Population Estimates (1990-2010) Table P-2: County Population by Age (2010-2013) http://www.dof.ca.gov/Forecasting/Demographics/Estimates/
U.S. Census Bureau: Decennial Census ZCTA Population (2000-2010) http://factfinder.census.gov
U.S. Census Bureau: American Community Survey 5-Year Population Estimates (2013) http://factfinder.census.gov
CONTACT INFORMATION vitalsigns.info@mtc.ca.gov
METHODOLOGY NOTES (across all datasets for this indicator) Life expectancy is commonly used as a measure of the health of a population. Life expectancy does not reflect how long any given individual is expected to live; rather, it is an artificial measure that captures an aspect of the mortality rates across a population that can be compared across time and populations. More information about the determinants of life expectancy that may lead to differences in life expectancy between neighborhoods can be found in the Bay Area Regional Health Inequities Initiative (BARHII) Health Inequities in the Bay Area report at http://www.barhii.org/wp-content/uploads/2015/09/barhii_hiba.pdf. Vital Signs measures life expectancy at birth (as opposed to cohort life expectancy). A statistical model was used to estimate life expectancy for Bay Area counties and ZIP Codes based on current life tables which require both age and mortality data. A life table is a table which shows, for each age, the survivorship of a people from a certain population.
Current life tables were created using death records and population estimates by age. The California Department of Public Health provided death records based on the California death certificate information. Records include age at death and residential ZIP Code. Single-year age population estimates at the regional- and county-level comes from the California Department of Finance population estimates and projections for ages 0-100+. Population estimates for ages 100 and over are aggregated to a single age interval. Using this data, death rates in a population within age groups for a given year are computed to form unabridged life tables (as opposed to abridged life tables). To calculate life expectancy, the probability of dying between the jth and (j+1)st birthday is assumed uniform after age 1. Special consideration is taken to account for infant mortality.
For the ZIP Code-level life expectancy calculation, it is assumed that postal ZIP Codes share the same boundaries as ZIP Code Census Tabulation Areas (ZCTAs). More information on the relationship between ZIP Codes and ZCTAs can be found at http://www.census.gov/geo/reference/zctas.html. ZIP Code-level data uses three years of mortality data to make robust estimates due to small sample size. Year 2013 ZIP Code life expectancy estimates reflects death records from 2011 through 2013. 2013 is the last year with available mortality data. Death records for ZIP Codes with zero population (like those associated with P.O. Boxes) were assigned to the nearest ZIP Code with population. ZIP Code population for 2000 estimates comes from the Decennial Census. ZIP Code population for 2013 estimates are from the American Community Survey (5-Year Average). ACS estimates are adjusted using Decennial Census data for more accurate population estimates. An adjustment factor was calculated using the ratio between the 2010 Decennial Census population estimates and the 2012 ACS 5-Year (with middle year 2010) population estimates. This adjustment factor is particularly important for ZCTAs with high homeless population (not living in group quarters) where the ACS may underestimate the ZCTA population and therefore underestimate the life expectancy. The ACS provides ZIP Code population by age in five-year age intervals. Single-year age population estimates were calculated by distributing population within an age interval to single-year ages using the county distribution. Counties were assigned to ZIP Codes based on majority land-area.
ZIP Codes in the Bay Area vary in population from over 10,000 residents to less than 20 residents. Traditional life expectancy estimation (like the one used for the regional- and county-level Vital Signs estimates) cannot be used because they are highly inaccurate for small populations and may result in over/underestimation of life expectancy. To avoid inaccurate estimates, ZIP Codes with populations of less than 5,000 were aggregated with neighboring ZIP Codes until the merged areas had a population of more than 5,000. ZIP Code 94103, representing Treasure Island, was dropped from the dataset due to its small population and having no bordering ZIP Codes. In this way, the original 305 Bay Area ZIP Codes were reduced to 217 ZIP Code areas for 2013 estimates. Next, a form of Bayesian random-effects analysis was used which established a prior distribution of the probability of death at each age using the regional distribution. This prior is used to shore up the life expectancy calculations where data were sparse.
Note: This dataset is historical only and there are not corresponding datasets for more recent time periods. For that more-recent information, please visit the Chicago Health Atlas at https://chicagohealthatlas.org.
This dataset gives the average life expectancy and corresponding confidence intervals for each Chicago community area for the years 1990, 2000 and 2010. See the full description at: https://data.cityofchicago.org/api/views/qjr3-bm53/files/AAu4x8SCRz_bnQb8SVUyAXdd913TMObSYj6V40cR6p8?download=true&filename=P:\EPI\OEPHI\MATERIALS\REFERENCES\Life Expectancy\Dataset description - LE by community area.pdf
VITAL SIGNS INDICATOR Life Expectancy (EQ6)
FULL MEASURE NAME Life Expectancy
LAST UPDATED April 2017
DESCRIPTION Life expectancy refers to the average number of years a newborn is expected to live if mortality patterns remain the same. The measure reflects the mortality rate across a population for a point in time.
DATA SOURCE State of California, Department of Health: Death Records (1990-2013) No link
California Department of Finance: Population Estimates Annual Intercensal Population Estimates (1990-2010) Table P-2: County Population by Age (2010-2013) http://www.dof.ca.gov/Forecasting/Demographics/Estimates/
CONTACT INFORMATION vitalsigns.info@mtc.ca.gov
METHODOLOGY NOTES (across all datasets for this indicator) Life expectancy is commonly used as a measure of the health of a population. Life expectancy does not reflect how long any given individual is expected to live; rather, it is an artificial measure that captures an aspect of the mortality rates across a population. Vital Signs measures life expectancy at birth (as opposed to cohort life expectancy). A statistical model was used to estimate life expectancy for Bay Area counties and Zip codes based on current life tables which require both age and mortality data. A life table is a table which shows, for each age, the survivorship of a people from a certain population.
Current life tables were created using death records and population estimates by age. The California Department of Public Health provided death records based on the California death certificate information. Records include age at death and residential Zip code. Single-year age population estimates at the regional- and county-level comes from the California Department of Finance population estimates and projections for ages 0-100+. Population estimates for ages 100 and over are aggregated to a single age interval. Using this data, death rates in a population within age groups for a given year are computed to form unabridged life tables (as opposed to abridged life tables). To calculate life expectancy, the probability of dying between the jth and (j+1)st birthday is assumed uniform after age 1. Special consideration is taken to account for infant mortality. For the Zip code-level life expectancy calculation, it is assumed that postal Zip codes share the same boundaries as Zip Code Census Tabulation Areas (ZCTAs). More information on the relationship between Zip codes and ZCTAs can be found at https://www.census.gov/geo/reference/zctas.html. Zip code-level data uses three years of mortality data to make robust estimates due to small sample size. Year 2013 Zip code life expectancy estimates reflects death records from 2011 through 2013. 2013 is the last year with available mortality data. Death records for Zip codes with zero population (like those associated with P.O. Boxes) were assigned to the nearest Zip code with population. Zip code population for 2000 estimates comes from the Decennial Census. Zip code population for 2013 estimates are from the American Community Survey (5-Year Average). The ACS provides Zip code population by age in five-year age intervals. Single-year age population estimates were calculated by distributing population within an age interval to single-year ages using the county distribution. Counties were assigned to Zip codes based on majority land-area.
Zip codes in the Bay Area vary in population from over 10,000 residents to less than 20 residents. Traditional life expectancy estimation (like the one used for the regional- and county-level Vital Signs estimates) cannot be used because they are highly inaccurate for small populations and may result in over/underestimation of life expectancy. To avoid inaccurate estimates, Zip codes with populations of less than 5,000 were aggregated with neighboring Zip codes until the merged areas had a population of more than 5,000. In this way, the original 305 Bay Area Zip codes were reduced to 218 Zip code areas for 2013 estimates. Next, a form of Bayesian random-effects analysis was used which established a prior distribution of the probability of death at each age using the regional distribution. This prior is used to shore up the life expectancy calculations where data were sparse.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Life expectancy at birth and at age 65, by sex, on a three-year average basis.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Mean, median and modal ages at death in the UK and its constituent countries, 2001 to 2003 and 2016 to 2018.
Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
I am developing my data science skills in areas outside of my previous work. An interesting problem for me was to identify which factors influence life expectancy on a national level. There is an existing Kaggle data set that explored this, but that information was corrupted. Part of the problem solving process is to step back periodically and ask "does this make sense?" Without reasonable data, it is harder to notice mistakes in my analysis code (as opposed to unusual behavior due to the data itself). I wanted to make a similar data set, but with reliable information.
This is my first time exploring life expectancy, so I had to guess which features might be of interest when making the data set. Some were included for comparison with the other Kaggle data set. A number of potentially interesting features (like air pollution) were left off due to limited year or country coverage. Since the data was collected from more than one server, some features are present more than once, to explore the differences.
A goal of the World Health Organization (WHO) is to ensure that a billion more people are protected from health emergencies, and provided better health and well-being. They provide public data collected from many sources to identify and monitor factors that are important to reach this goal. This set was primarily made using GHO (Global Health Observatory) and UNESCO (United Nations Educational Scientific and Culture Organization) information. The set covers the years 2000-2016 for 183 countries, in a single CSV file. Missing data is left in place, for the user to decide how to deal with it.
Three notebooks are provided for my cursory analysis, a comparison with the other Kaggle set, and a template for creating this data set.
There is a lot to explore, if the user is interested. The GHO server alone has over 2000 "indicators". - How are the GHO and UNESCO life expectancies calculated, and what is causing the difference? That could also be asked for Gross National Income (GNI) and mortality features. - How does the life expectancy after age 60 compare to the life expectancy at birth? Is the relationship with the features in this data set different for those two targets? - What other indicators on the servers might be interesting to use? Some of the GHO indicators are different studies with different coverage. Can they be combined to make a more useful and robust data feature? - Unraveling the correlations between the features would take significant work.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The median age at death is calculated for each municipality in Allegheny County. Data is based on the decedent's residence at the time of death, not the location where the death occurred. Median age by municipality is based on “official” death records that have been released by the Pennsylvania Department of Health. Data is broken out by race (white/black), and also includes a count of deaths for City of Pittsburgh neighborhoods and Allegheny County Municipalities.
Support for Health Equity datasets and tools provided by Amazon Web Services (AWS) through their Health Equity Initiative.
https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
File Description: "Life Expectancy Data.csv" This dataset contains 2,938 entries and 22 columns, covering life expectancy and related health indicators for multiple nations from 2000 to 2015. It includes country-wise data and other economic, social, and health metrics. Column Description: 1. Country – Name of the country. 2. Year – Data year (ranging from 2000 to 2015). 3. Status – Economic classification (Developing/Developed). 4. Life expectancy – Average lifespan in years. 5. Adult Mortality – Probability of death between ages 15-60 per 1,000 individuals. 6. Infant Deaths – Number of infant deaths per 1,000 live births. 7. Alcohol – Per capita alcohol consumption. 8. Percentage Expenditure – Government health expenditure as a percentage of GDP. 9. Hepatitis B – Immunization coverage percentage. 10. Measles – Number of reported measles cases. 11. BMI – Average Body Mass Index. 12. Under-Five Deaths – Mortality rate for children under five. 13. Polio & Diphtheria – Immunization rates. 14. HIV/AIDS – Deaths due to HIV/AIDS per 1,000 individuals. 15. GDP – Gross Domestic Product per capita. 16. Population – Total population of the country. 17. Thinness (1-19 years, 5-9 years) – Percentage of underweight children. 18. Income Composition of Resources– Human development index proxy. 19. Schooling– Average number of years of schooling. Missing Data: Some columns (like Hepatitis B, GDP, Population, Total Expenditure) contain missing values. Further File Information: Total Countries: 193 Years Covered: 2000–2015 Total Entries: 2,938 Missing Data Overview: Some columns have missing values, notably: Hepatitis B (553 missing) GDP (448 missing) Population (652 missing) Total expenditure (226 missing) Income Composition of Resources (167 missing) Schooling (163 missing) Summary Statistics: Life Expectancy:
Range: 36.3 to 89 years Mean: 69.2 years Adult Mortality:
Mean: 165 per 1,000 Max: 723 per 1,000 GDP per Capita:
Mean: $7,483 Max: $119,172 Population:
Mean: ~12.75 million Max: 1.29 billion Education:
Schooling Average: 12 years Max: 20.7 years
Futuristic Scope of this data: For comparative analysis of the 2000–2015 life expectancy dataset with new datasets on the same parametres , you can perform several statistical tests and analytical methods based on different research questions. Below are some key tests and approaches:
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 2754 series, with data for years 2005/2007 - 2012/2014 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (153 items: Canada; Newfoundland and Labrador; Eastern Regional Integrated Health Authority, Newfoundland and Labrador; Central Regional Integrated Health Authority, Newfoundland and Labrador; ...); Age group (2 items: At birth; At age 65); Sex (3 items: Both sexes; Males; Females); Characteristics (3 items: Life expectancy; Low 95% confidence interval, life expectancy; High 95% confidence interval, life expectancy).
Average age of deaths of the Basque Country by province, large groups cause of death (CIE-9) and sex (from 1996 - to 1998)
This dataset contains counts of deaths for California counties based on information entered on death certificates. Final counts are derived from static data and include out-of-state deaths to California residents, whereas provisional counts are derived from incomplete and dynamic data. Provisional counts are based on the records available when the data was retrieved and may not represent all deaths that occurred during the time period. Deaths involving injuries from external or environmental forces, such as accidents, homicide and suicide, often require additional investigation that tends to delay certification of the cause and manner of death. This can result in significant under-reporting of these deaths in provisional data.
The final data tables include both deaths that occurred in each California county regardless of the place of residence (by occurrence) and deaths to residents of each California county (by residence), whereas the provisional data table only includes deaths that occurred in each county regardless of the place of residence (by occurrence). The data are reported as totals, as well as stratified by age, gender, race-ethnicity, and death place type. Deaths due to all causes (ALL) and selected underlying cause of death categories are provided. See temporal coverage for more information on which combinations are available for which years.
The cause of death categories are based solely on the underlying cause of death as coded by the International Classification of Diseases. The underlying cause of death is defined by the World Health Organization (WHO) as "the disease or injury which initiated the train of events leading directly to death, or the circumstances of the accident or violence which produced the fatal injury." It is a single value assigned to each death based on the details as entered on the death certificate. When more than one cause is listed, the order in which they are listed can affect which cause is coded as the underlying cause. This means that similar events could be coded with different underlying causes of death depending on variations in how they were entered. Consequently, while underlying cause of death provides a convenient comparison between cause of death categories, it may not capture the full impact of each cause of death as it does not always take into account all conditions contributing to the death.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
This dataset, released February 2021, contains Median age at death of males, 2014 to 2018; Median age at death of females, 2014 to 2018; Median age at death of persons, 2014 to 2018;
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Annual data on death registrations by single year of age for the UK (1974 onwards) and England and Wales (1963 onwards).
Note: This dataset is historical only and there are not corresponding datasets for more recent time periods. For that more-recent information, please visit the Chicago Health Atlas at https://chicagohealthatlas.org.
This dataset gives the average life expectancy and corresponding confidence intervals for sex and racial-ethnic groups in Chicago for the years 1990, 2000 and 2010. See the full description at: https://data.cityofchicago.org/api/views/3qdj-cqb8/files/pJ3PVVyubnsS2SpGO5P5IOPtNgCJZTE3LNOeLagC3mw?download=true&filename=P:\EPI\OEPHI\MATERIALS\REFERENCES\Life Expectancy\Dataset description_LE_ Sex_Race_Ethnicity.pdf
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Effect of suicide rates on life expectancy dataset
Abstract In 2015, approximately 55 million people died worldwide, of which 8 million committed suicide. In the USA, one of the main causes of death is the aforementioned suicide, therefore, this experiment is dealing with the question of how much suicide rates affects the statistics of average life expectancy. The experiment takes two datasets, one with the number of suicides and life expectancy in the second one and combine data into one dataset. Subsequently, I try to find any patterns and correlations among the variables and perform statistical test using simple regression to confirm my assumptions.
Data
The experiment uses two datasets - WHO Suicide Statistics[1] and WHO Life Expectancy[2], which were firstly appropriately preprocessed. The final merged dataset to the experiment has 13 variables, where country and year are used as index: Country, Year, Suicides number, Life expectancy, Adult Mortality, which is probability of dying between 15 and 60 years per 1000 population, Infant deaths, which is number of Infant Deaths per 1000 population, Alcohol, which is alcohol, recorded per capita (15+) consumption, Under-five deaths, which is number of under-five deaths per 1000 population, HIV/AIDS, which is deaths per 1 000 live births HIV/AIDS, GDP, which is Gross Domestic Product per capita, Population, Income composition of resources, which is Human Development Index in terms of income composition of resources, and Schooling, which is number of years of schooling.
LICENSE
THE EXPERIMENT USES TWO DATASET - WHO SUICIDE STATISTICS AND WHO LIFE EXPECTANCY, WHICH WERE COLLEECTED FROM WHO AND UNITED NATIONS WEBSITE. THEREFORE, ALL DATASETS ARE UNDER THE LICENSE ATTRIBUTION-NONCOMMERCIAL-SHAREALIKE 3.0 IGO (https://creativecommons.org/licenses/by-nc-sa/3.0/igo/).
Number of deaths and mortality rates, by age group, sex, and place of residence, 1991 to most recent year.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Provisional deaths registration data for single year of age and average age of death (median and mean) of persons whose death involved coronavirus (COVID-19), England and Wales. Includes deaths due to COVID-19 and breakdowns by sex.