https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
File Description: "Life Expectancy Data.csv" This dataset contains 2,938 entries and 22 columns, covering life expectancy and related health indicators for multiple nations from 2000 to 2015. It includes country-wise data and other economic, social, and health metrics. Column Description: 1. Country – Name of the country. 2. Year – Data year (ranging from 2000 to 2015). 3. Status – Economic classification (Developing/Developed). 4. Life expectancy – Average lifespan in years. 5. Adult Mortality – Probability of death between ages 15-60 per 1,000 individuals. 6. Infant Deaths – Number of infant deaths per 1,000 live births. 7. Alcohol – Per capita alcohol consumption. 8. Percentage Expenditure – Government health expenditure as a percentage of GDP. 9. Hepatitis B – Immunization coverage percentage. 10. Measles – Number of reported measles cases. 11. BMI – Average Body Mass Index. 12. Under-Five Deaths – Mortality rate for children under five. 13. Polio & Diphtheria – Immunization rates. 14. HIV/AIDS – Deaths due to HIV/AIDS per 1,000 individuals. 15. GDP – Gross Domestic Product per capita. 16. Population – Total population of the country. 17. Thinness (1-19 years, 5-9 years) – Percentage of underweight children. 18. Income Composition of Resources– Human development index proxy. 19. Schooling– Average number of years of schooling. Missing Data: Some columns (like Hepatitis B, GDP, Population, Total Expenditure) contain missing values. Further File Information: Total Countries: 193 Years Covered: 2000–2015 Total Entries: 2,938 Missing Data Overview: Some columns have missing values, notably: Hepatitis B (553 missing) GDP (448 missing) Population (652 missing) Total expenditure (226 missing) Income Composition of Resources (167 missing) Schooling (163 missing) Summary Statistics: Life Expectancy:
Range: 36.3 to 89 years Mean: 69.2 years Adult Mortality:
Mean: 165 per 1,000 Max: 723 per 1,000 GDP per Capita:
Mean: $7,483 Max: $119,172 Population:
Mean: ~12.75 million Max: 1.29 billion Education:
Schooling Average: 12 years Max: 20.7 years
Futuristic Scope of this data: For comparative analysis of the 2000–2015 life expectancy dataset with new datasets on the same parametres , you can perform several statistical tests and analytical methods based on different research questions. Below are some key tests and approaches:
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Life Expectancy (WHO) Dataset is a WHO-based national health dataset that tabulates life expectancy, vaccination, mortality, economy, and society in 193 countries around the world from 2000 to 2015.
2) Data Utilization (1) Life Expectancy (WHO) Dataset has characteristics that: • Each row contains more than 20 health, economic, and social variables and target variables (life expectancy), including country, year, life expectancy, vaccination rates (e.g., hepatitis B, polio, diphtheria), infant and adult mortality, GDP, population, education level, drinking and smoking. • Although some missing values exist in the data, they are well structured for analysis of health levels and influencing factors by country, including data from various countries and time series. (2) Life Expectancy (WHO) Dataset can be used to: • Analysis of factors affecting life expectancy: The effects of various factors such as vaccination, mortality, economic and social variables on life expectancy can be assessed using statistical methods such as regression analysis. • Health Policy and International Comparative Study: Using national and annual health indicators, it can be used for international health research, such as evaluating the effectiveness of health policies, analyzing health gaps, and establishing strategies to support low-income countries.
Although there have been lot of studies undertaken in the past on factors affecting life expectancy considering demographic variables, income composition and mortality rates. It was found that affect of immunization and human development index was not taken into account in the past. Also, some of the past research was done considering multiple linear regression based on data set of one year for all the countries. Hence, this gives motivation to resolve both the factors stated previously by formulating a regression model based on mixed effects model and multiple linear regression while considering data from a period of 2000 to 2015 for all the countries. Important immunization like Hepatitis B, Polio and Diphtheria will also be considered. In a nutshell, this study will focus on immunization factors, mortality factors, economic factors, social factors and other health related factors as well. Since the observations this dataset are based on different countries, it will be easier for a country to determine the predicting factor which is contributing to lower value of life expectancy. This will help in suggesting a country which area should be given importance in order to efficiently improve the life expectancy of its population.
The project relies on accuracy of data. The Global Health Observatory (GHO) data repository under World Health Organization (WHO) keeps track of the health status as well as many other related factors for all countries The data-sets are made available to public for the purpose of health data analysis. The data-set related to life expectancy, health factors for 193 countries has been collected from the same WHO data repository website and its corresponding economic data was collected from United Nation website. Among all categories of health-related factors only those critical factors were chosen which are more representative. It has been observed that in the past 15 years , there has been a huge development in health sector resulting in improvement of human mortality rates especially in the developing nations in comparison to the past 30 years. Therefore, in this project we have considered data from year 2000-2015 for 193 countries for further analysis. The individual data files have been merged together into a single data-set. On initial visual inspection of the data showed some missing values. As the data-sets were from WHO, we found no evident errors. Missing data was handled in R software by using Missmap command. The result indicated that most of the missing data was for population, Hepatitis B and GDP. The missing data were from less known countries like Vanuatu, Tonga, Togo, Cabo Verde etc. Finding all data for these countries was difficult and hence, it was decided that we exclude these countries from the final model data-set. The final merged file(final dataset) consists of 22 Columns and 2938 rows which meant 20 predicting variables. All predicting variables was then divided into several broad categories:Immunization related factors, Mortality factors, Economical factors and Social factors.
The data was collected from WHO and United Nations website with the help of Deeksha Russell and Duan Wang.
The data-set aims to answer the following key questions: 1. Does various predicting factors which has been chosen initially really affect the Life expectancy? What are the predicting variables actually affecting the life expectancy? 2. Should a country having a lower life expectancy value(<65) increase its healthcare expenditure in order to improve its average lifespan? 3. How does Infant and Adult mortality rates affect life expectancy? 4. Does Life Expectancy has positive or negative correlation with eating habits, lifestyle, exercise, smoking, drinking alcohol etc. 5. What is the impact of schooling on the lifespan of humans? 6. Does Life Expectancy have positive or negative relationship with drinking alcohol? 7. Do densely populated countries tend to have lower life expectancy? 8. What is the impact of Immunization coverage on life Expectancy?
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 2394 series, with data for years 1991 - 1991 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 items: Canada ...), Population group (19 items: Entire cohort; Income adequacy quintile 1 (lowest);Income adequacy quintile 2;Income adequacy quintile 3 ...), Age (14 items: At 25 years; At 30 years; At 40 years; At 35 years ...), Sex (3 items: Both sexes; Females; Males ...), Characteristics (3 items: Life expectancy; High 95% confidence interval; life expectancy; Low 95% confidence interval; life expectancy ...).
http://data.europa.eu/eli/dec/2011/833/ojhttp://data.europa.eu/eli/dec/2011/833/oj
This dataset shows the life expectancy at regional level for 2011.
Life expectancy in the EU, which is a reflection of well-being, is among the highest in the world. Of the 50 countries in the world with the highest life expectancy in 2012, 21 were EU Member States, 18 of which had a higher life expectancy than the US. Differences between regions in the EU are marked. Life expectancy at birth is less than 74 in many partsof Bulgaria as well as in Latvia and Lithuania, while overall across the EU it is over 80 years in two out of every three regions. In 17 regions in Spain, France and Italy, it is 83 years or more.
EU-28 = 80.3 . BE, IT, UK: 2010. Source: Eurostat
This dataset contains replication files for "The Association Between Income and Life Expectancy in the United States, 2001-2014" by Augustin Bergeron, Raj Chetty, David Cutler, Benjamin Scuderi, Michael Stepner, and Nicholas Turner. For more information, see https://opportunityinsights.org/paper/lifeexpectancy/. A summary of the related publication follows. How can we reduce socioeconomic disparities in health outcomes? Although it is well known that there are significant differences in health and longevity between income groups, debate remains about the magnitudes and determinants of these differences. We use new data from 1.4 billion anonymous earnings and mortality records to construct more precise estimates of the relationship between income and life expectancy at the national level than was feasible in prior work. We then construct new local area (county and metro area) estimates of life expectancy by income group and identify factors that are associated with higher levels of life expectancy for low-income individuals. Our findings show that disparities in life expectancy are not inevitable. There are cities throughout America — from New York to San Francisco to Birmingham, AL — where gaps in life expectancy are relatively small or are narrowing over time. Replicating these successes more broadly will require targeted local efforts, focusing on improving health behaviors among the poor in cities such as Las Vegas and Detroit. Our findings also imply that federal programs such as Social Security and Medicare are less redistributive than they might appear because low-income individuals obtain these benefits for significantly fewer years than high-income individuals, especially in cities like Detroit. Going forward, the challenge is to understand the mechanisms that lead to better health and longevity for low-income individuals in some parts of the U.S. To facilitate future research and monitor local progress, we have posted annual statistics on life expectancy by income group and geographic area (state, CZ, and county) at The Health Inequality Project website. Using these data, researchers will be able to study why certain places have high or improving levels of life expectancy and ultimately apply these lessons to reduce health disparities in other parts of the country.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time series data for the statistic Life expectancy at age 60, female (years) and country Seychelles. Indicator Definition:Life expectancy at age 60, female is the average number of years that a female at age 60 would live if prevailing patterns of mortality at the time of age 60 were to stay the same throughout her life.The indicator "Life expectancy at age 60, female (years)" stands at 20.88 as of 12/31/2023. Regarding the One-Year-Change of the series, the current value constitutes an increase of 6.97 percent compared to the value the year prior.The 1 year change in percent is 6.97.The 3 year change in percent is -9.83.The 5 year change in percent is 1.02.The 10 year change in percent is 0.7689.The Serie's long term average value is 19.46. It's latest available value, on 12/31/2023, is 7.31 percent higher, compared to it's long term average value.The Serie's change in percent from it's minimum value, on 12/31/1960, to it's latest available value, on 12/31/2023, is +25.12%.The Serie's change in percent from it's maximum value, on 12/31/2020, to it's latest available value, on 12/31/2023, is -9.83%.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Every year the CDC releases the country’s most detailed report on death in the United States under the National Vital Statistics Systems. This mortality dataset is a record of every death in the country for 2005 through 2015, including detailed information about causes of death and the demographic background of the deceased.
It's been said that "statistics are human beings with the tears wiped off." This is especially true with this dataset. Each death record represents somebody's loved one, often connected with a lifetime of memories and sometimes tragically too short.
Putting the sensitive nature of the topic aside, analyzing mortality data is essential to understanding the complex circumstances of death across the country. The US Government uses this data to determine life expectancy and understand how death in the U.S. differs from the rest of the world. Whether you’re looking for macro trends or analyzing unique circumstances, we challenge you to use this dataset to find your own answers to one of life’s great mysteries.
This dataset is a collection of CSV files each containing one year's worth of data and paired JSON files containing the code mappings, plus an ICD 10 code set. The CSVs were reformatted from their original fixed-width file formats using information extracted from the CDC's PDF manuals using this script. Please note that this process may have introduced errors as the text extracted from the pdf is not a perfect match. If you have any questions or find errors in the preparation process, please leave a note in the forums. We hope to publish additional years of data using this method soon.
A more detailed overview of the data can be found here. You'll find that the fields are consistent within this time window, but some of data codes change every few years. For example, the 113_cause_recode entry 069 only covers ICD codes (I10,I12) in 2005, but by 2015 it covers (I10,I12,I15). When I post data from years prior to 2005, expect some of the fields themselves to change as well.
All data comes from the CDC’s National Vital Statistics Systems, with the exception of the Icd10Code, which are sourced from the World Health Organization.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This data set has been generated using data from the Gapminder website, which focuses on gathering and sharing statistics and other information about social, economic and environmental development at local, national and global levels.
This particular data set describes the values of several parameters (see the list below) between 1998 and 2018 for a total of 175 countries, having a total of 3675 rows. The parameters included in the data set and the column name of the dataframe are as follows:
The Health Inequality Project uses big data to measure differences in life expectancy by income across areas and identify strategies to improve health outcomes for low-income Americans.
This table reports life expectancy point estimates and standard errors for men and women at age 40 for each percentile of the national income distribution. Both race-adjusted and unadjusted estimates are reported.
This table reports life expectancy point estimates and standard errors for men and women at age 40 for each percentile of the national income distribution separately by year. Both race-adjusted and unadjusted estimates are reported.
This dataset was created on 2020-01-10 18:53:00.508
by merging multiple datasets together. The source datasets for this version were:
Commuting Zone Life Expectancy Estimates by year: CZ-level by-year life expectancy estimates for men and women, by income quartile
Commuting Zone Life Expectancy: Commuting zone (CZ)-level life expectancy estimates for men and women, by income quartile
Commuting Zone Life Expectancy Trends: CZ-level estimates of trends in life expectancy for men and women, by income quartile
Commuting Zone Characteristics: CZ-level characteristics
Commuting Zone Life Expectancy for larger populations: CZ-level life expectancy estimates for men and women, by income ventile
This table reports life expectancy point estimates and standard errors for men and women at age 40 for each quartile of the national income distribution by state of residence and year. Both race-adjusted and unadjusted estimates are reported.
This table reports US mortality rates by gender, age, year and household income percentile. Household incomes are measured two years prior to the mortality rate for mortality rates at ages 40-63, and at age 61 for mortality rates at ages 64-76. The “lag” variable indicates the number of years between measurement of income and mortality.
Observations with 1 or 2 deaths have been masked: all mortality rates that reflect only 1 or 2 deaths have been recoded to reflect 3 deaths
This table reports coefficients and standard errors from regressions of life expectancy estimates for men and women at age 40 for each quartile of the national income distribution on calendar year by commuting zone of residence. Only the slope coefficient, representing the average increase or decrease in life expectancy per year, is reported. Trend estimates for both race-adjusted and unadjusted life expectancies are reported. Estimates are reported for the 100 largest CZs (populations greater than 590,000) only.
This table reports life expectancy estimates at age 40 for Males and Females for all countries. Source: World Health Organization, accessed at: http://apps.who.int/gho/athena/
This table reports life expectancy point estimates and standard errors for men and women at age 40 for each quartile of the national income distribution by county of residence. Both race-adjusted and unadjusted estimates are reported. Estimates are reported for counties with populations larger than 25,000 only
This table reports life expectancy point estimates and standard errors for men and women at age 40 for each quartile of the national income distribution by commuting zone of residence and year. Both race-adjusted and unadjusted estimates are reported. Estimates are reported for the 100 largest CZs (populations greater than 590,000) only.
This table reports US population and death counts by age, year, and sex from various sources. Counts labelled “dm1” are derived from the Social Security Administration Data Master 1 file. Counts labelled “irs” are derived from tax data. Counts labelled “cdc” are derived from NCHS life tables.
This table reports numerous county characteristics, compiled from various sources. These characteristics are described in the county life expectancy table.
Two variables constructed by the Cen
Life Expectancy - This indicator shows life expectancy from birth, in years. Life expectancy is a summary measure used to describe overall health. Life expectancy at birth is the average number of years a newborn is expected to live given current conditions. The life expectancy in the US is the highest in recorded history thanks to public health interventions such as improvements in sanitation and food safety, development and use of vaccines, and health promotion efforts. Link to Data Details
Note: This dataset is historical only and there are not corresponding datasets for more recent time periods. For that more-recent information, please visit the Chicago Health Atlas at https://chicagohealthatlas.org.
This dataset gives the average life expectancy and corresponding confidence intervals for each Chicago community area for the years 1990, 2000 and 2010. See the full description at: https://data.cityofchicago.org/api/views/qjr3-bm53/files/AAu4x8SCRz_bnQb8SVUyAXdd913TMObSYj6V40cR6p8?download=true&filename=P:\EPI\OEPHI\MATERIALS\REFERENCES\Life Expectancy\Dataset description - LE by community area.pdf
The United States Census Bureau’s international dataset provides estimates of country populations since 1950 and projections through 2050. Specifically, the dataset includes midyear population figures broken down by age and gender assignment at birth. Additionally, time-series data is provided for attributes including fertility rates, birth rates, death rates, and migration rates.
You can use the BigQuery Python client library to query tables in this dataset in Kernels. Note that methods available in Kernels are limited to querying data. Tables are at bigquery-public-data.census_bureau_international.
What countries have the longest life expectancy? In this query, 2016 census information is retrieved by joining the mortality_life_expectancy and country_names_area tables for countries larger than 25,000 km2. Without the size constraint, Monaco is the top result with an average life expectancy of over 89 years!
SELECT
age.country_name,
age.life_expectancy,
size.country_area
FROM (
SELECT
country_name,
life_expectancy
FROM
bigquery-public-data.census_bureau_international.mortality_life_expectancy
WHERE
year = 2016) age
INNER JOIN (
SELECT
country_name,
country_area
FROM
bigquery-public-data.census_bureau_international.country_names_area
where country_area > 25000) size
ON
age.country_name = size.country_name
ORDER BY
2 DESC
/* Limit removed for Data Studio Visualization */
LIMIT
10
Which countries have the largest proportion of their population under 25? Over 40% of the world’s population is under 25 and greater than 50% of the world’s population is under 30! This query retrieves the countries with the largest proportion of young people by joining the age-specific population table with the midyear (total) population table.
SELECT
age.country_name,
SUM(age.population) AS under_25,
pop.midyear_population AS total,
ROUND((SUM(age.population) / pop.midyear_population) * 100,2) AS pct_under_25
FROM (
SELECT
country_name,
population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population_agespecific
WHERE
year =2017
AND age < 25) age
INNER JOIN (
SELECT
midyear_population,
country_code
FROM
bigquery-public-data.census_bureau_international.midyear_population
WHERE
year = 2017) pop
ON
age.country_code = pop.country_code
GROUP BY
1,
3
ORDER BY
4 DESC /* Remove limit for visualization*/
LIMIT
10
The International Census dataset contains growth information in the form of birth rates, death rates, and migration rates. Net migration is the net number of migrants per 1,000 population, an important component of total population and one that often drives the work of the United Nations Refugee Agency. This query joins the growth rate table with the area table to retrieve 2017 data for countries greater than 500 km2.
SELECT
growth.country_name,
growth.net_migration,
CAST(area.country_area AS INT64) AS country_area
FROM (
SELECT
country_name,
net_migration,
country_code
FROM
bigquery-public-data.census_bureau_international.birth_death_growth_rates
WHERE
year = 2017) growth
INNER JOIN (
SELECT
country_area,
country_code
FROM
bigquery-public-data.census_bureau_international.country_names_area
Historic (none)
United States Census Bureau
Terms of use: This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/international-census-data
Life expectancy at birth and at age 65, by sex, on a three-year average basis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Spain ES: Life Expectancy at Birth: Total data was reported at 82.832 Year in 2016. This stayed constant from the previous number of 82.832 Year for 2015. Spain ES: Life Expectancy at Birth: Total data is updated yearly, averaging 76.747 Year from Dec 1960 (Median) to 2016, with 57 observations. The data reached an all-time high of 83.229 Year in 2014 and a record low of 69.109 Year in 1960. Spain ES: Life Expectancy at Birth: Total data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Spain – Table ES.World Bank: Health Statistics. Life expectancy at birth indicates the number of years a newborn infant would live if prevailing patterns of mortality at the time of its birth were to stay the same throughout its life.; ; (1) United Nations Population Division. World Population Prospects: 2017 Revision, or derived from male and female life expectancy at birth from sources such as: (2) Census reports and other statistical publications from national statistical offices, (3) Eurostat: Demographic Statistics, (4) United Nations Statistical Division. Population and Vital Statistics Reprot (various years), (5) U.S. Census Bureau: International Database, and (6) Secretariat of the Pacific Community: Statistics and Demography Programme.; Weighted average;
VITAL SIGNS INDICATOR Life Expectancy (EQ6)
FULL MEASURE NAME Life Expectancy
LAST UPDATED April 2017
DESCRIPTION Life expectancy refers to the average number of years a newborn is expected to live if mortality patterns remain the same. The measure reflects the mortality rate across a population for a point in time.
DATA SOURCE State of California, Department of Health: Death Records (1990-2013) No link
California Department of Finance: Population Estimates Annual Intercensal Population Estimates (1990-2010) Table P-2: County Population by Age (2010-2013) http://www.dof.ca.gov/Forecasting/Demographics/Estimates/
U.S. Census Bureau: Decennial Census ZCTA Population (2000-2010) http://factfinder.census.gov
U.S. Census Bureau: American Community Survey 5-Year Population Estimates (2013) http://factfinder.census.gov
CONTACT INFORMATION vitalsigns.info@mtc.ca.gov
METHODOLOGY NOTES (across all datasets for this indicator) Life expectancy is commonly used as a measure of the health of a population. Life expectancy does not reflect how long any given individual is expected to live; rather, it is an artificial measure that captures an aspect of the mortality rates across a population that can be compared across time and populations. More information about the determinants of life expectancy that may lead to differences in life expectancy between neighborhoods can be found in the Bay Area Regional Health Inequities Initiative (BARHII) Health Inequities in the Bay Area report at http://www.barhii.org/wp-content/uploads/2015/09/barhii_hiba.pdf. Vital Signs measures life expectancy at birth (as opposed to cohort life expectancy). A statistical model was used to estimate life expectancy for Bay Area counties and ZIP Codes based on current life tables which require both age and mortality data. A life table is a table which shows, for each age, the survivorship of a people from a certain population.
Current life tables were created using death records and population estimates by age. The California Department of Public Health provided death records based on the California death certificate information. Records include age at death and residential ZIP Code. Single-year age population estimates at the regional- and county-level comes from the California Department of Finance population estimates and projections for ages 0-100+. Population estimates for ages 100 and over are aggregated to a single age interval. Using this data, death rates in a population within age groups for a given year are computed to form unabridged life tables (as opposed to abridged life tables). To calculate life expectancy, the probability of dying between the jth and (j+1)st birthday is assumed uniform after age 1. Special consideration is taken to account for infant mortality.
For the ZIP Code-level life expectancy calculation, it is assumed that postal ZIP Codes share the same boundaries as ZIP Code Census Tabulation Areas (ZCTAs). More information on the relationship between ZIP Codes and ZCTAs can be found at http://www.census.gov/geo/reference/zctas.html. ZIP Code-level data uses three years of mortality data to make robust estimates due to small sample size. Year 2013 ZIP Code life expectancy estimates reflects death records from 2011 through 2013. 2013 is the last year with available mortality data. Death records for ZIP Codes with zero population (like those associated with P.O. Boxes) were assigned to the nearest ZIP Code with population. ZIP Code population for 2000 estimates comes from the Decennial Census. ZIP Code population for 2013 estimates are from the American Community Survey (5-Year Average). ACS estimates are adjusted using Decennial Census data for more accurate population estimates. An adjustment factor was calculated using the ratio between the 2010 Decennial Census population estimates and the 2012 ACS 5-Year (with middle year 2010) population estimates. This adjustment factor is particularly important for ZCTAs with high homeless population (not living in group quarters) where the ACS may underestimate the ZCTA population and therefore underestimate the life expectancy. The ACS provides ZIP Code population by age in five-year age intervals. Single-year age population estimates were calculated by distributing population within an age interval to single-year ages using the county distribution. Counties were assigned to ZIP Codes based on majority land-area.
ZIP Codes in the Bay Area vary in population from over 10,000 residents to less than 20 residents. Traditional life expectancy estimation (like the one used for the regional- and county-level Vital Signs estimates) cannot be used because they are highly inaccurate for small populations and may result in over/underestimation of life expectancy. To avoid inaccurate estimates, ZIP Codes with populations of less than 5,000 were aggregated with neighboring ZIP Codes until the merged areas had a population of more than 5,000. ZIP Code 94103, representing Treasure Island, was dropped from the dataset due to its small population and having no bordering ZIP Codes. In this way, the original 305 Bay Area ZIP Codes were reduced to 217 ZIP Code areas for 2013 estimates. Next, a form of Bayesian random-effects analysis was used which established a prior distribution of the probability of death at each age using the regional distribution. This prior is used to shore up the life expectancy calculations where data were sparse.
We used individual-level death data to estimate county-level life expectancy at 25 (e25) for Whites, Black, AIAN and Asian in the contiguous US for 2000-2005. Race-sex-stratified models were used to examine the associations among e25, rurality and specific race proportion, adjusted for socioeconomic variables. Individual death data from the National Center for Health Statistics were aggregated as death counts into five-year age groups by county and race-sex groups for the contiguous US for years 2000-2005 (National Center for Health Statistics 2000-2005). We used bridged-race population estimates to calculate five-year mortality rates. The bridged population data mapped 31 race categories, as specified in the 1997 Office of Management and Budget standards for the collection of data on race and ethnicity, to the four race categories specified under the 1977 standards (the same as race categories in mortality registration) (Ingram et al. 2003). The urban-rural gradient was represented by the 2003 Rural Urban Continuum Codes (RUCC), which distinguished metropolitan counties by population size, and nonmetropolitan counties by degree of urbanization and adjacency to a metro area (United States Department of Agriculture 2016). We obtained county-level sociodemographic data for 2000-2005 from the US Census Bureau. These included median household income, percent of population attaining greater than high school education (high school%), and percent of county occupied rental units (rent%). We obtained county violent crime from Uniform Crime Reports and used it to calculate mean number of violent crimes per capita (Federal Bureau of Investigation 2010). This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Request to author. Format: Data are stored as csv files. This dataset is associated with the following publication: Jian, Y., L. Neas, L. Messer, C. Gray, J. Jagai, K. Rappazzo, and D. Lobdell. Divergent trends in life expectancy across the rural-urban gradient among races in the contiguous United States. International Journal of Public Health. Springer Basel AG, Basel, SWITZERLAND, 64(9): 1367-1374, (2019).
The first edition of the World Happiness Report was published on April 1, 2012 (omitted from this dataset), with a methodology that ranks countries based on their Happiness Index. Since its release, this report has garnered global recognition and has been issued on a yearly basis, excluding 2014.
Happiness Index is explained by: • Dystopia (adds 1.83 Index score) + residual • GDP per capita • Social support • Healthy life expectancy • Freedom to make life choices • Generosity • Perceptions of corruption.
Dataset provided in .xlsx and .csv formats. For convenience, dataset is provided with and without NULLs.
Possible use cases & questions: Initially the dataset was meant to be used for visualization practice with BI tools, for example, Tableau.
Clean dataset without nulls: 1) “Yearly average Happiness Index change” 2) “Are there countries who’s happiness was increasing for years, but had a sudden drop in recent years?” 3) “Is there a year(s) where the average world happiness decreased compared to last year?” 4) Geo data visualization.
With nulls: 1) Practice dealing with nulls. 2) “Which countries in which year had no ranking?” Deep dive and exploration into possible causes (for example, war, internal conflict, government or policy changes, diseases) via other sources.
Columns: Country – country name. Year – year of the report. Index – Happiness Index score. Rank – country rank according to their Happiness Index score.
Reports used in this dataset: 1. The 2013 World Happiness Report provides rankings of 156 countries based on their happiness index during the period of 2010-2012. Data collected from: https://resources.unsdsn.org/world-happiness-report-2013 2. No report for 2014. 3. The 2015 World Happiness Report provides rankings of 158 countries based on their happiness index during the period of 2012-2014. 4. The 2016 World Happiness Report provides rankings of 157 countries based on their happiness index during the period of 2013-2015. 5. The 2017 World Happiness Report provides rankings of 155 countries based on their happiness index during the period of 2014-2016. 6. The 2018 World Happiness Report provides rankings of 156 countries based on their happiness index during the period of 2015-2017. 7. The 2019 World Happiness Report provides rankings of 156 countries based on their happiness index during the period of 2016-2018. 8. The 2020 World Happiness Report provides rankings of 153 countries based on their happiness index during the period of 2017-2019. 9. The 2021 World Happiness Report provides rankings of 149 countries based on their happiness index during the period of 2018-2020. 10. The 2022 World Happiness Report provides rankings of 146 countries based on their happiness index during the period of 2019-2021. 11. The 2023 World Happiness Report provides rankings of 137 countries based on their happiness index during the period of 2020-2022.
If not stated differently, data collected from: https://worldhappiness.report/
Licence: CDLA-Permissive-1.0
Notes: Please note that some country names have been shortened, for example, “Hong Kong Special Administrative Region of the People's Republic of China” was shortened to “Hong Kong”.
Additional notes: [name in older Reports, other data sources] = [name used in this file] [data source] • Czech Republic = Czechia [1] • Macedonia = North Macedonia [2] • Turkey = Turkiye [3]
Data sources: 1) https://european-union.europa.eu/principles-countries-history/country-profiles/czechia_en 2) https://www.strasbourg-europe.eu/macedonia/ 3) https://www.un.org/en/about-us/member-states/turkiye
Update History: 2023-03-14—2023-03-17 – initial data collection for 2013-2022. 2023-03-25 – updated for 2023.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This comprehensive dataset provides a wealth of information about all countries worldwide, covering a wide range of indicators and attributes. It encompasses demographic statistics, economic indicators, environmental factors, healthcare metrics, education statistics, and much more. With every country represented, this dataset offers a complete global perspective on various aspects of nations, enabling in-depth analyses and cross-country comparisons.
Key Features
Country: Name of the country.
Density (P/Km2): Population density measured in persons per square kilometer.
Abbreviation: Abbreviation or code representing the country.
Agricultural Land (%): Percentage of land area used for agricultural purposes.
Land Area (Km2): Total land area of the country in square kilometers.
Armed Forces Size: Size of the armed forces in the country.
Birth Rate: Number of births per 1,000 population per year.
Calling Code: International calling code for the country.
Capital/Major City: Name of the capital or major city.
CO2 Emissions: Carbon dioxide emissions in tons.
CPI: Consumer Price Index, a measure of inflation and purchasing power.
CPI Change (%): Percentage change in the Consumer Price Index compared to the previous year.
Currency_Code: Currency code used in the country.
Fertility Rate: Average number of children born to a woman during her lifetime.
Forested Area (%): Percentage of land area covered by forests.
Gasoline_Price: Price of gasoline per liter in local currency.
GDP: Gross Domestic Product, the total value of goods and services produced in the country.
Gross Primary Education Enrollment (%): Gross enrollment ratio for primary education.
Gross Tertiary Education Enrollment (%): Gross enrollment ratio for tertiary education.
Infant Mortality: Number of deaths per 1,000 live births before reaching one year of age.
Largest City: Name of the country's largest city.
Life Expectancy: Average number of years a newborn is expected to live.
Maternal Mortality Ratio: Number of maternal deaths per 100,000 live births.
Minimum Wage: Minimum wage level in local currency.
Official Language: Official language(s) spoken in the country.
Out of Pocket Health Expenditure (%): Percentage of total health expenditure paid out-of-pocket by individuals.
Physicians per Thousand: Number of physicians per thousand people.
Population: Total population of the country.
Population: Labor Force Participation (%): Percentage of the population that is part of the labor force.
Tax Revenue (%): Tax revenue as a percentage of GDP.
Total Tax Rate: Overall tax burden as a percentage of commercial profits.
Unemployment Rate: Percentage of the labor force that is unemployed.
Urban Population: Percentage of the population living in urban areas.
Latitude: Latitude coordinate of the country's location.
Longitude: Longitude coordinate of the country's location.
Potential Use Cases
Analyze population density and land area to study spatial distribution patterns.
Investigate the relationship between agricultural land and food security.
Examine carbon dioxide emissions and their impact on climate change.
Explore correlations between economic indicators such as GDP and various socio-economic factors.
Investigate educational enrollment rates and their implications for human capital development.
Analyze healthcare metrics such as infant mortality and life expectancy to assess overall well-being.
Study labor market dynamics through indicators such as labor force participation and unemployment rates.
Investigate the role of taxation and its impact on economic development.
Explore urbanization trends and their social and environmental consequences.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table contains 2754 series, with data for years 2005/2007 - 2012/2014 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (153 items: Canada; Newfoundland and Labrador; Eastern Regional Integrated Health Authority, Newfoundland and Labrador; Central Regional Integrated Health Authority, Newfoundland and Labrador; ...); Age group (2 items: At birth; At age 65); Sex (3 items: Both sexes; Males; Females); Characteristics (3 items: Life expectancy; Low 95% confidence interval, life expectancy; High 95% confidence interval, life expectancy).
https://www.worldbank.org/en/about/legal/terms-of-use-for-datasetshttps://www.worldbank.org/en/about/legal/terms-of-use-for-datasets
File Description: "Life Expectancy Data.csv" This dataset contains 2,938 entries and 22 columns, covering life expectancy and related health indicators for multiple nations from 2000 to 2015. It includes country-wise data and other economic, social, and health metrics. Column Description: 1. Country – Name of the country. 2. Year – Data year (ranging from 2000 to 2015). 3. Status – Economic classification (Developing/Developed). 4. Life expectancy – Average lifespan in years. 5. Adult Mortality – Probability of death between ages 15-60 per 1,000 individuals. 6. Infant Deaths – Number of infant deaths per 1,000 live births. 7. Alcohol – Per capita alcohol consumption. 8. Percentage Expenditure – Government health expenditure as a percentage of GDP. 9. Hepatitis B – Immunization coverage percentage. 10. Measles – Number of reported measles cases. 11. BMI – Average Body Mass Index. 12. Under-Five Deaths – Mortality rate for children under five. 13. Polio & Diphtheria – Immunization rates. 14. HIV/AIDS – Deaths due to HIV/AIDS per 1,000 individuals. 15. GDP – Gross Domestic Product per capita. 16. Population – Total population of the country. 17. Thinness (1-19 years, 5-9 years) – Percentage of underweight children. 18. Income Composition of Resources– Human development index proxy. 19. Schooling– Average number of years of schooling. Missing Data: Some columns (like Hepatitis B, GDP, Population, Total Expenditure) contain missing values. Further File Information: Total Countries: 193 Years Covered: 2000–2015 Total Entries: 2,938 Missing Data Overview: Some columns have missing values, notably: Hepatitis B (553 missing) GDP (448 missing) Population (652 missing) Total expenditure (226 missing) Income Composition of Resources (167 missing) Schooling (163 missing) Summary Statistics: Life Expectancy:
Range: 36.3 to 89 years Mean: 69.2 years Adult Mortality:
Mean: 165 per 1,000 Max: 723 per 1,000 GDP per Capita:
Mean: $7,483 Max: $119,172 Population:
Mean: ~12.75 million Max: 1.29 billion Education:
Schooling Average: 12 years Max: 20.7 years
Futuristic Scope of this data: For comparative analysis of the 2000–2015 life expectancy dataset with new datasets on the same parametres , you can perform several statistical tests and analytical methods based on different research questions. Below are some key tests and approaches: