http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the countries in this dataset have a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.
See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.
VARIABLE DESCRIPTIONS:
unid: ISO numeric country code (used by the United Nations)
wbid: ISO alpha country code (used by the World Bank)
SES: Country socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174)
country: Short country name
year: Survey year
gdppc: GDP per capita: Single time-series (imputed)
yrseduc: Completed years of education in the adult (15+) population
region5: Five category regional coding schema
regionUN: United Nations regional coding schema
DATA SOURCES:
The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita:
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.
World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm
Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
United Nations Population Division. 2009.
Dataset consisting of inequality measures for 46 nation states and a global bibliography of all known household expenditure surveys covering the period roughly 1880-1960. Each entry notes when and where the survey was carried out and salient characteristics of the survey such as number of households, whether income and/or expenditure data are collected etc. These bibliographies are organised by six world regions and then by 118 nation states. For a sub-set of the most useful surveys we have estimated various inequality measures from the published data for 46 nation states, organised by world region.This project will calculate new estimates of world inequality in the period from the end of the nineteenth century until the 1960s, based on the results of household expenditure surveys. Our investigations have located a vast cache of household expenditure surveys for the period. Thus far, we have identified around 800 household surveys from around the world, carried out between the 1880s and 1960s, of which around half are of sufficient scope as to be potentially useful for the investigation of inequality. We will extract the reported demographic and expenditure data by income group from these reports and use them to estimate parameters of the income distribution. Using these estimates, we will investigate the changing nature of inequality within a number of key nation states, and also investigate the time path and geography of global inequality 1880-1960. In addition, we would use these data to estimate other indicators of living conditions, such as nutritional attainment, which may provide further insights into the impact of industrialisation on inequality. This project utilised the published reports of household expenditure surveys. These published reports are held at copyright libraries or national statistical offices and were typically part of the output of government departments (for example, the UK Board of Trade). We compiled our bibliographies through library searches and requests to various national statistical offices. Many of these reports are published in English, but a substantial number are only published in the language of the relevant nation state. The published household expenditure survey reports typically include summary tables of grouped data of income, expenditures, and household structure. All of these reports, and the data therein, are already in the public domain, and our bibliography provides details of when and where they were published. From these data we estimated a suite of inequality measures, using three different techniques. The inequality measures are: Gini coefficient, 90/10 percentile ratio, 90/50 percentile ratio, and the 50/10 percentile ratio. These inequality measures were estimated three ways: linear interpolation, the Beta-Lorenz method and a log normal density estimation. Not all published household expenditure survey reports contain sufficient data to estimate inequality measures. Our selection was based simply on whether the reports published the appropriate data. All that we required to estimate inequality were total household income or expenditure grouped by class (and the group average incomes/expenditures) and the total number of households and average household size.
https://datacatalog.worldbank.org/public-licenses?fragment=cchttps://datacatalog.worldbank.org/public-licenses?fragment=cc
The Poverty and Inequality Platform: Percentiles database reports 100 points ranked according to the consumption or income distributions for country-year survey data available in the World Bank’s Poverty and Inequality Platform (PIP). There are, as of September 19, 2024, a total of 2,456 country-survey-year data points, which include 2,274 distributions based on microdata, binned data, or imputed/synthetic data, and 182 based on grouped data. For the grouped data, the percentiles are derived by fitting a parametric Lorenz distribution following Datt (1998). For ease of communication, all distributions are referred to as survey data henceforth, and the welfare variable is referred to as income.
Each distribution reports 100 points per country per survey year ranked from the smallest (percentile 1) to the largest (percentile 100) income or consumption. For each income percentile, the database reports the following variables: the average daily per person income or consumption (avg_welfare); the income or consumption value for the upper threshold of the percentile (quantile); the share of the population in the percentile (which might deviate slightly from 1% due to coarseness in the raw data) (pop_share); and the share of income or consumption held by each percentile (welfare_share). In addition, the database reports the welfare measure (welfare_type) used in the survey data—income or consumption—and the region covered (reporting_level)—urban, rural, or national. The distributions are available in 2011 or 2017 PPP$.
Below is an example of how to use the database to generate an anonymous growth incidence curve for Bangladesh between 2005 and 2010
keep if country_code"BGD" & reporting_level1 & ///
inlist(year,2005,2010)
bys country_code percentile (year): ///
gen growth05_10 = (avg_welfare/avg_welfare[_n-1] - 1) * 100
twoway connected growth05_10 percentile, ytitle("%") ///
title("Cumulative growth in Bangladesh, 2005-2010")
Some metadata of the data set, such as the version of the data, can be found by typing char dir
in the Stata console. Alternatively, please refer to this portal, which contains all the information available.
PIP version date: 20250401 (updated June 05, 2025)
Not currently available
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This data is part of a research project on the impact of consumption taxes on inequality by Julien Blasco, Elvire Guillaud and Michaël Zemmour.
Our project is currently published in the LIS Working Paper Series. You may cite it as:
Blasco J., Guillaud E., Zemmour M. (2020) “Consumption Taxes and Income Inequality: An International Perspective with Microsimulation”, LIS Working Paper Series, No. 785.
You are free to use the datasets we provide here, but please cite them as:
Blasco J., Guillaud E., Zemmour M., Data on the Impact of Consumption Taxes on Income Inequality, https://doi.org/10.5281/zenodo.4291984, October 2020.
For detailed information on the method used, please refer to Blasco, Guillaud and Zemmour (2020). In particular, the appendices describe the imputation models used for consumption. All the coefficients are given, which allows for a replication of this imputation method in other datasets.
The code used is available at https://github.com/JulienBlasco/consumption-taxes.
Our data is base on surveys on income and consumption, harmonized by the Luxembourg Income Study. We used OECD Statistics for National Accounts data on income, consumption and consumption tax revenue.
Description of the data
The data is constituted of five tables.
Two datasets are aggregated indicators at the country-year level, such as Gini coefficients for different concepts of income, global consumption tax-to-income ratios, anti-redistributive effect of consumption taxes:
For the core model (82 country-years): ConsumptionTaxes_indicators_coremodel.dta
For the lighter model (126 country-years): ConsumptionTaxes_indicators_xtnddmodel
Two datasets are variables broken down by percentiles of disposable income, within each country-year. Please note that these data are mainly for graphing purposes, not detailed analysis at the percentile level:
Core model (82 country-years): ConsumptionTaxes_percentiles_coremodel
Lighter model (126 country-years): ConsumptionTaxes_percentiles_xtnddmodel
One dataset that contains the implicit effective tax rates on consumption, computed with National Accounts data: 18-07-27 OECD_itrcs.dta
This table presents income shares, thresholds, tax shares, and total counts of individual Canadian tax filers, with a focus on high income individuals (95% income threshold, 99% threshold, etc.). Income thresholds are geography-specific; for example, the number of Nova Scotians in the top 1% will be calculated as the number of taxfiling Nova Scotians whose total income exceeded the 99% income threshold of Nova Scotian tax filers. Different definitions of income are available in the table namely market, total, and after-tax income, both with and without capital gains.
Families of tax filers; Single-earner and dual-earner census families by number of children (final T1 Family File; T1FF).
As of 2022, the top 10 percent Indian population group in terms of pre-tax income was estimated to hold **** percent of total income in India, whereas the bottom 50 percent group only made up just ** percent of total income. This reflected an even greater income gap compared to 2000.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This dataset contains estimates of the socioeconomic status (SES) position of each of 149 countries covering the period 1880-2010. Measures of SES, which are in decades, allow for a 130 year time-series analysis of the changing position of countries in the global status hierarchy. SES scores are the average of each country’s income and education ranking and are reported as percentile rankings ranging from 1-99. As such, they can be interpreted similarly to other percentile rankings, such has high school standardized test scores. If country A has an SES score of 55, for example, it indicates that 55 percent of the countries in this dataset have a lower average income and education ranking than country A. ISO alpha and numeric country codes are included to allow users to merge these data with other variables, such as those found in the World Bank’s World Development Indicators Database and the United Nations Common Database.
See here for a working example of how the data might be used to better understand how the world came to look the way it does, at least in terms of status position of countries.
VARIABLE DESCRIPTIONS:
unid: ISO numeric country code (used by the United Nations)
wbid: ISO alpha country code (used by the World Bank)
SES: Country socioeconomic status score (percentile) based on GDP per capita and educational attainment (n=174)
country: Short country name
year: Survey year
gdppc: GDP per capita: Single time-series (imputed)
yrseduc: Completed years of education in the adult (15+) population
region5: Five category regional coding schema
regionUN: United Nations regional coding schema
DATA SOURCES:
The dataset was compiled by Shawn Dorius (sdorius@iastate.edu) from a large number of data sources, listed below. GDP per Capita:
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. GDP & GDP per capita data in (1990 Geary-Khamis dollars, PPPs of currencies and average prices of commodities). Maddison data collected from: http://www.ggdc.net/MADDISON/Historical_Statistics/horizontal-file_02-2010.xls.
World Development Indicators Database Years of Education 1. Morrisson and Murtin.2009. 'The Century of Education'. Journal of Human Capital(3)1:1-42. Data downloaded from http://www.fabricemurtin.com/ 2. Cohen, Daniel & Marcelo Cohen. 2007. 'Growth and human capital: Good data, good results' Journal of economic growth 12(1):51-76. Data downloaded from http://soto.iae-csic.org/Data.htm
Barro, Robert and Jong-Wha Lee, 2013, "A New Data Set of Educational Attainment in the World, 1950-2010." Journal of Development Economics, vol 104, pp.184-198. Data downloaded from http://www.barrolee.com/
Maddison, Angus. 2004. 'The World Economy: Historical Statistics'. Organization for Economic Co-operation and Development: Paris. 13.
United Nations Population Division. 2009.