This multi-country harmonized dataset concerning forcibly displaced populations (FDPs) and their host communities was produced by the World Bank’s Poverty and Equity Global Practice. It incorporates representative surveys conducted in 10 countries across five regions that hosted FDPs in the period 2015 to 2020. The goal of this harmonization exercise is to provide researchers and policymakers with a valuable input for comparative analyses of forced displacement across key developing country settings.
The datasets included in the harmonization effort cover key recent displacement contexts: the Venezuelan influx in Latin America’s Andean states; the Syrian crisis in the Mashreq; the Rohingya displacement in Bangladesh; and forcible displacement in Sub-Saharan Africa (Sahel and East Africa). The harmonization exercise encompasses 10 different surveys. These include nationally representative surveys with a separate representative stratum for displaced populations; sub-national representative surveys covering displaced populations and their host communities; and surveys designed specifically to provide insights on displacement contexts. Most of the surveys were collected between 2015 and 2020.
Household
Forcibly displaced populations and their hosts communities.
Sample survey data [ssd]
Computer Assisted Personal Interview [capi]
After 2022-01-25, Sentinel-2 scenes with PROCESSING_BASELINE '04.00' or above have their DN (value) range shifted by 1000. The HARMONIZED collection shifts data in newer scenes to be in the same range as in older scenes. Sentinel-2 is a wide-swath, high-resolution, multi-spectral imaging mission supporting Copernicus Land Monitoring studies, including the monitoring of vegetation, soil and water cover, as well as observation of inland waterways and coastal areas. The Sentinel-2 data contain 13 UINT16 spectral bands representing TOA reflectance scaled by 10000. See the Sentinel-2 User Handbook for details. QA60 is a bitmask band that contained rasterized cloud mask polygons until Feb 2022, when these polygons stopped being produced. Starting in February 2024, legacy-consistent QA60 bands are constructed from the MSK_CLASSI cloud classification bands. For more details, see the full explanation of how cloud masks are computed.. Each Sentinel-2 product (zip archive) may contain multiple granules. Each granule becomes a separate Earth Engine asset. EE asset ids for Sentinel-2 assets have the following format: COPERNICUS/S2/20151128T002653_20151128T102149_T56MNN. Here the first numeric part represents the sensing date and time, the second numeric part represents the product generation date and time, and the final 6-character string is a unique granule identifier indicating its UTM grid reference (see MGRS). The Level-2 data produced by ESA can be found in the collection COPERNICUS/S2_SR. For datasets to assist with cloud and/or cloud shadow detection, see COPERNICUS/S2_CLOUD_PROBABILITY and GOOGLE/CLOUD_SCORE_PLUS/V1/S2_HARMONIZED. For more details on Sentinel-2 radiometric resolution, see this page.
Harmonized Landsat Sentinel is a NASA initiative to produce a Virtual Constellation of surface reflectance (SR) data from the Operational Land Imager (OLI) and Multi-Spectral Instrument (MSI) aboard the Landsat 8-9 and Sentinel-2 remote sensing satellites, respectively. The combined measurement enables global observations of the land every 2–3 days. Input products are Landsat 8-9 Collection 2 Level 1 top-of-atmosphere reflectance and Sentinel-2 L1C top-of-atmosphere reflectance, which NASA radiometrically harmonizes to the maximum extent, resamples to common 30-meter resolution, and grids using the Sentinel-2 Military Grid Reference System (MGRS) UTM grid. Because of this, the products are different from Landsat 8-9 Collection 2 Level 2 surface reflectance and Sentinel-2 L2A surface reflectance.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Survey based Harmonized Indicators (SHIP) files are harmonized data files from household surveys that are conducted by countries in Africa. To ensure the quality and transparency of the data, it is critical to document the procedures of compiling consumption aggregation and other indicators so that the results can be duplicated with ease. This process enables consistency and continuity that make temporal and cross-country comparisons consistent and more reliable. Four harmonized data files are prepared for each survey to generate a set of harmonized variables that have the same variable names. Invariably, in each survey, questions are asked in a slightly different way, which poses challenges on consistent definition of harmonized variables. The harmonized household survey data present the best available variables with harmonized definitions, but not identical variables. The four harmonized data files are a) Individual level file (Labor force indicators in a separate file): This file has information on basic characteristics of individuals such as age and sex, literacy, education, health, anthropometry and child survival. b) Labor force file: This file has information on labor force including employment/unemployment, earnings, sectors of employment, etc. c) Household level file: This file has information on household expenditure, household head characteristics (age and sex, level of education, employment), housing amenities, assets, and access to infrastructure and services. d) Household Expenditure file: This file has consumption/expenditure aggregates by consumption groups according to Purpose (COICOP) of Household Consumption of the UN.
To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.
The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.
Two harmonized datafiles are prepared for each survey. The two datafiles are:
1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales.
2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.
National coverage
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Sample survey data [ssd]
See “Malawi - Integrated Household Panel Survey 2010-2013-2016-2019 (Long-Term Panel, 102 EAs)” and “Malawi - High-Frequency Phone Survey on COVID-19” available in the Microdata Library for details.
Computer Assisted Personal Interview [capi]
Malawi Integrated Household Panel Survey (IHPS) 2019 and Malawi High-Frequency Phone Survey on COVID-19 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).
The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.
See “Malawi - Integrated Household Panel Survey 2010-2013-2016-2019 (Long-Term Panel, 102 EAs)” and “Malawi - High-Frequency Phone Survey on COVID-19” available in the Microdata Library for details.
This data set describes select global soil parameters from the Harmonized World Soil Database (HWSD) v1.2, including additional calculated parameters such as area weighted soil organic carbon (kg C per m2), as high resolution NetCDF files. These data were regridded and upscaled from the Harmonized World Soil Database v1.2
The HWSD provides information for addressing emerging problems of land competition for food production, bio-energy demand and threats to biodiversity and can be used as input to model global carbon cycles.
The data are presented as a series of 27 NetCDF v3/v4 (*.nc4) files at 0.05-degree spatial resolution, and one NetCDF file regridded to the Community Land Model (CLM) grid cell resolution (0.9 degree x 1.25 degree) for the nominal year of 2000.
This SOils DAta Harmonization (SoDaH) database is designed to bring together soil carbon data from diverse research networks into a harmonized dataset that can be used for synthesis activities and model development. The research network sources for SoDaH span different biomes and climates, encompass multiple ecosystem types, and have collected data across a range of spatial, temporal, and depth gradients. The rich data sets assembled in SoDaH consist of observations from monitoring efforts and long-term ecological experiments. The SoDaH database also incorporates related environmental covariate data pertaining to climate, vegetation, soil chemistry, and soil physical properties. The data are harmonized and aggregated using open-source code that enables a scripted, repeatable approach for soil data synthesis.
To facilitate the use of data collected through the high-frequency phone surveys on COVID-19, the Living Standards Measurement Study (LSMS) team has created the harmonized datafiles using two household surveys: 1) the country’ latest face-to-face survey which has become the sample frame for the phone survey, and 2) the country’s high-frequency phone survey on COVID-19.
The LSMS team has extracted and harmonized variables from these surveys, based on the harmonized definitions and ensuring the same variable names. These variables include demography as well as housing, household consumption expenditure, food security, and agriculture. Inevitably, many of the original variables are collected using questions that are asked differently. The harmonized datafiles include the best available variables with harmonized definitions.
Two harmonized datafiles are prepared for each survey. The two datafiles are: 1. HH: This datafile contains household-level variables. The information include basic household characterizes, housing, water and sanitation, asset ownership, consumption expenditure, consumption quintile, food security, livestock ownership. It also contains information on agricultural activities such as crop cultivation, use of organic and inorganic fertilizer, hired labor, use of tractor and crop sales. 2. IND: This datafile contains individual-level variables. It includes basic characteristics of individuals such as age, sex, marital status, disability status, literacy, education and work.
National coverage
The survey covered all de jure households excluding prisons, hospitals, military barracks, and school dormitories.
Sample survey data [ssd]
See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.
Computer Assisted Personal Interview [capi]
Ethiopia Socioeconomic Survey (ESS) 2018-2019 and Ethiopia COVID-19 High Frequency Phone Survey of Households (HFPS) 2020 data were harmonized following the harmonization guidelines (see “Harmonized Datafiles and Variables for High-Frequency Phone Surveys on COVID-19” for more details).
The high-frequency phone survey on COVID-19 has multiple rounds of data collection. When variables are extracted from multiple rounds of the survey, the originating round of the survey is noted with “_rX” in the variable name, where X represents the number of the round. For example, a variable with “_r3” presents that the variable was extracted from Round 3 of the high-frequency phone survey. Round 0 refers to the country’s latest face-to-face survey which has become the sample frame for the high-frequency phone surveys on COVID-19. When the variables are without “_rX”, they were extracted from Round 0.
See “Ethiopia - Socioeconomic Survey 2018-2019” and “Ethiopia - COVID-19 High Frequency Phone Survey of Households 2020” available in the Microdata Library for details.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Meta-analysis sample size of harmonized variables for each study.
ST_LUCAS is a harmonized dataset derived from the LUCAS (Land Use and Coverage Area frame Survey) dataset. LUCAS is an Eurostat activity that has performed repeated in situ surveys over Europe every three years since 2006. Original LUCAS data (https://ec.europa.eu/eurostat/web/lucas/data) starting with the 2006 survey were harmonized into common nomenclature based on the 2018 survey. ST_LUCAS dataset is provided in two versions:
lucas_points: each LUCAS survey is represented by single record
lucas_st_points: each LUCAS point is represented by a single location calculated from multiple surveys and by a set of harmonized attributes for each survey year
Harmonization and space-aggregation of LUCAS data were performed by ST_LUCAS system available from https://geoforall.fsv.cvut.cz/st_lucas. The methodology is described in Landa, M.; Brodský, L.; Halounová, L.; Bouček, T.; Pešek, O. Open Geospatial System for LUCAS In Situ Data Harmonization and Distribution. ISPRS Int. J. Geo-Inf. 2022, 11, 361. https://doi.org/10.3390/ijgi11070361.
List of harmonized LUCAS attributes: https://geoforall.fsv.cvut.cz/st_lucas/tables/list_of_attributes.html
ST_LUCAS dataset is provided under the same conditions (“free of charge”) as the original LUCAS data (https://ec.europa.eu/eurostat/web/lucas/data).
This dataset contains final analysis cases used in our paper Psychosocial Constructs Related to Alcohol, Cigarette, and Marijuana Use: An Integrated and Harmonized Analysis. We assembled raw data from 25 longitudinal research projects. We collected data from our own research projects (7 projects) as well data provided by 18 researchers. Datasets included epidemiological studies and prevention studies. For the latter, only control group and pretest data were included. All data, including surveys and projects have been de-identified.
https://fred.stlouisfed.org/legal/#copyright-citation-requiredhttps://fred.stlouisfed.org/legal/#copyright-citation-required
Graph and download economic data for Harmonized Index of Consumer Prices: All-Items HICP for United States (CP0000USM086NEST) from Dec 2001 to Dec 2024 about harmonized, all items, CPI, price index, indexes, price, and USA.
This dataset is the current 2025 Harmonized Tariff Schedule plus all revisions for the current year. It provides the applicable tariff rates and statistical categories for all merchandise imported into the United States; it is based on the international Harmonized System, the global system of nomenclature that is used to describe most world trade in goods.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Danish and Greenlandic MIN4EU INSPIRE dataset contains spatial features extracted and harmonized from GEUS Mineral deposit database to MIN4EU data model. The mineral deposit database contains all mineral deposits and occurrences in Denmark and Greenland. Data is based on public information on the deposits available including published literature and archived reports.
This dataset is the current 2024 Harmonized Tariff Schedule plus all revisions for the current year. It provides the applicable tariff rates and statistical categories for all merchandise imported into the United States; it is based on the international Harmonized System, the global system of nomenclature that is used to describe most world trade in goods.
The dataset contains spatial features extracted and harmonized from GTK’s Mineral deposit database to MIN4EU data model. The mineral deposit database contains all mineral deposits, occurrences, and prospects in Finland. Data is based on all public information on the deposits available including published literature, archive reports, press releases and companies’ web pages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MIN4EU harmonized dataset - "electronic EUROPEAN MINERALS YEARBOOK" - resources, reserves and exploration data. Data sets collected in the MINTELL4EU project (reference year of 2019) as well as in the Minerals4EU project by (reference year 2011, 2012 or 2013).
The Sudan Household Health Survey 2nd round (SHHS2) 2010 provides up-to-date information on the situation of children and women and measures of key indicators that allow countries to monitor progress towards the Millennium Development Goals (MDGs) and other internationally agreed upon commitments.
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006 Household Health Survey in Sudan. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Sudan 2006 & 2010- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
The sample harmonized and disseminated by the Economic research represents Northern Sudan only.
The Sudan Household Health Survey (SHHS) 2010 dataset covers the states of Northern Sudan only (Northern, River Nile, Red Sea, Kassala, Gedarif, Khartoum, Gezira, White Nile, Sinnar, Blue Nile, North Kordofan, South Kordofan, North Darfur, West Darfur and South Darfur).
1- Household/family. 2- Individual/person. 3- Woman. 4- Child.
The target universe for the SHHS includes the households and members of individual households, including nomadic households camping at a location/place at the time of the survey. The population living in institutions and group quarters such as hospitals, military bases and prisons, were excluded from the sampling frame.
Sample survey data [ssd]
Face-to-face [f2f]
Five sets of questionnaires were used in the Sudan Household Health Survey. The first three questionnaires are based on the MICS3 and PAPFAM model questionnaires. Those three were subject to harmonization.
1) Household questionnaire which was used to collect information on all de jure household members and the household. It included the following modules: - Household information panel - Household listing - Education - Female Genital Mutilation - Chronic diseases & injuries (Northern States only) - Tobacco use (Northern States only) - Child disability - Water and sanitation - Household characteristics - Insecticide treated nets - Salt iodization
2) Women's questionnaire administered to all women aged 15-49 years in each household. It included the following modules:
- Women's information panel
- Women's background
- Child mortality
- Desire for last birth
- Maternal and newborn health
- Illness symptoms
- Contraception
- Unmet need
- Marriage and union
- HIV/AIDS
- Birth history
- Female Genital Mutilation
- Attitudes towards domestic violence
- Sexual behavior STIs (Southern States only)
3) Under-five questionnaire administered to mothers. In case the mother was not listed in the household list/roster, a primary caretaker for the child was identified and interviewed. The Questionnaire for Children under Five included the following modules: - Under-five children information panel - Birth registration - Vitamin A supplementation - Breastfeeding - Care of illness - Immunization - Malaria - Anthropometry
4) Men's questionnaire administered to all men aged 15-49 years in each household. It included the following modules: - Men information panel - Men's background Marriage - Circumcision - Condom - Sexual behavior STIs - HIV/AIDS
5) Food Security Questionnaire which included the following modules: - Food security information panel - Income sources - Expenditures - Food consumption and dietary diversity
In addition to the administration of questionnaires, fieldwork teams tested the salt used for cooking in the households for iodine content, and measured the weights and heights of children under five years of age.
---> Harmonized Data:
Of the 15,000 households selected for the sample, 14,778 were successfully interviewed, yielding a response rate of 99 percent. Of the 18,614 women (age 15-49 years) identified in the selected households, 17,174 were successfully interviewed, yielding a response rate of 91.4 percent. Of the 13,587 children under age five listed in the households, questionnaires were completed for 13,282 children, which correspond to a response rate of 96.8 percent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Selecting study datasets for the HU use harmonization and secondary analysis using the CureSCi Metadata Catalog.
+++++++++++++++ Version 2.0.0 of the study is outdated and therefore only available in exceptional cases . Please use the revised version 3.0.0 here . +++++++++++++++
We carried out an harmonization of the Eurobarometer 2004-2021(spring). This dataset includes 35 single standard Eurobarometers, and morethan 140 variables about EU policies, attitudes towards Europe and the EU, identity, cognitive mobilization, political institutions, socio-political characteristics and partisanship, etc.
The harmonization was carried out using existing Eurobarometer datasets published by GESIS. To allow the user to replicate the harmonization and be able to modify some codes if needed, we publish one example of do-file used to pursue the harmonization, as well as the corresponding (harmonized) dataset. The user can find the do-file containing the codes used to modify and clean EB 953 (ZA7783, conducted in spring 2021) according to the harmonization procedure that we followed. Moreover, the user can find the cleaned dataset for EB 953 that was obtained after running the do-file. The files are named “EB 953.do” and “953_new.dta”.
We include: - a harmonized dataset ("harmonised_EB_2004-2021.dta"), - a technical report ("User Guide Harmonized Eurobarometer 2004-2021"), - a summary of the original survey questions corresponding to the variables included in the dataset ("Trends_EBs_1970-2021.xlsx"), - one of the do-files used to carry out the harmonization (“EB 953.do” ), - one of the datasets used before merging all datasets (“953_new.dta”).
This multi-country harmonized dataset concerning forcibly displaced populations (FDPs) and their host communities was produced by the World Bank’s Poverty and Equity Global Practice. It incorporates representative surveys conducted in 10 countries across five regions that hosted FDPs in the period 2015 to 2020. The goal of this harmonization exercise is to provide researchers and policymakers with a valuable input for comparative analyses of forced displacement across key developing country settings.
The datasets included in the harmonization effort cover key recent displacement contexts: the Venezuelan influx in Latin America’s Andean states; the Syrian crisis in the Mashreq; the Rohingya displacement in Bangladesh; and forcible displacement in Sub-Saharan Africa (Sahel and East Africa). The harmonization exercise encompasses 10 different surveys. These include nationally representative surveys with a separate representative stratum for displaced populations; sub-national representative surveys covering displaced populations and their host communities; and surveys designed specifically to provide insights on displacement contexts. Most of the surveys were collected between 2015 and 2020.
Household
Forcibly displaced populations and their hosts communities.
Sample survey data [ssd]
Computer Assisted Personal Interview [capi]