ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Population figures for countries, regions (e.g. Asia) and the world. Data comes originally from World Bank and has been converted into standard CSV.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The "Forest Proximate People" (FPP) dataset is one of the data layers contributing to the development of indicator #13, “number of forest-dependent people in extreme poverty,” of the Collaborative Partnership on Forests (CPF) Global Core Set of forest-related indicators (GCS). The FPP dataset provides an estimate of the number of people living in or within 5 kilometers of forests (forest-proximate people) for the year 2019 with a spatial resolution of 100 meters at a global level.
For more detail, such as the theory behind this indicator and the definition of parameters, and to cite this data, see: Newton, P., Castle, S.E., Kinzer, A.T., Miller, D.C., Oldekop, J.A., Linhares-Juvenal, T., Pina, L. Madrid, M., & de Lamo, J. 2022. The number of forest- and tree-proximate people: A new methodology and global estimates. Background Paper to The State of the World’s Forests 2022 report. Rome, FAO.
Contact points:
Maintainer: Leticia Pina
Maintainer: Sarah E., Castle
Data lineage:
The FPP data are generated using Google Earth Engine. Forests are defined by the Copernicus Global Land Cover (CGLC) (Buchhorn et al. 2020) classification system’s definition of forests: tree cover ranging from 15-100%, with or without understory of shrubs and grassland, and including both open and closed forests. Any area classified as forest sized ≥ 1 ha in 2019 was included in this definition. Population density was defined by the WorldPop global population data for 2019 (WorldPop 2018). High density urban populations were excluded from the analysis. High density urban areas were defined as any contiguous area with a total population (using 2019 WorldPop data for population) of at least 50,000 people and comprised of pixels all of which met at least one of two criteria: either the pixel a) had at least 1,500 people per square km, or b) was classified as “built-up” land use by the CGLC dataset (where “built-up” was defined as land covered by buildings and other manmade structures) (Dijkstra et al. 2020). Using these datasets, any rural people living in or within 5 kilometers of forests in 2019 were classified as forest proximate people. Euclidean distance was used as the measure to create a 5-kilometer buffer zone around each forest cover pixel. The scripts for generating the forest-proximate people and the rural-urban datasets using different parameters or for different years are published and available to users. For more detail, such as the theory behind this indicator and the definition of parameters, and to cite this data, see: Newton, P., Castle, S.E., Kinzer, A.T., Miller, D.C., Oldekop, J.A., Linhares-Juvenal, T., Pina, L., Madrid, M., & de Lamo, J. 2022. The number of forest- and tree-proximate people: a new methodology and global estimates. Background Paper to The State of the World’s Forests 2022. Rome, FAO.
References:
Buchhorn, M., Smets, B., Bertels, L., De Roo, B., Lesiv, M., Tsendbazar, N.E., Herold, M., Fritz, S., 2020. Copernicus Global Land Service: Land Cover 100m: collection 3 epoch 2019. Globe.
Dijkstra, L., Florczyk, A.J., Freire, S., Kemper, T., Melchiorri, M., Pesaresi, M. and Schiavina, M., 2020. Applying the degree of urbanisation to the globe: A new harmonised definition reveals a different picture of global urbanisation. Journal of Urban Economics, p.103312.
WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University, 2018. Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00645
Online resources:
GEE asset for "Forest proximate people - 5km cutoff distance"
The Afrobarometer project assesses attitudes and public opinion on democracy, markets, and civil society in several sub-Saharan African.This dataset was compiled from the studies in Round 2 of the Afrobarometer, conducted from 2002-2004 in 16 countries, including Botswana, Cape Verde, Ghana, Kenya, Lesotho, Malawi, Mali, Mozambique, Namibia, Nigeria, Senegal, South Africa, Tanzania, Uganda, Zambia, and Zimbabwe
The Round 2 Afrobarometer surveys have national coverage for the following countries: Botswana, Ghana, Kenya, Lesotho, Malawi, Mali, Mozambique, Namibia, Nigeria, Republic of Cabo Verde, Senegal, South Africa, Tanzania, Uganda, Zambia, Zimbabwe.
Individuals
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample survey data [ssd]
Afrobarometer uses national probability samples designed to meet the following criteria. Samples are designed to generate a sample that is a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of being selected for an interview. They achieve this by:
• using random selection methods at every stage of sampling; • sampling at all stages with probability proportionate to population size wherever possible to ensure that larger (i.e., more populated) geographic units have a proportionally greater probability of being chosen into the sample.
The sampling universe normally includes all citizens age 18 and older. As a standard practice, we exclude people living in institutionalized settings, such as students in dormitories, patients in hospitals, and persons in prisons or nursing homes. Occasionally, we must also exclude people living in areas determined to be inaccessible due to conflict or insecurity. Any such exclusion is noted in the technical information report (TIR) that accompanies each data set.
Sample size and design Samples usually include either 1,200 or 2,400 cases. A randomly selected sample of n=1200 cases allows inferences to national adult populations with a margin of sampling error of no more than +/-2.8% with a confidence level of 95 percent. With a sample size of n=2400, the margin of error decreases to +/-2.0% at 95 percent confidence level.
The sample design is a clustered, stratified, multi-stage, area probability sample. Specifically, we first stratify the sample according to the main sub-national unit of government (state, province, region, etc.) and by urban or rural location.
Area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. Afrobarometer occasionally purposely oversamples certain populations that are politically significant within a country to ensure that the size of the sub-sample is large enough to be analysed. Any oversamples is noted in the TIR.
Sample stages Samples are drawn in either four or five stages:
Stage 1: In rural areas only, the first stage is to draw secondary sampling units (SSUs). SSUs are not used in urban areas, and in some countries they are not used in rural areas. See the TIR that accompanies each data set for specific details on the sample in any given country. Stage 2: We randomly select primary sampling units (PSU). Stage 3: We then randomly select sampling start points. Stage 4: Interviewers then randomly select households. Stage 5: Within the household, the interviewer randomly selects an individual respondent. Each interviewer alternates in each household between interviewing a man and interviewing a woman to ensure gender balance in the sample.
To keep the costs and logistics of fieldwork within manageable limits, eight interviews are clustered within each selected PSU.
Data weights For some national surveys, data are weighted to correct for over or under-sampling or for household size. "Withinwt" should be turned on for all national -level descriptive statistics in countries that contain this weighting variable. It is included as the last variable in the data set, with details described in the codebook. For merged data sets, "Combinwt" should be turned on for cross-national comparisons of descriptive statistics. Note: this weighting variable standardizes each national sample as if it were equal in size.
Further information on sampling protocols, including full details of the methodologies used for each stage of sample selection, can be found at https://afrobarometer.org/surveys-and-methods/sampling-principles
Face-to-face [f2f]
Certain questions in the questionnaires for the Afrobarometer 2 survey addressed country-specific issues, but many of the same questions were asked across surveys. Citizens of the 16 countries were asked questions about their economic and social situations, and their opinions were elicited on recent political and economic changes within their country.
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
The "Forest Proximate People" (FPP) dataset is one of the data layers contributing to the development of indicator #13, “number of forest-dependent people in extreme poverty,” of the Collaborative Partnership on Forests (CPF) Global Core Set of forest-related indicators (GCS). The FPP dataset provides an estimate of the number of people living in or within 5 kilometers of forests (forest-proximate people) for the year 2019 with a spatial resolution of 100 meters at a global level.
For more detail, such as the theory behind this indicator and the definition of parameters, and to cite this data, see: Newton, P., Castle, S.E., Kinzer, A.T., Miller, D.C., Oldekop, J.A., Linhares-Juvenal, T., Madrid, M., & Pina, L. 2022. The number of forest- and tree-proximate people: a new methodology and global estimates. Rome, FAO.
Contact points:
Metadata Contact: Leticia Pina
Resource Contact: Sarah E., Castle
Data lineage:
The FPP data are generated using Google Earth Engine. Forests are defined by the Copernicus Global Land Cover (CGLC) (Buchhorn et al. 2020) classification system’s definition of forests: tree cover ranging from 15-100%, with or without understory of shrubs and grassland, and including both open and closed forests. Any area classified as forest sized ≥ 1 ha in 2019 was included in this definition. Population density was defined by the WorldPop global population data for 2019 (WorldPop 2018). High density urban populations were excluded from the analysis. High density urban areas were defined as any contiguous area with a total population (using 2019 WorldPop data for population) of at least 50,000 people and comprised of pixels all of which met at least one of two criteria: either the pixel a) had at least 1,500 people per square km, or b) was classified as “built-up” land use by the CGLC dataset (where “built-up” was defined as land covered by buildings and other manmade structures) (Dijkstra et al. 2020). Using these datasets, any rural people living in or within 5 kilometers of forests in 2019 were classified as forest proximate people. Euclidean distance was used as the measure to create a 5-kilometer buffer zone around each forest cover pixel. The scripts for generating the forest-proximate people and the rural-urban datasets using different parameters or for different years are published and available to users. For more detail, such as the theory behind this indicator and the definition of parameters, and to cite this data, see: Newton, P., Castle, S.E., Kinzer, A.T., Miller, D.C., Oldekop, J.A., Linhares-Juvenal, T., Madrid, M., & Pina, L. 2022. The number of forest- and tree-proximate people: a new methodology and global estimates. Rome, FAO.
References: Buchhorn, M., Smets, B., Bertels, L., De Roo, B., Lesiv, M., Tsendbazar, N.E., Herold, M., Fritz, S., 2020. Copernicus Global Land Service: Land Cover 100m: collection 3 epoch 2019. Globe.
Dijkstra, L., Florczyk, A.J., Freire, S., Kemper, T., Melchiorri, M., Pesaresi, M. and Schiavina, M., 2020. Applying the degree of urbanisation to the globe: A new harmonised definition reveals a different picture of global urbanisation. Journal of Urban Economics, p.103312.
WorldPop (www.worldpop.org - School of Geography and Environmental Science, University of Southampton; Department of Geography and Geosciences, University of Louisville; Departement de Geographie, Universite de Namur) and Center for International Earth Science Information Network (CIESIN), Columbia University, 2018. Global High Resolution Population Denominators Project - Funded by The Bill and Melinda Gates Foundation (OPP1134076). https://dx.doi.org/10.5258/SOTON/WP00645
Online resources:
Based on information released from White House with detailed information about the trade between US and the rest of countries. You will find the relevant information for each country, including Exports, Imports and Deficit (or surplus).
Version 2 includes population (if data is available). Figures gathered from https://datahub.io/core/population
Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.
The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire
Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe
Basic units of analysis that the study investigates include: individuals and groups
Sample survey data [ssd]
A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.
The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.
Sample Universe
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample Design
The sample design is a clustered, stratified, multi-stage, area probability sample.
To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.
In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:
The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages
A first-stage to stratify and randomly select primary sampling units;
A second-stage to randomly select sampling start-points;
A third stage to randomly choose households;
A final-stage involving the random selection of individual respondents
We shall deal with each of these stages in turn.
STAGE ONE: Selection of Primary Sampling Units (PSUs)
The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.
We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.
Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.
Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.
Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.
Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.
The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.
These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.
The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Data Data comes from Organisation for Economic Cooperation and Development on https://data.oecd.org/healthres/pharmaceutical-spending.htm
It consists of useful information about the percent of health spending, percent of GDP and US dollars per capita for specific countries. Also, we added total spending by countries using their population data.
Population data comes from DataHub http://datahub.io/core/population since it is regularly updated and includes all country codes.
Preparation There are several steps that have been done to get the final data.
We extracted separately each resource by “percent of health spending”, “percent of GDP” and “US dollars per capita” We merged them into one resource and added nenew column “TOTAL_SPEND” “TOTAL_SPEND” is calculated using “US dollars per capita” and “population” data. Source for original pharmacy drug spending: https://stats.oecd.org/sdmx-json/data/DP_LIVE/.PHARMAEXP.../OECD?contentType=csv&detail=code&separator=comma&csv-lang=en.
Gallup Worldwide Research continually surveys residents in more than 150 countries, representing more than 98% of the world's adult population, using randomly selected, nationally representative samples. Gallup typically surveys 1,000 individuals in each country, using a standard set of core questions that has been translated into the major languages of the respective country. In some regions, supplemental questions are asked in addition to core questions. Face-to-face interviews are approximately 1 hour, while telephone interviews are about 30 minutes. In many countries, the survey is conducted once per year, and fieldwork is generally completed in two to four weeks. The Country Dataset Details spreadsheet displays each country's sample size, month/year of the data collection, mode of interviewing, languages employed, design effect, margin of error, and details about sample coverage.
Gallup is entirely responsible for the management, design, and control of Gallup Worldwide Research. For the past 70 years, Gallup has been committed to the principle that accurately collecting and disseminating the opinions and aspirations of people around the globe is vital to understanding our world. Gallup's mission is to provide information in an objective, reliable, and scientifically grounded manner. Gallup is not associated with any political orientation, party, or advocacy group and does not accept partisan entities as clients. Any individual, institution, or governmental agency may access the Gallup Worldwide Research regardless of nationality. The identities of clients and all surveyed respondents will remain confidential.
Sample survey data [ssd]
SAMPLING AND DATA COLLECTION METHODOLOGY With some exceptions, all samples are probability based and nationally representative of the resident population aged 15 and older. The coverage area is the entire country including rural areas, and the sampling frame represents the entire civilian, non-institutionalized, aged 15 and older population of the entire country. Exceptions include areas where the safety of interviewing staff is threatened, scarcely populated islands in some countries, and areas that interviewers can reach only by foot, animal, or small boat.
Telephone surveys are used in countries where telephone coverage represents at least 80% of the population or is the customary survey methodology (see the Country Dataset Details for detailed information for each country). In Central and Eastern Europe, as well as in the developing world, including much of Latin America, the former Soviet Union countries, nearly all of Asia, the Middle East, and Africa, an area frame design is used for face-to-face interviewing.
The typical Gallup Worldwide Research survey includes at least 1,000 surveys of individuals. In some countries, oversamples are collected in major cities or areas of special interest. Additionally, in some large countries, such as China and Russia, sample sizes of at least 2,000 are collected. Although rare, in some instances the sample size is between 500 and 1,000. See the Country Dataset Details for detailed information for each country.
FACE-TO-FACE SURVEY DESIGN
FIRST STAGE In countries where face-to-face surveys are conducted, the first stage of sampling is the identification of 100 to 135 ultimate clusters (Sampling Units), consisting of clusters of households. Sampling units are stratified by population size and or geography and clustering is achieved through one or more stages of sampling. Where population information is available, sample selection is based on probabilities proportional to population size, otherwise simple random sampling is used. Samples are drawn independent of any samples drawn for surveys conducted in previous years.
There are two methods for sample stratification:
METHOD 1: The sample is stratified into 100 to 125 ultimate clusters drawn proportional to the national population, using the following strata: 1) Areas with population of at least 1 million 2) Areas 500,000-999,999 3) Areas 100,000-499,999 4) Areas 50,000-99,999 5) Areas 10,000-49,999 6) Areas with less than 10,000
The strata could include additional stratum to reflect populations that exceed 1 million as well as areas with populations less than 10,000. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 8
METHOD 2:
A multi-stage design is used. The country is first stratified by large geographic units, and then by smaller units within geography. A minimum of 33 Primary Sampling Units (PSUs), which are first stage sampling units, are selected. The sample design results in 100 to 125 ultimate clusters.
SECOND STAGE
Random route procedures are used to select sampled households. Unless an outright refusal occurs, interviewers make up to three attempts to survey the sampled household. To increase the probability of contact and completion, attempts are made at different times of the day, and where possible, on different days. If an interviewer cannot obtain an interview at the initial sampled household, he or she uses a simple substitution method. Refer to Appendix C for a more in-depth description of random route procedures.
THIRD STAGE
Respondents are randomly selected within the selected households. Interviewers list all eligible household members and their ages or birthdays. The respondent is selected by means of the Kish grid (refer to Appendix C) in countries where face-to-face interviewing is used. The interview does not inform the person who answers the door of the selection criteria until after the respondent has been identified. In a few Middle East and Asian countries where cultural restrictions dictate gender matching, respondents are randomly selected using the Kish grid from among all eligible adults of the matching gender.
TELEPHONE SURVEY DESIGN
In countries where telephone interviewing is employed, random-digit-dial (RDD) or a nationally representative list of phone numbers is used. In select countries where cell phone penetration is high, a dual sampling frame is used. Random respondent selection is achieved by using either the latest birthday or Kish grid method. At least three attempts are made to reach a person in each household, spread over different days and times of day. Appointments for callbacks that fall within the survey data collection period are made.
PANEL SURVEY DESIGN
Prior to 2009, United States data were collected using The Gallup Panel. The Gallup Panel is a probability-based, nationally representative panel, for which all members are recruited via random-digit-dial methodology and is only used in the United States. Participants who elect to join the panel are committing to the completion of two to three surveys per month, with the typical survey lasting 10 to 15 minutes. The Gallup Worldwide Research panel survey is conducted over the telephone and takes approximately 30 minutes. No incentives are given to panel participants. Worldwide Research Methodology and Codebook Copyright © 2008-2012 Gallup, Inc. All rights reserved. 9
QUESTION DESIGN
Many of the Worldwide Research questions are items that Gallup has used for years. When developing additional questions, Gallup employed its worldwide network of research and political scientists1 to better understand key issues with regard to question development and construction and data gathering. Hundreds of items were developed, tested, piloted, and finalized. The best questions were retained for the core questionnaire and organized into indexes. Most items have a simple dichotomous ("yes or no") response set to minimize contamination of data because of cultural differences in response styles and to facilitate cross-cultural comparisons.
The Gallup Worldwide Research measures key indicators such as Law and Order, Food and Shelter, Job Creation, Migration, Financial Wellbeing, Personal Health, Civic Engagement, and Evaluative Wellbeing and demonstrates their correlations with world development indicators such as GDP and Brain Gain. These indicators assist leaders in understanding the broad context of national interests and establishing organization-specific correlations between leading indexes and lagging economic outcomes.
Gallup organizes its core group of indicators into the Gallup World Path. The Path is an organizational conceptualization of the seven indexes and is not to be construed as a causal model. The individual indexes have many properties of a strong theoretical framework. A more in-depth description of the questions and Gallup indexes is included in the indexes section of this document. In addition to World Path indexes, Gallup Worldwide Research questions also measure opinions about national institutions, corruption, youth development, community basics, diversity, optimism, communications, religiosity, and numerous other topics. For many regions of the world, additional questions that are specific to that region or country are included in surveys. Region-specific questions have been developed for predominantly Muslim nations, former Soviet Union countries, the Balkans, sub-Saharan Africa, Latin America, China and India, South Asia, and Israel and the Palestinian Territories.
The questionnaire is translated into the major conversational languages of each country. The translation process starts with an English, French, or Spanish version, depending on the region. One of two translation methods may be used.
METHOD 1: Two independent translations are completed. An independent third party, with some knowledge of survey research methods, adjudicates the differences. A professional translator translates the final version back into the source language.
METHOD 2: A translator
FAOSTAT provides access to over 3 million time-series and cross sectional data relating to food and agriculture. The full FAO data can be found in the large zipfile, while a (somewhat out of date) summary of FAOSTAT is in the top level csv files. FAOSTAT contains data for 200 countries and more than 200 primary products and inputs in its core data set. The national version of FAOSTAT, CountrySTAT, is being implemented in about 20 countries and three regions. It offers a two-way bridge amongst sub-national, national, regional and international statistics on food and agriculture.
This dataset was kindly published by the United Nation on the UNData site. You can find the original dataset here.
Per the UNData terms of use: all data and metadata provided on UNdata’s website are available free of charge and may be copied freely, duplicated and further distributed provided that UNdata is cited as the reference.
The Afrobarometer is a comparative series of public attitude surveys that assess African citizen's attitudes to democracy and governance, markets, and civil society, among other topics.
The 12 country datasetis a combined dataset for the 12 African countries surveyed during round 1 of the survey, conducted between 1999-2000 (Botswana, Ghana, Lesotho, Mali, Malawi, Namibia, Nigeria South Africa, Tanzania, Uganda, Zambia and Zimbabwe), plus data from the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.
The Round 1 Afrobarometer surveys have national coverage for the following countries: Botswana, Ghana, Lesotho, Malawi, Mali, Namibia, Nigeria, South Africa, Tanzania, Uganda, Zambia, Zimbabwe.
Individuals
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample survey data [ssd]
Afrobarometer uses national probability samples designed to meet the following criteria. Samples are designed to generate a sample that is a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of being selected for an interview. They achieve this by:
• using random selection methods at every stage of sampling; • sampling at all stages with probability proportionate to population size wherever possible to ensure that larger (i.e., more populated) geographic units have a proportionally greater probability of being chosen into the sample.
The sampling universe normally includes all citizens age 18 and older. As a standard practice, we exclude people living in institutionalized settings, such as students in dormitories, patients in hospitals, and persons in prisons or nursing homes. Occasionally, we must also exclude people living in areas determined to be inaccessible due to conflict or insecurity. Any such exclusion is noted in the technical information report (TIR) that accompanies each data set.
Sample size and design Samples usually include either 1,200 or 2,400 cases. A randomly selected sample of n=1200 cases allows inferences to national adult populations with a margin of sampling error of no more than +/-2.8% with a confidence level of 95 percent. With a sample size of n=2400, the margin of error decreases to +/-2.0% at 95 percent confidence level.
The sample design is a clustered, stratified, multi-stage, area probability sample. Specifically, we first stratify the sample according to the main sub-national unit of government (state, province, region, etc.) and by urban or rural location.
Area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. Afrobarometer occasionally purposely oversamples certain populations that are politically significant within a country to ensure that the size of the sub-sample is large enough to be analysed. Any oversamples is noted in the TIR.
Sample stages Samples are drawn in either four or five stages:
Stage 1: In rural areas only, the first stage is to draw secondary sampling units (SSUs). SSUs are not used in urban areas, and in some countries they are not used in rural areas. See the TIR that accompanies each data set for specific details on the sample in any given country. Stage 2: We randomly select primary sampling units (PSU). Stage 3: We then randomly select sampling start points. Stage 4: Interviewers then randomly select households. Stage 5: Within the household, the interviewer randomly selects an individual respondent. Each interviewer alternates in each household between interviewing a man and interviewing a woman to ensure gender balance in the sample.
To keep the costs and logistics of fieldwork within manageable limits, eight interviews are clustered within each selected PSU.
Data weights For some national surveys, data are weighted to correct for over or under-sampling or for household size. "Withinwt" should be turned on for all national -level descriptive statistics in countries that contain this weighting variable. It is included as the last variable in the data set, with details described in the codebook. For merged data sets, "Combinwt" should be turned on for cross-national comparisons of descriptive statistics. Note: this weighting variable standardizes each national sample as if it were equal in size.
Further information on sampling protocols, including full details of the methodologies used for each stage of sample selection, can be found at https://afrobarometer.org/surveys-and-methods/sampling-principles
Face-to-face [f2f]
Because Afrobarometer Round 1 emerged out of several different survey research efforts, survey instruments were not standardized across all countries, there are a number of features of the questionnaires that should be noted, as follows: • In most cases, the data set only includes those questions/variables that were asked in nine or more countries. Complete Round 1 data sets for each individual country have already been released, and are available from ICPSR or from the Afrobarometer website at www.afrobarometer.org. • In the seven countries that originally formed the Southern Africa Barometer (SAB) - Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe - a standardized questionnaire was used, so question wording and response categories are the generally the same for all of these countries. The questionnaires in Mali and Tanzania were also essentially identical (in the original English version). Ghana, Uganda and Nigeria each had distinct questionnaires. • This merged dataset combines, into a single variable, responses from across these different countries where either identical or very similar questions were used, or where conceptually equivalent questions can be found in at least nine of the different countries. For each variable, the exact question text from each of the countries or groups of countries ("SAB" refers to the Southern Africa Barometer countries) is listed. • Response options also varied on some questions, and where applicable, these differences are also noted.
Gallup’s World Poll continually surveys residents in more than 150 countries and areas, representing more than 98% of the world’s adult population, using randomly selected, nationally representative samples. Gallup typically surveys 1,000 individuals in each country or area, using a standard set of core questions that has been translated into the major languages of the respective country. In some regions, supplemental questions are asked in addition to core questions. Face-to-face interviews are approximately 1 hour, while telephone interviews are about 30 minutes. In many countries, the survey is conducted once per year, and fieldwork is generally completed in two to four weeks. The Country Dataset Details document displays each country’s sample size, month/year of the data collection, mode of interviewing, languages employed, design effect, margin of error and details about sample coverage.
The data was last updated March 2025.
Data access is required to view this section.
Data set is for private consumption for the competition.
According to IBEF “Domestic automobiles production increased at 2.36% CAGR between FY16-20 with 26.36 million vehicles being manufactured in the country in FY20.Overall, domestic automobiles sales increased at 1.29% CAGR between FY16-FY20 with 21.55 million vehicles being sold in FY20”.The rise in vehicles on the road will also lead to multiple challenges and the road will be more vulnerable to accidents.Increased accident rates also leads to more insurance claims and payouts rise for insurance companies.
In order to pre-emptively plan for the losses, the insurance firms leverage accident data to understand the risk across the geographical units e.g. Postal code/district etc.
In this challenge, we are providing you the dataset to predict the “Accident_Risk_Index” against the postcodes.Accident_Risk_Index (mean casualties at a postcode) = sum(Number_of_casualities)/count(Accident_ID)
Working example:
Train Data (given)
Accident_ID Postcode Number_of_casualities
1 AL1 1JJ 2
2 AL1 1JP 3
3 AL1 3PS 2
4 AL1 3PS 1
5 AL1 3PS 1
Modelling Train Data (Rolled up at Postcode level)
Postcode Derived_feature1 Derived_feature2 Accident_risk_Index
AL1 1JJ _ _ 2
AL1 1JP _ _ 3
AL1 3PS _ _ 1.33
The participants are required to predict the 'Accident_risk_index' for the test.csv and against the postcode on the test data.
Then submit your 'my_submission_file.csv' on the submission tab of the hackathon page.
Pro-tip: The participants are required to perform feature engineering to first roll-up the train data at postcode level and create a column as “accident_risk_index” and optimize the model against postcode level.
Few Hypothesis to help you think: "More accidents happen in the later part of the day as those are office hours causing congestion"
"Postal codes with more single carriage roads have more accidents"
(***In the above hypothesis features such as office_hours_flag and #single _carriage roads can be formed)
Additionally, we are providing you with road network data (contains info on the nearest road to a postcode and it's characteristics) and population data (contains info about population at area level). This info are for augmentation of features, but not mandatory to use.
The provided dataset contains the following files:
train.csv & test.csv:
'Accident_ID', 'Police_Force', 'Number_of_Vehicles', 'Number_of_Casualties', 'Date', 'Day_of_Week', 'Time', ‘Local_Authority_(District)', 'Local_Authority_(Highway)', '1st_Road_Class', '1st_Road_Number', 'Road_Type', 'Speed_limit', '2nd_Road_Class', '2nd_Road_Number', 'Pedestrian_Crossing-Human_Control', 'Pedestrian_Crossing-Physical_Facilities', 'Light_Conditions', ‘'Weather_Conditions', 'Road_Surface_Conditions', 'Special_Conditions_at_Site', 'Carriageway_Hazards', 'Urban_or_Rural_Area', 'Did_Police_Officer_Attend_Scene_of_Accident', 'state', 'postcode', 'country'
population.csv:
'postcode', 'Rural Urban', 'Variable: All usual residents; measures: Value', 'Variable: Males; measures: Value', 'Variable: Females; measures: Value', ‘Variable: Lives in a household; measures: Value', ‘Variable: Lives in a communal establishment; measures: Value', 'Variable: Schoolchild or full-time student aged 4 and over at their non term-time address; measures: Value', 'Variable: Area (Hectares); measures: Value', 'Variable: Density (number of persons per hectare); measures: Value'
roads_network.csv:
'WKT', 'roadClassi', ‘roadFuncti', 'formOfWay', 'length', 'primaryRou', 'distance to the nearest point on rd', 'postcode’
Overview Swiss Re is one of the largest reinsurers in the world headquartered in Zurich with offices in over 25 countries. Swiss Re’s core expertise is in underwriting in life, health, as well as the property and casualty insurance space whereas its tech strategy focuses on developing smarter and innovative solutions for clients’ value chains by leveraging data and technology.
The company’s vision is to make the world more resilient. Swiss Re believes in applying fresh perspectives, knowledge and capital to anticipate and manage risk to create smarter solutions and help the world rebuild, renew and move forward.About 1300 professionals that work in the Swiss Re Global Business Solutions Center (BSC), Bangalore combine experience, expertise and out-of-the-box thinking to bring Swiss Re's core business to life by creating new business opportunities.
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
ISO 3166-1-alpha-2 English country names and code elements. This list states the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
SABA Core dataset
This Syria core dataset comprises 14 quantitative indicators based on publicly available information from humanitarian organisations. It is updated on a monthly basis, and it covers the whole country.
This dataset brings together data from a range of sources to provide a greater overall and comparative understanding of the current situation and context inside each district. The core dataset indicators cover a range of categories including agriculture, commodity prices (food and fuel), conflict, displacement, exchange rate, food security, humanitarian access severity, health, people in need per sector, and rainfall.
When analysing and interpreting the data, please be aware that while we aim to include district-level data that is updated monthly, some indicators are updated on only a quarterly or annual basis and some data is only available on admin 1 level. Please ensure you check the details in the ‘indicator list’ tab and the references for each indicator before conducting analysis.
The Syria Area Based Analysis (SABA) team recommends that this dataset is used only as a starting point. It will enable you to quickly examine and compare cross-sectoral, quantitative data at the district level in Syria. However, for operational decision making, we recommend you consult ACAPS analysis products on Syria available here: https://www.acaps.org/en/countries/syria
ACAPS conducts random quality checking of a sample of entries to try to limit errors. However, it is likely that errors remain. For sensitive analysis, we recommend you cross check findings with the source data in the list of indicators and at the top row of each column.
Do you have ideas to make this data set more useful? Do you see mistakes or disagree with our opinions or assumptions? Contact us at info@acaps.org. Your feedback helps us do better.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The KHOJ (Know Your High Court Judges) dataset includes data on more than 1700 judges appointed between 1993 (after the creation of the collegium) and 2021. The dataset captures information across 43 variables including the personal, educational and professional backgrounds of India’s High Court judges. It opens pathways for researchers who are looking to probe deeper or wider into the composition of the High Courts and those who want to undertake jurimetrics studies which explore the linkage between judicial behaviour and the background of judges.
The core philosophy behind building such a dataset is the realization that people of the country should have more information about judges whose decisions have a real impact on such people's lives.
This dataset is the result of a joint effort over 15 months involving more than 30 students and 10 professionals who volunteered their time and efforts in preparing this dataset. This was a collaboration between NLUO’s Centre for Public Policy, Law and Good Governance, Agami and CivicDataLab. It started with the Summer of Data 2021 programme where students from across the country became the original data creators using official and publicly accessible data sources.
Biocapacity by Countries - 2017
"The capacity of ecosystems to regenerate what people demand from those surfaces. Life, including human life, competes for space. The biocapacity of a surface represents its ability to renew what people demand. Biocapacity is therefore the ecosystems' capacity to produce biological materials used by people and to absorb waste material generated by humans, under current management schemes and extraction technologies. Biocapacity can change from year to year due to climate, management, and proportion considered useful inputs to the human economy. In the National Footprint and Biocapacity Accounts, biocapacity is calculated by multiplying the physical area by the yield factor and the appropriate equivalence factor. Biocapacity is expressed in global hectares." https://data.footprintnetwork.org/?_ga=2.79482327.2142738977.1627570256-507256724.1627570256#/abouttheData
Biocapacity (gha)
"Global hectares are the accounting unit for the Ecological Footprint and biocapacity accounts. These productivity weighted biologically productive hectares allow researchers to report both the biocapacity of the earth or a region and the demand on biocapacity (the Ecological Footprint). A global hectare is a biologically productive hectare with world average biological productivity for a given year. Global hectares are useful because different land types have different productivities. A global hectare of cropland, for example, would occupy a smaller physical area than the much less biologically productive pasture land, as more pasture would be needed to provide the same biocapacity as one hectare of cropland. Because world productivity varies slightly from year to year, the value of a global hectare may change slightly from year to year." https://data.footprintnetwork.org/?_ga=2.79482327.2142738977.1627570256-507256724.1627570256#/abouttheData
https://data.footprintnetwork.org/?_ga=2.79482327.2142738977.1627570256-507256724.1627570256#/compareCountries?type=BCtot&cn=all&yr=2017 https://api.footprintnetwork.org/v1/data/all/2017/BCtot
"Ecological resources are at the core of every country’s long-term wealth. Yet population growth and consumption patterns are putting more pressure on these critical assets. Generate bar graphs to compare the ecological assets and consumption patterns of countries by year."
https://data.footprintnetwork.org/?_ga=2.79482327.2142738977.1627570256-507256724.1627570256#/compareCountries?type=BCtot&cn=all&yr=2017 https://api.footprintnetwork.org/v1/data/all/2017/BCtot
Photo by Juanma Clemente-Alloza on Unsplash
Brazilian biocapacity and biodiversity.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Number of persons by core housing need, tenure, First Nations people living off reserve, Métis and Inuit, and gender, Canada, provinces and territories.
The Afrobarometer is a comparative series of public attitude surveys that assess African citizen's attitudes to democracy and governance, markets, and civil society, among other topics. The surveys have been undertaken at periodic intervals since 1999. The Afrobarometer's coverage has increased over time. Round 1 (1999-2001) initially covered 7 countries and was later extended to 12 countries. Round 2 (2002-2004) surveyed citizens in 16 countries. Round 3 (2005-2006) 18 countries. The survey covered 20 countries in Round 4 (2008-2009).
The Round 4 Afrobarometer surveys have national coverage for the following countries: Benin, Botswana, Burkina Faso, Ghana, Kenya, Lesotho, Liberia, Madagascar, Malawi, Mali, Mozambique, Namibia, Nigeria, Republic of Cabo Verde, Senegal, South Africa, Tanzania, Uganda, Zambia, Zimbabwe.
Individuals
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample survey data [ssd]
Afrobarometer uses national probability samples designed to meet the following criteria. Samples are designed to generate a sample that is a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of being selected for an interview. They achieve this by:
• using random selection methods at every stage of sampling; • sampling at all stages with probability proportionate to population size wherever possible to ensure that larger (i.e., more populated) geographic units have a proportionally greater probability of being chosen into the sample.
The sampling universe normally includes all citizens age 18 and older. As a standard practice, we exclude people living in institutionalized settings, such as students in dormitories, patients in hospitals, and persons in prisons or nursing homes. Occasionally, we must also exclude people living in areas determined to be inaccessible due to conflict or insecurity. Any such exclusion is noted in the technical information report (TIR) that accompanies each data set.
Sample size and design Samples usually include either 1,200 or 2,400 cases. A randomly selected sample of n=1200 cases allows inferences to national adult populations with a margin of sampling error of no more than +/-2.8% with a confidence level of 95 percent. With a sample size of n=2400, the margin of error decreases to +/-2.0% at 95 percent confidence level.
The sample design is a clustered, stratified, multi-stage, area probability sample. Specifically, we first stratify the sample according to the main sub-national unit of government (state, province, region, etc.) and by urban or rural location.
Area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. Afrobarometer occasionally purposely oversamples certain populations that are politically significant within a country to ensure that the size of the sub-sample is large enough to be analysed. Any oversamples is noted in the TIR.
Sample stages Samples are drawn in either four or five stages:
Stage 1: In rural areas only, the first stage is to draw secondary sampling units (SSUs). SSUs are not used in urban areas, and in some countries they are not used in rural areas. See the TIR that accompanies each data set for specific details on the sample in any given country. Stage 2: We randomly select primary sampling units (PSU). Stage 3: We then randomly select sampling start points. Stage 4: Interviewers then randomly select households. Stage 5: Within the household, the interviewer randomly selects an individual respondent. Each interviewer alternates in each household between interviewing a man and interviewing a woman to ensure gender balance in the sample.
To keep the costs and logistics of fieldwork within manageable limits, eight interviews are clustered within each selected PSU.
Data weights For some national surveys, data are weighted to correct for over or under-sampling or for household size. "Withinwt" should be turned on for all national -level descriptive statistics in countries that contain this weighting variable. It is included as the last variable in the data set, with details described in the codebook. For merged data sets, "Combinwt" should be turned on for cross-national comparisons of descriptive statistics. Note: this weighting variable standardizes each national sample as if it were equal in size.
Further information on sampling protocols, including full details of the methodologies used for each stage of sample selection, can be found at https://afrobarometer.org/surveys-and-methods/sampling-principles
Face-to-face [f2f]
Certain questions in the questionnaires for the Afrobarometer 4 survey addressed country-specific issues, but many of the same questions were asked across surveys. Citizens of the 20 countries were asked questions about their economic and social situations, and their opinions were elicited on recent political and economic changes within their country.
The Integrated Survey of Living Standards (ISLS), renamed in 2004 to Integrated Survey of Living Conditions Survey (ILCS) is conducted annually by the NSS National Statistical Service of the Republic of Armenia, formed the basis for monitoring living conditions in Armenia. The ILCS is a universally recognized best-practice survey for collecting data to inform about the living standards of households. The ILCS comprises comprehensive and valuable data on the welfare of households and separate individuals which gives the NSS an opportunity to provide the public with up to date information on the population’s income, expenditures, the level of poverty and the other changes in living standards on an annual basis.
Urban and rural communities
Sample survey data [ssd]
During the 2001-2003 surveys two-stage random sample was used; the first stage covered the selection of settlements - cities and villages, while the second stage was focused on the selection of households in these settlements. The surveys were conducted on the principle of monthly rotation of households by clusters (sample units). In 2002 and 2003 the number of households was 387 with the sample covering 14 cities and 30 villages in 2002 and 17 cities and 20 villages in 2003.
During the 2004-2006 surveys the sampling frame for the ILCS was built using the database of addresses for the 2001 Population Census; the database was developed with the World Bank technical assistance. The database of addresses of all households in Armenia was divided into 48 strata including 12 communities of Yerevan city. The households from other regions (marzes) were grouped according to the following three categories: big towns with 15,000 and more population; villages, and other towns. Big towns formed 16 strata (the only exception was the Vayots Dzor marz where there are no big towns). The villages and other towns formed 10 strata each. According to this division, a random, two-step sample stratified at marz level was developed. All marzes, as well as all urban and rural settlements were included in the sample population according to the share of population residing in those settlements as percent to the total population in the country. In the first step, the settlements, i.e. primary sample units, were selected: 43 towns out of 48 or 90 percent of all towns in Armenia were surveyed during the year; also 216 villages out of 951 or 23 percent of all villages in the country were covered by the survey. In the second step, the respondent households were selected: 6,816 households (5,088 from urban and 1,728 from rural settlements). As a result, for the first time since 1996 survey data were representative at the marz level.
During the 2007-2012 surveys the sampling frame for ILCS was designed according to the database of addresses for the 2001 Population Census, which was developed with the World Bank technical assistance. The sample consisted of two parts: core sample and oversample.
1) For the creation of core sample, the sample frame (database of addresses of all households in Armenia) was divided into 48 strata including 12 communities of Yerevan city. The households from other regions (marzes) were grouped according to three categories: large towns (with population of 15000 and higher), villages and other towns. Large towns formed by 16 groups (strata), while the villages and towns formed by 10 strata each. According to that division, a random, two-step sample stratified at the marz level was developed. All marzes, as well as all urban and rural settlements were included in the sample population according to the share of households residing in those settlements as percent to the total households in the country. In the first step, using the PPS method the enumeration units (i.e., primary sample units to be surveyed during the year) were selected. 2007 sample includes 48 urban and 18 rural enumeration areas per month. 2) The oversample was drawn from the list of villages included in MCA-Armenia Rural Roads Rehabilitation Project. The enumeration areas of villages that were already in the core sample were excluded from that list. From the remaining enumeration areas 18 enumeration areas were selected per month. Thus, the rural sample size was doubled. 3) After merging the core sample and oversample, the survey households were selected in the second step. 656 households were surveyed per month, from which 368 from urban and 288 from rural settlements. Each month 82 interviewers had conducted field work, and their workload included 8 households per month. In 2007 number of surveyed households was 7,872 (4,416 from urban and 3,456 from rural areas).
For the survey 2013 the sample frame for ILCS was designed in accordance with the database of addresses of all private households in the country developed on basis of the 2001 Population Census results, with the technical assistance of the World Bank. The method of systematic representative probability sampling was used to frame the sample. For the purpose of drawing the sample, the sample frame was divided into 32 strata including 12 communities of Yerevan City (currently, the administrative districts). According to this division, a two-tier sample was drawn stratified by regions and by Yerevan. All regions and Yerevan, as well as all urban and rural communities were included in the sample in accordance to the shares of their resident households within the total number of households in the country. In the first round, enumeration areas - that is primary sample units to be surveyed during the year - were selected. The ILCS 2013 sample included 32 enumeration areas in urban and 16 enumeration areas in rural communities per month. The households to be surveyed were selected in the second round. A total of 432 households were surveyed per month, of which 279 and 153 households from urban and rural communities, respectively. Every month 48 interviewers went on field work with a workload of 9 households per month.
The sample frame for 2014-2016 was designed in accordance with the database of addresses of all private households in the country developed on basis of the 2011 Population Census results, with the technical assistance of the World Bank. The method of systematic representative probability sampling was used to frame the sample.
For drawing the sample, the sample frame was divided into 32 strata including 12 communities of Yerevan City (currently, the administrative districts). According to this division, a two-tier sample was drawn stratified by regions and by Yerevan. All regions and Yerevan, as well as all urban and rural communities were included in the sample in accordance to the shares of their resident households within the total number of households in the country. In the first round, enumeration areas - that is primary sample units to be surveyed during the year - were selected. The ILCS 2014 sample included 30 enumeration areas in urban and 18 enumeration areas in rural communities per month.
The method of representative probability sampling was used to frame the sample. At regional level, all communities were grouped into two categories - towns and villages. According to this division, a two-tier sample was drawn stratified by regions and by Yerevan. All regions and Yerevan, as well as all rural and urban communities were included in the sample in accordance to the shares of their resident households within the total number of households in the country. In the first round, enumeration districts - that is primary sample units to be surveyed during the year - were selected. The ILCS 2015 sample included 30 enumeration districts in urban and 18 enumeration districts in rural communities per month.
Face-to-face [f2f]
The Questionnaire is filled in by the interviewer during the least five visits to households per month. During face-to-face interviews with the household head or another knowledgeable adult member, the interviewer collects information on the composition and housing conditions of the household, the employment status, educational level and health condition of the members, availability and use of land, livestock, and agricultural machinery, monetary and commodity flows between households, and other information.
The 2003 survey questionnaire had the following sections: (1) "List of Household Members", (2) "Housing Facilities", (3) "Migration", (4) "Education", (5) "Agriculture", (6) "Monetary and Commodity Flows between Households", (7) "Health (General) and Healthcare", (8) "Savings and Debts", (9) "Social Assistance"
The Diary is completed directly by the household for one month. Every day the household would record all its expenditures on food, non-food products and services, also giving a detailed description of such purchases; e.g. for food products the name, quantity, cost, and place of purchase of the product is recorded. Besides, the household records its consumption of food products received and used from its own land and livestock, as well as from other sources (e.g. gifts, humanitarian aid). Non-food products and services purchased or received for free are also recorded in the diary. Then, the household records its income received during the month. At the end of the month, information on rarely used food products, durable goods and ceremonies is recorded, as well. The records in the diary are verified by the interviewer in the course of 5 mandatory visits to the household during the survey month.
The Survey Diary has the following sections: (1) food purchased during the day, (2) food consumed at home
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Population figures for countries, regions (e.g. Asia) and the world. Data comes originally from World Bank and has been converted into standard CSV.