Household Income and Expenditure Survey (HIES) collects a wealth of information on HH income and expenditure, such as source of income by industry, HH expenditure on goods and services, and income and expenditure associated with subsistence production and consumption. In addition to this, HIES collects information on sectoral and thematic areas, such as education, health, labour force, primary activities, transport, information and communication, transfers and remittances, food expenditure (as a proxy for HH food consumption and nutrition analysis), and gender.
The Pacific Islands regionally standardized HIES instruments and procedures were adopted by the Government of Tokelau for the 2015/16 Tokelau HIES. These standards were designed to feed high-quality data to HIES data end users for:
The data allow for the production of useful indicators and information on the sectors covered in the survey, including providing data to inform indicators under the UN Sustainable Development Goals (SDGs). This report, the above listed outputs, and any thematic analyses of HIES data, collectively provide information to assist with social and economic planning and policy formation.
National coverage.
Households and Individuals.
The universe of the 2015/16 Tokelau Household Income and Expenditure Survey (HIES) is all occupied households (HHs) in Tokelau. HHs are the sampling unit, defined as a group of people (related or not) who pool their money, cook and eat together. It is not the physical structure (dwelling) in which people live. The HH must have been living in Tokelau for a period of six months, or have had the intention to live in Tokelau for a period of twelve months in order to be included in the survey.
Household members covered in the survey include: -usual residents currently living in the HH; -usual residents who are temporarily away (e.g., for work or a holiday); -usual residents who are away for an extended period, but are financially dependent on, or supporting, the HH (e.g., students living in school dormitories outside Tokelau, or a provider working overseas who hasn't formed or joined another HH in the host country) and plan to return; -persons who frequently come and go from the HH, but consider the HH being interviewed as their main place of stay; -any person who lives with the HH and is employed (paid or in-kind) as a domestic worker and who shares accommodation and eats with the host HH; and -visitors currently living with the HH for a period of six months or more.
Sample survey data [ssd]
The 2015/16 Tokelau Household Income and Expenditure Survey (HIES) sampling approach was designed to generate reliable results at the national level. That is, the survey was not designed to produce reliable results at any lower level, such as for the three individual atolls. The reason for this is partly budgetary constraint, but also because the HIES will serve its primary objectives with a sample size that will provide reliable national aggregates.
The sampling frame used for the random selection of HHs was from December 2013, i.e. the HH listing updated in the 2013 Population Count.
The 2015/16 Tokelau HIES had a quota of 120 HHs. The sample covered all three populated atolls in Tokelau (Fakaofo, Nukunonu and Atafu) and the sample was evenly allocated between the three atoll clusters (i.e., 40 HHs per atoll surveyed over a ten-month period). The HHs within each cluster were randomly selected using a single-stage selection process.
In addition to the 120 selected HHs, 60 HHs (20 per cluster) were randomly selected as replacement HHs to ensure that the desired sample was met. The replacement HHs were only approached for interview in the case that one of the primarily selected HHs could not be interviewed.
Face-to-face [f2f]
The questionnaires for this Household Income and Expenditure Survey (HIES) are composed of a diary and 4 modules published in English and in Tokelauan. All English questionnaires and modules are provided as external resources.
Here is the list of the questionnaires for this 2015-2016 HIES: - Diary: week 1 an 2; - Module 1: Demographic information (Household listing, Demographic profile, Activities, Educational status, Communication status...); - Module 2: Household expenditure (Housing characteristics, Housing tenure expenditure, Utilities and communication, Land and home...etc); - Module 3: Individual expenditure (Education, Health, Clothing, Communication, Luxury items, Alcohonl & tobacco); - Module 4: Household and individual income (Wages and salary, Agricultural and forestry activities, Fishing gathering and hunting activities, livestock and aquaculture activities...etc).
All inconsistencies and missing values were corrected using a variety of methods: 1. Manual correction: verified on actual questionnaires (double check on the form, questionnaire notes, local knowledge, manual verifications) 2. Subjective: the answer is obvious and be deducted from other questions 3. Donor hot deck: the value is imputed based on similar characteristics from other HHs or individuals (see example below) 4. Donor median: the missing or outliers were imputed from similar items reported median value 5. Record deletion: the record was filled by mistake and had to be removed.
Several questions used the hotdeck method of imputation to impute missing and outlying values. This method can use one to three dimensions and is dependent on which section and module the question was placed. The process works by placing correct values in a coded matrix. For example in Tokelau the “Drink Alcohol” questions used a three dimension hotdeck to store in-range reported data. The constraining dimensions used are AGE, SEX and RELATIONSHIP questions and act as a key for the hotdeck. On the first pass the valid yes/no responses are place into this 3-dimension hotdeck. On the second pass the data in the matrix is updated one person at a time. If a “Drink Alcohol” question contained a missing response then the person's coded age, sex and relationship key is searched in the “valid” matrix. Once a key is found the result contained in the matrix is imputed for the missing value. The first preferred method to correct missing or outlying data is the manual correction (trying to obtain the real value, it could have been miss-keyed or reported incorrectly). If the manual correction was unsuccessful at correcting the values, a subjective approach was used, the next method would be the hotdeck, then the donor median and the last correction is the record deletion. The survey procedure and enumeration team structure allow for in-round data entry, which gives the field staff the opportunity to correct the data by manual review and by using the entry system-generated error messages. This process was designed to improve data quality. The data entry system used system-controlled entry, interactive coding and validity and consistency checks. Despite the validity and consistency checks put in place, the data still required cleaning. The cleaning was a two-stage process, which included manual cleaning while referencing the questionnaire, whereas the second stage involved computer-assisted code verification and, in some cases, imputation. Once the data were clean, verified and consistent, they were recoded to form a final aggregated database, consisting of: Person level record - characteristics of every (household) HH member, including activity and education profile; HH level record - characteristics of the dwelling and access to services; Final aggregated income - all HH income streams, by category and type; Final aggregated expenditure - all HH expenditure items, by category and type.
The cleaning was a two-stage process, which included manual cleaning while referencing the questionnaire, whereas the second stage involved computer-assisted code verification and, in some cases, imputation. Once the data were clean, verified and consistent, they were recoded to form a final aggregated database.
Overall, 99% of the response rate objective was achieved.
Refer to Appendix 2 of the Tokelau 2015/2016 Household Income and Expenditure Survey report attached as an external resource.
This data collection contains information gathered in the Survey of Income and Education (SIE) conducted in April-July 1976 by the Census Bureau for the United States Department of Health, Education, and Welfare (HEW). Although national estimates of the number of children in poverty were available each year from the Census Bureau's Current Population Survey (CPS), those estimates were not statistically reliable on a state-by-state basis. In enacting the Educational Amendments of 1974, Congress mandated that HEW conduct a survey to obtain reliable state-by-state data on the numbers of school-age children in local areas with family incomes below the federal poverty level. This was the statistic that determined the amount of grant a local educational agency was entitled to under Title 1, Elementary and Secondary Education Act of 1965. (Such funds were distributed by HEW's Office of Education.) The SIE was the survey created to fulfill that mandate. Its questions include those used in the Current Population Survey regarding current employment, past work experience, and income. Additional questions covering school enrollment, disability, health insurance, bilingualism, food stamp recipiency, assets, and housing costs enabled the study of the poverty concept and of program effectiveness in reaching target groups. Basic household information also was recorded, including tenure of unit (a determination of whether the occupants of the living quarters owned, rented, or occupied the unit without rent), type of unit, household language, and for each member of the household: age, sex, race, ethnicity, marital history, and education.
The survey was conducted during December 2006, following an initial mini census listing exercise which was conducted about two months earlier in late September 2006. The objectives of the HIES were as follows: a) Provide information on income and expenditure distribution within the population; b) Provide income estimates of the household sector for the national accounts; c) Provide data for the re-base on the consumer price index; d) Provide data for the analysis of poverty and hardship.
National coverage: whole island was covered for the survey.
The survey covered all private households on the island of Nauru. When the survey was in the field, interviewers were further required to reduce the scope by removing those households which had not been residing in Nauru for the last 12 months and did not intend to stay in Nauru for the next 12 months. Persons living in special dwellings (Hospital, Prison, etc) were not included in the survey.
Sample survey data [ssd]
The sample size adopted for the survey was 500 households which allowed for expected sample loss, whilst still maintaining a suitable responding sample for the analysis.
Before the sample was selected, the population was stratified by constituency in order to assist with the logistical issues associated with the fieldwork. There were eight constituencies in total, along with "Location" which stretches across the districts of Denigamodu and Aiwo, forming nine strata in total. Although constituency level analysis was not a priority for the survey, sample sizes within each stratum were kept to a minimum of 40 households, to enable some basic forms of analysis at this level if required.
The sample selection procedure within each stratum was then to sort each household on the frame by household size (number of people), and then run a systematic skip through the list in order to achieve the desirable sample size.
No deviations from the sample design took place.
Face-to-face [f2f]
The survey schedules adopted for the Household Income and Expenditure Survey (HIES) included the following: · Expenditure questionnaire; · Income questionnaire; · Miscellaneous questionnaire; · Diary (x2).
Whilst a Household Control Form collecting basic demographics is also normally included with the survey, this wasn't required for this HIES as this activity took place for all households in the mini census.
Information collected in the four schedules covered the following: -Expenditure questionnaire: Covers basic details about the dwelling structure and its access to things like water and sanitation. It was also used as the vehicle to collect expenditure on major and infrequent expenditures incurred by the household. -Income questionnaire: Covers each of the main types of household income generated by the household such as wages and salaries, business income and income from subsistence activities. -Miscellaneous questionnaire: Covers topics relating to health access, labour force status and education. -Diary: Covers all day to day expenditures incurred by the household, consumption of items produced by the household such as fish and crops, and gifts both received and given by the household.
All questionnaires are provided as External Resources.
There were 3 phases to the editing process for the 2006 Household Income and Expenditure Survey (HIES) of Nauru which included: 1. Data Verification operations; 2. Data Editing operations; 3. Data Auditing operations.
The software used for data editting is CSPro 3.0. After each batch is completed the supervisor should check that all person details have been entered from the household listing form (HCF) and should review the income and expenditure questionnaires for each batch ensuring that all items have been entered correctly. Any omitted or incorrect items should be entered into the system. The supervisor is required to perform outlier checks (large or small values) on the batched diary data by calculating unit price (amount/quantity) and comparing prices for each item. This is to be conducted by loading the data into Excel files and sorting data by unit price for each item. Any changes to prices or quantities will be made on the batch file.
For more information on what each phase entailed go the document HIES Processing Instructions attached to this documentation.
The survey response rates were a lot lower than expected, especially in some districts. The district of Aiwo, Uaboe and Denigomodu had the lowest response rates with 16.7%, 20.0% and 34.8% respectively. The area of Location was also extremely low with a responses rate of 32.2%. On a more positive note, the districts of Yaren, Ewa, Anabar, Ijuw and Anibare all had response rates at 80.0% or better.
The major contributing factor to the low response rates were households refusing to take part in the survey. The figures for responding above only include fully responding households, and given there were many partial responses, this also brought the values down. The other significant contributing factor to the low response rates was the interviewers not being able to make contact with the household during the survey period.
Unfortunately, not only do low response rates often increase the sampling error of the survey estimates, because the final sample is smaller, it will also introduce response bias into the final estimates. Response bias takes place when the households responding to the survey possess different characteristics to the households not responding, thus generating different results to what would have been achieved if all selected households responded. It is extremely difficult to measure the impact of the non-response bias, as little information is generally known about the non-responding households in the survey. For the Nauru 2006 HIES however, it was noted during the fieldwork that a higher proportion of the Chinese population residing in Nauru were more likely to not respond. Given it is expected their income and expenditure patterns would differ from the rest of the population, this would contribute to the magnitude of the bias.
Below is the list of all response rates by district: -Yaren: 80.5% -Boe: 70% -Aiwo: 16.7% -Buada: 62.5% -Denigomodu: 34.8% -Nibok: 68.4% -Uaboe: 20% -Baitsi: 47.8% -Ewa: 80% -Anetan: 76.5% -Anabar: 81.8% -Ijuw: 85.7% -Anibare: 80% -Meneng: 64.3% -Location: 32.2% -TOTAL: 54.4%
To determine the impact of sampling error on the survey results, relative standard errors (RSEs) for key estimates were produced. When interpreting these results, one must remember that these figures don't include any of the non-sampling errors discussed in other sections of this documentation
To also provide a rough guide on how to interpret the RSEs provided in the main report, the following information can be used:
Category Description
RSE < 5% Estimate can be regarded as very reliable
5% < RSE < 10% Estimate can be regarded as good and usable
10% < RSE < 25% Estimate can be considered usable, with caution
RSE > 25% Estimate should only be used with extreme caution
The actual RSEs for the key estimates can be found in Section 4.1 of the main report
As can be seen from these tables, the estimates for Total Income and Total Expenditure from the Household Income and Expenditure Survey (HIES) can be considered to be very good, from a sampling error perspective. The same can also be said for the Wage and Salary estimate in income and the Food estimate in expenditure, which make up a high proportion of each respective group.
Many of the other estimates should be used with caution, depending on the magnitude of their RSE. Some of these high RSEs are to be expected, due to the expected degree of variability for how households would report for these items. For example, with Business Income (RSE 56.8%), most households would report no business income as no household members undertook this activity, whereas other households would report large business incomes as it's their main source of income.
Other than the non-response issues discussed in this documentation, other quality issues were identified which included: 1) Reporting errors Some of the different aspects contributing to the reporting errors generated from the survey, with some examples/explanations for each, include the following:
a) Misinterpretation of survey questions: A common mistake which takes place when conducting a survey is that the person responding to the questionnaire may interpret a question differently to the interviewer, who in turn may have interpreted the question differently to the people who designed the questionnaire. Some examples of this for a Household Income and Expenditure Survey (HIES) can include people providing answers in dollars and cents, instead of just dollars, or the reference/recall period for an “income” or “expenditure” is misunderstood. These errors can often see reported amounts out by a factor of 10 or even 100, which can have major impacts on final results.
b) Recall problems for the questionnaire information: The majority of questions in both of the income and expenditure questionnaires require the respondent to recall what took place over a 12 month period. As would be expected, people will often forget what took place up to 12 months ago so some
This data collection is part of a longitudinal survey designed to provide detailed information on the economic situation of households and persons in the United States. These data examine the distribution of income, wealth, and poverty in American society and gauge the effects of federal and state programs on the well-being of families and individuals. There are three basic elements contained in the survey. The first is a control card that records basic social and demographic characteristics for each person in a household, as well as changes in such characteristics over the course of the interviewing period. The second element is the core portion of the questionnaire, with questions repeated at each interview on labor force activity, types and amounts of income, participation in various cash and noncash benefit programs, attendance in postsecondary schools, private health insurance coverage, public or subsidized rental housing, low-income energy assistance, and school breakfast and lunch participation. The third element consists of topical modules, which are series of supplemental questions asked during selected household visits. Topical modules were not created for the first or second waves of the 1985 panel. The topical module for Wave III contains information on assets and liabilities. Included are questions on loans, IRAs, medical bills, other debts, checking accounts, and savings bonds, as well as questions related to mortgages, royalties, and other investments, real estate property and vehicles, rental income, self-employment, and stocks and mutual fund shares. The Wave IV topical module contains information on fertility history, household relationships, marital history, migration history, support for non-household members, and work-related expenses. The topical module for Wave VI includes data on child care arrangements, child support agreements, support for non-household members, job offers, health status and utilization of health care services, long-term care, and disability status of children. Wave VII topical module contains information on assets and liabilities. Included are questions on pension plan coverage, lump sum distributions from pension plans, characteristics of job from which retired, and characteristics of home financing arrangements. Frequencies for each wave are also provided. Parts 27 and 28 of this study are the unedited research files for Wave V and Wave VIII Topical Modules, obtained from the Census Bureau. These files include data on annual income, retirement accounts, taxes, school enrollment, and financing. These two topical module files have not been edited nor imputed, although they have been topcoded or bottomcoded and recoded if necessary by the Census Bureau to avoid disclosure of individual respondents' identities. (Source: downloaded from ICPSR 7/13/10)
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 25% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN
Surveys related to the family budget are considered one of the most important surveys types carried out by the Department Of Statistics, since it provides data on household expenditure and income and their relationship with different indicators. Therefore, most of the countries undertake periodic surveys on household income and expenditures. The Department Of Statistics, since established, conducted a series of Expenditure and Income Surveys during the years 1966, 1980, 1986/1987, 1992, 1997, 2002/2003, 2006/2007, 2008/2009, 2010/2011 and because of continuous changes in spending patterns, income levels and prices, as well as in the population internal and external migration, it was necessary to update data for household income and expenditure over time. Hence, the need to implement the Household Expenditure and Income Survey for the year 2013 arises.
The survey was then conducted to achieve the following objectives: 1. Provide data on income and expenditure to enable computation of poverty indices and determine the characteristics of the poor and prepare poverty maps. 2. Provide data weights that reflect the relative importance of consumer expenditure items used in the preparation of the consumer price index. 3. Provide the necessary data for the national accounts related to overall consumption and income of the household sector. 4. Provide the data necessary for the formulation, follow-up and evaluation of economic and social development programs, including those addressed to eradicate poverty. 5. Identify consumer spending patterns prevailing in the society, and the impact of demographic, social and economic variables on those patterns. 6. Calculate the average annual income of the household and the individual, and identify the relationship between income and different socio-economic factors, such as profession and educational level of the head of the household and other indicators. 7. Study the distribution of individuals and households by income and expenditure categories and analyze the factors associated with it.
The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing household surveys in several Arab countries.
The General Census of Population and Housing in 2004 provided a detailed framework for housing and households for different administrative levels in the Kingdom. Where the Kingdom is administratively divided into 12 governorates, each governorate is composed of a number of districts, each district (Liwa) includes one or more sub-district (Qada). In each sub-district, there are a number of communities (cities and villages). Each community was divided into a number of blocks. Where in each block, the number of houses ranged between 60 and 100 houses. Nomads, persons living in collective dwellings such as hotels, hospitals and prison were excluded from the survey framework.
1- Household/family. 2- Individual/person.
The survey covered a national sample of households and all individuals permanently residing in surveyed households.
Sample survey data [ssd]
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 25% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN
The Household Expenditure and Income survey sample, for the year 2013, was designed to serve the basic objectives of the survey through providing a relatively large sample in each sub-district to enable drawing a poverty map in Jordan. A two stage stratified cluster sampling technique was used. In the first stage, a cluster sample proportional to the size was uniformly selected, where the number of households in each cluster was considered the weight of the cluster. At the second stage, a sample of 10 households was selected from each cluster, in addition to another 5 households selected as a backup for the basic sample, using a systematic sampling technique. Those 5 households were sampled to be used during the first visit to the block in case the visit to the original household selected is not possible for any reason. For the purposes of this survey, each sub-district was considered a separate stratum to ensure the possibility of producing results on the sub-district level. In this respect, the survey framework adopted that provided by the General Census of Population and Housing Census in dividing the sample strata. To estimate the sample size, the coefficient of variation and the design effect of the expenditure variable provided in the Household Expenditure and Income Survey for the year 2010 was calculated for each sub-district. These results were used to estimate the sample size on the sub-district level so that the coefficient of variation for the expenditure variable in each sub-district is less than 10%, at a minimum, of the number of clusters in the same sub-district (8 clusters). This is to ensure adequate presentation of clusters in different administrative areas to enable drawing an indicative poverty map. It should be noted that in addition to the standard non response rate assumed, higher rates were expected in areas where poor households are concentrated in major cities. Therefore, those were taken into consideration during the sampling design phase, and a higher number of households were selected from those areas, aiming at well covering all regions where poverty spreads.
Face-to-face [f2f]
To reach the survey objectives, 3 forms have been developed. Those forms were finalized after being tested and reviewed by specialists taking into account making the data entry, and validation, process on the computer as simple as possible.
(1) General Form/Questionnaire This form includes: - Housing characteristics such as geographic location variables, household area, building material predominant for external walls, type of tenure, monthly rent or lease, main source of water, lighting, heating and fuel cooking, sanitation type and water cycle, the number of rooms in the dwelling, in addition to providing ownership status of some home appliances and car. - Characteristics of household members: This form focused on the social characteristics of the family members such as relation to the head of the family, gender, age and educational status and marital status. It also included economic characteristics such as economic activity, and the main occupation, employment status, and the labor sector. To the additions of questions about individual continued to stay with the family, in order to update the information at the end of each of the four rounds of the survey. - Income section which included three parts · Family ownership of assets · Productive activities for the family · Current income sources
(2) Expenditure on food commodities form/Questionnaire This form indicates expenditure data on 17 consumption groups. Each group includes a number of food commodities, with the exception of the latter group, which was confined to some of the non-food goods and services because of their frequent spending pattern on daily basis like food commodities. For the purposes of the efficient use of results, expenditure data of the latter group was moved with the non-food commodities expenditure. The form also includes estimated amounts of own-produced food items and those received as gifts or in an in-kind form, as well as servants living with the family spending on themselves from their own wages to buy food.
(3) Expenditure on non-food commodities form/Questionnaire This form indicates expenditure data on 11 groups of non-food items, and 5 sets of spending on services, in addition to a group of consumption expenditure. It also includes an estimate of self-consumption, and non-food gifts or other items in an in-kind form received or sent by the household, as well as servants living with the family spending on themselves from their own wages to buy non-food items.
----> Raw Data
The data collection phase was then followed by the data processing stage accomplished through the following procedures: 1- Organizing forms/questionnaires A compatible archive system, with the nature of the subsequent operations, was used to classify the forms according to different round throughout the year. This is to effectively enable extracting the forms when required for processing. A registry was prepared to indicate different stages of the process of data checking, coding and entry till forms are back to the archive system. 2- Data office checking This phase is achieved concurrently with the data collection phase in the field, where questionnaires completed in the fieldwork are immediately sent to data office checking phase. 3- Data coding A team was trained to work on the data coding phase, which in this survey is only limited to education specialization, profession and economic activity. In this respect, international classifications were use, while for the rest of the questions, all coding were predefined
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii
function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...
https://www.icpsr.umich.edu/web/ICPSR/studies/36801/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/36801/terms
The 2015 American Housing Survey marks the first release of a newly integrated national sample and independent metropolitan area samples. The 2015 release features many variable name revisions, as well as the integration of an AHS Codebook Interactive Tool available on the U.S. Census Bureau We site. This data collection provides information on the characteristics of a national sample of housing units in 2015, including apartments, single-family homes, mobile homes, and vacant housing units. Data from the 15 largest metropolitan areas in the United States are included in the national sample survey (the AHS 2015 Metropolitan Data are also available as ICPSR 36805). The data are presented in three separate parts: Part 1, Household Record (Main Record), Part 2, Person Record, and Part 3, Project Record. Household Record data includes questions about household occupancy and tenure, household exterior and interior structural features, household equipment and appliances, housing problems, housing costs, home improvement, neighborhood features, recent moving information, income, and basic demographic information. The household record data also features four rotating topical modules: Arts and Culture, Food Security, Housing Counseling, and Healthy Homes. Person Record data includes questions about personal disabilities, income, and basic demographic information. Finally, the Project Record data includes questions about home improvement projects. Specific questions were asked about the types of projects, costs, funding sources, and year of completion.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau's Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units for states and counties..Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2019 American Community Survey 1-Year Estimates.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Between 2018 and 2019 the American Community Survey retirement income question changed. These changes resulted in an increase in both the number of households reporting retirement income and higher aggregate retirement income at the national level. For more information see Changes to the Retirement Income Question ..The categories for relationship to householder were revised in 2019. For more information see Revisions to the Relationship to Household item..The 2019 American Community Survey (ACS) data generally reflect the September 2018 Office of Management and Budget (OMB) delineations of metropolitan and micropolitan statistical areas. In certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB delineations due to differences in the effective dates of the geographic entities..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:An "**" entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.An "-" entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution, or the margin of error associated with a median was larger than the median itself.An "-" following a median estimate means the median falls in the lowest interval of an open-ended distribution.An "+" following a median estimate means the median falls in the upper interval of an open-ended distribution.An "***" entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.An "*****" entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. An "N" entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.An "(X)" means that the estimate is not applicable or not available.
This data collection is part of a longitudinal survey designed to provide detailed information on the economic situation of households and persons in the United States. These data examine the distribution of income, wealth, and poverty in American society and gauge the effects of federal and state programs on the well-being of families and individuals. There are three basic elements contained in the survey. The first is a control card that records basic social and demographic characteristics for each person in a household, as well as changes in such characteristics over the course of the interviewing period. These include age, sex, race, ethnic origin, marital status, household relationship, education, and veteran status. Limited data are provided on housing unit characteristics such as units in structure, tenure, access, and complete kitchen facilities. The second element is the core portion of the questionnaire, with questions repeated at each interview on labor force activity, types and amounts of income, and participation in various cash and noncash benefit programs for each month of the four- month reference period. Data for employed persons include number of hours and weeks worked, earnings, and weeks without a job. Nonworkers are classified as unemployed or not in the labor force. In addition to providing income data associated with labor force activity, the core questions cover nearly 50 other types of income. Core data also include postsecondary school attendance, public or private subsidized rental housing, low-income energy assistance, and school breakfast and lunch participation. The third element consists of topical modules, which are a series of supplemental questions asked during selected household visits. Topical modules include some core data to link individuals to the core files. The Wave 1 Topical Module covers recipiency and employment history. The Wave 2 Topical Module includes work disability, education and training, marital, migration, and fertility histories, and household relationships. The Wave 3 Topical Module covers medical expenses and utilization of health care, work-related expenses and child support, assets and liabilities, real estate, shelter costs, dependent care, vehicles, value of business, interest earning accounts, rental properties, stocks and mutual fund shares, mortgages, and other assets. The Wave 4 Topical Module covers work schedule, taxes, child care, and annual income and retirement accounts. Data in the Wave 5 Topical Module describe child support agreements, school enrollment and financing, support for non-household members, adult and child disability, and employer-provided health benefits. The Wave 6 Topical Module covers medical expenses and utilization of health care, work related expenses, child support paid and child care poverty, assets and liabilities, real estate, shelter costs, dependent care, vehicles, value of business, interest earning accounts, rental properties, stock and mutual fund shares, mortgages, and other financial investments. The Wave 7 Topical Module covers informal caregiving, children's well-being, and annual income and retirement accounts. The Wave 8 Topical Module and Wave 8 Welfare Reform Topical Module cover child support agreements, support for nonhousehold members, adult disability, child disability, adult well-being, and welfare reform. The Wave 9 Topical Module covers medical expenses and utilization of heath care (adults and children), work related expenses, child support paid and child care poverty, assets and liabilities, real estate, shelter costs, dependent care, vehicles, value of business, interest earnings accounts, rental properties, stocks and mutual fund shares mortgages, and other financial investments (Source: downloaded from ICPSR 7/13/10)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Comparable household income measures are crucial for most social science analyses of cross-national public opinion survey data. However, income questions in many cross-national surveys suffer from comparability and interpretability limitations that have not been adequately addressed by the existing literature. In this article, we examine the income measure in one major survey, the World Values Survey (WVS), arguing that a variety of problems arise when drawing inferences - descriptive or causal, individual or aggregate - using the standard 10-category measure. We then propose and implement a number of corrections to these potential biases and present a series of diagnostics that confirm the importance of our proposed corrections. We conclude by documenting some of the same challenges in the income measures used in other cross-national surveys. The accompanying data set can be merged with the WVS to make better use of the income measure.
This is a longitudinal survey designed to provide detailed information on the economic situation of households and persons in the United States. These data examine the distribution of income, wealth, and poverty in American society and gauge the effects of federal and state programs on the well-being of families and individuals. There are three basic elements contained in the survey. The first is a control card that records basic social and demographic characteristics for each person in a household, as well as changes in such characteristics over the course of the interviewing period. The second element is the core portion of the questionnaire, with questions repeated at each interview on labor force activity, types and amounts of income, participation in various cash and noncash benefit programs, attendance in postsecondary schools, private health insurance coverage, public or subsidized rental housing, low-income energy assistance, and school breakfast and lunch participation. The third element consists of topical modules, which are a series of supplemental questions asked during selected household visits. Topical modules include some core data to help link individuals to the core files. Topical module data for the 1992 Panel cover the following topics: Topical Module 1 -- welfare and other aid recipiency and employment, Topical Module 2 -- work disability, education and training, marital status, migration, and fertility histories, Topical Module 3 -- extended measures of well-being, including consumer durables, living conditions, and basic needs, Topical Module 4 -- assets and liabilities, retirement expectations and pension plan coverage, real estate, property, and vehicles, Topical Module 5 -- school enrollment and financing, Topical Module 6 -- work schedules, child care, support for nonhousehold members, functional limitations and disabilities, utilization of health care services, and home-based self-employment and size of firm, Topical Module 7 -- selected financial assets, medical expenses and work disability, real estate, shelter costs, dependent care, and vehicles, Topical Module 8 -- school enrollment and financing, Topical Module 9 -- work schedule, child care, child support agreements, child support, support for nonhousehold members, functional limitations and disability, utilization of health care, functional limitations and disability of children, health status and utilization of health care services, and utilization of health care services for children. Parts 26 and 27 are the Wave 5 and Wave 8 Topical Module Microdata Research Files obtained from the Census Bureau. These two topical module files include data on annual income, retirement accounts and taxes, and school enrollment and financing. These topical module files have not been edited nor imputed, although they have been topcoded or bottomcoded and recoded if necessary by the Census Bureau to avoid disclosure of individual respondents' identities. (Source: downloaded from ICPSR 7/13/10)
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The dataset consists of 113 responses directly taken from a Google Form survey consisting of four demographic questions (age, sex, country, income), a single item on Love for Money, and WHO-5 Wellbeing Questionnaire. This is a completely raw , anonymous dataset. This data was collected as part of a study examining the relationship between income and wellbeing mediated/moderated by love for money.
This special survey consists of three topics: - standard questions of the extended housing survey - working conditions - question on the income The emphasis is on questions on working conditions, especially with regard to the fields stress and environmental influences at the work place. This is basically a (shortened) repetition of the survey on working conditions from June 1994.
The purpose of the HIES survey is to obtain information on the income, consumption pattern, incidence of poverty, and saving propensities for different groups of people in Nauru. This information will be used to guide policy makers in framing socio-economic developmental policies and in initiating financial measures for improving economic conditions of the people.
Some more specific outputs from the survey are listed below: a) To obtain expenditure weights and other useful data for the revision of the consumer price index; b) To supplement the data available for use in compiling official estimates of household accounts in the systems of national accounts; c) To supply basic data needed for policy making in connection with social and economic planning; d) To provide data for assessing the impact on household living conditions of existing or proposed economic and social measures, particularly changes in the structure of household expenditures and in household consumption; e) To gather information on poverty lines and incidence of poverty throughout Nauru.
National
The survey covered all private households on the island of Nauru. When the survey was in the field, interviewers were further required to reduce the scope by removing those households which had not been residing in Nauru for the last 12 months and did not intend to stay in Nauru for the next 12 months.
Persons living in special dwellings (Hospital, Prison, etc) were not included in the survey.
Sample survey data [ssd]
The sample size adopted for the survey was 500 households which allowed for expected sample loss, whilst still maintaining a suitable responding sample for the analysis.
Before the sample was selected, the population was stratified by constituency in order to assist with the logistical issues associated with the fieldwork. There were eight constituencies in total, along with "Location" which stretches across the districts of Denigamodu and Aiwo, forming nine strata in total. Although constituency level analysis was not a priority for the survey, sample sizes within each stratum were kept to a minimum of 40 households, to enable some basic forms of analysis at this level if required.
The sample selection procedure within each stratum was then to sort each household on the frame by household size (number of people), and then run a systematic skip through the list in order to achieve the desirable sample size.
No deviations from the sample design took place.
Face-to-face [f2f] for questionnaires, self-enumeration for the diaries
The survey schedules adopted for the HIES included the following: · Expenditure questionnaire · Income questionnaire · Miscellaneous questionnaire · Diary (x2)
Whilst a Household Control Form collecting basic demographics is also normally included with the survey, this wasn't required for this HIES as this activity took place for all households in the mini census.
Information collected in the four schedules covered the following:
Expenditure questionnaire: Covers basic details about the dwelling structure and its access to things like water and sanitation. It was also used as the vehicle to collect expenditure on major and infrequent expenditures incurred by the household.
Income questionnaire: Covers each of the main types of household income generated by the household such as wages and salaries, business income and income from subsistence activities.
Miscellaneous questionnaire: Covers topics relating to health access, labour force status and education.
Diary: Covers all day to day expenditures incurred by the household, consumption of items produced by the household such as fish and crops, and gifts both received and given by the household.
There were 3 phases to the editing process for the 2006 Nauru HIES which included: 1. Data Verification operations 2. Data Editing operations 3. Data Auditing operations
For more information on what each phase entailed go the document HIES Processing Instructions attached to this documentation.
The survey response rates were a lot lower than expected, especially in some districts. The district of Aiwo, Uaboe and Denigomodu had the lowest response rates with 16.7%, 20.0% and 34.8% respectively. The area of Location was also extremely low with a responses rate of 32.2%. On a more positive note, the districts of Yaren, Ewa, Anabar, Ijuw and Anibare all had response rates at 80.0% or better.
The major contributing factor to the low response rates were households refusing to take part in the survey. The figures for responding above only include fully responding households, and given there were many partial responses, this also brought the values down. The other significant contributing factor to the low response rates was the interviewers not being able to make contact with the household during the survey period.
Unfortunately, not only do low response rates often increase the sampling error of the survey estimates, because the final sample is smaller, it will also introduce response bias into the final estimates. Response bias takes place when the households responding to the survey possess different characteristics to the households not responding, thus generating different results to what would have been achieved if all selected households responded. It is extremely difficult to measure the impact of the non-response bias, as little information is generally known about the non-responding households in the survey. For the Nauru 2006 HIES however, it was noted during the fieldwork that a higher proportion of the Chinese population residing in Nauru were more likely to not respond. Given it is expected their income and expenditure patterns would differ from the rest of the population, this would contribute to the magnitude of the bias.
To determine the impact of sampling error on the survey results, relative standard errors (RSEs) for key estimates were produced. When interpreting these results, one must remember that these figures don't include any of the non-sampling errors discussed in other sections of this documentation
To also provide a rough guide on how to interpret the RSEs provided in the main report, the following information can be used:
Category Description
RSE < 5% Estimate can be regarded as very reliable
5% < RSE < 10% Estimate can be regarded as good and usable
10% < RSE < 25% Estimate can be considered usable, with caution
RSE > 25% Estimate should only be used with extreme caution
The actual RSEs for the key estimates can be found in Section 4.1 of the main report
As can be seen from these tables, the estimates for Total Income and Total Expenditure from the HIES can be considered to be very good, from a sampling error perspective. The same can also be said for the Wage and Salary estimate in income and the Food estimate in expenditure, which make up a high proportion of each respective group.
Many of the other estimates should be used with caution, depending on the magnitude of their RSE. Some of these high RSEs are to be expected, due to the expected degree of variability for how households would report for these items. For example, with Business Income (RSE 56.8%), most households would report no business income as no household members undertook this activity, whereas other households would report large business incomes as it's their main source of income.
Other than the non-response issues discussed in this documentation, other quality issues were identified which included: 1) Reporting errors Some of the different aspects contributing to the reporting errors generated from the survey, with some examples/explanations for each, include the following:
a) Misinterpretation of survey questions: A common mistake which takes place when conducting a survey is that the person responding to the questionnaire may interpret a question differently to the interviewer, who in turn may have interpreted the question differently to the people who designed the questionnaire. Some examples of this for a HIES can include people providing answers in dollars and cents, instead of just dollars, or the reference/recall period for an “income” or “expenditure” is misunderstood. These errors can often see reported amounts out by a factor of 10 or even 100, which can have major impacts on final results.
b) Recall problems for the questionnaire information: The majority of questions in both of the income and expenditure questionnaires require the respondent to recall what took place over a 12 month period. As would be expected, people will often forget what took place up to 12 months ago so some information will be forgotten.
c) Intentional under-reporting for some items: For whatever reasons, a household may still participate in a survey but not be willing to provide accurate responses for some questions. Examples for a HIES include people not fully disclosing their total income, and intentionally under-reporting expenditures on items such as alcohol and tobacco.
d) Accidental under-reporting in the household diaries: Although the two diaries are left with the household for a period of two weeks, it is easy for the household to forget to enter all expenditures throughout this period - this problem most likely increases as the two
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
NFCorpus: 20 generated queries (BEIR Benchmark)
This HF dataset contains the top-20 synthetic queries generated for each passage in the above BEIR benchmark dataset.
DocT5query model used: BeIR/query-gen-msmarco-t5-base-v1 id (str): unique document id in NFCorpus in the BEIR benchmark (corpus.jsonl). Questions generated: 20 Code used for generation: evaluate_anserini_docT5query_parallel.py
Below contains the old dataset card for the BEIR benchmark.
Dataset Card for BEIR… See the full description on the dataset page: https://huggingface.co/datasets/income/cqadupstack-gaming-top-20-gen-queries.
The main purpose of a HIES survey was to present high quality and representative national household data on income and expenditure in order to update Consumer Price Index (CPI), improve statistics on National Accounts and measure poverty within the country. These statistics are a requirement for evidence based policy-making in reducing poverty within the country and monitor progress in the national strategic plan "Te Kakeega 3".
The 2015-16 Household Income and Expenditure Survey (HIES) is the third HIES that was conducted by the Central Statistics Division since Tuvalu gained political independence in 1978. With great assitance from the Pacific Community (SPC) experts, the HIES was conducted over a period of 12 months in urban (Funafuti) and rural (4 outer islands) areas. From a total of 1,872 households on Tuvalu, an amount of 38 percent sample of all households in Tuvalu was selected to provide valid response.
National Coverage.
Household and Individual.
The scope of the 2015/2016 Household Income and Expenditure Survey (HIES) was all occupied households in Tuvalu. Households are the sampling unit, defined as a group of people (related or not) who pool their money, and cook and eat together. It is not the physical structure (dwelling) in which people live. HIES covered all persons who were considered to be usual residents of private dwellings (must have been living in Tuvalu for a period of 12-months, or have intention to live in Tuvalu for a period of 12-months in order to be included in the survey). Usual residents who are temporary away are included as well (e.g., for work or a holiday).
Sample survey data [ssd]
Out of the total 1,872 households (HHs) listed in 2015, a sample 706 households which is 38 percent of the the total households were succesfully interviewed for a response rate of 98%.
SAMPLING FRAME: The 2010 (Household Income and Expenditure Survey (HIES) sample was spread over 12 months rounds - one each quarter - and the specifications of the final responding households are summarised below: Tuvalu urban: Selected households: 259 = 217 responded; Tuvalu rural: Selected households: 346 = 324 responded.
In 2010, 605 HHs were selected and 541 sufficiently responded. The 2010 HIES provided solid estimates for expenditure aggregates at the national level (sampling error for national expenditure estimate is 3.1%).
Similarly to the 2010 HIES, private occupied dwellings were the statistical unit for the 2015/2016 HIES. Institutions and vacant dwellings were removed from the sampling frame. Some areas in Tuvalu are very difficult to reach due to the cost of transportation and the remoteness of some islands, which is why they are excluded from the sample selection. The following table presents the distribution of the HHs according to their location (main island or outer islands in each domain) based on the 2012 Population and Housing Census: -Urban - Funafuti: 845 (48%); -Rural - Nanumea: 115 (7%); -Rural - Nanumaga: 116 (7%); -Rural - Niutao: 123 (7%); -Rural - Nui: 138 (8%); -Rural - Vaitupu: 226 (13%); -Rural - Nukufetau: 124 (%); -Rural - Nukulaelae: 67 (%); -Rural - Niulakita: 7 (%); -TOTAL: 1761 (100%).
The 2012 Population and Household Census (PHC) wsa used to select the island to interview, and then in each selected island the HH listing was updated for selection. For budget and logistics reasons the islands of Nui, Nukufetau, Nukulaelae and Niukalita were excluded from the sample selection. In total 19% of the HHs were excluded from the selection but this decision should not affect the HIES outputs as those 19% show similar profile as other HHs who live in the outer islands. This exclusion will be take into consideration in the sampling weight computation in order to cover 100% of the outer island HHs.
SAMPLE SELECTION AND SAMPLE SIZE: A simple random selection was used in each of the selected island (HHs were selected directly from the sampling frame). Based on the findings from the 2010 Tuvalu HIES, the sample in Funafuti has been increased and the one in rural remains stable. Within each rural selected atolls, the allocation of the sample size is proportional to its size (baed on the 2012 population census). The table below shows the number of HHs to survey: Urban - Funafuti: 384; Rural - Vaitupu: 126; Rural - Nanumea: 63; Rural - Niutao: 84; Rural - Nanumaga: 63; TUVALU: 720.
The expected sample size has been increased by one third (361 HHs) with the aim of pre-empting the non contacted HHs (refusals, absence….). The 2015/2016 HIES adopted the standardized HIES methodology and survey instruments for the Pacific Islands region. This approach, developed by the Pacific Community (SPC), has resulted in proven survey forms being used for data collection. It involves collection of data over a 12-month period to account for seasonal changes in income and expenditure patterns, and to keep the field team to a smaller and more qualified group. Their implementation had the objective of producing consistent and high quality data.
For budget and logistics reasons the islands of Nui, Nukufetau, Nukulaelae and Niukalita were excluded from the sample selection. In total 19% of the HHs were excluded from the selection but this decision should not affect the HIES outputs as those 19% show similar profile as other HHs who live in the outer islands. This exclusion will be take into consideration in the sampling weight computation in order to cover 100% of the outer island HHs.
Face-to-face [f2f]
The survey contain 4 modules and 2 Diaries (1 diary for each of the two weeks that a household was enumerated). The purpose of a Diary is to record all the daily expenses and incomes of a Household as shown by its topics below;
- DIARY
The Diary module contains questions such as "What did your Household buy Today (Food and Non-Food Items)?", "Payments for Services made Today", "Food, Non-Food and Services Received for Free", "Home-Produced Items Today", "Overflow Sheet for Items Bought This Week", "Overflow Sheet for Services Paid for This Week", "Overflow Sheet for Items Received for Free this Week", and an "Overflow Sheet for Home-Produced Items This Week".
The 4 modules are detailed below;
- MODULE 1 - DEMOGRAPHIC INFORMATION
The module contains individual demograhic questions on their Demographic Profiles, Labour Force status (Activities), Education status, Health status, Communication status and questions on "Household members that have left the household".
- MODULE 2 - HOUSEHOLD EXPENDITURE
The module contains household expenditure questions the housing characteristics, Housing tenure expenditures, Utilities and Communication, Land, Household goods and assets, Vehicles and accessories, Private Travel details, Household services expenditures, Cash contributions, Provisions of Financial support, Household asset insurance and taxes and questions on Personal insurance.
- MODULE 3 - INDIVIDUAL EXPENDITURE
This module contains individual expenditure questions on Education, Health, Clothing, Communication, Luxury Items, Alcohol, Kava and Tobacco, and Deprivation questions.
- MODULE 4 - HOUSEHOLD & INDIVIDUAL INCOME
This module contains household and individual questions on their income, on topics such as Wages and Salary, Agricultural and Forestry Activities, Fishing, Gathering and Hunting Activities, Livestock and Aquaculture Activities, Handicraft/Home-processed Food Activities, Income from Non-subsistence Business, Property income, transfer income & other Receipts, and Remmitances and other Cash gifts.
Depending on the information being collected, a recall period (ranging from the last 7 days to the last 12 months) is applied to various sections of the questionnaire. The forms were completed by face-to-face interview, usually with the HH head providing most of the information, with other household (HH) members being interviewed when necessary. The interviews took place over a 2-week period such that the HH diary, which is completed by the HH on a daily basis for 2 weeks, can be monitored while the module interviews take place. The HH diary collects information on the HH's daily expenditure on goods and services; and the harvest, capture, collection or slaughter of primary produce (fruit, vegetables and animals) by intended purpose (home consumption, sale or to give away). The income and expenditure data from the modules and the diary are concatenated (ensuring that double counting does not occur), annualised, and extrapolated to form the income and expenditure aggregates presented herein.
The survey procedure and enumeration team structure allowed for in-round data entry, which gives the field staff the opportunity to correct the data by manual review and by using the entry system-generated error messages. This process was designed to improve data quality. The data entry system used system-controlled entry, interactive coding and validity and consistency checks. Despite the validity and consistency checks put in place, the data still required cleaning. The cleaning was a 2-stage process, which included manual cleaning while referencing the questionnaire, whereas the second stage involved computer-assisted code verification and, in some cases, imputation. Once the data were clean, verified and consistent, they were recoded to form a final aggregated database, consisting of: 1. Person level record - characteristics of every HH member, including activity
The Household Market and Nonmarket Activities (HUS) project started as a joint research project between the Industrial Institute for Economic and Social Research (IUI) and Göteborg University in 1980. The ambition was to build a consistent longitudinal micro data base on the use of time, money and public services of households. The first main survey was carried out in 1984. In addition to a contact interview with the selected individuals, all designated individuals participated in a personal interview and two telephone interviews. All respondents were asked about their family background, education, marital status, labor market experience, and employment. In addition, questions about the household were asked of the head of household, concerning family composition, child care, health status, housing, possession of vacation homes, cars, boats and other consumption durables. At the end of the personal interview the household head had to fill out a questionnaire including questions about financing of current home, construction costs for building a house, house value and loans, imputation of property values and loans, additions/renovations 1983, maintenance and repairs, leasing, sale of previous home, assets and liabilities, and non-taxable benefits. All the respondents had to fill out a questionnaire including questions about tax-return information 1983, employment income, and taxes and support payments. Two telephone interviews were used primarily to collect data on the household´s time use and consumption expenditures. The 1986 HUS-survey included both a follow-up of the 1984 sample (panel study) and a supplementary sample. The 1986 sample included 1) all respondents participating in the 1984 survey, 2) the household heads, partners and third persons who should have participated in 1984 but did not (1984 nonresponse), 3) those individuals who started living together after the 1984 interview with an selected individual who participated or was supposed to participate in 1984, 4) members of the 1984 household born in 1966 or 1967. If entering a new household, for example because of leaving their parental home, the household head and his/her partner were also interviewed. Respondents participating in the 1984 survey were interviewed by telephone in 1986. Questions dealt with changes in family composition, housing, employment, wages and child care, and it was not only recorded whether a change had occurred, and what sort of change, but also when it occurred. The respondents also received a questionnaire by mail with questions mainly concerning income and assets. Respondents not participating in the earlier survey were interviewed in person and were asked approximately the same questions as in the 1984 personal interview. The 1988 HUS-survey was considerably smaller than the previous ones. It was addressed exclusively to participants in the 1986 survey, and consisted of a self-enumerated questionnaire with a nonrespondent follow-up by telephone. The questions dealt with changes in housing conditions, employment and household composition. The questionnaire also contained some questions on household income. In many respect the 1991 HUS-survey replicated the 1988 survey. The questions were basically the same in content and range, and the survey was conducted as a self-enamurated questionnaire sent out by mail. This time, however, in contrast to the 1988 survey, an attempt was made to include in the survey the new household members who had moved into sample households since 1986, as well as young people who turned 18 after the 1986 survey. Earlier respondents received a questionnaire by mail containing questions about their home, their primary occupation and weekly work hours since May 1988 (event-history data), earnings in 1989, 1990 and 1991, household composition and any changes in it that might have occurred since 1988, child care and some questions on income. New respondents were also asked about their education and labor-market experience. With respect to its design and question wording, the 1993 survey is a new version of the 1986 survey. The survey is made up of four parts: 1) the panel survey, which was addressed mainly to respondents in the 1991 survey, with certain additions; 2) the so-called supplementary survey, which focused on a new random sample of individuals; 3) the so-called nonresponse survey, which encompassed respondents who had participated in at least one of the earlier surveys but had since dropped out; 4) the time-use survey, which included the same sample of respondents as those in the panel and supplementary surveys. Individuals in the nonresponse group were not included in the time-use survey. Most of the questions in the first three surveys were the same, but certain questions sequences were targeted to the respondents in a specific survey. Thus certain retrospective questions were asked of the nonresponse group, while specific questions on social background, labor market experience etc. were addressed to new respondents. In the case of respondents who had already participated in the panel, a combined contact and main interview was conducted by telephone, after which a self-enumerated questionnaire was sent out to each respondent by mail. The panel sample also included young people in panel households who were born in 1973 or 1974 as well as certain new household members who had not previously been interviewed. These individuals, like new respondents, were not interviewed by telephone until they had been interviewed personally. Thus technically they were treated in the same manner as individuals in the supplementary sample. The new supplementary sample was first contacted by telephone and then given a fairly lengthy personal interview, at the conclusion of which each respondent was asked to fill out a written questionnaire. In this respect the survey design for the nonresponse sample was the same as for the supplementary sample. The nonresponse sample also included young people born in 1973 or 1974 as well as certain new household members. The time-use interviews were conducted by telephone. For each respondent two days were chosen at random from the period from February 15, 1993 to February 14, 1994 and the respondents were interviewed about their time use during those two days. If possible, the time-use interviews were preceded by the other parts of the survey, but this was not always feasible. In each household the household head and spouse/partner were interviewed, as well as an additional person in certain households. Questions regarding the household as a whole were asked of only one person in the household, preferably the household head. As in earlier surveys, data from the interviews was subsequently supplemented by registry data, but only for those respondents who had given their express consent. There is registry information for 75-80 percent of the sample. The telephone interview is divided into following sections: administrative data; labor market experience; employment; job-seekers; not in labor force; education; family composition; child care; health status; other household members; housing conditions; vacation homes; and cars and boats. The questionnaire was divided into twelve sections: sale of previous home; acquisition of current home; construction costs for building a home; house value and loans; repairs; insurance; home-related expenses; sale of previous home; assets; household income; taxes; and respondent income 1992. The 1996 telephone interview is divided into following sections: administrative data; labor market experience; employment; job-seekers; not in labor force; education; family composition; child care; health status; other household members; housing conditions; vacation homes; cars and boats; and environment. The questionnaire was divided into twelve sections: sale of previous home; acquisition of current home; construction costs for building a home; house value and loans; repairs; insurance; home-related expenses; sale of previous home; assets; household income; taxes; and respondent income 1995. The 1998 telephone interview is divided into following sections: administrative data; labor market experience; employment; job-seekers; not in labor force; education; family composition; child care; health status; other household members; housing conditions; vacation homes; cars and boats; and municipal service. The questionnaire was divided into nine sections: sale of previous home; house value and loans; insurance; home-related expenses; assets; household income; inheritances and gifts; black-market work; and respondent income 1997.
The Household Income and Expenditure Survey (HIES) is a field operation which consists of collecting information in the households based on face to face interview.
The questions asked to the households are related to living standard conditions, expenditures, purchases, incomes.... It is the only survey conducted at a national level which deals with households habits in terms of expenditure and income. As the private and public sector, households represent an economic and social actor of the country which needed to be known.
The purpose of the HIES survey is to obtain information on the income, consumption pattern, incidence of poverty, and saving propensities for different groups of people in the Solomon Islands. This information will be used to guide policy makers in framing socio-economic developmental policies and in initiating financial measures for improving economic conditions of the people.
Some more specific outputs from the survey are listed below: a) To obtain expenditure weights and other useful data for the revision of the consumer price index; b) To supplement the data available for use in compiling official estimates of household accounts in the systems of national accounts; c) To supply basic data needed for policy making in connection with social and economic planning; d) To provide data for assessing the impact on household living conditions of existing or proposed economic and social measures, particularly changes in the structure of household expenditures and in household consumption; e) To gather information on poverty lines and incidence of poverty throughout the Solomon Islands.
The previous HIES was conducted in 2005-2006, 7 years ago. All the indicators based on this survey need to be updated now. a) In the CPI, new items have appeared on the market since 2005, and the purchases habits / consumption habits of the household has changed b) The poverty assessment of the country has to be updated as well, based on the household living condition in 2012 (job opportunities have changed, income, education level...) c) In terms of national account, this survey will provide aggregates of 2012 household consumption.
This survey will highlight the level of expenditure and income of the households, situation with employment, equipment, assets of the households, education and health information, source of income and remittances... It will derive indicators that would provide Solomon Islands Government and their development partners with a core set of statistics to facilitate evidence-based policy development and planning, to monitor development progress and measure policy performance, and ultimately to describe development impact.
Version 01: Cleaned, labelled and anonymized version of the Master file.
INDIVIDUAL: demographic characteristics, economic activity, education, health, expenditure, income.
Collection start: 2012
Collection end: 2013
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 50% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)
The Household Income, Expenditure and Consumption Survey (HIECS) is of great importance among other household surveys conducted by statistical agencies in various countries around the world. This survey provides a large amount of data to rely on in measuring the living standards of households and individuals, as well as establishing databases that serve in measuring poverty, designing social assistance programs, and providing necessary weights to compile consumer price indices, considered to be an important indicator to assess inflation.
The HIECS 2008/2009 is the tenth Household Income, Expenditure and Consumption Survey that was carried out in 2008/2009, among a long series of similar surveys that started back in 1955.
The survey main objectives are: - To identify expenditure levels and patterns of population as well as socio- economic and demographic differentials. - To estimate the quantities, values of commodities and services consumed by households during the survey period to determine the levels of consumption and estimate the current demand which is important to predict future demands. - To measure mean household and per-capita expenditure for various expenditure items along with socio-economic correlates. - To define percentage distribution of expenditure for various items used in compiling consumer price indices which is considered important indicator for measuring inflation. - To define mean household and per-capita income from different sources. - To provide data necessary to measure standard of living for households and individuals. Poverty analysis and setting up a basis for social welfare assistance are highly dependant on the results of this survey. - To provide essential data to measure elasticity which reflects the percentage change in expenditure for various commodity and service groups against the percentage change in total expenditure for the purpose of predicting the levels of expenditure and consumption for different commodity and service items in urban and rural areas. - To provide data essential for comparing change in expenditure against change in income to measure income elasticity of expenditure. - To study the relationships between demographic, geographical, housing characteristics of households and their income and expenditure for commodities and services. - To provide data necessary for national accounts especially in compiling inputs and outputs tables. - To identify consumers behavior changes among socio-economic groups in urban and rural areas. - To identify per capita food consumption and its main components of calories, proteins and fats according to its sources and the levels of expenditure in both urban and rural areas. - To identify the value of expenditure for food according to sources, either from household production or not, in addition to household expenditure for non food commodities and services. - To identify distribution of households according to the possession of some appliances and equipments such as (cars, satellites, mobiles ...) in urban and rural areas. - To identify the percentage distribution of income recipients according to some background variables such as housing conditions, size of household and characteristics of head of household.
Compared to previous surveys, the current survey experienced certain peculiarities, among which: 1- Doubling the number of area segments from 1200 in the previous survey to 2526 segments with decreasing the number of households selected from each segment to be (20) households instead of (40) in the previous survey to ensure appropriate representatives in the society. 2- Changing the survey period to 15 days instead of one month in the previous one 200412005, to lighten the respondent burden and encourage more cooperation. 3- Adding some additional questions: a- Participation or the benefits gained from pension and social security system. b- Participation in health insurance system. 4- Increasing quality control Procedures especially for fieldwork to ensure data accuracy and avoid any errors in suitable time.
The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing household surveys in several Arab countries.
Covering a sample of urban and rural areas in all the governorates.
1- Household/family. 2- Individual/person.
The survey covered a national sample of households and all individuals permanently residing in surveyed households.
Sample survey data [ssd]
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 50% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE CENTRAL AGENCY FOR PUBLIC MOBILIZATION AND STATISTICS (CAPMAS)
The sample of HIECS, 2008-2009 is a two-stage stratified cluster sample, approximately self-weighted, of nearly 48000 households. The main elements of the sampling design are described in the following.
1- Sample Size
It has been deemed important to retain the same sample size of the previous two HIECS rounds. Thus, a sample of about 48000 households has been considered. The justification of maintaining the sample size at this level is to have estimates with levels of precision similar to those of the previous two rounds: therefore trend analysis with the previous two surveys will not be distorted by substantial changes in sampling errors from round to another. In addition, this relatively large national sample implies proportional samples of reasonable sizes for smaller governorates. Nonetheless, over-sampling has been introduced to raise the sample size of small governorates to about 1000 households As a result, reasonably precise estimates could be extracted for those governorates. The over-sampling has resulted in a slight increase in the national sample to 48658 households.
2- Cluster size
An important lesson learned from the previous two HIECS rounds is that the cluster size applied in both surveys is found to be too large to yield an accepted design effect estimates. The cluster size was 40 households in the 2004-2005 round, descending from 80 households in the 1999-2000 round. The estimates of the design effect (deft) for most survey measures of the latest round were extraordinary large. As a result, it has been decided to decrease the cluster size to only 19 households (20 households in urban governorates to account for anticipated non-response in those governorates: in view of past experience non-response is almost nil in rural governorates).
A more detailed description of the different sampling stages and allocation of sample across governorates is provided in the Methodology document available among the documentation materials published in both Arabic and English.
Face-to-face [f2f]
Three different questionnaires have been designed as following: 1- Expenditure and consumption questionnaire. 2- Diary questionnaire for expenditure and consumption. 3- Income questionnaire.
In designing the questionnaires of expenditure, consumption and income, we were taking into our consideration the following: - Using the recent concepts and definitions of International Labor Organization approved in the International Convention of Labor Statisticians held in Geneva, 2003. - Using the recent Classification of Individual Consumption according to Purpose (COICOP). - Using more than one approach of expenditure measurement to serve many purposes of the survey.
A brief description of each questionnaire is given next:
This questionnaire comprises 14 tables in addition to identification and geographic data of household on the cover page. The questionnaire is divided into two main sections.
Section one: Household schedule and other information. It includes: - Demographic characteristics and basic data for all household individuals consisting of 18 questions for every person. - Members of household who are currently working abroad. - The household ration card. - The main outlets that provide food and beverage. - Domestic and foreign tourism. - The housing conditions including 15 questions. - Means of transportation used to go to work or school. - The household possession of appliances and means of transportation. - This section includes some questions which help to define the social and economic level of households which in turn, help interviewers to check the plausibility of expenditure, consumption and income data.
Section two: Expenditure and consumption data It includes 14 tables as follows: - The quantity and value of food and beverages commodities actually consumed. - The quantity and value of the actual consumption of alcoholic beverages, tobacco and narcotics. - The quantity and value of the clothing and footwear. - The household expenditure for housing. - The household expenditure for furnishings, household equipment and routine maintenance of the house. - The household expenditure for health care services. - The household expenditure for transportation. - The household
The Household Income and Expenditure Survey is a survey collecting data on income, consumption and expenditure patterns of households, in accordance with methodological principles of statistical enquiries, which are linked to demographic and socio-economic characteristics of households. A Household Income and expenditure Survey is the sole source of information on expenditure, consumption and income patterns of households, which is used to calculate poverty and income distribution indicators. It also serves as a statistical infrastructure for the compilation of the national basket of goods used to measure changes in price levels. Furthermore, it is used for updating of the national accounts.
The main objective of the NHIES 2009/2010 is to comprehensively describe the levels of living of Namibians using actual patterns of consumption and income, as well as a range of other socio-economic indicators based on collected data. This survey was designed to inform policy making at the international, national and regional levels within the context of the Fourth National Development Plan, in support of monitoring and evaluation of Vision 2030 and the Millennium Development Goals. The NHIES was designed to provide policy decision making with reliable estimates at regional levels as well as to meet rural - urban disaggregation requirements.
National Coverage
Individuals and Households
Every week of the four weeks period of a survey round all persons in the household were asked if they spent at least 4 nights of the week in the household. Any person who spent at least 4 nights in the household was taken as having spent the whole week in the household. To qualify as a household member a person must have stayed in the household for at least two weeks out of four weeks.
Sample survey data [ssd]
The targeted population of NHIES 2009/2010 was the private households of Namibia. The population living in institutions, such as hospitals, hostels, police barracks and prisons were not covered in the survey. However, private households residing within institutional settings were covered. The sample design for the survey was a stratified two-stage probability sample, where the first stage units were geographical areas designated as the Primary Sampling Units (PSUs) and the second stage units were the households. The PSUs were based on the 2001 Census EAs and the list of PSUs serves as the national sample frame. The urban part of the sample frame was updated to include the changes that take place due to rural to urban migration and the new developments in housing. The sample frame is stratified first by region followed by urban and rural areas within region. In urban areas further stratification is carried out by level of living which is based on geographic location and housing characteristics. The first stage units were selected from the sampling frame of PSUs and the second stage units were selected from a current list of households within each selected PSU, which was compiled just before the interviews.
PSUs were selected using probability proportional to size sampling coupled with the systematic sampling procedure where the size measure was the number of households within the PSU in the 2001 Population and Housing Census. The households were selected from the current list of households using systematic sampling procedure.
The sample size was designed to achieve reliable estimates at the region level and for urban and rural areas within each region. However the actual sample sizes in urban or rural areas within some of the regions may not satisfy the expected precision levels for certain characteristics. The final sample consists of 10 660 households in 533 PSUs. The selected PSUs were randomly allocated to the 13 survey rounds.
All the expected sample of 533 PSUs was covered. However a number of originally selected PSUs had to be substituted by new ones due to the following reasons.
Urban areas: Movement of people for resettlement in informal settlement areas from one place to another caused a selected PSU to be empty of households.
Rural areas: In addition to Caprivi region (where one constituency is generally flooded every year) Ohangwena and Oshana regions were badly affected from an unusual flood situation. Although this situation was generally addressed by interchanging the PSUs betweensurvey rounds still some PSUs were under water close to the end of the survey period. There were five empty PSUs in the urban areas of Hardap (1), Karas (3) and Omaheke (1) regions. Since these PSUs were found in the low strata within the urban areas of the relevant regions the substituting PSUs were selected from the same strata. The PSUs under water were also five in rural areas of Caprivi (1), Ohangwena (2) and Oshana (2) regions. Wherever possible the substituting PSUs were selected from the same constituency where the original PSU was selected. If not, the selection was carried out from the rural stratum of the particular region. One sampled PSU in urban area of Khomas region (Windhoek city) had grown so large that it had to be split into 7 PSUs. This was incorporated into the geographical information system (GIS) and one PSU out of the seven was selected for the survey. In one PSU in Erongo region only fourteen households were listed and one in Omusati region listed only eleven households. All these households were interviewed and no additional selection was done to cover for the loss in sample.
Face-to-face [f2f]
The instruments for data collection were as in the previous survey the questionnaires and manuals. Form I questionnaire collected demographic and socio-economic information of household members, such as: sex, age, education, employment status among others. It also collected information on household possessions like animals, land, housing, household goods, utilities, household income and expenditure, etc.
Form II or the Daily Record Book is a diary for recording daily household transactions. A book was administered to each sample household each week for four consecutive weeks (survey round). Households were asked to record transactions, item by item, for all expenditures and receipts, including incomes and gifts received or given out. Own produce items were also recorded. Prices of items from different outlets were also collected in both rural and urban areas. The price collection was needed to supplement information from areas where price collection for consumer price indices (CPI) does not currently take place.
The questionnaires received from the regions were registered and counterchecked at the survey head office. The data processing team consisted of Systems administrator, IT technician, Programmers, Statisticians and Data typists.
Data capturing
The data capturing process was undertakenin the following ways: Form 1 was scanned, interpreted and verified using the “Scan”, “Interpret” & “Verify” modules of the Eyes & Hands software respectively. Some basic checks were carried out to ensure that each PSU was valid and every household was unique. Invalid characters were removed. The scanned and verified data was converted into text files using the “Transfer” module of the Eyes & Hands. Finally, the data was transferred to a SQL database for further processing, using the “TranScan” application. The Daily Record Books (DRB or form 2) were manually entered after the scanned data had been transferred to the SQL database. The reason was to ensure that all DRBs were linked to the correct Form 1, i.e. each household’s Form 1 was linked to the corresponding Daily Record Book. In total, 10 645 questionnaires (Form 1), comprising around 500 questions each, were scanned and close to one million transactions from the Form 2 (DRBs) were manually captured.
Household response rate: Total number of responding households and non-responding households and the reason for non-response are shown below. Non-contacts and incomplete forms, which were rejected due to a lot of missing data in the questionnaire, at 3.4 and 4.0 percent, respectively, formed the largest part of non-response. At the regional level Erongo, Khomas, and Kunene reported the lowest response rate and Caprivi and Kavango the highest. See page 17 of the report for a detailed breakdown of response rates by region.
To be able to compare with the previous survey in 2003/2004 and to follow up the development of the country, methodology and definitions were kept the same. Comparisons between the surveys can be found in the different chapters in this report. Experiences from the previous survey gave valuable input to this one and the data collection was improved to avoid earlier experienced errors. Also, some additional questions in the questionnaire helped to confirm the accuracy of reported data. During the data cleaning process it turned out, that some households had difficulty to separate their household consumption from their business consumption when recording their daily transactions in DRB. This was in particular applicable for the guest farms, the number of which has shown a big increase during the past five years. All households with extreme high consumption were examined manually and business transactions were recorded and separated from private consumption.
Household Income and Expenditure Survey (HIES) collects a wealth of information on HH income and expenditure, such as source of income by industry, HH expenditure on goods and services, and income and expenditure associated with subsistence production and consumption. In addition to this, HIES collects information on sectoral and thematic areas, such as education, health, labour force, primary activities, transport, information and communication, transfers and remittances, food expenditure (as a proxy for HH food consumption and nutrition analysis), and gender.
The Pacific Islands regionally standardized HIES instruments and procedures were adopted by the Government of Tokelau for the 2015/16 Tokelau HIES. These standards were designed to feed high-quality data to HIES data end users for:
The data allow for the production of useful indicators and information on the sectors covered in the survey, including providing data to inform indicators under the UN Sustainable Development Goals (SDGs). This report, the above listed outputs, and any thematic analyses of HIES data, collectively provide information to assist with social and economic planning and policy formation.
National coverage.
Households and Individuals.
The universe of the 2015/16 Tokelau Household Income and Expenditure Survey (HIES) is all occupied households (HHs) in Tokelau. HHs are the sampling unit, defined as a group of people (related or not) who pool their money, cook and eat together. It is not the physical structure (dwelling) in which people live. The HH must have been living in Tokelau for a period of six months, or have had the intention to live in Tokelau for a period of twelve months in order to be included in the survey.
Household members covered in the survey include: -usual residents currently living in the HH; -usual residents who are temporarily away (e.g., for work or a holiday); -usual residents who are away for an extended period, but are financially dependent on, or supporting, the HH (e.g., students living in school dormitories outside Tokelau, or a provider working overseas who hasn't formed or joined another HH in the host country) and plan to return; -persons who frequently come and go from the HH, but consider the HH being interviewed as their main place of stay; -any person who lives with the HH and is employed (paid or in-kind) as a domestic worker and who shares accommodation and eats with the host HH; and -visitors currently living with the HH for a period of six months or more.
Sample survey data [ssd]
The 2015/16 Tokelau Household Income and Expenditure Survey (HIES) sampling approach was designed to generate reliable results at the national level. That is, the survey was not designed to produce reliable results at any lower level, such as for the three individual atolls. The reason for this is partly budgetary constraint, but also because the HIES will serve its primary objectives with a sample size that will provide reliable national aggregates.
The sampling frame used for the random selection of HHs was from December 2013, i.e. the HH listing updated in the 2013 Population Count.
The 2015/16 Tokelau HIES had a quota of 120 HHs. The sample covered all three populated atolls in Tokelau (Fakaofo, Nukunonu and Atafu) and the sample was evenly allocated between the three atoll clusters (i.e., 40 HHs per atoll surveyed over a ten-month period). The HHs within each cluster were randomly selected using a single-stage selection process.
In addition to the 120 selected HHs, 60 HHs (20 per cluster) were randomly selected as replacement HHs to ensure that the desired sample was met. The replacement HHs were only approached for interview in the case that one of the primarily selected HHs could not be interviewed.
Face-to-face [f2f]
The questionnaires for this Household Income and Expenditure Survey (HIES) are composed of a diary and 4 modules published in English and in Tokelauan. All English questionnaires and modules are provided as external resources.
Here is the list of the questionnaires for this 2015-2016 HIES: - Diary: week 1 an 2; - Module 1: Demographic information (Household listing, Demographic profile, Activities, Educational status, Communication status...); - Module 2: Household expenditure (Housing characteristics, Housing tenure expenditure, Utilities and communication, Land and home...etc); - Module 3: Individual expenditure (Education, Health, Clothing, Communication, Luxury items, Alcohonl & tobacco); - Module 4: Household and individual income (Wages and salary, Agricultural and forestry activities, Fishing gathering and hunting activities, livestock and aquaculture activities...etc).
All inconsistencies and missing values were corrected using a variety of methods: 1. Manual correction: verified on actual questionnaires (double check on the form, questionnaire notes, local knowledge, manual verifications) 2. Subjective: the answer is obvious and be deducted from other questions 3. Donor hot deck: the value is imputed based on similar characteristics from other HHs or individuals (see example below) 4. Donor median: the missing or outliers were imputed from similar items reported median value 5. Record deletion: the record was filled by mistake and had to be removed.
Several questions used the hotdeck method of imputation to impute missing and outlying values. This method can use one to three dimensions and is dependent on which section and module the question was placed. The process works by placing correct values in a coded matrix. For example in Tokelau the “Drink Alcohol” questions used a three dimension hotdeck to store in-range reported data. The constraining dimensions used are AGE, SEX and RELATIONSHIP questions and act as a key for the hotdeck. On the first pass the valid yes/no responses are place into this 3-dimension hotdeck. On the second pass the data in the matrix is updated one person at a time. If a “Drink Alcohol” question contained a missing response then the person's coded age, sex and relationship key is searched in the “valid” matrix. Once a key is found the result contained in the matrix is imputed for the missing value. The first preferred method to correct missing or outlying data is the manual correction (trying to obtain the real value, it could have been miss-keyed or reported incorrectly). If the manual correction was unsuccessful at correcting the values, a subjective approach was used, the next method would be the hotdeck, then the donor median and the last correction is the record deletion. The survey procedure and enumeration team structure allow for in-round data entry, which gives the field staff the opportunity to correct the data by manual review and by using the entry system-generated error messages. This process was designed to improve data quality. The data entry system used system-controlled entry, interactive coding and validity and consistency checks. Despite the validity and consistency checks put in place, the data still required cleaning. The cleaning was a two-stage process, which included manual cleaning while referencing the questionnaire, whereas the second stage involved computer-assisted code verification and, in some cases, imputation. Once the data were clean, verified and consistent, they were recoded to form a final aggregated database, consisting of: Person level record - characteristics of every (household) HH member, including activity and education profile; HH level record - characteristics of the dwelling and access to services; Final aggregated income - all HH income streams, by category and type; Final aggregated expenditure - all HH expenditure items, by category and type.
The cleaning was a two-stage process, which included manual cleaning while referencing the questionnaire, whereas the second stage involved computer-assisted code verification and, in some cases, imputation. Once the data were clean, verified and consistent, they were recoded to form a final aggregated database.
Overall, 99% of the response rate objective was achieved.
Refer to Appendix 2 of the Tokelau 2015/2016 Household Income and Expenditure Survey report attached as an external resource.