Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Living Standards Survey (CLSS) consists of one household survey and one community (village) survey, conducted in Hebei and Liaoning Provinces (northern and northeast China) in July 1995 and July 1997 respectively. Five villages from each three sample counties of each province were selected (six were selected in Liaoyang County of Liaoning Province because of administrative area change). About 880 farm households were selected from total thirty-one sample villages for the household survey. The same thirty-one villages formed the samples of community survey. This document provides information on the content of different questionnaires, the survey design and implementation, data processing activities, and the different available data sets.
China Living Standards Survey (LSS) consists of one household survey and one community (village) survey, conducted in Hebei and Liaoning Provinces (northern and northeast China) in July 1995 and July 1997 respectively. Five villages from each three sample counties of each province were selected (six were selected in Liaoyang County of Liaoning Province because of administrative area change). About 880 farm households were selected from total thirty-one sample villages for the household survey. The same thirty-one villages formed the samples of community survey. This document provides information on the content of different questionnaires, the survey design and implementation, data processing activities, and the different available data sets.
Regional
Households
Sample survey data [ssd]
The China LSS sample is not a rigorous random sample drawn from a well-defined population. Instead it is only a rough approximation of the rural population in Hebei and Liaoning provinces in North-eastern China. The reason for this is that part of the motivation for the survey was to compare the current conditions with conditions that existed in Hebei and Liaoning in the 1930's. Because of this, three counties in Hebei and three counties in Liaoning were selected as "primary sampling units" because data had been collected from those six counties by the Japanese occupation government in the 1930's. Within each of these six counties (xian) five villages (cun) were selected, for an overall total of 30 villages (in fact, an administrative change in one village led to 31 villages being selected). In each county a "main village" was selected that was in fact a village that had been surveyed in the 1930s. Because of the interest in these villages 50 households were selected from each of these six villages (one for each of the six counties). In addition, four other villages were selected in each county. These other villages were not drawn randomly but were selected so as to "represent" variation within the county. Within each of these villages 20 households were selected for interviews. Thus, the intended sample size was 780 households, 130 from each county. Unlike county and village selection, the selection of households within each village was done according to standard sample selection procedures. In each village, a list of all households in the village was obtained from village leaders. An "interval" was calculated as the number of the households in the village divided by the number of households desired for the sample (50 for main villages and 20 for other villages). For the list of households, a random number was drawn between 1 and the interval number. This was used as a starting point. The interval was then added to this number to get a second number, then the interval was added to this second number to get a third number, and so on. The set of numbers produced were the numbers used to select the households, in terms of their order on the list. In fact, the number of households in the sample is 785, as opposed to 780. Most of this difference is due to a village in which 24 households were interviewed, as opposed to the goal of 20 households
Face-to-face [f2f]
(a) DATA ENTRY All responses obtained from the household interviews were recorded in the household questionnaires. These were then entered into the computer, in the field, using data entry programs written in BASIC. The data produced by the data entry program were in the form of household files, i.e. one data file for all of the data in one household/community questionnaire. Thus, for the household there were about 880 data files. These data files were processed at the University of Toronto and the World Bank to produce datasets in statistical software formats, each of which contained information for all households for a subset of variables. The subset of variables chosen corresponded to data entry screens, so these files are hereafter referred to as "screen files". For the household survey component 66 data files were created. Members of the survey team checked and corrected data by checking the questionnaires for original recorded information. We would like to emphasize that correction here refers to checking questionnaires, in case of errors in skip patterns, incorrect values, or outlying values, and changing values if and only if data in the computer were different from those in the questionnaires. The personnel in charge of data preparation were given specific instructions not to change data even if values in the questionnaires were clearly incorrect. We have no reason to believe that these instructions were not followed, and every reason to believe that the data resulting from these checks and corrections are accurate and of the highest quality possible.
(b) DATA EDITING The screen files were then brought to World Bank headquarters in Washington, D.C. and uploaded to a mainframe computer, where they were converted to "standard" LSMS formats by merging datasets to produce separate datasets for each section with variable names corresponding to the questionnaires. In some cases, this has meant a single dataset for a section, while in others it has meant retaining "screen" datasets with just the variable names changed. Linking Parts of the Household Survey Each household has a unique identification number which is contained in the variable HID. Values for this variable range from 10101 to 60520. The first number is the code for the six counties in which data were collected, the second and third digits are for the villages within each county. Finally, the last two digits of HID contain the household number within the village. Data for households from different parts of the survey can be merged by using the HID variable which appears in each dataset of the household survey. To link information for an individual use should be made of both the household identification number, HID, and the person identification number, PID. A child in the household can be linked to the parents, if the parents are household members, through the parents' id codes in Section 01B. For parents who are not in the household, information is collected on the parent's schooling, main occupation and whether he/she is currently alive. Household members can be linked with their non-resident children through the parents' id codes in Section 01C. Linking the Household to the Community Data The community data have a somewhat different set of identifying variables than the household data. Each community dataset has four identifying variables: province (code 7 for Hebei and code 8 for Liaoning); county (six two digit codes, of which the first digit represents province and the second digit represents the three counties in each province); township (3 digit code, first digit is county, second digit is county and third digit is township); and village (4 digit code, first digit is county, second digit is county, third digit is township, and third fourth digit is village). Constructed Data Set Researchers at the World Bank and the University of Toronto have created a data set with information on annual household expenditures, region codes, etc. This constructed data set is made available for general use with the understanding that the description below is the only documentation that will be provided. Any manipulation of the data requires assumptions to be made and, as much as possible, those assumptions are explained below. Except where noted, the data set has been created using only the original (raw) data sets. A researcher could construct a somewhat different data set by incorporating different assumptions. Aggregate Expenditure, TOTEXP. The dataset TOTEXP contains variables for total household annual expenditures (for the year 1994) and variables for the different components of total household expenditures: food expenditures, non-food expenditures, use value of consumer durables, etc. These, along with the algorithm used to calculate household expenditures are detailed in Appendix D. The dataset also contains the variable HID, which can be used to match this dataset to the household level data set. Note that all of the expenditure variables are totals for the household. That is, they are not in per capita terms. Researchers will have to divide these variables by household size to get per capita numbers. The household size variable is included in the data set.
China Living Standards Survey (CLSS) consists of one household survey and one community (village) survey, conducted in Hebei and Liaoning Provinces (northern and northeast China) in July 1995 and July 1997 respectively. Five villages from each three sample counties of each province were selected (six were selected in Liaoyang County of Liaoning Province because of administrative area change). About 880 farm households were selected from total thirty-one sample villages for the household survey. The same thirty-one villages formed the samples of community survey. This document provides information on the content of different questionnaires, the survey design and implementation, data processing activities, and the different available data sets.
The China Living Standards Survey (CLSS) was conducted only in Hebei and Liaoning Provinces (northern and northeast China).
Sample survey data [ssd]
The CLSS sample is not a rigorous random sample drawn from a well-defined population. Instead it is only a rough approximation of the rural population in Hebei and Liaoning provinces in Northeastern China. The reason for this is that part of the motivation for the survey was to compare the current conditions with conditions that existed in Hebei and Liaoning in the 1930’s. Because of this, three counties in Hebei and three counties in Liaoning were selected as "primary sampling units" because data had been collected from those six counties by the Japanese occupation government in the 1930’s. Within each of these six counties (xian) five villages (cun) were selected, for an overall total of 30 villages (in fact, an administrative change in one village led to 31 villages being selected). In each county a "main village" was selected that was in fact a village that had been surveyed in the 1930s. Because of the interest in these villages 50 households were selected from each of these six villages (one for each of the six counties). In addition, four other villages were selected in each county. These other villages were not drawn randomly but were selected so as to "represent" variation within the county. Within each of these villages 20 households were selected for interviews. Thus the intended sample size was 780 households, 130 from each county.
Unlike county and village selection, the selection of households within each village was done according to standard sample selection procedures. In each village, a list of all households in the village was obtained from village leaders. An "interval" was calculated as the number of the households in the village divided by the number of households desired for the sample (50 for main villages and 20 for other villages). For the list of households, a random number was drawn between 1 and the interval number. This was used as a starting point. The interval was then added to this number to get a second number, then the interval was added to this second number to get a third number, and so on. The set of numbers produced were the numbers used to select the households, in terms of their order on the list.
In fact, the number of households in the sample is 785, as opposed to 780. Most of this difference is due to a village in which 24 households were interviewed, as opposed to the goal of 20 households
Face-to-face [f2f]
Household Questionnaire
The household questionnaire contains sections that collect data on household demographic structure, education, housing conditions, land, agricultural management, household non-agricultural business, household expenditures, gifts, remittances and other income sources, and saving and loans. For some sections (general household information, schooling, housing, gift-exchange, remittance, other income, and credit and savings) the individual designated by the household members as the household head provided responses. For some other sections (farm land, agricultural management, family-run non-farm business, and household consumption expenditure) a member identified as the most knowledgeable provided responses. Identification codes for respondents of different sections indicate who provided the information. In sections where the information collected pertains to individuals (employment), whenever possible, each member of the household was asked to respond for himself or herself, except that parents were allowed to respond for younger children. Therefore, in the case of the employment section it is possible that the information was not provided by the relevant person; variables in this section indicate when this is true.
The household questionnaire was completed in a one-time interview in the summer of 1995. The survey was designed so that more sensitive issues such as credit and savings were discussed near the end. The content of each section is briefly described below.
Section 0 SURVEY INFORMATION
This section mainly summarizes the results of the survey visits. The following information was entered into the computer: whether the survey and the data entry were completed, codes of supervisor’s brief comments on interviewer, data entry operator, and related revising suggestion (e.g., 1. good, 2. revise at office, and 3. re-interview needed). Information about the date of interview, the names of interviewer, supervisor, data enterer, and detail notes of interviewer and supervisor were not entered into the computer.
Section 1 GENERAL HOUSEHOLD INFORMATION
1A HOUSEHOLD STRUCTURE 1B INFORMATION ABOUT THE HOUSEHOLD MEMBERS’ PARENTS 1C INFORMATION ABOUT THE CHILDREN WHO ARE NOT LIVING IN HOME
Section 1A lists the personal id code, sex, relationship to the household head, ethnic group, type of resident permit (agricultural [nongye], non-agricultural [fei nongye], or no resident permit), date of birth, marital status of all people who spent the previous night in that household and for household members who are temporarily away from home. The household head is listed first and receives the personal id code 1. Household members were defined to include “all the people who normally live and eat their meals together in this dwelling.” Those who were absent more than nine of the last twelve months were excluded, except for the head of household. For individuals who are married and whose spouse resides in the household, the personal id number of the spouse is noted. By doing so, information on the spouse can be collected by appropriately merging information from the section 1A and other parts of the survey.
Section 1B collects information on the parents of all household members. For individuals whose parents reside in the household, parents’ personal id numbers are noted, and information can be obtained by appropriately merging information from other parts of the survey. For individuals whose parents do not reside in the household, information is recorded on whether each parent is alive, as well as their schooling and occupation.
Section 1C collects information for children of household members who are not living in home. Children who have died are not included. The information on the name, sex, types of resident permit, age, education level, education cost, reasons not living in home, current living place, and type of job of each such child is recorded.
Section 2 SCHOOLING
In Section 2, information about literacy and numeracy, school attendance, completion, and current enrollment for all household members of preschool age and older. The interpretation of pre-school age appears to have varied, with the result that while education information is available for some children of pre-school age, not all pre-school children were included in this section. But for ages 6 and above information is available for nearly all individuals, so in essence the data on schooling can be said to apply all persons 6 age and above. For those who were enrolled in school at the time of the survey, information was also collected on school attendance, expenses, and scholarships. If applicable, information on serving as an apprentice, technical or professional training was also collected.
Section 3 EMPLOYMENT
3A GENERAL INFORMATION 3B MAJOR NON-FARM JOB IN 1994 3C THE SECOND NON-FARM JOB IN 1994 3D OTHER EMPLOYMENT ACTIVITIES IN 1994 3E SEARCHING FOR NON-FARM JOB 3F PROCESS FOR GETTING MAJOR NON-FARM JOB 3G CORVEE LABOR
All individuals age thirteen and above were asked to respond to the employment activity questions in Section 3. Section 3A collects general information on farm and non-farm employment, such as whether or not the household member worked on household own farm in 1994, when was the last year the member worked on own farm if he/she did not work in 1994, work days and hours during busy season, occupation and sector codes of the major, second, and third non-farm jobs, work days and total income of these non-farm jobs. There is a variable which indicates whether or not the individual responded for himself or herself.
Sections 3B and 3C collect detailed information on the major and the second non-farm job. Information includes number of months worked and which month in 1994 the member worked on these jobs, average works days (or hours) per month (per day), total number of years worked for these jobs by the end of 1994, different components of income, type of employment contracts. Information on employer’s ownership type and location was also collected.
Section 3D collects information on average hours spent doing chores and housework at home every day during non-busy and busy season. The chores refer to cooking, laundry, cleaning, shopping, cutting woods, as well as small-scale farm yard animals raising, for example, pigs or chickens. Large-scale animal
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Living Standards Survey (CLSS) consists of one household survey and one community (village) survey, conducted in Hebei and Liaoning Provinces (northern and northeast China) in July 1995 and July 1997 respectively. Five villages from each three sample counties of each province were selected (six were selected in Liaoyang County of Liaoning Province because of administrative area change). About 880 farm households were selected from total thirty-one sample villages for the household survey. The same thirty-one villages formed the samples of community survey. This document provides information on the content of different questionnaires, the survey design and implementation, data processing activities, and the different available data sets.