https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965
Abstract (en): The American Time Use Survey (ATUS) collects information on how people living in the United States spend their time. Data collected in this study measured the amount of time that people spent doing various activities in 2005, such as paid work, child care, religious activities, volunteering, and socializing. Respondents were randomly selected from households that had completed their final month of the Current Population Survey (CPS), and were interviewed two to five months after their household's last CPS interview. Respondents were interviewed only once and reported their activities for the 24-hour period from 4 a.m. on the day before the interview until 4 a.m. on the day of the interview. Respondents indicated the total number of minutes spent on each activity, including where they were and whom they were with. Except for secondary child care, data on activities done simultaneously with primary activities were not collected. Part 1, Respondent and Activity Summary File, contains demographic information about respondents and a summary of the total amount of time they spent doing each activity that day. Part 2, Roster File, contains information about household members and nonhousehold children under the age of 18. Part 3, Activity File, includes additional information on activities in which respondents participated, including the location of each activity and the total time spent on secondary child care. Part 4, Who File, includes data on who was present during each activity. Part 5, ATUS-CPS 2005 File, contains data on respondents and members of their household collected two to five months prior to the ATUS interviews during their participation in the Current Population Survey (CPS). Parts 6-10 contain supplemental data files that can be used for further analysis of the data. Part 6, Case History File, contains information about the interview process, such as identifiers and interview outcome codes. Part 7, Call History File, gives information about each call attempt, including the call date and outcome. Part 8, Trips File, provides information about the number, duration, and purpose of overnight trips away from home for two or more nights in a row. Part 9, Replicate Weights File I, contains base weights, replicated base weights, and replicate final weights for each case that was selected to be interviewed for ATUS, while Part 10, Replicate Weights File II, contains replicate weights that were generated using the 2006 weighting method. Demographic variables include sex, age, race, ethnicity, education level, income, employment status, occupation, citizenship status, country of origin, relationship to household members, and the ages and number of children in the household. The data contain weight variables which should be used in analyzing the data. Unweighted data are not representative of the population due to differences between population groups in both sampling and nonresponse. ATUS weight variables include the ATUS final weight (TUFINLWGT), which indicates the number of person-days the respondent represents, the ATUS base weight (TUBWGT), and a ATUS final weight based on 2006 weighting methodology (TU06FWGT). ATUS weights were selected from the Current Population Survey (CPS), and CPS weights (after the first-stage adjustment) are the basis for the ATUS weights. These base weights were adjusted to account for the fact that less populous states were not oversampled in ATUS, as they were in the CPS. Further adjustments were made to account for the probability of selecting each household within the ATUS sampling strata and the probability of selecting each person from each sample household. Part 9 contains replicate weights for the variable TUFINLWGT, as well as base weights, while Part 10 contains replicate weights for the variable TU06FWGT. ATUS replicate weights were based on the replicate weights developed for the CPS. ATUS began with the CPS replicate weight after the first-stage ratio adjustment, and each replicate was processed through all of the stages of the ATUS weighting procedure. The CPS replicate weights were based on a modified balanced half-sample method of replication, developed in the 1980s by Robert Fay. For more information about the replicate weights, see the publication, Technical Paper 63RV: Current Population Survey -- Design and Methodology, available via the Bureau of Labor Statistics Web site. More information on the weighting variables used in this study can be found in t...
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457357https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457357
Abstract (en): The Public Use Microdata Sample (PUMS) 1-Percent Sample contains household and person records for a sample of housing units that received the "long form" of the 1990 Census questionnaire. Data items include the full range of population and housing information collected in the 1990 Census, including 500 occupation categories, age by single years up to 90, and wages in dollars up to $140,000. Each person identified in the sample has an associated household record, containing information on household characteristics such as type of household and family income. All persons and housing units in the United States. A stratified sample, consisting of a subsample of the household units that received the 1990 Census "long-form" questionnaire (approximately 15.9 percent of all housing units). 2006-01-12 All files were removed from dataset 85 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 83 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 82 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 81 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 80 and flagged as study-level files, so that they will accompany all downloads.1998-08-28 The following data files were replaced by the Census Bureau: the state files (Parts 1-56), Puerto Rico (Part 72), Geographic Equivalency File (Part 84), and Public Use Microdata Areas (PUMAS) Crossing State Lines (Part 99). These files now incorporate revised group quarters data. Parts 201-256, which were separate revised group quarters files for each state, have been removed from the collection. The data fields affected by the group quarters data revisions were POWSTATE, POWPUMA, MIGSTATE and MIGPUMA. As a result of the revisions, the Maine file (Part 23) gained 763 records and Part 99 lost 763 records. In addition, the following files have been added to the collection: Ancestry Code List, Place of Birth Code List, Industry Code List, Language Code List, Occupation Code List, and Race Code List (Parts 86-91). Also, the codebook is now available as a PDF file. (1) Although all records are 231 characters in length, each file is hierarchical in structure, containing a housing unit record followed by a variable number of person records. Both record types contain approximately 120 variables. Two improvements over the 1980 PUMS files have been incorporated. First, the housing unit serial number is identified on both the housing unit record and on the person record, allowing the file to be processed as a rectangular file. In addition, each person record is assigned an individual weight, allowing users to more closely approximate published reports. Unlike previous years, the 1990 PUMS 1-Percent and 5-Percent Samples have not been released in separate geographic series (known as "A," "B," etc. records). Instead, each sample has its own set of geographies, known as "Public Use Microdata Areas" (PUMAs), established by the Census Bureau with assistance from each State Data Center. The PUMAs in the 1-Percent Sample are based on a distinction between metropolitan and nonmetropolitan areas. Metropolitan areas encompass whole central cities, Primary Metropolitan Statistical Areas (PMSAs), Metropolitan Statistical Areas (MSAs), or groups thereof, except where the city or metropolitan area contains more than 200,000 inhabitants. In that case, the city or metropolitan area is divided into several PUMAs. Nonmetropolitan PUMAs are based on areas or groups of areas outside the central city, PMSA, or MSA. PUMAs in this 1-Percent Sample may cross state lines. (2) The codebook is provided as a Portable Document Format (PDF) file. The PDF file format was developed by Adobe Systems Incorporated and can be accessed using PDF reader software, such as the Adobe Acrobat Reader. Information on how to obtain a copy of the Acrobat Reader is provided through the ICPSR Website on the Internet.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441781https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de441781
Abstract (en): This data collection contains information gathered from questionnaires administered to high school seniors on two separate occasions. Part 1 contains data gathered in 1965 in order to provide information about the social and political climate of the peer groups and the entire senior classes of the student interviewees who were the subjects of the STUDENT-PARENT SOCIALIZATION STUDY, 1965 (ICPSR 7286). Part 2 contains similar data gathered in 1973 to provide a trend line and to cover slightly different topics. The schools used were defined by the 97 included in the socialization study, in which all members of the senior class were potential respondents. In the cohort study, several key political measures (especially trust, efficacy, tolerance, cosmopolitanism, salience, and partisanship) and personal measures were developed paralleling those used in the socialization study. Data include respondent's attitudes toward politics, things the respondent was least proud of (e.g., discrimination against minorities or dirty politics in government), concept of a good citizen, faith in government, political interest, attitudes toward federal government, party identification, academic courses, interest in public affairs, attitudes toward school and students, school activities, respondent's personality, academic background and plans, occupational plans, and family background. The 1965 and 1973 interviews differed in some respects: Part 1 included more attention to the social studies curriculum and the social climate, and Part 2 devoted more attention to political data and ethnic and racial composition. Additional information about the schools attended by the students was collected from school officials through a school characteristics form, e.g., percentages of various ethnic groups making up the student population, percentage of graduating seniors entering college, and whether the school had a formal social studies curriculum guide. These data are located at the end of each file. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Standardized missing values.; Checked for undocumented or out-of-range codes.. High school seniors in the United States. The number of cooperating schools was 77 in 1965 and 85 in 1973. The number of respondents in the 1965 study was 20,682 (85 percent response rate) and 16,929 respondents in 1973 (approximately 80 percent response rate). The weighted data provide a sample that can be considered a nationally representative sample of high school seniors based on population distributions as of the mid-1960s.
The purpose of the Tajikistan LSS surveys has been to provide quantitative data at the individual, household and community level that will facilitate purposeful policy design on issues of welfare and living standards of the population of the Republic of Tajikistan. Since 2007, the studies have been done in collaboration with World Bank and UNICEF and implementation by Tajik National Committee for Statistics. The 2007 LSS survey is based on the 2003 LSS and 2005 MICS survey with additional questions and modules
National
Households
Sample survey data [ssd]
A detailed description of the sampling methodology is available in appendix to the document "Basic Information Document".
The Tajikistan LSS sample was designed to allow reliable estimation of poverty and most variables for a variety of other living standard indicators at the various domains of interest based on a representative probability sample on the level of:
• Tajikistan as a whole
• Total urban and total rural areas
• The five main administrative regions (oblasts) of the country: Dushanbe, Rayons of Republican Subordination (RRS), Sogd, Khatlon, and Gorno-Badakhshan Autonomous Oblast (GBAO)
The last census was conducted in 2000 and covered all five main administrative regions (oblasts) of the country (Dushanbe, RRS, Sogd, Khatlon, and GBAO). Each oblast was further subdivided into smaller areas called census section, instructor's sector and enumeration sector (ES). Each ES is either totally urban or rural. The list of ESs has census information on the population of each ES, and the ES lists were grouped by oblast.
In 2005, UNICEF implemented a Multiple Indicator Cluster Survey (MIC-05) in Tajikistan during which an electronic database of the ES information was created. Information in this database included: oblast, rayon, jamoat, settlement type, city/village, ES code, and population. Information from this database was used in the sample design of the TLSS07.
The total number of clusters for the Tajikistan LSS 2007 was established as 270 and total number of households per cluster was established as 18, resulting in a sample size of 4,860. The sample size was determined by considering: • The reliability of the survey estimates on both regional and national level • Quality of the data collected for the survey • Cost in time for the data collection • An oversample in 7 rayons in Khatlon
Face-to-face [f2f]
Data Entry and Cleaning
The data entry program was designed using CSPro, a data entry package developed by the US Census Bureau. This software allows programs to be developed to perform three types of data checks: (a) range checks; (b) intra-record checks to verify inconsistencies pertinent to the particular module of the questionnaire; and (c) inter-record checks to determine inconsistencies between the different modules of the questionnaire.
The data from the First Round were key entered at the Goskomstat headquarters in Dushanbe starting 4 October 2007 through 25 November 2007. The Second Round and Sughd data were key entered from 26 November 2007 through 12 December 2007. All of the data were double entered with both the First Round, Second Round and Sughd re-collection double entry being completed by 22 January 2008.
The data cleaning process began in February 2008 and was completed at the end of May 2008.
How to Use the Data:
There are three separate data bases with the data from the TLSS07. The data from each data collection is maintained separately. The data sets have similar names in each of the three separate data collections. First Round data sets have names in the form of "r1mnp" where "n" is the number of the module, and "p" is the part of the module (if any). Data from the Subjective Poverty module would be stored as "r1m9" and data from the Migration module, Part C Family Members Living Away from the Household would be stored as "r1m2c". Second Round data set names have a similar form "r2mnp". Data sets from the Sughd collection replace the "m" of the First Round with "sm", such as sm12a1.
The variable names have a similar format. Each variable name includes the module in which the variable is found and the question number. For example, question 10 in Module 4 Health, Part B Utilization of Outpatient Health Care is "m4b_q10". The variable names in all three of the data collections have the same format.
In addition to the individual roster files for each data base, there is also one roster file for all three data bases, rosterall. This roster file contains the information on all of the households and household members who are included in the data. There is a variable (source) indicating if the household/member is: (a) in Round 1 only; (b) in Round 2 only; (c) in Round 1 and Round 2; or (d) in the Sughd data. It is important to pay attention to this variable as the recall periods for the Subjective Poverty and Food Security Module (9A) is the last 4 weeks in the First Round, but changed to the last 2 weeks in the Second Round and the Sughd collection. In addition, the order of the question in the Expenditure On Food In The Last 7 Days, Module 10, changed
This data collection contains two separate data files, both of which are the results of the systematic evaluation of job worth performed by the Committee on Occupational Classification and Analysis of the National Academy of Sciences. The Committee acquired a selection of variables from the April 1971 Current Population Survey (CPS) that were gathered from a sample of households which yielded 60,441 workers in the experienced civilian labor force. The CPS survey provided detailed information about the workers and their family backgrounds, education, and employment. Part 1 contains that data augmented with Dictionary of Occupational Titles (DOT) characteristics, e.g., job classification and description, for each worker in the survey. Part 2 of this data collection is a file created by the Committee containing aggregate DOT characteristics (based on the DOT, Fourth Edition) for the 574 expanded occupation categories of the 1970 United States Census. The motivation for aggregating DOT characteristics (which exist as scores for each of 12,099 occupations) into 1970 United States Census codes was to allow researchers to relate the characteristics of occupations from the DOT to the characteristics of the individuals in those occupations gathered from the Census and survey data. The file's data -- the aggregated scores for all the workers in each of the 574 occupational categories -- are based on a variety of criteria, e.g., Specific Vocational Preparation (SVP), aptitudes, interest factors, preferences, physical demands, environmental conditions, and General Educational Development (GED).
MARF is the 1980 Census counterpart of the Master Enumeration District List (MEDList) prepared for the 1970 census. It links state or state equivalent, county or county equivalent, minor civil division (MCD)/census county division (CCD), and place names with their respective geographic codes. It is also an abbreviated summary file containing selected population and housing unit counts. MARF 2 has the same geographic coverage as the first MARF and includes the following additional information: FIPS place codes, latitude and longitude coordinates for geographic areas down to the BG/ED level, land area in square miles for geographic areas down to the level of places or minor civil divisions (for 11 selected states) with a population of 2,500 or more, total population and housing count estimates based on sample returns, and per capital income for all geographic areas included in the file. There are 51 files, one for each state and the District of Columbia. (Source: downloaded from ICPSR 7/13/10)
Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR08258.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.
The Consumer Expenditure Survey (CE) program provides a continuous and comprehensive flow of data on the buying habits of American consumers. These data are used widely in economic research and analysis, and in support of revisions of the Consumer Price Index. To meet the needs of users, the Bureau of Labor Statistics (BLS) produces population estimates for consumer units (CUs) of average expenditures in news releases, reports, issues, and articles in the Monthly Labor Review. Tabulated CE data are also available on the Internet and by facsimile transmission (See Section XV. APPENDIX 4). The microdata are available online at http://www/bls.gov/cex/pumdhome.htm.
These microdata files present detailed expenditure and income data for the Diary component of the CE for 2002. They include weekly expenditure (EXPD) and annual income (DTBD) files. The data in EXPD and DTBD files are categorized by a Universal Classification Code (UCC). The advantage of the EXPD and DTBD files is that with the data classified in a standardized format, the user may perform comparative expenditure (income) analysis with relative ease. The FMLD and MEMD files present data on the characteristics and demographics of CUs and CU members. The summary level expenditure and income information on the FMLD files permits the data user to link consumer spending, by general expenditure category, and household characteristics and demographics on one set of files.
Estimates of average expenditures in 2002 from the Diary survey, integrated with data from the Interview survey, are published in Consumer Expenditures in 2002. A list of recent publications containing data from the CE appears at the end of this documentation.
The microdata files are in the public domain and with appropriate credit, may be reproduced without permission. A suggested citation is: "U.S. Department of Labor, Bureau of Labor Statistics, Consumer Expenditure Survey, Diary Survey, 2002".
STATE IDENTIFIER Since the CE is not designed to produce state-level estimates, summing the consumer unit weights by state will not yield state population totals. A CU's basic weight reflects its probability of selection among a group of primary sampling units of similar characteristics. For example, sample units in an urban nonmetropolitan area in California may represent similar areas in Wyoming and Nevada. Among other adjustments, CUs are post-stratified nationally by sex-age-race. For example, the weights of consumer units containing a black male, age 16-24 in Alabama, Colorado, or New York, are all adjusted equivalently. Therefore, weighted population state totals will not match population totals calculated from other surveys that are designed to represent state data. To summarize, the CE sample was not designed to produce precise estimates for individual states. Although state-level estimates that are unbiased in a repeated sampling sense can be calculated for various statistical measures, such as means and aggregates, their estimates will generally be subject to large variances. Additionally, a particular state-population estimate from the CE sample may be far from the true state-population estimate.
INTERPRETING THE DATA
Several factors should be considered when interpreting the expenditure data. The average expenditure for an item may be considerably lower than the expenditure by those CUs that purchased the item. The less frequently an item is purchased, the greater the difference between the average for all consumer units and the average of those purchasing. (See Section V.B. for ESTIMATION OF TOTAL AND MEAN EXPENDITURES). Also, an individual CU may spend more or less than the average, depending on its particular characteristics. Factors such as income, age of family members, geographic location, taste and personal preference also influence expenditures. Furthermore, even within groups with similar characteristics, the distribution of expenditures varies substantially.
Expenditures reported are the direct out-of-pocket expenditures. Indirect expenditures, which may be significant, may be reflected elsewhere. For example, rental contracts often include utilities. Renters with such contracts would record no direct expense for utilities, and therefore, appear to have no utility expenses. Employers or insurance companies frequently pay other costs. CUs with members whose employers pay for all or part of their health insurance or life insurance would have lower direct expenses for these items than those who pay the entire amount themselves. These points should be considered when relating reported averages to individual circumstances.
Consumer Unit
Sample survey data [ssd]
A. SURVEY SAMPLE DESIGN
Samples for the CE are national probability samples of households designed to be representative of the total U. S. civilian population. Eligible population includes all civilian noninstitutional persons.
The first step in sampling is the selection of primary sampling units (PSUs), which consist of counties (or parts thereof) or groups of counties. The set of sample PSUs used for the 2002 sample is composed of 105 areas. The design classifies the PSUs into four categories:
• 31 "A" certainty PSUs are Metropolitan Statistical Areas (MSA's) with a population greater than 1.5 million. • 46 "B" PSUs, are medium-sized MSA's. • 10 "C" PSUs are nonmetropolitan areas that are included in the CPI. • 18 "D" PSUs are nonmetropolitan areas where only the urban population data will be included in the CPI.
The sampling frame (that is, the list from which housing units were chosen) for the 2002 survey is generated from the 1990 Population Census 100-percent-detail file. The sampling frame is augmented by new construction permits and by techniques used to eliminate recognized deficiencies in census coverage. All Enumeration Districts (ED's) from the Census that fail to meet the criterion for good addresses for new construction, and all ED's in nonpermit-issuing areas are grouped into the area segment frame.
To the extent possible, an unclustered sample of units is selected within each PSU. This lack of clustering is desirable because the sample size of the Diary Survey is small relative to other surveys, while the intraclass correlations for expenditure characteristics are relatively large. This suggests that any clustering of the sample units could result in an unacceptable increase in the within-PSU variance and, as a result, the total variance.
Each selected sample unit is requested to keep two 1-week diaries of expenditures over consecutive weeks. The earliest possible day for placing a diary with a household is predesignated with each day of the week having an equal chance to be the first of the reference week. The diaries are evenly spaced throughout the year. During the last 6 weeks of the year, however, the Diary Survey sample is supplemented to twice its normal size to increase the reporting of types of expenditures unique to the holidays.
B. COOPERATION LEVELS
The annual target sample size at the United States level for the Diary Survey is 7,800 participating sample units. To achieve this target the total estimated work load is 11,275 sample units. This allows for refusals, vacancies, or nonexistent sample unit addresses.
Each participating sample unit selected is asked to keep two 1-week diaries. Each diary is treated independently, so response rates are based on twice the number of housing units sampled.
Computer Assisted Personal Interview [capi]
The response rate for the 2002 Diary Survey is 74.2%. This response rate refers to all diaries in the year.
The microdata file on households and dwellings, derived from the 1986 Census, contains a wide range of statistical data on the population of Canada, the provinces and most metropolitan areas. These data are based on a sample of 115,000 households, representing approximately 1% of all households in Canada. The file provides data on households per se, such as total income of household members, and the dwellings they occupy, such as date of construction. As well, extensive demographic, social and economic information about the household maintainer and his/her spouse, and about the maintainer's economic family are provided. The information is organized according to four themes: dwellings; households; household maintainer and spouse; and household maintainer's economic family.
AP VoteCast is a survey of the American electorate conducted by NORC at the University of Chicago for Fox News, NPR, PBS NewsHour, Univision News, USA Today Network, The Wall Street Journal and The Associated Press.
AP VoteCast combines interviews with a random sample of registered voters drawn from state voter files with self-identified registered voters selected using nonprobability approaches. In general elections, it also includes interviews with self-identified registered voters conducted using NORC’s probability-based AmeriSpeak® panel, which is designed to be representative of the U.S. population.
Interviews are conducted in English and Spanish. Respondents may receive a small monetary incentive for completing the survey. Participants selected as part of the random sample can be contacted by phone and mail and can take the survey by phone or online. Participants selected as part of the nonprobability sample complete the survey online.
In the 2020 general election, the survey of 133,103 interviews with registered voters was conducted between Oct. 26 and Nov. 3, concluding as polls closed on Election Day. AP VoteCast delivered data about the presidential election in all 50 states as well as all Senate and governors’ races in 2020.
This is survey data and must be properly weighted during analysis: DO NOT REPORT THIS DATA AS RAW OR AGGREGATE NUMBERS!!
Instead, use statistical software such as R or SPSS to weight the data.
National Survey
The national AP VoteCast survey of voters and nonvoters in 2020 is based on the results of the 50 state-based surveys and a nationally representative survey of 4,141 registered voters conducted between Nov. 1 and Nov. 3 on the probability-based AmeriSpeak panel. It included 41,776 probability interviews completed online and via telephone, and 87,186 nonprobability interviews completed online. The margin of sampling error is plus or minus 0.4 percentage points for voters and 0.9 percentage points for nonvoters.
State Surveys
In 20 states in 2020, AP VoteCast is based on roughly 1,000 probability-based interviews conducted online and by phone, and roughly 3,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.3 percentage points for voters and 5.5 percentage points for nonvoters.
In an additional 20 states, AP VoteCast is based on roughly 500 probability-based interviews conducted online and by phone, and roughly 2,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 2.9 percentage points for voters and 6.9 percentage points for nonvoters.
In the remaining 10 states, AP VoteCast is based on about 1,000 nonprobability interviews conducted online. In these states, the margin of sampling error is about plus or minus 4.5 percentage points for voters and 11.0 percentage points for nonvoters.
Although there is no statistically agreed upon approach for calculating margins of error for nonprobability samples, these margins of error were estimated using a measure of uncertainty that incorporates the variability associated with the poll estimates, as well as the variability associated with the survey weights as a result of calibration. After calibration, the nonprobability sample yields approximately unbiased estimates.
As with all surveys, AP VoteCast is subject to multiple sources of error, including from sampling, question wording and order, and nonresponse.
Sampling Details
Probability-based Registered Voter Sample
In each of the 40 states in which AP VoteCast included a probability-based sample, NORC obtained a sample of registered voters from Catalist LLC’s registered voter database. This database includes demographic information, as well as addresses and phone numbers for registered voters, allowing potential respondents to be contacted via mail and telephone. The sample is stratified by state, partisanship, and a modeled likelihood to respond to the postcard based on factors such as age, race, gender, voting history, and census block group education. In addition, NORC attempted to match sampled records to a registered voter database maintained by L2, which provided additional phone numbers and demographic information.
Prior to dialing, all probability sample records were mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Postcards were addressed by name to the sampled registered voter if that individual was under age 35; postcards were addressed to “registered voter” in all other cases. Telephone interviews were conducted with the adult that answered the phone following confirmation of registered voter status in the state.
Nonprobability Sample
Nonprobability participants include panelists from Dynata or Lucid, including members of its third-party panels. In addition, some registered voters were selected from the voter file, matched to email addresses by V12, and recruited via an email invitation to the survey. Digital fingerprint software and panel-level ID validation is used to prevent respondents from completing the AP VoteCast survey multiple times.
AmeriSpeak Sample
During the initial recruitment phase of the AmeriSpeak panel, randomly selected U.S. households were sampled with a known, non-zero probability of selection from the NORC National Sample Frame and then contacted by mail, email, telephone and field interviewers (face-to-face). The panel provides sample coverage of approximately 97% of the U.S. household population. Those excluded from the sample include people with P.O. Box-only addresses, some addresses not listed in the U.S. Postal Service Delivery Sequence File and some newly constructed dwellings. Registered voter status was confirmed in field for all sampled panelists.
Weighting Details
AP VoteCast employs a four-step weighting approach that combines the probability sample with the nonprobability sample and refines estimates at a subregional level within each state. In a general election, the 50 state surveys and the AmeriSpeak survey are weighted separately and then combined into a survey representative of voters in all 50 states.
State Surveys
First, weights are constructed separately for the probability sample (when available) and the nonprobability sample for each state survey. These weights are adjusted to population totals to correct for demographic imbalances in age, gender, education and race/ethnicity of the responding sample compared to the population of registered voters in each state. In 2020, the adjustment targets are derived from a combination of data from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, Catalist’s voter file and the Census Bureau’s 2018 American Community Survey. Prior to adjusting to population totals, the probability-based registered voter list sample weights are adjusted for differential non-response related to factors such as availability of phone numbers, age, race and partisanship.
Second, all respondents receive a calibration weight. The calibration weight is designed to ensure the nonprobability sample is similar to the probability sample in regard to variables that are predictive of vote choice, such as partisanship or direction of the country, which cannot be fully captured through the prior demographic adjustments. The calibration benchmarks are based on regional level estimates from regression models that incorporate all probability and nonprobability cases nationwide.
Third, all respondents in each state are weighted to improve estimates for substate geographic regions. This weight combines the weighted probability (if available) and nonprobability samples, and then uses a small area model to improve the estimate within subregions of a state.
Fourth, the survey results are weighted to the actual vote count following the completion of the election. This weighting is done in 10–30 subregions within each state.
National Survey
In a general election, the national survey is weighted to combine the 50 state surveys with the nationwide AmeriSpeak survey. Each of the state surveys is weighted as described. The AmeriSpeak survey receives a nonresponse-adjusted weight that is then adjusted to national totals for registered voters that in 2020 were derived from the U.S. Census Bureau’s November 2018 Current Population Survey Voting and Registration Supplement, the Catalist voter file and the Census Bureau’s 2018 American Community Survey. The state surveys are further adjusted to represent their appropriate proportion of the registered voter population for the country and combined with the AmeriSpeak survey. After all votes are counted, the national data file is adjusted to match the national popular vote for president.
This file contains data on Gini coefficients, cumulative quintile shares, explanations regarding the basis on which the Gini coefficient was computed, and the source of the information. There are two data-sets, one containing the "high quality" sample and the other one including all the information (of lower quality) that had been collected.
The database was constructed for the production of the following paper:
Deininger, Klaus and Lyn Squire, "A New Data Set Measuring Income Inequality", The World Bank Economic Review, 10(3): 565-91, 1996.
This article presents a new data set on inequality in the distribution of income. The authors explain the criteria they applied in selecting data on Gini coefficients and on individual quintile groups’ income shares. Comparison of the new data set with existing compilations reveals that the data assembled here represent an improvement in quality and a significant expansion in coverage, although differences in the definition of the underlying data might still affect intertemporal and international comparability. Based on this new data set, the authors do not find a systematic link between growth and changes in aggregate inequality. They do find a strong positive relationship between growth and reduction of poverty.
In what follows, we provide brief descriptions of main features for individual countries that are included in the data-base. Without being comprehensive, these notes are intended to indicate some of the considerations underlying our decision to include or exclude certain observations.
Argentina Various permanent household surveys, all covering urban centers only, have been regularly conducted since 1972 and are quoted in a wide variety of sources and years, e.g., for 1980 (World Bank 1992), 1985 (Altimir 1994), and 1989 (World Bank 1992). Estimates for 1963, 1965, 1969/70, 1970/71, 1974, 1975, 1980, and 1981 (Altimir 1987) are based only on Greater Buenos Aires. Estimates for 1961, 1963, 1970 (Jain 1975) and for 1970 (van Ginneken 1984) have only limited geographic coverage and do not satisfy our minimum criteria.
Despite the many urban surveys, there are no income distribution data that are representative of the population as a whole. References to national income distribution for the years 1953, 1959, and 1961(CEPAL 1968 in Altimir 1986 ) are based on extrapolation from national accounts and have therefore not been included. Data for 1953 and 1961 from Weisskoff (1970) , from Lecaillon (1984) , and from Cromwell (1977) are also excluded.
Australia Household surveys, the result of which is reported in the statistical yearbook, have been conducted in 1968/9, 1975/6, 1978/9, 1981, 1985, 1986, 1989, and 1990.
Data for 1962 (Cromwell, 1977) and 1966/67 (Sawyer 1976) were excluded as they covered only tax payers. Jain's data for 1970 was excluded because it covered income recipients only. Data from Podder (1972) for 1967/68, from Jain (1975) for the same year, from UN (1985) for 78/79, from Sunders and Hobbes (1993) for 1986 and for 1989 were excluded given the availability of the primary sources. Data from Bishop (1991) for 1981/82, from Buhman (1988) for 1981/82, from Kakwani (1986) for 1975/76, and from Sunders and Hobbes (1993) for 1986 were utilized to test for the effect of different definitions. The values for 1967 used by Persson and Tabellini and Alesina and Rodrik (based on Paukert and Jain) are close to the ones reported in the Statistical Yearbook for 1969.
Austria: In addition to data referring to the employed population (Guger 1989), national household surveys for 1987 and 1991 are included in the LIS data base. As these data do not include income from self-employment, we do not report them in our high quality data-set.
Bahamas Data for Ginis and shares are available for 1973, 1977, 1979, 1986, 1988, 1989, 1991, 1992, and 1993 in government reports on population censuses and household budget surveys, and for 1973 and 1975 from UN (1981). Estimates for 1970 (Jain 1975), 1973, 1975, 1977, and 1979 (Fields 1989) have been excluded given the availability of primary sources.
Bangladesh Data from household surveys for 1973/74, 1976/77, 1977/78, 1981/82, and 1985/86 are available from the Statistical Yearbook, complemented by household-survey based information from Chen (1995) and the World Development Report. Household surveys with rural coverage for 1959, 1960, 1963/64, 1965, 1966/67 and 1968/69, and with urban coverage for 1963/64, 1965, 1966/67, and 1968/69 are also available from the Statistical yearbook. Data for 1963/64 ,1964 and 1966/67, (Jain 1975) are not included due to limited geographic coverage, We also excluded secondary sources for 1973/74, 1976/77, 1981/82 (Fields 1989), 1977 (UN 1981), 1983 (Milanovic 1994), and 1985/86 due to availability of the primary source.
Barbados National household surveys have been conducted in 1951/52 and 1978/79 (Downs, 1988). Estimates based on personal tax returns, reported consistently for 1951-1981 (Holder and Prescott, 1989), had to be excluded as they exclude the non-wage earning population. Jain's figure (used by Alesina and Rodrik) is based on the same source.
Belgium Household surveys with national coverage are available for 1978/79 (UN 1985), and for 1985, 1988, and 1992 (LIS 1995). Earlier data for 1969, 1973, 1975, 1976 and 1977 (UN 1981) refer to taxable households only and are not included.
Bolivia The only survey with national coverage is the 1990 LSMS (World Development Report). Surveys for 1986 and 1989 cover the main cities only (Psacharopoulos et al. 1992) and are therefore not included. Data for 1968 (Cromwell 1977) do not refer to a clear definition and is therefore excluded.
Botswana The only survey with national coverage was conducted in 1985-1986 (Chen et al 1993); surveys in 74/75 and 85/86 included rural areas only (UN 1981). We excluded Gini estimates for 1971/72 that refer to the economically active population only (Jain 1975), as well as 1974/75 and 1985/86 (Valentine 1993) due to lack of national coverage or consistency in definition.
Brazil Data from 1960, 1970, 1974/75, 1976, 1977, 1978, 1980, 1982, 1983, 1985, 1987 and 1989 are available from the statistical yearbook, in addition to data for 1978 (Fields 1987) and for 1979 (Psacharopoulos et al. 1992). Other sources have been excluded as they were either not of national coverage, based on wage earners only, or because a more consistent source was available.
Bulgaria: Data from household surveys are available for 1963-69 (in two year intervals), for 1970-90 (on an annual basis) from the Statistical yearbook and for 1991 - 93 from household surveys by the World Bank (Milanovic and Ying).
Burkina Faso A priority survey has been undertaken in 1995.
Central African Republic: Except for a household survey conducted in 1992, no information was available.
Cameroon The only data are from a 1983/4 household budget survey (World Bank Poverty Assessment).
Canada Gini- and share data for the 1950-61 (in irregular intervals), 1961-81 (biennially), and 1981-91 (annually) are available from official sources (Statistical Yearbook for years before 1971 and Income Distributions by Size in Canada for years since 1973, various issues). All other references seem to be based on these primary sources.
Chad: An estimate for 1958 is available in the literature, and used by Alesina and Rodrik and Persson and Tabellini but was not included due to lack of primary sources.
Chile The first nation-wide survey that included not only employment income was carried out in 1968 (UN 1981). This is complemented by household survey-based data for 1971 (Fields 1989), 1989, and 1994. Other data that refer either only to part of the population or -as in the case of a long series available from World Bank country operations- are not clearly based on primary sources, are excluded.
China Annual household surveys from 1980 to 1992, conducted separately in rural and urban areas, were consolidated by Ying (1995), based on the statistical yearbook. Data from other secondary sources are excluded due to limited geographic and population coverage and data from Chen et al (1993) for 1985 and 1990 have not been included, to maintain consistency of sources..
Colombia The first household survey with national coverage was conducted in 1970 (DANE 1970). In addition, there are data for 1971, 1972, 1974 CEPAL (1986), and for 1978, 1988/89, and 1991 (World Bank Poverty Assessment 1992 and Chen et al. 1995). Data referring to years before 1970 -including the 1964 estimate used in Persson and Tabellini were excluded, as were estimates for the wage earning population only.
Costa Rica Data on Gini coefficients and quintile shares are available for 1961, 1971 (Cespedes 1973),1977 (OPNPE 1982), 1979 (Fields 1989), 1981 (Chen et al 1993), 1983 (Bourguignon and Morrison 1989), 1986 (Sauma-Fiatt 1990), and 1989 (Chen et al 1993). Gini coefficients for 1971 (Gonzalez-Vega and Cespedes in Rottenberg 1993), 1973 and 1985 (Bourguignon and Morrison 1989) cover urban areas only and were excluded.
Cote d'Ivoire: Data based on national-level household surveys (LSMS) are available for 1985, 1986, 1987, 1988, and 1995. Information for the 1970s (Schneider 1991) is based on national accounting information and therefore excluded
Cuba Official information on income distribution is limited. Data from secondary sources are available for 1953, 1962, 1973, and 1978, relying on personal wage income, i.e. excluding the population that is not economically active (Brundenius 1984).
Czech Republic Household surveys for 1993 and 1994 were obtained from Milanovic and Ying. While it is in principle possible to go back further, splitting national level surveys for the former Czechoslovakia into their independent parts, we decided not to do so as the same argument could be used to
The 2003 Family Income and Expenditure Survey (FIES) had the following primary objectives:
1) to gather data on family income and family expenditure and related information affecting income and expenditure levels and patterns in the Philippines;
2) to determine the sources of income and income distribution, levels of living and spending patterns, and the degree of inequality among families;
3) to provide benchmark information to update weights for the estimation of consumer price index; and
4) to provide information for the estimation of the country's poverty threshold and incidence.
National coverage
Household Consumption expenditure item Income by source
The 2003 FIES has as its target population, all households and members of households nationwide. A household is defined as an aggregate of persons, generally but not necessarily bound by ties of kinship, who live together under the same roof and eat together or share in common the household food. Household membership comprises the head of the household, relatives living with him such as his/her spouse, children, parent, brother/sister, son-in-law/daughter-in-law, grandson/granddaughter and other relatives. Household membership likewise includes boarders, domestic helpers and non-relatives. A person who lives alone is considered a separate household.
Institutional population is not within the scope of the survey.
Sample survey data [ssd]
The 2003 MS considers the country's 17 administrative regions as defined in Executive Orders (EO) 36 and 131 as the sampling domains. A domain is referred to as a subdivision of the country for which estimates with adequate level of precision are generated. It must be noted that while there is demand for data at the provincial level (and to some extent municipal and barangay levels), the provinces were not treated as sampling domains because there are more than 80 provinces which would entail a large resource requirement. Below are the 17 administrative regions of the country:
National Capital Region Cordillera Administrative Region Region I - Ilocos Region II - Cagayan Valley Region III - Central Luzon Region IVA - CALABARZON Region IVB - MIMAROPA Region V - Bicol Region VI - Western Visayas Region VII - Central Visayas Region VIII - Eastern Visayas Region IX - Zamboanga Peninsula Region X - Northern Mindanao Region XI - Davao Region XII - SOCCSKSARGEN Region XIII - Caraga Autonomous Region in Muslim Mindanao
As in most household surveys, the 2003 MS made use of an area sample design. For this purpose, the Enumeration Area Reference File (EARF) of the 2000 Census of Population and Housing (CPH) was utilized as sampling frame. The EARF contains the number of households by enumeration area (EA) in each barangay.
This frame was used to form the primary sampling units (PSUs). With consideration of the period for which the 2003 MS will be in use, the PSUs were formed/defined as a barangay or a combination of barangays with at least 500 households.
The 2003 MS considers the 17 regions of the country as the primary strata. Within each region, further stratification was performed using geographic groupings such as provinces, highly urbanized cities (HUCs), and independent component cities (ICCs). Within each of these substrata formed within regions, the PSUs were further stratified, to the extent possible, using the proportion of strong houses (PSTRONG), indicator of engagement in agriculture of the area (AGRI), and a measure of per capita income (PERCAPITA) as stratification factors.
The 2003 MS consists of a sample of 2,835 PSUs. The entire MS was divided into four sub-samples or independent replicates, such as a quarter sample contains one fourth of the total PSUs; a half sample contains one-half of the four sub-samples or equivalent to all PSUs in two replicates.
The final number of sample PSUs for each domain was determined by first classifying PSUs as either self-representing (SR) or non-self-representing (NSR). In addition, to facilitate the selection of sub-samples, the total number of NSR PSUs in each region was adjusted to make it a multiple of 4.
SR PSUs refers to a very large PSU in the region/domain with a selection probability of approximately 1 or higher and is outright included in the MS; it is properly treated as a stratum; also known as certainty PSU. NSR PSUs refers to a regular too small sized PSU in a region/domain; also known as non-certainty PSU. The 2003 MS consists of 330 certainty PSUs and 2,505 non-certainty PSUs.
To have some control over the sub-sample size, the PSUs were selected with probability proportional to some estimated measure of size. The size measure refers to the total number of households from the 2000 CPH. Because of the wide variation in PSU sizes, PSUs with selection probabilities greater than 1 were identified and were included in the sample as certainty selections.
At the second stage, enumeration areas (EAs) were selected within sampled PSUs, and at the third stage, housing units were selected within sampled EAs. Generally, all households in sampled housing units were enumerated, except for few cases when the number of households in a housing unit exceeds three. In which case, a sample of three households in a sampled housing unit was selected at random with equal probability.
An EA is defined as an area with discernable boundaries within barangays consisting of about 150 contiguous households. These EAs were identified during the 2000 CPH. A housing unit, on the other hand, is a structurally separate and independent place of abode which, by the way it has been constructed, converted, or arranged, is intended for habitation by a household.
The 2003 FIES involved the interview of a national sample of about 51,000 sample households deemed sufficient to gather data on family income and family expenditure and related information affecting income and expenditure levels and patterns in the Philippines at the national and regional level. The sample households covered in the survey were the same households interviewed in the July 2003 and January 2004 round of the LFS.
Face-to-face [f2f]
The 2003 FIES questionnaire contains about 800 data items and a summary for comparing income and expenditures. The questionnaires were subjected to a rigorous manual and machine edit checks for completeness, arithmetic accuracy, range validity and internal consistency.
The major steps in the machine processing are as follows: 1. Data Entry 2. Completeness Check 3. Matching of visit records 4. Consistency and Macro Edit (Big Edit) 5. Generation of the Public Use File 6. Tabulation
Steps 1 to 2 were done right after each visit. The remaining steps were carried out only after the second visit had been completed.
Steps 1 to 4 were done at the Regional Office while Steps 5 and 6 were completed in the Central Office.
After completing Steps 1 to 4, data files were transmitted to the Central Office where a summary file was generated. The summary file was used to produce the consistency tables as well as the preliminary and textual tables.
When the generated tables showed inconsistencies, selected data items were subjected to further scrutiny and validation. The cycle of generation of consistency tables and data validation were done until questionable data items were verified.
The FAME (FIES computer-Aided Consistency and Macro Editing), an interactive Windows-based application system was used in data processing. This system was used starting with the 2000 FIES round. The interactive module of FAME enabled the following activities to be done simultaneously. a) Matching of visit records b) Consistency and macro edit (big edit) c) Range check
The improved system minimized processing time as well as minimized, if not eliminated, the need for paper to generate the reject listing.
Note: For data entry, CSPro Version 2.6 was used.
The response rate for this survey is 95.7%. The response rate is the ratio of the total responding households to the total number of eligible households. Eligible households include households who were completely interviewed, refused to be interviewed or were temporarily away or not at home or on vacation during the survey period.
As in all surveys, two types of non-response were encountered in the 2003 FIES: interview non-response and item non-response. Interview non-response refers to a sample household that could not be interviewed. Since the survey requires that the sample households be interviewed in both visits, households that transferred to another dwelling unit, temporarily away, on vacation, not at home, household unit demolished, destroyed by fire/typhoon and refusal to be interviewed in the second visit contributed to the number of interview non-response cases.
Item non-response, or the failure to obtain responses to particular survey items, resulted from factors such as respondents being unaware of the answer to a particular question, unwilling to provide the requested information or ENs’ omission of questions during the interview. Deterministic imputation was done to address item nonresponse. This imputation is a process in which proper entry for a particular missing item was deduced from other items of the questionnaire where the non-response item was observed. Notes and remarks indicated in the questionnaire were likewise used as basis for imputation.
Refer to the
The main objectives of the CLF-CLS 2011-2012 are to collect detailed information on the country's labour force of persons 15 years old and above and children 5 to 17 years old disaggregated by age, gender, region, sector and social category. The survey provides information on the national labour market that can then be used to develop, manage and evaluate labour market policies and programmes. Also, the survey provides detailed information on child workers and hazards at work.
It is intended to promote a gender mainstreamed analysis of the labour market and compile national and provincial statistics relating to informal employment, working poor and vulnerable employment. These statistics will be especially useful to government as it attempts to identify the problems that Cambodians face in the area of employment. With this information available, planners and policy makers will then be better placed to develop policies and programmes to improve the welfare of the people and some information on working people and child labour.
National, Urban, Rural, All of provinces in Cambodia (24 Provinces)
Individuals
Sample survey data [ssd]
The Cambodia Labour Force and Child Labour Survey 2011-12 covered 24 Capital/provinces in the country and involved 600 Enumeration Areas (EA) randomly selected as primary sampling units, or PSUs, and 9,600 households randomly selected as secondary sampling units, or SSUs. Each EA was randomly selected 16 sample household. Totally, there were 9,600 households to be interviewed.
The sampling frame was based on the village population data files from the 2008 general population census, conducted by the NIS. The CLF-CLS 2011-12 was undertaken in two stages with EAs as the primary sampling units and households as secondary sampling units. It consists of 600 primary sampling units (PSUs) or EAs. Out of the total sample EAs, 54 EAs were allocated for urban areas and the remainder 546 EAs for rural areas.
For details please refer to the document entitled "Report on Selection of Sampled Households from the Sampling Frame for Cambodia Labour Force and Child Labour Survey 2011-2012".
Face-to-face [f2f]
The following are the LFCLS forms used during the field enumeration and a brief outline of the fieldwork procedures:
2.1 Listing Sheet
This is a sheet containing a list of the buildings, housing units and households within an enumeration area (EA). Other information pertaining to population of households were also recorded.
Listing sheet was used to record all households in the village or part thereof selected for household enumeration. The current list of households was necessary for sampling households and also as an input to derive household weights
2.2 Questionnaire
The Cambodia Labour Force and Child Labour questionnaire consists of a cover page, which contains general information on the household, followed by the 12 sections:
A. Household composition and characteristics of household members
B. Literacy and Education
C. Training within the last 12 months (outside of the general education system)
D. Current activities
E. Characteristics of the main job/activity in the last 7 days
F. Characteristics of the secondary job/activity in the last 7 days
G. Hours of work
H. Underemployment
I. Job search
J. Occupational injuries within the last 12 months
K. Participation in production of goods for use by own household
L. Other activities
All completed questionnaires were brought to NIS for processing. Although completed questionnaires were checked and edited by supervisors in the field, specially because of the length of questionnaires and the complexity of the topics covered the need for manual editing and coding by trained staff was accepted as an essential priority activity to produce a cleaned data file without delay. In all 4 staff comprising 3 processing staff and 1 supervisor were trained for two days by the project staff. An instruction manual for manual editing and coding was prepared and translated into Khmer for the guidance of processing staff.
In order to produce an unedited data file, keying in the data as recorded by field enumerators and supervisors, (without subjecting data to manual edit as required by the Analysis Component Project staff), it was necessary to structure manual editing as a two-phase operation. Thus in the first phase, the processing staff coded the questions such as those industry, and occupation which required coding. Editing was restricted to selected structural edits and some error corrections. These edits were restricted to checking the completeness and consistency of responses, legibility, and totaling of selected questions. Error corrections were made without canceling or obliterating the original entry made by the enumerator, by inserting the correction close to the original entry.
Much of the manual editing was carried out in the second phase, after key entry and one hundred percent verification and extraction of error print outs. A wide range of errors had to be corrected which was expected in view of the complexity of the survey and the skill background of the enumeration and processing staff. The manual edits involved the correction of errors arising from incorrect key entry, in-correct/ failure to include identification, miss-coding of answers, failure to follow skip patterns, misinterpretation of measures, range errors, and other consistency errors.
Despite the length of the questionnaire, the respondents cooperated with the survey staff and provided answers to both questionnaires and it was possible to achieve a 100% response rate. At this stage it is not possible to comment on item non-response, and completeness of information provided by the respondents, and the respondent's fatigue arising from the length of the interviews which may have had a bearing on these issues.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457513https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de457513
Abstract (en): The Global Entrepreneurship Monitor [GEM] research program was developed to provide comparisons among countries related to participation of adults in the firm creation process. The initial data was assembled as a pretest of five countries in 1998 and by 2012 over 100 countries had been involved in the program. The initial design for the GEM initiative was based on the first US Panel Study of Entrepreneurial Dynamics, and by 2012 data from 1,827,513 individuals had been gathered in 563 national samples and 6 specialized regional samples. This dataset is a harmonized file capturing results from all of the surveys. The procedure has been to harmonize the basic items across all surveys in all years, followed by implementing a standardized transform to identify those active as nascent entrepreneurs in the start-up process, as owner-managers of new firms, or as owner-managers of established firms. Those identified as nascent entrepreneurs or new business owners are the basis for the Total Entrepreneurial Activity [TEA] or Total Early-Stage index. This harmonized, consolidated assessment not only facilitates comparisons across countries, but provides a basis for temporal comparisons for individual countries. Respondents were queried on the following main topics: general entrepreneurship, start-up activities, ownership and management of the firm, and business angels (angel investors). Respondents were initially screened by way of a series of general questions pertaining to starting a business, such as whether they were currently trying to start a new business, whether they knew anyone who had started a new business, whether they thought it was a good time to start a new business, as well as their perceptions of the income potential and the prestige associated with starting a new business. Demographic variables include respondent age, sex, and employment status. The data are not weighted, however, this collection contains three weight variables that should be used in any analysis: WEIGHT, WEIGHT_L, and WEIGHT_A. National survey vendors implemented weights that would match the annual cohorts with the best available national data, later adjusted by matching the sample to the U.S. Census Bureau International Data Base (IDB) on national population distributions by age and gender. For more information on weights and sampling please refer to the Original P.I. Documentation section in the ICPSR Codebook. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created online analysis version with question text.; Checked for undocumented or out-of-range codes.. Adult populations of 100 countries. Smallest Geographic Unit: Country Developing representative samples of adults was a two stage process. The first step involved a random selection of households leading to a contact with an adult resident. In countries where a high proportion of households have land line telephones, this was done by creating a random set of numbers considered to be household phone numbers. In countries with a high proportion of cell-phone only adults, this has been supplemented with random samples of active cell phone numbers. Numbers were then called, generally up to three times, until an adult respondent answered the phone. In countries with low proportion of households with phones, geographic areas were selected at random for personal contacts by interviewers, who then approached households for a face-to-face interview. In some developing countries phone interviews are conducted in the major urban areas supplemented with face-to-face interviews in rural regions. Adults from each household were selected for interviews in one of two ways. In some cases it was the first adult contacted and in others a person would be randomly selected from those adults living in the household for the interview. In many developed countries there was a deliberate attempt, quota sampling, to complete half of all interviews with men and half with women. For additional information on sampling, please refer to the Original P.I. Documentation section in the ICPSR Codebook. 2016-12-14 Data have been resupplied and now in...
The primary objective of the 2014 GDHS was to generate recent reliable information on fertility, family planning, infant and child mortality, maternal and child health, and nutrition. In addition, the survey collected specialised data on malaria treatment, prevention, and prevalence among children age 6-59 months; blood pressure among adults; anaemia among women and children; and HIV prevalence among adults. This information is essential for making informed policy decisions and for planning, monitoring, and evaluating programmes related to health in general, and reproductive health in particular, at both the national and regional levels. Analysis of data collected in the 2014 GDHS provides updated estimates of basic demographic and health indicators covered in the earlier rounds of the 1988, 1993, 1998, 2003, and 2008 surveys.
The GDHS will assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of Ghana’s population. The 2014 GDHS also provides comparable data for long-term trend analysis in Ghana, since the surveys were implemented by the same organisation, using similar data collection procedures. Furthermore, the survey adds to the international database on demographic and health–related information for research purposes.
National
Sample survey data [ssd]
The sampling frame used for the 2014 GDHS is an updated frame from the 2010 Ghana Population and Housing Census provided by the Ghana Statistical Service (GSS 2013b). The sampling frame excluded nomadic and institutional populations such as persons in hotels, barracks, and prisons.
The 2014 GDHS followed a two-stage sample design and was intended to allow estimates of key indicators at the national level as well as for urban and rural areas and each of Ghana's 10 administrative regions. The first stage involved selecting sample points (clusters) consisting of enumeration areas (EAs) delineated for the 2010 PHC. A total of 427 clusters were selected, 216 in urban areas and 211 in rural areas.
The second stage involved the systematic sampling of households. A household listing operation was undertaken in all the selected EAs in January-March 2014, and households to be included in the survey were randomly selected from the list. About 30 households were selected from each cluster to constitute the total sample size of 12,831 households. Because of the approximately equal sample sizes in each region, the sample is not self-weighting at the national level, and weighting factors have been added to the data file so that the results will be proportional at the national level.
All women age 15-49 who were either permanent residents of the selected households or visitors who stayed in the household the night before the survey were eligible to be interviewed and have their blood pressure measured.
In half of the households, all men age 15-59 who were either permanent residents of the selected households or visitors who stayed in the households the night before the survey were eligible to be interviewed. In addition, in the subsample of households selected for the male survey: • blood pressure measurements were performed among eligible men who consented to being tested; • children age 6-59 months were tested for anaemia and malaria with the parent's or guardian's consent; • eligible women who consented were tested for anaemia; • blood samples were collected for laboratory testing of HIV from eligible women and men who consented; and • height and weight information was collected from eligible women, men, and children age 0- 59 months.
For further details on sample selection, see Appendix A of the final report.
Face-to-face [f2f]
Three questionnaires were used for the 2014 GDHS: the Household Questionnaire, the Woman’s Questionnaire, and the Man’s Questionnaire. These questionnaires, which were based on standard Demographic and Health Survey (DHS) questionnaires, were adapted to reflect the population and health issues relevant to Ghana. Comments on the questionnaires were solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. The definitive questionnaires were first prepared in English; they were then translated into the major local languages, namely Akan, Ga, and Ewe.
The Household Questionnaire was used to list all the members of and visitors to the selected households. Basic demographic information was collected on the characteristics of each person listed, including his or her age, sex, marital status, education, and relationship to the head of the household. For children under age 18, parents’ survival status was determined. The data on age and sex of household members obtained in the Household Questionnaire were used to identify women and men who were eligible for individual interviews. The Household Questionnaire also included questions on child education as well as the characteristics of the household’s dwelling unit, such as source of water, type of toilet facilities, materials used for the floor of the dwelling unit, and ownership of various durable goods.
The Woman’s Questionnaire was used to collect information from all eligible women age 15-49.
In half of the selected households, the Man’s Questionnaire was administered to all men age 15-59. The Man’s Questionnaire collected much of the same information found in the Woman’s Questionnaire but was shorter because it did not contain a detailed reproductive history or questions on maternal and child health.
The data processing operation included 100 percent verification (also called second data entry) and secondary editing, which involved resolution of computer-identified inconsistencies. The data processing activities at the central office were led by one key GSS officer who took part in the main fieldwork training. Data processing was accomplished using CSPro software. Data entry and editing were initiated in September 2014 and completed in February 2015.
A total of 12,831 households were selected for the sample, of which 12,010 were occupied. Of the occupied households, 11,835 were successfully interviewed, yielding a response rate of 99 percent, the same as the 2008 GDHS household response rate (GSS, GHS, and ICF Macro 2009).
In the interviewed households, 9,656 eligible women were identified for individual interviews; interviews were completed with 9,396 women, yielding a response rate of 97 percent. In the subsample of households selected for the male survey, 4,609 eligible men were identified and 4,388 were successfully interviewed, yielding a response rate of 95 percent. The lower response rate for men was likely due to their more frequent and longer absences from the household.
The estimates from a sample survey are affected by two types of errors: non-sampling errors and sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2014 Ghana DHS (GDHS) to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2014 GDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2014 GDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. Sampling errors are computed in either ISSA or SAS, using programs developed by ICF International. These programs use the Taylor linearization method of variance estimation for survey estimates that are means, proportions or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
The Taylor linearization method treats any percentage or average as a ratio estimate, r = y x , where y represents the total sample value for variable y, and x represents the
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS
The Palestinian Central Bureau of Statistics (PCBS) carried out four rounds of the Labor Force Survey 2005 (LFS). The survey rounds covered a total sample of about 30252 households, and the number of completed questionaire was 26595, which amounts to a sample of around 92384 individuals aged 15 years and over.
The importance of this survey lies in that it focuses mainly on labour force key indicators, main characteristics of the employed, unemployed, underemployed and persons outside labour force, labour force according to level of education, distribution of the employed population by occupation, economic activity, place of work, employment status, hours and days worked and average daily wage in NIS for the employees.
The survey main objectives are: - To estimate the labor force and its percentage to the population. - To estimate the number of employed individuals. - To analyze labour force according to gender, employment status, educational level , occupation and economic activity. - To provide information about the main changes in the labour market structure and its socio economic characteristics. - To estimate the numbers of unemployed individuals and analyze their general characteristics. - To estimate the rate of working hours and wages for employed individuals in addition to analyze of other characteristics.
The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.
Covering a representative sample on the region level (West Bank, Gaza Strip), the locality type (urban, rural, camp) and the governorates.
1- Household/family. 2- Individual/person.
The survey covered all Palestinian households who are a usual residence of the Palestinian Territory.
Sample survey data [ssd]
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE PALESTINIAN CENTRAL BUREAU OF STATISTICS
The methodology was designed according to the context of the survey, international standards, data processing requirements and comparability of outputs with other related surveys.
All Palestinians aged 10 years or older living in the Palestinian Territory, excluding those living in institutions such as prisons or shelters.
The sampling frame consisted of a master sample of Enumeration Areas (EAs) selected from the population housing and establishment census 1997. The master sample consists of area units of relatively equal size (number of households), these units have been used as Primary Sampling Units (PSUs).
The sample is a two-stage stratified cluster random sample.
Stratification: Four levels of stratification were made:
The sample size in the first round consisted of 7,563 households, which amounts to a sample of around 22,759 persons aged 15 years and over. In the second round the sample consisted of 7,563 households, which amounts to a sample of around 23,104 persons aged 15 years and over, in the third round the sample consisted of 7,563 households, which amounts to a sample of around 23,123 persons aged 15 years and over. In the fourth round the sample consisted of 7,563 households; which amounts to a sample of around 23,398 persons aged 15 years and over.
The sample size allowed for non-response and related losses. In addition, the average number of households selected in each cell was 16.
Each round of the Labor Force Survey covers all the 481 master sample areas. Basically, the areas remain fixed over time, but households in 50% of the EAs are replaced each round. The same household remains in the sample over 2 consecutive rounds, rests for the next two rounds and represented again in the sample for another and last two consecutive rounds before it is dropped from the sample. A 50 % overlap is then achieved between both consecutive rounds and between consecutive years (making the sample efficient for monitoring purposes). In earlier applications of the LFS (rounds 1 to 11); the rotation pattern used was different; requiring a household to remain in the sample for six consecutive rounds, then dropped. The objective of such a pattern was to increase the overlap between consecutive rounds. The new rotation pattern was introduced to reduce the burden on the households resulting from visiting the same household for six consecutive times.
Face-to-face [f2f]
One of the main survey tools is the questionnaire, the survey questionnaire was designed according to the International Labour Organization (ILO) recommendations. The questionnaire includes four main parts:
Identification Data: The main objective for this part is to record the necessary information to identify the household, such as, cluster code, sector, type of locality, cell, housing number and the cell code.
Quality Control: This part involves groups of controlling standards to monitor the field and office operation, to keep in order the sequence of questionnaire stages (data collection, field and office coding, data entry, editing after entry and store the data.
Household Roster: This part involves demographic characteristics about the household, like number of persons in the household, date of birth, sex, educational level…etc.
Employment Part: This part involves the major research indicators, where one questionnaire had been answered by every 15 years and over household member, to be able to explore their labour force status and recognize their major characteristics toward employment status, economic activity, occupation, place of work, and other employment indicators.
Data editing took place at a number of stages through the processing including: 1. office editing and coding 2. during data entry 3. structure checking and completeness 4. structural checking of SPSS data files
The overall response rate for the survey was 93.2%
More information on the distribution of response rates by different survey rounds is available in Page 12 of the data user guide provided among the disseminated survey materials under a file named "Palestine 2005- Data User Guide (English).pdf".
Since the data reported here are based on a sample survey and not on a complete enumeration, they are subjected to sampling errors as well as non-sampling errors. Sampling errors are random outcomes of the sample design, and are, therefore, in principle measurable by the statistical concept of standard error. A description of the estimated standard errors and the effects of the sample design on sampling errors are provided in the annual report provided among the disseminated survey materials under a file named "Palestine 2005- LFS Annual Report (Arabic).pdf".
Non-sampling errors can occur at the various stages of survey implementation whether in data collection or in data processing. They are generally difficult to be evaluated statistically. They cover a wide range of errors, including errors resulting from non-response, sampling frame coverage, coding and classification, data processing, and survey response (both respondent and interviewer-related). The use of effective training and supervision and the careful design of questions have direct bearing on limiting the magnitude of non-sampling errors, and hence enhancing the quality of the resulting data. The following are possible sources of non-sampling errors:
• Errors due to non-response because households were away from home or refused to participate. The overall non response rate amounted to almost 12.1% which is relatively low; a much higher rates is rather common in an international perspective. The refusal rate was only 0.8%. It is difficult
The survey was conducted during December 2006, following an initial mini census listing exercise which was conducted about two months earlier in late September 2006. The objectives of the HIES were as follows: a) Provide information on income and expenditure distribution within the population; b) Provide income estimates of the household sector for the national accounts; c) Provide data for the re-base on the consumer price index; d) Provide data for the analysis of poverty and hardship.
National coverage: whole island was covered for the survey.
The survey covered all private households on the island of Nauru. When the survey was in the field, interviewers were further required to reduce the scope by removing those households which had not been residing in Nauru for the last 12 months and did not intend to stay in Nauru for the next 12 months. Persons living in special dwellings (Hospital, Prison, etc) were not included in the survey.
Sample survey data [ssd]
The sample size adopted for the survey was 500 households which allowed for expected sample loss, whilst still maintaining a suitable responding sample for the analysis.
Before the sample was selected, the population was stratified by constituency in order to assist with the logistical issues associated with the fieldwork. There were eight constituencies in total, along with "Location" which stretches across the districts of Denigamodu and Aiwo, forming nine strata in total. Although constituency level analysis was not a priority for the survey, sample sizes within each stratum were kept to a minimum of 40 households, to enable some basic forms of analysis at this level if required.
The sample selection procedure within each stratum was then to sort each household on the frame by household size (number of people), and then run a systematic skip through the list in order to achieve the desirable sample size.
No deviations from the sample design took place.
Face-to-face [f2f]
The survey schedules adopted for the Household Income and Expenditure Survey (HIES) included the following: · Expenditure questionnaire; · Income questionnaire; · Miscellaneous questionnaire; · Diary (x2).
Whilst a Household Control Form collecting basic demographics is also normally included with the survey, this wasn't required for this HIES as this activity took place for all households in the mini census.
Information collected in the four schedules covered the following: -Expenditure questionnaire: Covers basic details about the dwelling structure and its access to things like water and sanitation. It was also used as the vehicle to collect expenditure on major and infrequent expenditures incurred by the household. -Income questionnaire: Covers each of the main types of household income generated by the household such as wages and salaries, business income and income from subsistence activities. -Miscellaneous questionnaire: Covers topics relating to health access, labour force status and education. -Diary: Covers all day to day expenditures incurred by the household, consumption of items produced by the household such as fish and crops, and gifts both received and given by the household.
All questionnaires are provided as External Resources.
There were 3 phases to the editing process for the 2006 Household Income and Expenditure Survey (HIES) of Nauru which included: 1. Data Verification operations; 2. Data Editing operations; 3. Data Auditing operations.
The software used for data editting is CSPro 3.0. After each batch is completed the supervisor should check that all person details have been entered from the household listing form (HCF) and should review the income and expenditure questionnaires for each batch ensuring that all items have been entered correctly. Any omitted or incorrect items should be entered into the system. The supervisor is required to perform outlier checks (large or small values) on the batched diary data by calculating unit price (amount/quantity) and comparing prices for each item. This is to be conducted by loading the data into Excel files and sorting data by unit price for each item. Any changes to prices or quantities will be made on the batch file.
For more information on what each phase entailed go the document HIES Processing Instructions attached to this documentation.
The survey response rates were a lot lower than expected, especially in some districts. The district of Aiwo, Uaboe and Denigomodu had the lowest response rates with 16.7%, 20.0% and 34.8% respectively. The area of Location was also extremely low with a responses rate of 32.2%. On a more positive note, the districts of Yaren, Ewa, Anabar, Ijuw and Anibare all had response rates at 80.0% or better.
The major contributing factor to the low response rates were households refusing to take part in the survey. The figures for responding above only include fully responding households, and given there were many partial responses, this also brought the values down. The other significant contributing factor to the low response rates was the interviewers not being able to make contact with the household during the survey period.
Unfortunately, not only do low response rates often increase the sampling error of the survey estimates, because the final sample is smaller, it will also introduce response bias into the final estimates. Response bias takes place when the households responding to the survey possess different characteristics to the households not responding, thus generating different results to what would have been achieved if all selected households responded. It is extremely difficult to measure the impact of the non-response bias, as little information is generally known about the non-responding households in the survey. For the Nauru 2006 HIES however, it was noted during the fieldwork that a higher proportion of the Chinese population residing in Nauru were more likely to not respond. Given it is expected their income and expenditure patterns would differ from the rest of the population, this would contribute to the magnitude of the bias.
Below is the list of all response rates by district: -Yaren: 80.5% -Boe: 70% -Aiwo: 16.7% -Buada: 62.5% -Denigomodu: 34.8% -Nibok: 68.4% -Uaboe: 20% -Baitsi: 47.8% -Ewa: 80% -Anetan: 76.5% -Anabar: 81.8% -Ijuw: 85.7% -Anibare: 80% -Meneng: 64.3% -Location: 32.2% -TOTAL: 54.4%
To determine the impact of sampling error on the survey results, relative standard errors (RSEs) for key estimates were produced. When interpreting these results, one must remember that these figures don't include any of the non-sampling errors discussed in other sections of this documentation
To also provide a rough guide on how to interpret the RSEs provided in the main report, the following information can be used:
Category Description
RSE < 5% Estimate can be regarded as very reliable
5% < RSE < 10% Estimate can be regarded as good and usable
10% < RSE < 25% Estimate can be considered usable, with caution
RSE > 25% Estimate should only be used with extreme caution
The actual RSEs for the key estimates can be found in Section 4.1 of the main report
As can be seen from these tables, the estimates for Total Income and Total Expenditure from the Household Income and Expenditure Survey (HIES) can be considered to be very good, from a sampling error perspective. The same can also be said for the Wage and Salary estimate in income and the Food estimate in expenditure, which make up a high proportion of each respective group.
Many of the other estimates should be used with caution, depending on the magnitude of their RSE. Some of these high RSEs are to be expected, due to the expected degree of variability for how households would report for these items. For example, with Business Income (RSE 56.8%), most households would report no business income as no household members undertook this activity, whereas other households would report large business incomes as it's their main source of income.
Other than the non-response issues discussed in this documentation, other quality issues were identified which included: 1) Reporting errors Some of the different aspects contributing to the reporting errors generated from the survey, with some examples/explanations for each, include the following:
a) Misinterpretation of survey questions: A common mistake which takes place when conducting a survey is that the person responding to the questionnaire may interpret a question differently to the interviewer, who in turn may have interpreted the question differently to the people who designed the questionnaire. Some examples of this for a Household Income and Expenditure Survey (HIES) can include people providing answers in dollars and cents, instead of just dollars, or the reference/recall period for an “income” or “expenditure” is misunderstood. These errors can often see reported amounts out by a factor of 10 or even 100, which can have major impacts on final results.
b) Recall problems for the questionnaire information: The majority of questions in both of the income and expenditure questionnaires require the respondent to recall what took place over a 12 month period. As would be expected, people will often forget what took place up to 12 months ago so some
Note: Data on gender diverse households (formerly "2SLGBTQ+" households) has been added as of March 28th, 2025.
For more information, please visit HART.ubc.ca.
This dataset contains 18 tables which draw upon data from the 2021 Canadian Census of Population. The tables are a custom order and contain data pertaining to core housing need and characteristics of households and dwellings. This custom order was placed in collaboration with Housing, Infrastructure and Communities Canada to fill data gaps in their Housing Needs Assessment Template.
17 of the tables each cover a different geography in Canada: one for Canada as a whole, one for all Canadian census divisions (CD), and 15 for all census subdivisions (CSD) across Canada. The 18th table contains the median income for all geographies. Statistics Canada used these median incomes as the "area median household income (AMHI)," from which they derived some of the data fields within the Shelter Costs/Household Income dimension.
The dataset is in Beyond 20/20 (.ivt) format. The Beyond 20/20 browser is required in order to open it. This software can be freely downloaded from the Statistics Canada website: https://www.statcan.gc.ca/eng/public/beyond20-20 (Windows only). For information on how to use Beyond 20/20, please see: http://odesi2.scholarsportal.info/documentation/Beyond2020/beyond20-quickstart.pdf https://wiki.ubc.ca/Library:Beyond_20/20_Guide
Custom order from Statistics Canada includes the following dimensions and data fields:
Geography:
- Country of Canada, all CDs & Country as a whole
- All 10 Provinces (Newfoundland, Prince Edward Island (PEI), Nova Scotia, New Brunswick, Quebec, Ontario, Manitoba, Saskatchewan, Alberta, and British Columbia), all CSDs & each Province as a whole
- All 3 Territories (Nunavut, Northwest Territories, Yukon), all CSDs & each Territory as a whole
*- Data on gender diverse households is only available for geographies (provinces, territories, CDs, CSDs) with a population count greater than 50,000.
Data Quality and Suppression:
- The global non-response rate (GNR) is an important measure of census data quality. It combines total non-response (households) and partial non-response (questions). A lower GNR indicates a lower risk of non-response bias and, as a result, a lower risk of inaccuracy. The counts and estimates for geographic areas with a GNR equal to or greater than 50% are not published in the standard products. The counts and estimates for these areas have a high risk of non-response bias, and in most cases, should not be released.
- Area suppression is used to replace all income characteristic data with an 'x' for geographic areas with populations and/or number of households below a specific threshold. If a tabulation contains quantitative income data (e.g., total income, wages), qualitative data based on income concepts (e.g., low income before tax status) or derived data based on quantitative income variables (e.g., indexes) for individuals, families or households, then the following rule applies: income characteristic data are replaced with an 'x' for areas where the population is less than 250 or where the number of private households is less than 40.
Source: Statistics Canada
- When showing count data, Statistics Canada employs random rounding in order to reduce the possibility of identifying individuals within the tabulations. Random rounding transforms all raw counts to random rounded counts. Reducing the possibility of identifying individuals within the tabulations becomes pertinent for very small (sub)populations. All counts greater than 10 are rounded to a base of 5, meaning they will end in either 0 or 5. The random rounding algorithm controls the results and rounds the unit value of the count according to a predetermined frequency. Counts ending in 0 or 5 are not changed. Counts less than 10 are rounded to a base of 10, meaning they will be rounded to either 10 or Zero.
Universe:
Private Households in Non-farm Non-band Off-reserve Occupied Private Dwellings with Income Greater than zero.
Households examined for Core Housing Need:
Private, non-farm, non-reserve, owner- or renter-households with incomes greater than zero and shelter-cost-to-income ratios less than 100% are assessed for 'Core Housing Need.' Non-family Households with at least one household maintainer aged 15 to 29 attending school are considered not to be in Core Housing Need, regardless of their housing circumstances.
Data Fields:
Tenure Including Presence of Mortgage and Subsidized Housing; Household size (7)
1. Total - Private households by tenure including presence of mortgage payments and subsidized housing
2. Owner
3. With mortgage
4. Without mortgage
5. Renter
6. Subsidized housing
7. Not subsidized housing
Housing indicators in Core Housing Universe (12)
1. Total - Private Households by core housing need status
2. Households examined for core housing need
3. Households in core housing need
4. Below one standard only
5. Below affordability standard only
6. Below adequacy standard only
7. Below suitability standard only
8. Below 2 or more standards
9. Below affordability and suitability
10. Below affordability and adequacy
11. Below suitability and adequacy
12. Below affordability, suitability, and adequacy
Period of construction (10)
1. Total – Period of Construction
2. Before 2016
3. 1960 or before
4. 1961 to 1980
5. 1981 to 1990
6. 1991 to 2000
7. 2001 to 2005
8. 2006 to 2010
9. 2011 to 2015
10. 2016 to 2021 (Note 1)
Note 1). Includes data up to May 11, 2021.
Structural type of dwelling and Household income as proportion to AMHI (16)
1. Total - Structural type of dwelling
2. Single-detached house
3. Apartment in a building that has five or more storeys
4. Other attached dwelling
5. Apartment or flat in a duplex
6. Apartment in a building that has fewer than five storeys
7. Other single-attached house
8. Row house
9. Semi-detached house
10. Movable dwelling
11. Total – Private households by household income proportion to AMHI
12. Households with income 20% or under of area median household income (AMHI)
13. Households with income 21% to 50% of AMHI
14. Households with income 51% to 80% of AMHI
15. Households with income 81% to 120% of AMHI
16. Households with income 121% or more of AMHI
Selected characteristics (12)
1. Total – Private households by presence of activity limitation (Q18e only)
2. HH has at least one person who had an activity limitations reported for Question 18 e) only 1
3. Total – Age of primary household maintainer
4. 18 to 29 years
5. Total – Private households by military service status of the HH members
6. HH includes a person who is currently serving member and/or veteran
11. Total – Private households by shelter cost proportion to AMHI_1
12. Households with shelter cost 0.5% and under of AMHI
13. Households with shelter cost 0.6% to 1.25% of AMHI
14. Households with shelter cost 1.26% to 2% of AMHI
15. Households with shelter cost 2.1% to 3% of AMHI
16. Households with shelter cost 3.1% or more of AMHI*
Median income (2)
1. Number of households
2. Median income of household ($)
The household median income in the custom tabulation were estimates from a 25% sample-based data that have undergone weighting. These weights were applied to the sample data to produce estimates from the census long-form sample. The incomes used were drawn from the previous tax year, and therefore represent 2020 dollars.
[Only in "Census 2021 - Gender Diverse HHs" file] Genderdiversity (2)
1. Total - Gender diversity status of households
2. HH is gender diverse
File list (19 total):
Original data files (18):
1. Census 2021 - Table 1 - Median Incomes.ivt
2. Census 2021 - Table 2 - Canada.ivt
3. Census 2021 - Table 3 - Census Divisions.ivt
4. Census 2021 - Table 4 - Ontario CSDs.ivt
5. Census 2021 - Table 5 - BC CSDs.ivt
6. Census 2021 - Table 6 - Alberta CSDs.ivt
7. Census 2021 - Table 7 - Manitoba CSDs.ivt
8. Census 2021 - Table 8 - Saskatchewan CSDs.ivt
9. Census 2021 - Table 9-1 - Quebec CSDs (Part 1 of 3).ivt
10. Census 2021 - Table 9-2 - Quebec CSDs (Part 2 of 3).ivt
11. Census 2021 - Table 9-3 - Quebec CSDs (Part 3 of 3).ivt
12. Census 2021 - Table 10 - Newfoundland&Labrador CSDs.ivt
13. Census 2021 - Table 11 - PEI CSDs.ivt
14. Census 2021 - Table 12 - Nova Scotia CSDs.ivt
15. Census 2021 - Table 13 - New Brunswick CSDs.ivt
16. Census 2021 - Table 14 - Yukon CSDs.ivt
17. Census 2021 - Table 15 - NWT CSDs.ivt
18. Census 2021 - Table 16 - Nunavut CSDs.ivt
19. Census 2021 - Gender Diverse HHs.ivt
Pour de plus amples renseignements, veuillez visiter HART.ubc.ca.
Cet ensemble de données contient 18 tableaux qui s’appuient sur les données
https://opendata.vancouver.ca/pages/licence/https://opendata.vancouver.ca/pages/licence/
The census is Canada's largest and most comprehensive data source conducted by Statistics Canada every five years. The Census of Population collects demographics and linguistic information on every man, woman and child living in Canada. The data shown here is provided by Statistics Canada from the 2001 Census as a custom profile data order for the City of Vancouver, using the City's 22 local planning areas. The data may be reproduced provided they are credited to Statistics Canada, Census 2001, custom order for City of Vancouver Local Areas.Data AccessThis dataset has not yet been converted to a format compatible with our new platform. Please use the links below to access the files from our legacy site. Census local area profiles 2001 (CSV) Census local area profiles 2001 (XLS) Dataset schema (Attributes) Please see the Census local area profiles 2001 attributes page. NoteThe 22 Local Areas is defined by the Census blocks and is equal to the City's 22 local planning areas and includes the Musqueam 2 reserve.Vancouver CSD (Census Subdivision) is defined by the City of Vancouver municipal boundary which excludes the Musqueam 2 reserve but includes Stanley Park.Vancouver CMA (Census Metropolitan Area) is defined by the Metro Vancouver boundary which includes the following Census Subdivisions: Vancouver, Surrey, Burnaby, Richmond, Coquitlam, District of Langley, Delta, District of North Vancouver, Maple Ridge, New Westminster, Port Coquitlam, City of North Vancouver, West Vancouver, Port Moody, City of Langley, White Rock, Pitt Meadows, Greater Vancouver A, Bowen Island, Capilano 5, Anmore, Musqueam 2, Burrard Inlet 3, Lions Bay, Tsawwassen, Belcarra, Mission 1, Matsqui 4, Katzie 1, Semiahmoo, Seymour Creek 2, McMillian Island 6, Coquitlam 1, Musqueam 4, Coquitlam 2, Katzie 2, Whonnock 1, Barnston Island 3, and Langley 5. Data products that are identified as 20% sample data refer to information that was collected using the long census questionnaire. For the most part, these data were collected from 20% of the households; however they also include some areas, such as First Nations communities and remote areas, where long census form data were collected from 100% of the households. The following changes were made to the census family concept for 2001 and account for some of the increase in the total number of families, single parent families and children living at home: Two persons living in a same-sex common law relationship are now considered a family. Children living at home now include previously married children, provided they are not currently living with a spouse or common-law partner. A grandchild living in a three generation household where the parent (middle generation) was never married is now considered a child of the census family. A grandchild of a three-generation household where the middle generation is not present is now considered a child of the census family.Mode of transportation to work data is not reliable for the 2001 Census due to the TransLink Transit Strike that occurred during the data collection period. Data currencyThe data for Census 2001 was collected in May 2001. Data accuracyStatistics Canada is committed to protect the privacy of all Canadians and the confidentiality of the data they provide to us. As part of this commitment, some population counts of geographic areas are adjusted in order to ensure confidentiality. Counts of the total population are rounded to a base of 5 for any dissemination block having a population less than 15. Population counts for all standard geographic areas above the dissemination block level are derived by summing the adjusted dissemination block counts. The adjustment of dissemination block counts is controlled to ensure that the population counts for dissemination areas will always be within 5 of the actual values. The adjustment has no impact on the population counts of census divisions and large census subdivisions. Websites for further information Statistics Canada 2001 Census Dictionary Local area boundary dataset
The 2023 Jordan Population and Family Health Survey (JPFHS) is the eighth Population and Family Health Survey conducted in Jordan, following those conducted in 1990, 1997, 2002, 2007, 2009, 2012, and 2017–18. It was implemented by the Department of Statistics (DoS) at the request of the Ministry of Health (MoH).
The primary objective of the 2023 JPFHS is to provide up-to-date estimates of key demographic and health indicators. Specifically, the 2023 JPFHS: • Collected data at the national level that allowed calculation of key demographic indicators • Explored the direct and indirect factors that determine levels of and trends in fertility and childhood mortality • Measured contraceptive knowledge and practice • Collected data on key aspects of family health, including immunisation coverage among children, prevalence and treatment of diarrhoea and other diseases among children under age 5, and maternity care indicators such as antenatal visits and assistance at delivery • Obtained data on child feeding practices, including breastfeeding, and conducted anthropometric measurements to assess the nutritional status of children under age 5 and women age 15–49 • Conducted haemoglobin testing with eligible children age 6–59 months and women age 15–49 to gather information on the prevalence of anaemia • Collected data on women’s and men’s knowledge and attitudes regarding sexually transmitted infections and HIV/AIDS • Obtained data on women’s experience of emotional, physical, and sexual violence • Gathered data on disability among household members
The information collected through the 2023 JPFHS is intended to assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of the country’s population. The survey also provides indicators relevant to the Sustainable Development Goals (SDGs) for Jordan.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men aged 15-59, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2023 JPFHS was the 2015 Jordan Population and Housing Census (JPHC) frame. The survey was designed to produce representative results for the country as a whole, for urban and rural areas separately, for each of the country’s 12 governorates, and for four nationality domains: the Jordanian population, the Syrian population living in refugee camps, the Syrian population living outside of camps, and the population of other nationalities. Each of the 12 governorates is subdivided into districts, each district into subdistricts, each subdistrict into localities, and each locality into areas and subareas. In addition to these administrative units, during the 2015 JPHC each subarea was divided into convenient area units called census blocks. An electronic file of a complete list of all of the census blocks is available from DoS. The list contains census information on households, populations, geographical locations, and socioeconomic characteristics of each block. Based on this list, census blocks were regrouped to form a general statistical unit of moderate size, called a cluster, which is widely used in various surveys as the primary sampling unit (PSU). The sample clusters for the 2023 JPFHS were selected from the frame of cluster units provided by the DoS.
The sample for the 2023 JPFHS was a stratified sample selected in two stages from the 2015 census frame. Stratification was achieved by separating each governorate into urban and rural areas. In addition, the Syrian refugee camps in Zarqa and Mafraq each formed a special sampling stratum. In total, 26 sampling strata were constructed. Samples were selected independently in each sampling stratum, through a twostage selection process, according to the sample allocation. Before the sample selection, the sampling frame was sorted by district and subdistrict within each sampling stratum. By using a probability proportional to size selection at the first stage of sampling, an implicit stratification and proportional allocation were achieved at each of the lower administrative levels.
For further details on sample design, see APPENDIX A of the final report.
Computer Assisted Personal Interview [capi]
Five questionnaires were used for the 2023 JPFHS: (1) the Household Questionnaire, (2) the Woman’s Questionnaire, (3) the Man’s Questionnaire, (4) the Biomarker Questionnaire, and (5) the Fieldworker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Jordan. Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. After all questionnaires were finalised in English, they were translated into Arabic.
All electronic data files for the 2023 JPFHS were transferred via SynCloud to the DoS central office in Amman, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of open-ended questions. Data editing was accomplished using CSPro software. During the duration of fieldwork, tables were generated to check various data quality parameters, and specific feedback was given to the teams to improve performance. Secondary editing and data processing were initiated in July and completed in September 2023.
A total of 20,054 households were selected for the sample, of which 19,809 were occupied. Of the occupied households, 19,475 were successfully interviewed, yielding a response rate of 98%.
In the interviewed households, 13,020 eligible women age 15–49 were identified for individual interviews; interviews were completed with 12,595 women, yielding a response rate of 97%. In the subsample of households selected for the male survey, 6,506 men age 15–59 were identified as eligible for individual interviews and 5,873 were successfully interviewed, yielding a response rate of 90%.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and in data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2023 Jordan Population and Family Health Survey (2023 JPFHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2023 JPFHS is only one of many samples that could have been selected from the same population, using the same design and sample size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected by simple random sampling, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2023 JPFHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed using SAS programs developed by ICF. These programs use the Taylor linearisation method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de438965
Abstract (en): The American Time Use Survey (ATUS) collects information on how people living in the United States spend their time. Data collected in this study measured the amount of time that people spent doing various activities in 2005, such as paid work, child care, religious activities, volunteering, and socializing. Respondents were randomly selected from households that had completed their final month of the Current Population Survey (CPS), and were interviewed two to five months after their household's last CPS interview. Respondents were interviewed only once and reported their activities for the 24-hour period from 4 a.m. on the day before the interview until 4 a.m. on the day of the interview. Respondents indicated the total number of minutes spent on each activity, including where they were and whom they were with. Except for secondary child care, data on activities done simultaneously with primary activities were not collected. Part 1, Respondent and Activity Summary File, contains demographic information about respondents and a summary of the total amount of time they spent doing each activity that day. Part 2, Roster File, contains information about household members and nonhousehold children under the age of 18. Part 3, Activity File, includes additional information on activities in which respondents participated, including the location of each activity and the total time spent on secondary child care. Part 4, Who File, includes data on who was present during each activity. Part 5, ATUS-CPS 2005 File, contains data on respondents and members of their household collected two to five months prior to the ATUS interviews during their participation in the Current Population Survey (CPS). Parts 6-10 contain supplemental data files that can be used for further analysis of the data. Part 6, Case History File, contains information about the interview process, such as identifiers and interview outcome codes. Part 7, Call History File, gives information about each call attempt, including the call date and outcome. Part 8, Trips File, provides information about the number, duration, and purpose of overnight trips away from home for two or more nights in a row. Part 9, Replicate Weights File I, contains base weights, replicated base weights, and replicate final weights for each case that was selected to be interviewed for ATUS, while Part 10, Replicate Weights File II, contains replicate weights that were generated using the 2006 weighting method. Demographic variables include sex, age, race, ethnicity, education level, income, employment status, occupation, citizenship status, country of origin, relationship to household members, and the ages and number of children in the household. The data contain weight variables which should be used in analyzing the data. Unweighted data are not representative of the population due to differences between population groups in both sampling and nonresponse. ATUS weight variables include the ATUS final weight (TUFINLWGT), which indicates the number of person-days the respondent represents, the ATUS base weight (TUBWGT), and a ATUS final weight based on 2006 weighting methodology (TU06FWGT). ATUS weights were selected from the Current Population Survey (CPS), and CPS weights (after the first-stage adjustment) are the basis for the ATUS weights. These base weights were adjusted to account for the fact that less populous states were not oversampled in ATUS, as they were in the CPS. Further adjustments were made to account for the probability of selecting each household within the ATUS sampling strata and the probability of selecting each person from each sample household. Part 9 contains replicate weights for the variable TUFINLWGT, as well as base weights, while Part 10 contains replicate weights for the variable TU06FWGT. ATUS replicate weights were based on the replicate weights developed for the CPS. ATUS began with the CPS replicate weight after the first-stage ratio adjustment, and each replicate was processed through all of the stages of the ATUS weighting procedure. The CPS replicate weights were based on a modified balanced half-sample method of replication, developed in the 1980s by Robert Fay. For more information about the replicate weights, see the publication, Technical Paper 63RV: Current Population Survey -- Design and Methodology, available via the Bureau of Labor Statistics Web site. More information on the weighting variables used in this study can be found in t...