Facebook
TwitterA random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
N.B. This is not real data. Only here for an example for project templates.
Project Title: Add title here
Project Team: Add contact information for research project team members
Summary: Provide a descriptive summary of the nature of your research project and its aims/focal research questions.
Relevant publications/outputs: When available, add links to the related publications/outputs from this data.
Data availability statement: If your data is not linked on figshare directly, provide links to where it is being hosted here (i.e., Open Science Framework, Github, etc.). If your data is not going to be made publicly available, please provide details here as to the conditions under which interested individuals could gain access to the data and how to go about doing so.
Data collection details: 1. When was your data collected? 2. How were your participants sampled/recruited?
Sample information: How many and who are your participants? Demographic summaries are helpful additions to this section.
Research Project Materials: What materials are necessary to fully reproduce your the contents of your dataset? Include a list of all relevant materials (e.g., surveys, interview questions) with a brief description of what is included in each file that should be uploaded alongside your datasets.
List of relevant datafile(s): If your project produces data that cannot be contained in a single file, list the names of each of the files here with a brief description of what parts of your research project each file is related to.
Data codebook: What is in each column of your dataset? Provide variable names as they are encoded in your data files, verbatim question associated with each response, response options, details of any post-collection coding that has been done on the raw-response (and whether that's encoded in a separate column).
Examples available at: https://www.thearda.com/data-archive?fid=PEWMU17 https://www.thearda.com/data-archive?fid=RELLAND14
Facebook
TwitterThe 1997 Jordan Population and Family Health Survey (JPFHS) is a national sample survey carried out by the Department of Statistics (DOS) as part of its National Household Surveys Program (NHSP). The JPFHS was specifically aimed at providing information on fertility, family planning, and infant and child mortality. Information was also gathered on breastfeeding, on maternal and child health care and nutritional status, and on the characteristics of households and household members. The survey will provide policymakers and planners with important information for use in formulating informed programs and policies on reproductive behavior and health.
National
Sample survey data
SAMPLE DESIGN AND IMPLEMENTATION
The 1997 JPFHS sample was designed to produce reliable estimates of major survey variables for the country as a whole, for urban and rural areas, for the three regions (each composed of a group of governorates), and for the three major governorates, Amman, Irbid, and Zarqa.
The 1997 JPFHS sample is a subsample of the master sample that was designed using the frame obtained from the 1994 Population and Housing Census. A two-stage sampling procedure was employed. First, primary sampling units (PSUs) were selected with probability proportional to the number of housing units in the PSU. A total of 300 PSUs were selected at this stage. In the second stage, in each selected PSU, occupied housing units were selected with probability inversely proportional to the number of housing units in the PSU. This design maintains a self-weighted sampling fraction within each governorate.
UPDATING OF SAMPLING FRAME
Prior to the main fieldwork, mapping operations were carried out and the sample units/blocks were selected and then identified and located in the field. The selected blocks were delineated and the outer boundaries were demarcated with special signs. During this process, the numbers on buildings and housing units were updated, listed and documented, along with the name of the owner/tenant of the unit or household and the name of the household head. These activities took place between January 7 and February 28, 1997.
Note: See detailed description of sample design in APPENDIX A of the survey report.
Face-to-face
The 1997 JPFHS used two questionnaires, one for the household interview and the other for eligible women. Both questionnaires were developed in English and then translated into Arabic. The household questionnaire was used to list all members of the sampled households, including usual residents as well as visitors. For each member of the household, basic demographic and social characteristics were recorded and women eligible for the individual interview were identified. The individual questionnaire was developed utilizing the experience gained from previous surveys, in particular the 1983 and 1990 Jordan Fertility and Family Health Surveys (JFFHS).
The 1997 JPFHS individual questionnaire consists of 10 sections: - Respondent’s background - Marriage - Reproduction (birth history) - Contraception - Pregnancy, breastfeeding, health and immunization - Fertility preferences - Husband’s background, woman’s work and residence - Knowledge of AIDS - Maternal mortality - Height and weight of children and mothers.
Fieldwork and data processing activities overlapped. After a week of data collection, and after field editing of questionnaires for completeness and consistency, the questionnaires for each cluster were packaged together and sent to the central office in Amman where they were registered and stored. Special teams were formed to carry out office editing and coding.
Data entry started after a week of office data processing. The process of data entry, editing, and cleaning was done by means of the ISSA (Integrated System for Survey Analysis) program DHS has developed especially for such surveys. The ISSA program allows data to be edited while being entered. Data entry was completed on November 14, 1997. A data processing specialist from Macro made a trip to Jordan in November and December 1997 to identify problems in data entry, editing, and cleaning, and to work on tabulations for both the preliminary and final report.
A total of 7,924 occupied housing units were selected for the survey; from among those, 7,592 households were found. Of the occupied households, 7,335 (97 percent) were successfully interviewed. In those households, 5,765 eligible women were identified, and complete interviews were obtained with 5,548 of them (96 percent of all eligible women). Thus, the overall response rate of the 1997 JPFHS was 93 percent. The principal reason for nonresponse among the women was the failure of interviewers to find them at home despite repeated callbacks.
Note: See summarized response rates by place of residence in Table 1.1 of the survey report.
The estimates from a sample survey are subject to two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the result of mistakes made in implementing data collection and data processing (such as failure to locate and interview the correct household, misunderstanding questions either by the interviewer or the respondent, and data entry errors). Although during the implementation of the 1997 JPFHS numerous efforts were made to minimize this type of error, nonsampling errors are not only impossible to avoid but also difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The respondents selected in the 1997 JPFHS constitute only one of many samples that could have been selected from the same population, given the same design and expected size. Each of those samples would have yielded results differing somewhat from the results of the sample actually selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, since the 1997 JDHS-II sample resulted from a multistage stratified design, formulae of higher complexity had to be used. The computer software used to calculate sampling errors for the 1997 JDHS-II was the ISSA Sampling Error Module, which uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics, such as fertility and mortality rates.
Note: See detailed estimate of sampling error calculation in APPENDIX B of the survey report.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months
Note: See detailed tables in APPENDIX C of the survey report.
Facebook
TwitterThe harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2012. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2007 micro data set.
----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2012:
Iraq is considered a leader in household expenditure and income surveys where the first was conducted in 1946 followed by surveys in 1954 and 1961. After the establishment of Central Statistical Organization, household expenditure and income surveys were carried out every 3-5 years in (1971/ 1972, 1976, 1979, 1984/ 1985, 1988, 1993, 2002 / 2007). Implementing the cooperation between CSO and WB, Central Statistical Organization (CSO) and Kurdistan Region Statistics Office (KRSO) launched fieldwork on IHSES on 1/1/2012. The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
The survey has six main objectives. These objectives are:
The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2006/2007 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.
National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.
1- Household/family. 2- Individual/person.
The survey was carried out over a full year covering all governorates including those in Kurdistan Region.
Sample survey data [ssd]
----> Design:
Sample size was (25488) household for the whole Iraq, 216 households for each district of 118 districts, 2832 clusters each of which includes 9 households distributed on districts and governorates for rural and urban.
----> Sample frame:
Listing and numbering results of 2009-2010 Population and Housing Survey were adopted in all the governorates including Kurdistan Region as a frame to select households, the sample was selected in two stages: Stage 1: Primary sampling unit (blocks) within each stratum (district) for urban and rural were systematically selected with probability proportional to size to reach 2832 units (cluster). Stage two: 9 households from each primary sampling unit were selected to create a cluster, thus the sample size of total survey clusters was 25488 households distributed on the governorates, 216 households in each district.
----> Sampling Stages:
In each district, the sample was selected in two stages: Stage 1: based on 2010 listing and numbering frame 24 sample points were selected within each stratum through systematic sampling with probability proportional to size, in addition to the implicit breakdown urban and rural and geographic breakdown (sub-district, quarter, street, county, village and block). Stage 2: Using households as secondary sampling units, 9 households were selected from each sample point using systematic equal probability sampling. Sampling frames of each stages can be developed based on 2010 building listing and numbering without updating household lists. In some small districts, random selection processes of primary sampling may lead to select less than 24 units therefore a sampling unit is selected more than once , the selection may reach two cluster or more from the same enumeration unit when it is necessary.
Face-to-face [f2f]
----> Preparation:
The questionnaire of 2006 survey was adopted in designing the questionnaire of 2012 survey on which many revisions were made. Two rounds of pre-test were carried out. Revision were made based on the feedback of field work team, World Bank consultants and others, other revisions were made before final version was implemented in a pilot survey in September 2011. After the pilot survey implemented, other revisions were made in based on the challenges and feedbacks emerged during the implementation to implement the final version in the actual survey.
----> Questionnaire Parts:
The questionnaire consists of four parts each with several sections: Part 1: Socio – Economic Data: - Section 1: Household Roster - Section 2: Emigration - Section 3: Food Rations - Section 4: housing - Section 5: education - Section 6: health - Section 7: Physical measurements - Section 8: job seeking and previous job
Part 2: Monthly, Quarterly and Annual Expenditures: - Section 9: Expenditures on Non – Food Commodities and Services (past 30 days). - Section 10 : Expenditures on Non – Food Commodities and Services (past 90 days). - Section 11: Expenditures on Non – Food Commodities and Services (past 12 months). - Section 12: Expenditures on Non-food Frequent Food Stuff and Commodities (7 days). - Section 12, Table 1: Meals Had Within the Residential Unit. - Section 12, table 2: Number of Persons Participate in the Meals within Household Expenditure Other Than its Members.
Part 3: Income and Other Data: - Section 13: Job - Section 14: paid jobs - Section 15: Agriculture, forestry and fishing - Section 16: Household non – agricultural projects - Section 17: Income from ownership and transfers - Section 18: Durable goods - Section 19: Loans, advances and subsidies - Section 20: Shocks and strategy of dealing in the households - Section 21: Time use - Section 22: Justice - Section 23: Satisfaction in life - Section 24: Food consumption during past 7 days
Part 4: Diary of Daily Expenditures: Diary of expenditure is an essential component of this survey. It is left at the household to record all the daily purchases such as expenditures on food and frequent non-food items such as gasoline, newspapers…etc. during 7 days. Two pages were allocated for recording the expenditures of each day, thus the roster will be consists of 14 pages.
----> Raw Data:
Data Editing and Processing: To ensure accuracy and consistency, the data were edited at the following stages: 1. Interviewer: Checks all answers on the household questionnaire, confirming that they are clear and correct. 2. Local Supervisor: Checks to make sure that questions has been correctly completed. 3. Statistical analysis: After exporting data files from excel to SPSS, the Statistical Analysis Unit uses program commands to identify irregular or non-logical values in addition to auditing some variables. 4. World Bank consultants in coordination with the CSO data management team: the World Bank technical consultants use additional programs in SPSS and STAT to examine and correct remaining inconsistencies within the data files. The software detects errors by analyzing questionnaire items according to the expected parameter for each variable.
----> Harmonized Data:
Iraq Household Socio Economic Survey (IHSES) reached a total of 25488 households. Number of households refused to response was 305, response rate was 98.6%. The highest interview rates were in Ninevah and Muthanna (100%) while the lowest rates were in Sulaimaniya (92%).
Facebook
TwitterThe Tanzania Demographic and Health Survey (TDHS) is part of the worldwide Demographic and Health Surveys (DHS) programme, which is designed to collect data on fertility, family planning, and maternal and child health.
The primary objective of the 1999 TRCHS was to collect data at the national level (with breakdowns by urban-rural and Mainland-Zanzibar residence wherever warranted) on fertility levels and preferences, family planning use, maternal and child health, breastfeeding practices, nutritional status of young children, childhood mortality levels, knowledge and behaviour regarding HIV/AIDS, and the availability of specific health services within the community.1 Related objectives were to produce these results in a timely manner and to ensure that the data were disseminated to a wide audience of potential users in governmental and nongovernmental organisations within and outside Tanzania. The ultimate intent is to use the information to evaluate current programmes and to design new strategies for improving health and family planning services for the people of Tanzania.
National. The sample was designed to provide estimates for the whole country, for urban and rural areas separately, and for Zanzibar and, in some cases, Unguja and Pemba separately.
Sample survey data
The TRCHS used a three-stage sample design. Overall, 176 census enumeration areas were selected (146 on the Mainland and 30 in Zanzibar) with probability proportional to size on an approximately self-weighting basis on the Mainland, but with oversampling of urban areas and Zanzibar. To reduce costs and maximise the ability to identify trends over time, these enumeration areas were selected from the 357 sample points that were used in the 1996 TDHS, which in turn were selected from the 1988 census frame of enumeration in a two-stage process (first wards/branches and then enumeration areas within wards/branches). Before the data collection, fieldwork teams visited the selected enumeration areas to list all the households. From these lists, households were selected to be interviewed. The sample was designed to provide estimates for the whole country, for urban and rural areas separately, and for Zanzibar and, in some cases, Unguja and Pemba separately. The health facilities component of the TRCHS involved visiting hospitals, health centres, and pharmacies located in areas around the households interviewed. In this way, the data from the two components can be linked and a richer dataset produced.
See detailed sample implementation in the APPENDIX A of the final report.
Face-to-face
The household survey component of the TRCHS involved three questionnaires: 1) a Household Questionnaire, 2) a Women’s Questionnaire for all individual women age 15-49 in the selected households, and 3) a Men’s Questionnaire for all men age 15-59.
The health facilities survey involved six questionnaires: 1) a Community Questionnaire administered to men and women in each selected enumeration area; 2) a Facility Questionnaire; 3) a Facility Inventory; 4) a Service Provider Questionnaire; 5) a Pharmacy Inventory Questionnaire; and 6) a questionnaire for the District Medical Officers.
All these instruments were based on model questionnaires developed for the MEASURE programme, as well as on the questionnaires used in the 1991-92 TDHS, the 1994 TKAP, and the 1996 TDHS. These model questionnaires were adapted for use in Tanzania during meetings with representatives from the Ministry of Health, the University of Dar es Salaam, the Tanzania Food and Nutrition Centre, USAID/Tanzania, UNICEF/Tanzania, UNFPA/Tanzania, and other potential data users. The questionnaires and manual were developed in English and then translated into and printed in Kiswahili.
The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including his/her age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for individual interview and children under five who were to be weighed and measured. Information was also collected about the dwelling itself, such as the source of water, type of toilet facilities, materials used to construct the house, ownership of various consumer goods, and use of iodised salt. Finally, the Household Questionnaire was used to collect some rudimentary information about the extent of child labour.
The Women’s Questionnaire was used to collect information from women age 15-49. These women were asked questions on the following topics: · Background characteristics (age, education, religion, type of employment) · Birth history · Knowledge and use of family planning methods · Antenatal, delivery, and postnatal care · Breastfeeding and weaning practices · Vaccinations, birth registration, and health of children under age five · Marriage and recent sexual activity · Fertility preferences · Knowledge and behaviour concerning HIV/AIDS.
The Men’s Questionnaire covered most of these same issues, except that it omitted the sections on the detailed reproductive history, maternal health, and child health. The final versions of the English questionnaires are provided in Appendix E.
Before the questionnaires could be finalised, a pretest was done in July 1999 in Kibaha District to assess the viability of the questions, the flow and logical sequence of the skip pattern, and the field organisation. Modifications to the questionnaires, including wording and translations, were made based on lessons drawn from the exercise.
In all, 3,826 households were selected for the sample, out of which 3,677 were occupied. Of the households found, 3,615 were interviewed, representing a response rate of 98 percent. The shortfall is primarily due to dwellings that were vacant or in which the inhabitants were not at home despite of several callbacks.
In the interviewed households, a total of 4,118 eligible women (i.e., women age 15-49) were identified for the individual interview, and 4,029 women were actually interviewed, yielding a response rate of 98 percent. A total of 3,792 eligible men (i.e., men age 15-59), were identified for the individual interview, of whom 3,542 were interviewed, representing a response rate of 93 percent. The principal reason for nonresponse among both eligible men and women was the failure to find them at home despite repeated visits to the household. The lower response rate among men than women was due to the more frequent and longer absences of men.
The response rates are lower in urban areas due to longer absence of respondents from their homes. One-member households are more common in urban areas and are more difficult to interview because they keep their houses locked most of the time. In urban settings, neighbours often do not know the whereabouts of such people.
The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the TRCHS to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the TRCHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the TRCHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the TRCHS is the ISSA Sampling Error Module (SAMPERR). This module used the Taylor linearisation method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rate
Note: See detailed sampling error calculation in the APPENDIX B
Facebook
TwitterThe documented dataset covers Enterprise Survey (ES) panel data collected in Malawi in 2009 and 2014, as part of Africa Enterprise Surveys roll-out, an initiative of the World Bank.
New Enterprise Surveys target a sample consisting of longitudinal (panel) observations and new cross-sectional data. Panel firms are prioritized in the sample selection, comprising up to 50% of the sample in the current wave. For all panel firms, regardless of the sample, current eligibility or operating status is determined and included in panel datasets.
Malawi ES 2014 was conducted between April 2014 and February 2015, Malawi ES 2009 was carried out in May - July 2009. The objective of the Enterprise Survey is to obtain feedback from enterprises on the state of the private sector as well as to help in building a panel of enterprise data that will make it possible to track changes in the business environment over time, thus allowing, for example, impact assessments of reforms. Through interviews with firms in the manufacturing and services sectors, the survey assesses the constraints to private sector growth and creates statistically significant business environment indicators that are comparable across countries.
Stratified random sampling was used to select the surveyed businesses. The data was collected using face-to-face interviews.
Data from 673 establishments was analyzed: 436 businesses were from 2014 ES only, 63 - from 2009 ES only, and 174 firms were from both 2009 and 2014 panels.
The standard Enterprise Survey topics include firm characteristics, gender participation, access to finance, annual sales, costs of inputs and labor, workforce composition, bribery, licensing, infrastructure, trade, crime, competition, capacity utilization, land and permits, taxation, informality, business-government relations, innovation and technology, and performance measures. Over 90 percent of the questions objectively measure characteristics of a country’s business environment. The remaining questions assess the survey respondents’ opinions on what are the obstacles to firm growth and performance.
National
The primary sampling unit of the study is an establishment. An establishment is a physical location where business is carried out and where industrial operations take place or services are provided. A firm may be composed of one or more establishments. For example, a brewery may have several bottling plants and several establishments for distribution. For the purposes of this survey an establishment must make its own financial decisions and have its own financial statements separate from those of the firm. An establishment must also have its own management and control over its payroll.
The whole population, or the universe, covered in the Enterprise Surveys is the non-agricultural private economy. It comprises: all manufacturing sectors according to the ISIC Revision 3.1 group classification (group D), construction sector (group F), services sector (groups G and H), and transport, storage, and communications sector (group I). Note that this population definition excludes the following sectors: financial intermediation (group J), real estate and renting activities (group K, except sub-sector 72, IT, which was added to the population under study), and all public or utilities sectors. Companies with 100% government ownership are not eligible to participate in the Enterprise Surveys.
Sample survey data [ssd]
For the Malawi ES, multiple sample frames were used: a sample frame was built using data compiled from local and municipal business registries. Due to the fact that the previous round of surveys utilized different stratification criteria in the 2009 survey sample, the presence of panel firms was limited to a maximum of 50% of the achieved interviews in each stratum. That sample is referred to as the panel.
Face-to-face [f2f]
The following survey instruments were used for Malawi ES 2009 and 2014: - Manufacturing Module Questionnaire - Services Module Questionnaire
The survey is fielded via manufacturing or services questionnaires in order not to ask questions that are irrelevant to specific types of firms, e.g. a question that relates to production and nonproduction workers should not be asked of a retail firm. In addition to questions that are asked across countries, all surveys are customized and contain country-specific questions. An example of customization would be including tourism-related questions that are asked in certain countries when tourism is an existing or potential sector of economic growth. There is a skip pattern in the Service Module Questionnaire for questions that apply only to retail firms.
Data entry and quality controls are implemented by the contractor and data is delivered to the World Bank in batches (typically 10%, 50% and 100%). These data deliveries are checked for logical consistency, out of range values, skip patterns, and duplicate entries. Problems are flagged by the World Bank and corrected by the implementing contractor through data checks, callbacks, and revisiting establishments.
Survey non-response must be differentiated from item non-response. The former refers to refusals to participate in the survey altogether whereas the latter refers to the refusals to answer some specific questions. Enterprise Surveys suffer from both problems and different strategies were used to address these issues.
Item non-response was addressed by two strategies: a- For sensitive questions that may generate negative reactions from the respondent, such as corruption or tax evasion, enumerators were instructed to collect "Refusal to respond" (-8) as a different option from "Don't know" (-9). b- Establishments with incomplete information were re-contacted in order to complete this information, whenever necessary.
Survey non-response was addressed by maximizing efforts to contact establishments that were initially selected for interview. Attempts were made to contact the establishment for interview at different times/days of the week before a replacement establishment (with similar strata characteristics) was suggested for interview. Survey non-response did occur but substitutions were made in order to potentially achieve strata-specific goals.
Facebook
TwitterWithin the frame of PCBS' efforts in providing official Palestinian statistics in the different life aspects of Palestinian society and because the wide spread of Computer, Internet and Mobile Phone among the Palestinian people, and the important role they may play in spreading knowledge and culture and contribution in formulating the public opinion, PCBS conducted the Household Survey on Information and Communications Technology, 2014.
The main objective of this survey is to provide statistical data on Information and Communication Technology in the Palestine in addition to providing data on the following: -
· Prevalence of computers and access to the Internet. · Study the penetration and purpose of Technology use.
Palestine (West Bank and Gaza Strip) , type of locality (Urban, Rural, Refugee Camps) and governorate
Household. Person 10 years and over .
All Palestinian households and individuals whose usual place of residence in Palestine with focus on persons aged 10 years and over in year 2014.
Sample survey data [ssd]
Sampling Frame The sampling frame consists of a list of enumeration areas adopted in the Population, Housing and Establishments Census of 2007. Each enumeration area has an average size of about 124 households. These were used in the first phase as Preliminary Sampling Units in the process of selecting the survey sample.
Sample Size The total sample size of the survey was 7,268 households, of which 6,000 responded.
Sample Design The sample is a stratified clustered systematic random sample. The design comprised three phases:
Phase I: Random sample of 240 enumeration areas. Phase II: Selection of 25 households from each enumeration area selected in phase one using systematic random selection. Phase III: Selection of an individual (10 years or more) in the field from the selected households; KISH TABLES were used to ensure indiscriminate selection.
Sample Strata Distribution of the sample was stratified by: 1- Governorate (16 governorates, J1). 2- Type of locality (urban, rural and camps).
-
Face-to-face [f2f]
The survey questionnaire consists of identification data, quality controls and three main sections: Section I: Data on household members that include identification fields, the characteristics of household members (demographic and social) such as the relationship of individuals to the head of household, sex, date of birth and age.
Section II: Household data include information regarding computer processing, access to the Internet, and possession of various media and computer equipment. This section includes information on topics related to the use of computer and Internet, as well as supervision by households of their children (5-17 years old) while using the computer and Internet, and protective measures taken by the household in the home.
Section III: Data on persons (aged 10 years and over) about computer use, access to the Internet and possession of a mobile phone.
Preparation of Data Entry Program: This stage included preparation of the data entry programs using an ACCESS package and defining data entry control rules to avoid errors, plus validation inquiries to examine the data after it had been captured electronically.
Data Entry: The data entry process started on 8 May 2014 and ended on 23 June 2014. The data entry took place at the main PCBS office and in field offices using 28 data clerks.
Editing and Cleaning procedures: Several measures were taken to avoid non-sampling errors. These included editing of questionnaires before data entry to check field errors, using a data entry application that does not allow mistakes during the process of data entry, and then examining the data by using frequency and cross tables. This ensured that data were error free; cleaning and inspection of the anomalous values were conducted to ensure harmony between the different questions on the questionnaire.
Response Rates= 79%
There are many aspects of the concept of data quality; this includes the initial planning of the survey to the dissemination of the results and how well users understand and use the data. There are three components to the quality of statistics: accuracy, comparability, and quality control procedures.
Checks on data accuracy cover many aspects of the survey and include statistical errors due to the use of a sample, non-statistical errors resulting from field workers or survey tools, and response rates and their effect on estimations. This section includes:
Statistical Errors Data of this survey may be affected by statistical errors due to the use of a sample and not a complete enumeration. Therefore, certain differences can be expected in comparison with the real values obtained through censuses. Variances were calculated for the most important indicators.
Variance calculations revealed that there is no problem in disseminating results nationally or regionally (the West Bank, Gaza Strip), but some indicators show high variance by governorate, as noted in the tables of the main report.
Non-Statistical Errors Non-statistical errors are possible at all stages of the project, during data collection or processing. These are referred to as non-response errors, response errors, interviewing errors and data entry errors. To avoid errors and reduce their effects, strenuous efforts were made to train the field workers intensively. They were trained on how to carry out the interview, what to discuss and what to avoid, and practical and theoretical training took place during the training course. Training manuals were provided for each section of the questionnaire, along with practical exercises in class and instructions on how to approach respondents to reduce refused cases. Data entry staff were trained on the data entry program, which was tested before starting the data entry process.
Several measures were taken to avoid non-sampling errors. These included editing of questionnaires before data entry to check field errors, using a data entry application that does not allow mistakes during the process of data entry, and then examining the data by using frequency and cross tables. This ensured that data were error free; cleaning and inspection of the anomalous values were conducted to ensure harmony between the different questions on the questionnaire.
The sources of non-statistical errors can be summarized as: 1. Some of the households were not at home and could not be interviewed, and some households refused to be interviewed. 2. In unique cases, errors occurred due to the way the questions were asked by interviewers and respondents misunderstood some of the questions.
Facebook
TwitterEconomists are shifting attention and resources from work on survey data towork on “big data.” This analysis is an empirical exploration of the trade-offs this transition requires. Parallel models are estimated using the Federal Reserve Bank of New York Consumer Credit Panel/Equifax and the Survey of Consumer Finances. After adjustments to account for different variable definitions and sampled populations, it is possible to arrive at similar models of total household debt. However, the estimates are sensitive to the adjustments. Little similarity is observed in parallel models of nonmortgage debt. While surveys intentionally collect theoretically related variables, it may be necessary to merge external data into commercial big data. In this example, some education and income measures are successfully integrated with the big data, but other external aggregates fail to adequately substitute for survey responses. Big data offers sample sizes, frequencies, and details that surveys cannot match. However, this example illustrates why caution is appropriate when attempting to substitute big data for a carefully executed survey.
Facebook
TwitterThe Associated Press is sharing data from the COVID Impact Survey, which provides statistics about physical health, mental health, economic security and social dynamics related to the coronavirus pandemic in the United States.
Conducted by NORC at the University of Chicago for the Data Foundation, the probability-based survey provides estimates for the United States as a whole, as well as in 10 states (California, Colorado, Florida, Louisiana, Minnesota, Missouri, Montana, New York, Oregon and Texas) and eight metropolitan areas (Atlanta, Baltimore, Birmingham, Chicago, Cleveland, Columbus, Phoenix and Pittsburgh).
The survey is designed to allow for an ongoing gauge of public perception, health and economic status to see what is shifting during the pandemic. When multiple sets of data are available, it will allow for the tracking of how issues ranging from COVID-19 symptoms to economic status change over time.
The survey is focused on three core areas of research:
Instead, use our queries linked below or statistical software such as R or SPSS to weight the data.
If you'd like to create a table to see how people nationally or in your state or city feel about a topic in the survey, use the survey questionnaire and codebook to match a question (the variable label) to a variable name. For instance, "How often have you felt lonely in the past 7 days?" is variable "soc5c".
Nationally: Go to this query and enter soc5c as the variable. Hit the blue Run Query button in the upper right hand corner.
Local or State: To find figures for that response in a specific state, go to this query and type in a state name and soc5c as the variable, and then hit the blue Run Query button in the upper right hand corner.
The resulting sentence you could write out of these queries is: "People in some states are less likely to report loneliness than others. For example, 66% of Louisianans report feeling lonely on none of the last seven days, compared with 52% of Californians. Nationally, 60% of people said they hadn't felt lonely."
The margin of error for the national and regional surveys is found in the attached methods statement. You will need the margin of error to determine if the comparisons are statistically significant. If the difference is:
The survey data will be provided under embargo in both comma-delimited and statistical formats.
Each set of survey data will be numbered and have the date the embargo lifts in front of it in the format of: 01_April_30_covid_impact_survey. The survey has been organized by the Data Foundation, a non-profit non-partisan think tank, and is sponsored by the Federal Reserve Bank of Minneapolis and the Packard Foundation. It is conducted by NORC at the University of Chicago, a non-partisan research organization. (NORC is not an abbreviation, it part of the organization's formal name.)
Data for the national estimates are collected using the AmeriSpeak Panel, NORC’s probability-based panel designed to be representative of the U.S. household population. Interviews are conducted with adults age 18 and over representing the 50 states and the District of Columbia. Panel members are randomly drawn from AmeriSpeak with a target of achieving 2,000 interviews in each survey. Invited panel members may complete the survey online or by telephone with an NORC telephone interviewer.
Once all the study data have been made final, an iterative raking process is used to adjust for any survey nonresponse as well as any noncoverage or under and oversampling resulting from the study specific sample design. Raking variables include age, gender, census division, race/ethnicity, education, and county groupings based on county level counts of the number of COVID-19 deaths. Demographic weighting variables were obtained from the 2020 Current Population Survey. The count of COVID-19 deaths by county was obtained from USA Facts. The weighted data reflect the U.S. population of adults age 18 and over.
Data for the regional estimates are collected using a multi-mode address-based (ABS) approach that allows residents of each area to complete the interview via web or with an NORC telephone interviewer. All sampled households are mailed a postcard inviting them to complete the survey either online using a unique PIN or via telephone by calling a toll-free number. Interviews are conducted with adults age 18 and over with a target of achieving 400 interviews in each region in each survey.Additional details on the survey methodology and the survey questionnaire are attached below or can be found at https://www.covid-impact.org.
Results should be credited to the COVID Impact Survey, conducted by NORC at the University of Chicago for the Data Foundation.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Raw data and descriptive statistic data of the market survey performed with the Add-In XLSTAT 2009.1.02 is provided as Excel-file (CSV). The data include file name, sample name, area, calculated N2O amounts, test result and statistical values.
Facebook
TwitterDescriptive statistics for the demographic variables of the full survey sample.
Facebook
TwitterTHE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN
The Department of Statistics (DOS) carried out four rounds of the 2016 Employment and Unemployment Survey (EUS). The survey rounds covered a sample of about fourty nine thousand households Nation-wide. The sampled households were selected using a stratified multi-stage cluster sampling design.
It is worthy to mention that the DOS employed new technology in data collection and data processing. Data was collected using electronic questionnaire instead of a hard copy, namely a hand held device (PDA).
The survey main objectives are: - To identify the demographic, social and economic characteristics of the population and manpower. - To identify the occupational structure and economic activity of the employed persons, as well as their employment status. - To identify the reasons behind the desire of the employed persons to search for a new or additional job. - To measure the economic activity participation rates (the number of economically active population divided by the population of 15+ years old). - To identify the different characteristics of the unemployed persons. - To measure unemployment rates (the number of unemployed persons divided by the number of economically active population of 15+ years old) according to the various characteristics of the unemployed, and the changes that might take place in this regard. - To identify the most important ways and means used by the unemployed persons to get a job, in addition to measuring durations of unemployment for such persons. - To identify the changes overtime that might take place regarding the above-mentioned variables.
The raw survey data provided by the Statistical Agency were cleaned and harmonized by the Economic Research Forum, in the context of a major project that started in 2009. During which extensive efforts have been exerted to acquire, clean, harmonize, preserve and disseminate micro data of existing labor force surveys in several Arab countries.
Covering a sample representative on the national level (Kingdom), governorates, and the three Regions (Central, North and South).
1- Household/family. 2- Individual/person.
The survey covered a national sample of households and all individuals permanently residing in surveyed households.
Sample survey data [ssd]
THE CLEANED AND HARMONIZED VERSION OF THE SURVEY DATA PRODUCED AND PUBLISHED BY THE ECONOMIC RESEARCH FORUM REPRESENTS 100% OF THE ORIGINAL SURVEY DATA COLLECTED BY THE DEPARTMENT OF STATISTICS OF THE HASHEMITE KINGDOM OF JORDAN
Computer Assisted Personal Interview [capi]
----> Raw Data
A tabulation results plan has been set based on the previous Employment and Unemployment Surveys while the required programs were prepared and tested. When all prior data processing steps were completed, the actual survey results were tabulated using an ORACLE package. The tabulations were then thoroughly checked for consistency of data. The final report was then prepared, containing detailed tabulations as well as the methodology of the survey.
----> Harmonized Data
Facebook
TwitterThe Estimating the Size of Populations through a Household Survey (EPSHS), sought to assess the feasibility of the network scale-up and proxy respondent methods for estimating the sizes of key populations at higher risk of HIV infection and to compare the results to other estimates of the population sizes. The study was undertaken based on the assumption that if these methods proved to be feasible with a reasonable amount of data collection for making adjustments, countries would be able to add this module to their standard household survey to produce size estimates for their key populations at higher risk of HIV infection. This would facilitate better programmatic responses for prevention and caring for people living with HIV and would improve the understanding of how HIV is being transmitted in the country.
The specific objectives of the ESPHS were: 1. To assess the feasibility of the network scale-up method for estimating the sizes of key populations at higher risk of HIV infection in a Sub-Saharan African context; 2. To assess the feasibility of the proxy respondent method for estimating the sizes of key populations at higher risk of HIV infection in a Sub-Saharan African context; 3. To estimate the population size of MSM, FSW, IDU, and clients of sex workers in Rwanda at a national level; 4. To compare the estimates of the sizes of key populations at higher risk for HIV produced by the network scale-up and proxy respondent methods with estimates produced using other methods; and 5. To collect data to be used in scientific publications comparing the use of the network scale-up method in different national and cultural environments.
National
The Estimating the Size of Populations through a Household Survey (ESPHS) used a two-stage sample design, implemented in a representative sample of 2,125 households selected nationwide in which all women and men age 15 years and above where eligible for an individual interview. The sampling frame used was the preparatory frame for the Rwanda Population and Housing Census (RPHC), which was conducted in 2012; it was provided by the National Institute of Statistics of Rwanda (NISR).
The sampling frame was a complete list of natural villages covering the whole country (14,837 villages). Two strata were defined: the city of Kigali and the rest of the country. One hundred and thirty Primary Sampling Units (PSU) were selected from the sampling frame (35 in Kigali and 95 in the other stratum). To reduce clustering effect, only 20 households were selected per cluster in Kigali and 15 in the other clusters. As a result, 33 percent of the households in the sample were located in Kigali.
The list of households in each cluster was updated upon arrival of the survey team in the cluster. Once the listing had been updated, a number was assigned to each existing household in the cluster. The supervisor then identified the households to be interviewed in the survey by using a table in which the households were randomly pre-selected. This table also provided the list of households pre-selected for each of the two different definitions of what it means "to know" someone.
For further details on sample design and implementation, see Appendix A of the final report.
Face-to-face [f2f]
The Estimating the Size of Populations through a Household Survey (ESPHS) used two types of questionnaires: a household questionnaire and an individual questionnaire. The same individual questionnaire was used to interview both women and men. In addition, two versions of the individual questionnaire were developed, using two different definitions of what it means “to know” someone. Each version of the individual questionnaire was used in half of the selected households.
The processing of the ESPHS data began shortly after the fieldwork commenced. Completed questionnaires were returned periodically from the field to the SPH office in Kigali, where they were entered and checked for consistency by data processing personnel who were specially trained for this task. Data were entered using CSPro, a programme specially developed for use in DHS surveys. All data were entered twice (100 percent verification). The concurrent processing of the data was a distinct advantage for data quality, because the School of Public Health had the opportunity to advise field teams of problems detected during data entry. The data entry and editing phase of the survey was completed in late August 2011.
A total of 2,125 households were selected in the sample, of which 2,120 were actually occupied at the time of the interview. The number of occupied households successfully interviewed was 2,102, yielding a household response rate of 99 percent.
From the households interviewed, 2,629 women were found to be eligible and 2,567 were interviewed, giving a response rate of 98 percent. Interviews with men covered 2,102 of the eligible 2,149 men, yielding a response rate of 98 percent. The response rates do not significantly vary by type of questionnaire or residence.
The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made to minimize this type of error during the implementation of the Rwanda ESPHS 2011, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the ESPHS 2011 is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the ESPHS 2011 sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the ESPHS 2011 is a SAS program. This program uses the Taylor linearization method for variance estimation for survey estimates that are means or proportions.
A more detailed description of estimates of sampling errors are presented in Appendix B of the survey report.
Facebook
TwitterThe City of Bloomington contracted with National Research Center, Inc. to conduct the 2019 Bloomington Community Survey. This was the second time a scientific citywide survey had been completed covering resident opinions on service delivery satisfaction by the City of Bloomington and quality of life issues. The first was in 2017. The survey captured the responses of 610 households from a representative sample of 3,000 residents of Bloomington who were randomly selected to complete the survey. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the City of Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.
Facebook
TwitterThe National Sample Survey of Registered Nurses (NSSRN) Download makes data from the survey readily available to users in a one-stop download. The Survey has been conducted approximately every four years since 1977. For each survey year, HRSA has prepared two Public Use File databases in flat ASCII file format without delimiters. The 2008 data are also offerred in SAS and SPSS formats. Information likely to point to an individual in a sparsely-populated county has been withheld. General Public Use Files are State-based and provide information on nurses without identifying the County and Metropolitan Area in which they live or work. County Public Use Files provide most, but not all, the same information on the nurse from the General Public Use File, and also identifies the County and Metropolitan Areas in which the nurses live or work. NSSRN data are to be used for research purposes only and may not be used in any manner to identify individual respondents.
Facebook
TwitterThe basic goal of this survey is to provide the necessary database for formulating national policies at various levels. It represents the contribution of the household sector to the Gross National Product (GNP). Household Surveys help as well in determining the incidence of poverty, and providing weighted data which reflects the relative importance of the consumption items to be employed in determining the benchmark for rates and prices of items and services. Generally, the Household Expenditure and Consumption Survey is a fundamental cornerstone in the process of studying the nutritional status in the Palestinian territory.
The raw survey data provided by the Statistical Office was cleaned and harmonized by the Economic Research Forum, in the context of a major research project to develop and expand knowledge on equity and inequality in the Arab region. The main focus of the project is to measure the magnitude and direction of change in inequality and to understand the complex contributing social, political and economic forces influencing its levels. However, the measurement and analysis of the magnitude and direction of change in this inequality cannot be consistently carried out without harmonized and comparable micro-level data on income and expenditures. Therefore, one important component of this research project is securing and harmonizing household surveys from as many countries in the region as possible, adhering to international statistics on household living standards distribution. Once the dataset has been compiled, the Economic Research Forum makes it available, subject to confidentiality agreements, to all researchers and institutions concerned with data collection and issues of inequality. Data is a public good, in the interest of the region, and it is consistent with the Economic Research Forum's mandate to make micro data available, aiding regional research on this important topic.
The survey data covers urban, rural and camp areas in West Bank and Gaza Strip.
1- Household/families. 2- Individuals.
The survey covered all the Palestinian households who are a usual residence in the Palestinian Territory.
Sample survey data [ssd]
The sampling frame consists of all enumeration areas which were enumerated in 1997; the enumeration area consists of buildings and housing units and is composed of an average of 120 households. The enumeration areas were used as Primary Sampling Units (PSUs) in the first stage of the sampling selection. The enumeration areas of the master sample were updated in 2003.
The sample is a stratified cluster systematic random sample with two stages: First stage: selection of a systematic random sample of 299 enumeration areas. Second stage: selection of a systematic random sample of 12-18 households from each enumeration area selected in the first stage. A person (18 years and more) was selected from each household in the second stage.
The population was divided by: 1- Governorate 2- Type of Locality (urban, rural, refugee camps)
The calculated sample size is 3,781 households.
The target cluster size or "sample-take" is the average number of households to be selected per PSU. In this survey, the sample take is around 12 households.
Detailed information/formulas on the sampling design are available in the user manual.
Face-to-face [f2f]
The PECS questionnaire consists of two main sections:
First section: Certain articles / provisions of the form filled at the beginning of the month,and the remainder filled out at the end of the month. The questionnaire includes the following provisions:
Cover sheet: It contains detailed and particulars of the family, date of visit, particular of the field/office work team, number/sex of the family members.
Statement of the family members: Contains social, economic and demographic particulars of the selected family.
Statement of the long-lasting commodities and income generation activities: Includes a number of basic and indispensable items (i.e, Livestock, or agricultural lands).
Housing Characteristics: Includes information and data pertaining to the housing conditions, including type of shelter, number of rooms, ownership, rent, water, electricity supply, connection to the sewer system, source of cooking and heating fuel, and remoteness/proximity of the house to education and health facilities.
Monthly and Annual Income: Data pertaining to the income of the family is collected from different sources at the end of the registration / recording period.
Second section: The second section of the questionnaire includes a list of 54 consumption and expenditure groups itemized and serially numbered according to its importance to the family. Each of these groups contains important commodities. The number of commodities items in each for all groups stood at 667 commodities and services items. Groups 1-21 include food, drink, and cigarettes. Group 22 includes homemade commodities. Groups 23-45 include all items except for food, drink and cigarettes. Groups 50-54 include all of the long-lasting commodities. Data on each of these groups was collected over different intervals of time so as to reflect expenditure over a period of one full year.
Both data entry and tabulation were performed using the ACCESS and SPSS software programs. The data entry process was organized in 6 files, corresponding to the main parts of the questionnaire. A data entry template was designed to reflect an exact image of the questionnaire, and included various electronic checks: logical check, range checks, consistency checks and cross-validation. Complete manual inspection was made of results after data entry was performed, and questionnaires containing field-related errors were sent back to the field for corrections.
The survey sample consists of about 3,781 households interviewed over a twelve-month period between January 2004 and January 2005. There were 3,098 households that completed the interview, of which 2,060 were in the West Bank and 1,038 households were in GazaStrip. The response rate was 82% in the Palestinian Territory.
The calculations of standard errors for the main survey estimations enable the user to identify the accuracy of estimations and the survey reliability. Total errors of the survey can be divided into two kinds: statistical errors, and non-statistical errors. Non-statistical errors are related to the procedures of statistical work at different stages, such as the failure to explain questions in the questionnaire, unwillingness or inability to provide correct responses, bad statistical coverage, etc. These errors depend on the nature of the work, training, supervision, and conducting all various related activities. The work team spared no effort at different stages to minimize non-statistical errors; however, it is difficult to estimate numerically such errors due to absence of technical computation methods based on theoretical principles to tackle them. On the other hand, statistical errors can be measured. Frequently they are measured by the standard error, which is the positive square root of the variance. The variance of this survey has been computed by using the “programming package” CENVAR.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Population Statistics: Sample Survey: Sampling Fraction data was reported at 0.105 % in 2023. This records an increase from the previous number of 0.102 % for 2022. China Population Statistics: Sample Survey: Sampling Fraction data is updated yearly, averaging 0.100 % from Dec 1982 (Median) to 2023, with 37 observations. The data reached an all-time high of 100.000 % in 2020 and a record low of 0.063 % in 1994. China Population Statistics: Sample Survey: Sampling Fraction data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: Sample Survey: Level of Education.
Facebook
TwitterThe Bangladesh Demographic and Health Survey (BDHS) is part of the worldwide Demographic and Health Surveys program, which is designed to collect data on fertility, family planning, and maternal and child health.
The BDHS is intended to serve as a source of population and health data for policymakers and the research community. In general, the objectives of the BDHS are to: - assess the overall demographic situation in Bangladesh, - assist in the evaluation of the population and health programs in Bangladesh, and - advance survey methodology.
More specifically, the objective of the BDHS is to provide up-to-date information on fertility and childhood mortality levels; nuptiality; fertility preferences; awareness, approval, and use of family planning methods; breastfeeding practices; nutrition levels; and maternal and child health. This information is intended to assist policymakers and administrators in evaluating and designing programs and strategies for improving health and family planning services in the country.
National
Sample survey data
Bangladesh is divided into six administrative divisions, 64 districts (zillas), and 490 thanas. In rural areas, thanas are divided into unions and then mauzas, a land administrative unit. Urban areas are divided into wards and then mahallas. The 1996-97 BDHS employed a nationally-representative, two-stage sample that was selected from the Integrated Multi-Purpose Master Sample (IMPS) maintained by the Bangladesh Bureau of Statistics. Each division was stratified into three groups: 1 ) statistical metropolitan areas (SMAs), 2) municipalities (other urban areas), and 3) rural areas. 3 In the rural areas, the primary sampling unit was the mauza, while in urban areas, it was the mahalla. Because the primary sampling units in the IMPS were selected with probability proportional to size from the 1991 Census frame, the units for the BDHS were sub-selected from the IMPS with equal probability so as to retain the overall probability proportional to size. A total of 316 primary sampling units were utilized for the BDHS (30 in SMAs, 42 in municipalities, and 244 in rural areas). In order to highlight changes in survey indicators over time, the 1996-97 BDHS utilized the same sample points (though not necessarily the same households) that were selected for the 1993-94 BDHS, except for 12 additional sample points in the new division of Sylhet. Fieldwork in three sample points was not possible (one in Dhaka Cantonment and two in the Chittagong Hill Tracts), so a total of 313 points were covered.
Since one objective of the BDHS is to provide separate estimates for each division as well as for urban and rural areas separately, it was necessary to increase the sampling rate for Barisal and Sylhet Divisions and for municipalities relative to the other divisions, SMAs and rural areas. Thus, the BDHS sample is not self-weighting and weighting factors have been applied to the data in this report.
Mitra and Associates conducted a household listing operation in all the sample points from 15 September to 15 December 1996. A systematic sample of 9,099 households was then selected from these lists. Every second household was selected for the men's survey, meaning that, in addition to interviewing all ever-married women age 10-49, interviewers also interviewed all currently married men age 15-59. It was expected that the sample would yield interviews with approximately 10,000 ever-married women age 10-49 and 3,000 currently married men age 15-59.
Note: See detailed in APPENDIX A of the survey report.
Face-to-face
Four types of questionnaires were used for the BDHS: a Household Questionnaire, a Women's Questionnaire, a Men' s Questionnaire and a Community Questionnaire. The contents of these questionnaires were based on the DHS Model A Questionnaire, which is designed for use in countries with relatively high levels of contraceptive use. These model questionnaires were adapted for use in Bangladesh during a series of meetings with a small Technical Task Force that consisted of representatives from NIPORT, Mitra and Associates, USAID/Bangladesh, the International Centre for Diarrhoeal Disease Research, Bangladesh (ICDDR,B), Population Council/Dhaka, and Macro International Inc (see Appendix D for a list of members). Draft questionnaires were then circulated to other interested groups and were reviewed by the BDHS Technical Review Committee (see Appendix D for list of members). The questionnaires were developed in English and then translated into and printed in Bangla (see Appendix E for final version in English).
The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including his/her age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. In addition, information was collected about the dwelling itself, such as the source of water, type of toilet facilities, materials used to construct the house, and ownership of various consumer goods.
The Women's Questionnaire was used to collect information from ever-married women age 10-49. These women were asked questions on the following topics: - Background characteristics (age, education, religion, etc.), - Reproductive history, - Knowledge and use of family planning methods, - Antenatal and delivery care, - Breastfeeding and weaning practices, - Vaccinations and health of children under age five, - Marriage, - Fertility preferences, - Husband's background and respondent's work, - Knowledge of AIDS, - Height and weight of children under age five and their mothers.
The Men's Questionnaire was used to interview currently married men age 15-59. It was similar to that for women except that it omitted the sections on reproductive history, antenatal and delivery care, breastfeeding, vaccinations, and height and weight. The Community Questionnaire was completed for each sample point and included questions about the existence in the community of income-generating activities and other development organizations and the availability of health and family planning services.
A total of 9,099 households were selected for the sample, of which 8,682 were successfully interviewed. The shortfall is primarily due to dwellings that were vacant or in which the inhabitants had left for an extended period at the time they were visited by the interviewing teams. Of the 8,762 households occupied, 99 percent were successfully interviewed. In these households, 9,335 women were identified as eligible for the individual interview (i.e., ever-married and age 10-49) and interviews were completed for 9,127 or 98 percent of them. In the half of the households that were selected for inclusion in the men's survey, 3,611 eligible ever-married men age 15-59 were identified, of whom 3,346 or 93 percent were interviewed.
The principal reason for non-response among eligible women and men was the failure to find them at home despite repeated visits to the household. The refusal rate was low.
Note: See summarized response rates by residence (urban/rural) in Table 1.1 of the survey report.
The estimates from a sample survey are affected by two types of errors: (1) non-sampling errors, and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the BDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the BDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the BDHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the BDHS is the ISSA Sampling Error Module. This module used the Taylor
Facebook
Twitteranalyze the health and retirement study (hrs) with r the hrs is the one and only longitudinal survey of american seniors. with a panel starting its third decade, the current pool of respondents includes older folks who have been interviewed every two years as far back as 1992. unlike cross-sectional or shorter panel surveys, respondents keep responding until, well, death d o us part. paid for by the national institute on aging and administered by the university of michigan's institute for social research, if you apply for an interviewer job with them, i hope you like werther's original. figuring out how to analyze this data set might trigger your fight-or-flight synapses if you just start clicking arou nd on michigan's website. instead, read pages numbered 10-17 (pdf pages 12-19) of this introduction pdf and don't touch the data until you understand figure a-3 on that last page. if you start enjoying yourself, here's the whole book. after that, it's time to register for access to the (free) data. keep your username and password handy, you'll need it for the top of the download automation r script. next, look at this data flowchart to get an idea of why the data download page is such a righteous jungle. but wait, good news: umich recently farmed out its data management to the rand corporation, who promptly constructed a giant consolidated file with one record per respondent across the whole panel. oh so beautiful. the rand hrs files make much of the older data and syntax examples obsolete, so when you come across stuff like instructions on how to merge years, you can happily ignore them - rand has done it for you. the health and retirement study only includes noninstitutionalized adults when new respondents get added to the panel (as they were in 1992, 1993, 1998, 2004, and 2010) but once they're in, they're in - respondents have a weight of zero for interview waves when they were nursing home residents; but they're still responding and will continue to contribute to your statistics so long as you're generalizing about a population from a previous wave (for example: it's possible to compute "among all americans who were 50+ years old in 1998, x% lived in nursing homes by 2010"). my source for that 411? page 13 of the design doc. wicked. this new github repository contains five scripts: 1992 - 2010 download HRS microdata.R loop through every year and every file, download, then unzip everything in one big party impor t longitudinal RAND contributed files.R create a SQLite database (.db) on the local disk load the rand, rand-cams, and both rand-family files into the database (.db) in chunks (to prevent overloading ram) longitudinal RAND - analysis examples.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create tw o database-backed complex sample survey object, using a taylor-series linearization design perform a mountain of analysis examples with wave weights from two different points in the panel import example HRS file.R load a fixed-width file using only the sas importation script directly into ram with < a href="http://blog.revolutionanalytics.com/2012/07/importing-public-data-with-sas-instructions-into-r.html">SAScii parse through the IF block at the bottom of the sas importation script, blank out a number of variables save the file as an R data file (.rda) for fast loading later replicate 2002 regression.R connect to the sql database created by the 'import longitudinal RAND contributed files' program create a database-backed complex sample survey object, using a taylor-series linearization design exactly match the final regression shown in this document provided by analysts at RAND as an update of the regression on pdf page B76 of this document . click here to view these five scripts for more detail about the health and retirement study (hrs), visit: michigan's hrs homepage rand's hrs homepage the hrs wikipedia page a running list of publications using hrs notes: exemplary work making it this far. as a reward, here's the detailed codebook for the main rand hrs file. note that rand also creates 'flat files' for every survey wave, but really, most every analysis you c an think of is possible using just the four files imported with the rand importation script above. if you must work with the non-rand files, there's an example of how to import a single hrs (umich-created) file, but if you wish to import more than one, you'll have to write some for loops yourself. confidential to sas, spss, stata, and sudaan users: a tidal wave is coming. you can get water up your nose and be dragged out to sea, or you can grab a surf board. time to transition to r. :D
Facebook
TwitterThe American Community Survey (ACS) Public Use Microdata Sample (PUMS) contains a sample of responses to the ACS. The ACS PUMS dataset includes variables for nearly every question on the survey, as well as many new variables that were derived after the fact from multiple survey responses (such as poverty status).Each record in the file represents a single person, or, in the household-level dataset, a single housing unit. In the person-level file, individuals are organized into households, making possible the study of people within the contexts of their families and other household members. Individuals living in Group Quarters, such as nursing facilities or college facilities, are also included on the person file. ACS PUMS data are available at the nation, state, and Public Use Microdata Area (PUMA) levels. PUMAs are special non-overlapping areas that partition each state into contiguous geographic units containing roughly 100,000 people each. ACS PUMS files for an individual year, such as 2019, contain data on approximately one percent of the United States population.
Facebook
TwitterA random sample of households were invited to participate in this survey. In the dataset, you will find the respondent level data in each row with the questions in each column. The numbers represent a scale option from the survey, such as 1=Excellent, 2=Good, 3=Fair, 4=Poor. The question stem, response option, and scale information for each field can be found in the var "variable labels" and "value labels" sheets. VERY IMPORTANT NOTE: The scientific survey data were weighted, meaning that the demographic profile of respondents was compared to the demographic profile of adults in Bloomington from US Census data. Statistical adjustments were made to bring the respondent profile into balance with the population profile. This means that some records were given more "weight" and some records were given less weight. The weights that were applied are found in the field "wt". If you do not apply these weights, you will not obtain the same results as can be found in the report delivered to the Bloomington. The easiest way to replicate these results is likely to create pivot tables, and use the sum of the "wt" field rather than a count of responses.