Urban and regional planners rely on Average Household Size as a foundational indicator for many of their models, calculations, and plans. Average household size (also known as "people per household") is a reflection of many dynamics at play, for example:Age of the population, as many older people tend to live in smaller households (one-person or two-person households)Housing prices in the area, proximity to colleges and universities, and how likely people are to live with roommatesFamily norms and traditions (e.g., multigenerational families are more common in some areas and with some population groups)This feature layer contains the Average Household Size and Population Density for states, counties, and tracts. Data from U.S. Census Bureau's 2014-2018 American Community Survey's 5-year estimates, Tables B25010 and B01001. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. See the field description for the formula used.This layer is symbolized to show the average household size. Population density, as well as average household size breakdown by housing tenure is presented in the pop-up. Click the Data tab -> Fields list to see all available attributes and their definitions.
The 2019 Sierra Leone Demographic and Health Survey (2019 SLDHS) is a nationwide survey with a nationally representative sample of approximately 13,872 selected households. All women age 15-49 who are usual household members or who spent the night before the survey in the selected households were eligible for individual interviews.
The primary objective of the 2019 SLDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the survey collected information on fertility, awareness and use of family planning methods, breastfeeding practices, nutritional status of women and children, maternal and child health, adult and childhood mortality, women’s empowerment, domestic violence, female genital cutting, prevalence and awareness and behaviour regarding HIV/AIDS and other sexually transmitted infections (STIs), and other health-related issues such as smoking.
The information collected through the 2019 SLDHS is intended to assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of the country’s population.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-59, and all children aged 0-5 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2019 SLDHS is the Population and Housing Census of the Republic of Sierra Leone, which was conducted in 2015 by Statistics Sierra Leone. Administratively, Sierra Leone is divided into provinces. Each province is subdivided into districts, each district is further divided into chiefdoms/census wards, and each chiefdom/census ward is divided into sections. During the 2015 Population and Housing Census, each locality was subdivided into convenient areas called census enumeration areas (EAs). The primary sampling unit (PSU), referred to as a cluster for the 2019 SLDHS, is defined based on EAs from the 2015 EA census frame. The 2015 Population and Housing Census provided the list of EAs that served as a foundation to estimate the number of households and distinguish EAs as urban or rural for the survey sample frame.
The sample for the 2019 SLDHS was a stratified sample selected in two stages. Stratification was achieved by separating each district into urban and rural areas. In total, 31 sampling strata were created. Samples were selected independently in every stratum via a two-stage selection process. Implicit stratifications were achieved at each of the lower administrative levels by sorting the sampling frame before sample selection according to administrative order and by using probability-proportional-to-size selection during the first sampling stage.
In the first stage, 578 EAs were selected with probability proportional to EA size. EA size was the number of households residing in the EA. A household listing operation was carried out in all selected EAs, and the resulting lists of households served as a sampling frame for the selection of households in the second stage. In the second stage’s selection, a fixed number of 24 households were selected in every cluster through equal probability systematic sampling, resulting in a total sample size of approximately 13,872 selected households. The household listing was carried out using tablets, and random selection of households was carried out through computer programming. The survey interviewers interviewed only the pre-selected households. To prevent bias, no replacements and no changes of the pre-selected households were allowed in the implementing stages.
For further details on sample selection, see Appendix A of the final report.
Computer Assisted Personal Interview [capi]
Five questionnaires were used for the 2019 SLDHS: The Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, the Biomarker Questionnaire, and the Fieldworker Questionnaire. The questionnaires, based on The DHS Program’s standard Demographic and Health Survey (DHS-7) questionnaires, were adapted to reflect the population and health issues relevant to Sierra Leone. Comments were solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. The survey protocol was reviewed and approved by the Sierra Leone Ethics and Scientific Review Committee and the ICF Institutional Review Board. All questionnaires were finalised in English, and the 2019 SLDHS used computer-assisted personal interviewing (CAPI) for data collection.
The processing of the 2019 SLDHS data began almost as soon as the fieldwork started. As data collection was completed in each cluster, all electronic data files were transferred via the IFSS to the Stats SL central office in Freetown. These data files were registered and checked for inconsistencies, incompleteness, and outliers. The field teams received alerts on any inconsistencies and errors. Secondary editing, carried out in the central office, involved resolving inconsistencies and coding open-ended questions. The Stats SL data processor coordinated the exercise at the central office. The biomarker paper questionnaires were compared with electronic data files to check for any inconsistencies in data entry. Data entry and editing were carried out using the CSPro Systems software package. Concurrent processing of the data offered a distinct advantage because it maximised the likelihood of the data being error-free and accurate. Timely generation of field check tables allowed for effective monitoring. The secondary editing of the data was completed in mid-October 2019.
A total of 13,793 households were selected for the sample, of which 13,602 were occupied. Of the occupied households, 13,399 were successfully interviewed, yielding a response rate of 99%. In the interviewed households, 16,099 women age 15-49 were identified for individual interviews; interviews were completed with 15,574 women, yielding a response rate of 97%. In the subsample of households selected for the male survey, 7,429 men age 15-59 were identified, and 7,197 were successfully interviewed, yielding a response rate of 97%.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2019 Sierra Leone Demographic and Health Survey (SLDHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2019 SLDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling errors are usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2019 SLDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programmes developed by ICF. These programmes use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
See details of the data quality tables in Appendix C of the final
The Multiple Indicator Cluster Survey (MICS) is a household survey programme developed by UNICEF to assist countries in filling data gaps for monitoring human development in general and the situation of children and women in particular. MICS is capable of producing statistically sound, internationally comparable estimates of social indicators. The current round of MICS is focused on providing a monitoring tool for the Millennium Development Goals (MDGs), the World Fit for Children (WFFC), as well as for other major international commitments, such as the United Nations General Assembly Special Session (UNGASS) on HIV/AIDS and the Abuja targets for malaria.
Survey Objectives The 2006 Bangladesh Multiple Indicator Cluster Survey has the following objectives: - To provide up-to-date information for assessing the situation of children and women in Bangladesh; - To furnish data needed for monitoring progress toward goals established by the Millennium Development Goals, the goals of A World Fit For Children (WFFC), and other internationally agreed upon goals, as a basis for future action; - To contribute to the improvement of data and monitoring systems in Bangladesh and to strengthen technical expertise in the design, implementation, and analysis of such systems.
Survey Content MICS questionnaires are designed in a modular fashion that can be easily customized to the needs of a country. They consist of a household questionnaire, a questionnaire for women aged 15-49 and a questionnaire for children under the age of five (to be administered to the mother or caretaker). Other than a set of core modules, countries can select which modules they want to include in each questionnaire.
Survey Implementation The survey was implemented by the Bangladesh Bureau of Statistics , with the support and assistance of UNICEF and other partners. Technical assistance and training for the surveys is provided through a series of regional workshops, covering questionnaire content, sampling and survey implementation; data processing; data quality and data analysis; report writing and dissemination.
The survey is nationally representative and covers the whole of Bangladesh.
Households (defined as a group of persons who usually live and eat together)
De jure household members (defined as memers of the household who usually live in the household, which may include people who did not sleep in the household the previous night, but does not include visitors who slept in the household the previous night but do not usually live in the household)
Women aged 15-49
Children aged 0-4
The survey covered all de jure household members (usual residents), all women aged 15-49 years resident in the household, and all children aged 0-4 years (under age 5) resident in the household.
Sample survey data [ssd]
The primary objective of the sample design for the Bangladesh Multiple Indicator Cluster Survey was to produce statistically reliable estimates of most indicators, at the national level, for urban and rural areas, and for the six divisions of the country, municipal areas, city corporation's slum areas of two big cities and tribal areas. Rural areas, municipal areas, city corporation areas, slum areas and tribal areas were defined as the sampling domain.
A multi-stage, stratified cluster sampling approach was used for the selection of the survey sample.
Sample Size and Sample Allocation The target sample size for the Bangladesh MICS was calculated as 68247 households. For the calculation of the sample size, the key indicator used was the DPT immunization (3+doses) prevalence among children aged 12-23 months. The following formula was used to estimate the required sample size for these indicators: n = [ 4 (r) (1-r) (f) (1.1) ] [ (0.12r)2 (p) (nh) ]
where n is the required sample size, expressed as number of households 4 is a factor to achieve the 95 per cent level of confidence r is the predicted or anticipated prevalence (coverage rate) of the indicator 1.1 is the factor necessary to raise the sample size by 10 per cent for non-response f is the shortened symbol for deff (design effect) 0.12r is the margin of error to be tolerated at the 95 per cent level of confidence, defined as 12 per cent of r (relative sampling error of r) p is the proportion of the total population upon which the indicator, r, is based nh is the average household size.
For the calculation, r (DPT immunization 3+doses prevalence) was assumed to be 39.7 percent in the Rangamati districts. The value of deff (design effect) was taken as 1.5 based on estimates from previous surveys, p (percentage of children aged 12-23 months in the total population) was taken as 2.3 percent, and nh (average household size) was taken as 4.9 households.
For the sub national level, the margin of error should be high which was also acknowledged in the MICS manual. Therefore, for sub national estimates the margin of error need to be relaxed considerably. If a rate of 30% of r is used this would give a margin of error ± 0.06 for prevalence rates of 0.20, ± 0.12 for prevalence rates of 0.40, and so on. Considering this phenomenon, in case of Rangamati 30% of r has been used.
The resulting number of households from this exercise was about 900 households which is the sample size needed in each district - thus yielding about 68250 in total. The average cluster size in the Bangladesh MICS was determined as 35 households, based on a number of considerations, including the budget available, and the time that would be needed per team to complete one cluster. Dividing the total number of households by the number of households per cluster, it was calculated that the selection of a total number of 26 clusters would be needed in each district.
Equal allocation of the total sample size to the 75 domains was targeted. Therefore, 26 clusters were allocated to each district with the final sample size calculated at 68250 households (1950 cluster X 35 households per cluster). In each stratum, the clusters (primary sampling units) were distributed to rural, municipal, city corporations, slum and tribal areas on PPS method.
Sampling Frame and Selection of Clusters The 2001 census frame was used for the selection of clusters. Census enumeration areas were defined as primary sampling units (PSUs), and were selected from each of the sampling domains by using systematic pps (probability proportional to size) sampling procedures, based on the estimated sizes of the enumeration areas from the 2001 Population Census. The first stage of sampling was thus completed by selecting the required number of enumeration areas from each of the 5 strata namely rural, municipal, city corporations, slum and tribal areas.
Listing Activities Since the sample frame of the 2001 Population Census was not up to date, household lists in all selected enumeration areas were updated prior to the selection of households. For this purpose, listing teams were formed, who visited each enumeration area, and listed the occupied households. The BBS officials working in the upazila were responsible for the listing of all households in the respective PSUs.
Selection of Households Lists of households were prepared by the Upazila officials of BBS. The households were sequentially numbered from 1 to 100 (or more) households in each enumeration area at the where selection of 35 households in each enumeration area was carried out using systematic selection procedures.
(Information extracted from the final report: BBS and UNICEF. 2007. Bangladesh Multiple Indicator Cluster Survey 2006, Final Report. Dhaka, Bangladesh: BBS and UNICEF)
No major deviations from the original sample design were made. All sample enumeration areas were accessed and successfully interviewed with good response rates.
Face-to-face [f2f]
The questionnaires of MICS 2006 are based on the global format of MICS3 model questionnaire. From the MICS3 model English version, the questionnaires were translated into Bangla and were pre-tested in four sample areas of which two were in rural areas, one in City Corporation and one in the slum area during May 2006. Based on the results of the pre-test, modifications were made to the wording and translation of the questionnaires.
The questionnaire for under-five children was administered to mothers or caretakers of under-five children living in the households. Normally, the questionnaire was administered to mothers of under-five children; in cases when the mother was not listed in the household roster, a primary caretaker for the child was identified and interviewed.
Data editing took place at a number of stages throughout the processing (see Other processing), including: a) Office editing and coding b) During data entry c) Structure checking and completeness d) Secondary editing e) Structural checking of SPSS data files
Detailed documentation of the editing of data can be found in the data processing guidelines
Of the 68,247 of households selected for the sample, 67,540 were found to be occupied. Of these, 62,463 households were successfully interviewed for a household response rate of 92.5 percent. In the interviewed households, 78,260 of eligible women (age 15-49) were identified. Of these, 69,860 of women were successfully interviewed, yielding a response rate of 89.3 percent. In addition, 34,710 of children under 5 were listed in HH questionnaire.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2010, the 2010 Census provides the official counts of the population and housing units for the nation, states, counties, cities and towns. For 2006 to 2009, the Population Estimates Program provides intercensal estimates of the population for the nation, states, and counties..Explanation of Symbols:.An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2006-2010 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2006-2010 American Community Survey
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2010, the 2010 Census provides the official counts of the population and housing units for the nation, states, counties, cities and towns. For 2006 to 2009, the Population Estimates Program provides intercensal estimates of the population for the nation, states, and counties..Explanation of Symbols:.An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2006-2010 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2006-2010 American Community Survey
HOME Income Limits are calculated using the same methodology that HUD uses for calculating the income limits for the Section 8 program. These limits are based on HUD estimates of median family income, with adjustments based on family size. The Department's methodology for calculating nationwide median family income figures is described in Notice PDR-2001-01. For more information about how HUD calculates the HOME Program income limits, visit huduser.gov, the website for HUD's Office of Policy Development and Research, for more general information.
The basic goal of this survey is to provide a necessary database for formulating national policies at various levels. This survey providing the contribution of the household sector to the Gross National Product (GNP), This survey determining the incidence of poverty, Providing weighted data which reflects the relative importance of the consumption items to be employed to determine the benchmark for rates and prices of items and services. The survey is a fundamental cornerstone in the process of studying the nutritional status in the Palestinian territory.
The Data are representative at region level (West Bank, Gaza Strip), locality type (urban, rural, camp) .
Household, individual
The survey covered all the Palestinian households who are a usual residence in the Palestinian Territory.
Sample survey data [ssd]
The target population in this sample survey comprises all households living in the West Bank and Gaza Strip, excluding nomads and students.
The sample design is a stratified two-stage design for households selected to be interviewed. At the first stage a sample of cells (PSUs) was selected from the PCBS master sample frame. At the second stage, a sample of households was selected after a complete household listing of the sampled cells.
Sample Design
Stratification
Four levels of stratification have been made:
Stratification by District.
Stratification by place of residence which comprises:
(a) Municipalities (b) Villages (c) Refugee Camps
Stratification by locality size.
Stratification by cell identification in that order.
Sample Size The sample size is about 3,591 households allowing for non-response and related losses .
Target cluster size
The next important issue in the sample design is the target cluster size or “sample-take” which is the average number of households to be selected per PSU. In this survey, the sample take is around 10 households.
Self-weighting design:
At the first stage, clusters or “cells” have been selected with PPS probability proportional to estimated measure of size (Mi) for unit (I):
Where the summation covers all clusters in the population; a-300 is the total number of selected clusters. It is highly desirable for the PECS to have a constant overall sampling rate (f), i.e. to have a self- weighting sample. This requires the second stage probability for the selection of households and persons within any sample cluster i to be as follows:
Where b is a constant (independent of i) to be determined to obtain the required sample size, n =3,591 households. Since the measure of size are likely to differ from the actual number of households listed in any cluster i, the actual number of households selected with the above shall vary from one cluster to another and are presented as:
Adding all clusters in the sample results in the required constant b, to achieve the target sample size n as:
Hence to control the overall sample size, b is determined after completing the listing in all sample areas.
The above procedure allows for variation in sample sizes bi at the level of individual clusters, in order to provide a self-weighting sample. Households within each sample cluster shall be selected systematically from the lists prepared for that purpose, using the sampling interval,
Where:
a Number of cells in the sample (equals 360)
Number of housing units in cell I
Number of listed of households in cell I
n Proposed sample size (n= 3,591 HHs)
b Average sample take
Sample take in cell I
f Sample rate
First-stage sampling rate
Second-stage sampling rate
Which is fixed for each cluster but varies between clusters depending on the measure of size () with which the area was selected at the first stage.
The sample-take must be allowed to vary depending on the actual number of households found after listing. However, provision must be made to avoid extreme variation in cluster sample size. This could be done by using the above procedure to compute the ratio for each cluster in the sample. If this ratio lies outside the range say 0.5 - 4.0, adjust , i.e. the interval to be applied for the selection of households in the cluster, so as to keep the ratio within the above range.
Sample Rotation
The total number of (480) cells have been divided into (24) groups (subsided sample), each one constituted of (20) cells. A sub-sample of (360) cells is used year round by a monthly sample constituted of two minor samples (30 cells). The survey includes independent cells and not cross section ones, each of these is formed from (300) households for each month (round).
(Replication)
L K J I H G F E D C B A Month
× 1
× 2
× 3
× 4
× 5
× 6
× 7
× 8
× 9
× 10
× 11
× 12
Estimations Procedure
The sample is self-weighting by design. To estimate a given total Y for a given sub-population A, we introduce the following formula:
But since W is constant for all j within i, then: the estimating formula becomes:
Where, U YA = Estimated total for variable Y in sub-population A h = The sub-stratum within the estimation domain i = The sample PSU (cell) j = The unit of analysis or element A = Subset of elements possessing a given attribute, that is, belonging to a given sub- population A = Observed value of variable “y” for j-the element of i-the sample PSU in stratum h = Final (adjusted) sampling weight for the element is the unweighted PSU total within h for sub-population A
The estimator for a given ratio for sub-population A is the following: (2)
Where: U RA =Estimate for the ratio of two variables, Y/X, in sub-population A U XA = Estimated total for variable X in sub-population A, given by formula (1)B U YA = Estimated total for variable Y in sub-population A, also given by formula (1)B
Means and proportions are special types of ratios. In the case of the mean, the variable X, in the denominator of the ratio, is defined to equal 1 for each element, so that denominator is the sum of weights in the sub-population. In the case of the proportion, the variable X in the denominator is also defined to equal 1 for all elements. In addition, the variable Y in the numerator is binomial and is defined to equal either 0 or 1, depending on the absence or presence of a specified attribute in the element observed.
Calculation of Variance
It is very important to calculate standard errors for the main survey estimates so that the user can have an idea of their reliability or precision.
The variance calculation will use the method of ultimate clusters. Within any domain of estimation, for a sub-population A, and for a characteristic Y, the formulas are: (a) The variance of an estimator of a total is estimated by:
(3)
Where:
(4)
and: (5)
The expression in (3) is an unbiased estimator of the variance. (b) The variance of an estimator of a ratio is estimated by:
(6)
Where:
U U
V (YA) and V(XA) are calculated according to formula (3); U XA is calculated according to formula (1); and U RA according to formula (2).
Face-to-face [f2f]
The PECS questionnaire consists of two main sections:
First section: Certain articles / provisions of the form filled at the beginning of the month, and the remainder filled out at the end of the month. The questionnaire includes the following provisions:
Cover sheet: It contains detailed and particulars of the family, date of visit, particular of the field/office work team, number/sex of the family members.
Statement of the family members: Contains social, economic and demographic particulars of the selected family.
Statement of the long-lasting commodities and income generation activities: Includes a number of basic and indispensable items (i.e., Livestock, or agricultural lands).
Housing Characteristics: Includes information and data pertaining to the housing conditions, including type of house, number of rooms, ownership, rent, water, electricity supply, connection to the sewer system, source of cooking and heating fuel, and remoteness/proximity of the house to education and health facilities.
Monthly and Annual Income: Data pertaining to the income of the family is collected from different sources at the end of the registration / recording period.
Assistance and poverty: includes questions about household conditions and assistances that got through the the past month.
Second section: The second section of the questionnaire includes a list of 55 consumption and expenditure groups itemized and serially numbered according to its importance to the family. Each of these groups contains important commodities. The number of commodities items in each for all groups stood at 667 commodities and services items. Groups 1-21 include food, drink, and cigarettes. Group 22 includes homemade commodities. Groups 23-45 include all items except for food, drink and cigarettes. Groups 50-55 include all of the long-lasting commodities. Data on each of these groups was collected over different intervals of time so as to reflect expenditure over a period of one full year, except the cars group the data of which was collected for three previous years. These data was abotained from the recording book which is covered a period of month for each household.
Cleaning
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2008-2012 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2008-2012 American Community Survey
The 2019-20 Gambia Demographic and Health Survey (2019-20 GDHS) is a nationwide survey with a nationally representative sample of residential households. The survey was implemented by The Gambia Bureau of Statistics (GBoS) in collaboration with the Ministry of Health (MoH).
The primary objective of the 2019-20 GDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the 2019-20 GDHS: ▪ collected data on fertility levels and preferences; contraceptive use; maternal and child health; infant, child, and neonatal mortality levels; maternal mortality; gender; nutrition; awareness about HIV/AIDS; self-reported sexually transmitted infections (STIs); and other health issues relevant to the achievement of the Sustainable Development Goals (SDGs) ▪ obtained information on the availability of, access to, and use of mosquito nets as part of the National Malaria Control Programme ▪ gathered information on other health issues such as injections, tobacco use, hypertension, diabetes, and health insurance ▪ collected data on women’s empowerment, domestic violence, fistula, and female genital mutilation/cutting ▪ tested household salt for the presence of iodine ▪ obtained data on child feeding practices, including breastfeeding, and conducted anthropometric measurements to assess the nutritional status of children under age 5 and women age 15-49 ▪ conducted anaemia testing of women age 15-49 and children age 6-59 months ▪ conducted malaria testing of children age 6-59 months
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-59, and all children aged 0-5 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2019-20 GDHS was based on an updated version of the 2013 Gambia Population and Housing Census (2013 GPHC) conducted by GBoS. The census counts were updated in 2015-16 based on district-level projected counts from the 2015-16 Integrated Household Survey (IHS). Administratively, The Gambia is divided into eight Local Government Areas (LGAs). Each LGA is subdivided into districts and each district is subdivided into settlements. A settlement, a group of small settlements, or a part of a large settlement can form an enumeration area (EA). These units allow the country to be easily separated into small geographical area units, each with an urban or rural designation. There are 48 districts, 120 wards, and 4,098 EAs in The Gambia; the EAs have an average size of 68 households.
The sample for the 2019-20 GDHS was a stratified sample selected in two stages. In the first stage, EAs were selected with a probability proportional to their size within each sampling stratum. A total of 281 EAs were selected.
In the second stage, the households were systematically sampled. A household listing operation was undertaken in all of the selected clusters. The resulting lists of households served as the sampling frame from which a fixed number of 25 households were systematically selected per cluster, resulting in a total sample size of 7,025 selected households. Results from this sample are representative at the national, urban, and rural levels and at the LGA levels.
For further details on sample selection, see Appendix A of the final report.
Computer Assisted Personal Interview [capi]
Five questionnaires were used for the 2019-20 GDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, the Biomarker Questionnaire, and the Fieldworker Questionnaire. These questionnaires, based on The DHS Program’s standard questionnaires, were adapted to reflect the population and health issues relevant to The Gambia. Suggestions were solicited from various stakeholders representing government ministries, departments, and agencies; nongovernmental organisations; and international donors. All questionnaires were written in English, and interviewers translated the questions into the appropriate local language to carry out the interview.
All electronic data files were transferred via the Internet File Streaming System (IFSS) to the GBoS central office. The IFSS automatically encrypts the data and sends the data to a server, and the server in turn downloads the data to the data processing supervisor’s password-protected computer in the central office. The data processing operation included secondary editing, which required resolution of computeridentified inconsistencies and coding of open-ended questions. The data were processed by two IT specialists and three secondary editors who took part in the main fieldwork training; they were supervised remotely by staff from The DHS Program. Data editing was accomplished using CSPro software. During the fieldwork, field-check tables were generated to check various data quality parameters, and specific feedback was given to the teams to improve performance. Secondary editing and data processing were initiated in November 2019 and completed in May 2020.
All 6,985 households in the selected housing units were eligible for the survey, of which 6,736 were occupied. Of the occupied households, 6,549 were successfully interviewed, yielding a response rate of 97%. Among the households successfully interviewed, 1,948 interviews were completed in 2019 and 4,601 in 2020.
In the interviewed households, 12,481 women age 15-49 were identified for individual interviews; interviews were completed with 11,865 women, yielding a response rate of 95%, a 4 percentage point increase from the 2013 GDHS. Among men, 5,337 were eligible for individual interviews, and 4,636 completed an interview; this yielded a response rate of 87%, a 5 percentage point increase from the previous survey.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2019-20 Gambia Demographic and Health Survey (GDHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2019-20 GDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2019-20 GDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS, using programs developed by ICF. These programs use the Taylor linearisation method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
See details of the data quality tables in Appendix C of the final report.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2007-2011 American Community Survey
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, for 2010, the 2010 Census provides the official counts of the population and housing units for the nation, states, counties, cities and towns. For 2006 to 2009, the Population Estimates Program provides intercensal estimates of the population for the nation, states, and counties..Explanation of Symbols:.An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2006-2010 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2006-2010 American Community Survey
The 2022 Philippines National Demographic and Health Survey (NDHS) was implemented by the Philippine Statistics Authority (PSA). Data collection took place from May 2 to June 22, 2022.
The primary objective of the 2022 NDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the NDHS collected information on fertility, fertility preferences, family planning practices, childhood mortality, maternal and child health, nutrition, knowledge and attitudes regarding HIV/AIDS, violence against women, child discipline, early childhood development, and other health issues.
The information collected through the NDHS is intended to assist policymakers and program managers in designing and evaluating programs and strategies for improving the health of the country’s population. The 2022 NDHS also provides indicators anchored to the attainment of the Sustainable Development Goals (SDGs) and the new Philippine Development Plan for 2023 to 2028.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sampling scheme provides data representative of the country as a whole, for urban and rural areas separately, and for each of the country’s administrative regions. The sample selection methodology for the 2022 NDHS was based on a two-stage stratified sample design using the Master Sample Frame (MSF) designed and compiled by the PSA. The MSF was constructed based on the listing of households from the 2010 Census of Population and Housing and updated based on the listing of households from the 2015 Census of Population. The first stage involved a systematic selection of 1,247 primary sampling units (PSUs) distributed by province or HUC. A PSU can be a barangay, a portion of a large barangay, or two or more adjacent small barangays.
In the second stage, an equal take of either 22 or 29 sample housing units were selected from each sampled PSU using systematic random sampling. In situations where a housing unit contained one to three households, all households were interviewed. In the rare situation where a housing unit contained more than three households, no more than three households were interviewed. The survey interviewers were instructed to interview only the preselected housing units. No replacements and no changes of the preselected housing units were allowed in the implementing stage in order to prevent bias. Survey weights were calculated, added to the data file, and applied so that weighted results are representative estimates of indicators at the regional and national levels.
All women age 15–49 who were either usual residents of the selected households or visitors who stayed in the households the night before the survey were eligible to be interviewed. Among women eligible for an individual interview, one woman per household was selected for a module on women’s safety.
For further details on sample design, see APPENDIX A of the final report.
Computer Assisted Personal Interview [capi]
Two questionnaires were used for the 2022 NDHS: the Household Questionnaire and the Woman’s Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to the Philippines. Input was solicited from various stakeholders representing government agencies, academe, and international agencies. The survey protocol was reviewed by the ICF Institutional Review Board.
After all questionnaires were finalized in English, they were translated into six major languages: Tagalog, Cebuano, Ilocano, Bikol, Hiligaynon, and Waray. The Household and Woman’s Questionnaires were programmed into tablet computers to allow for computer-assisted personal interviewing (CAPI) for data collection purposes, with the capability to choose any of the languages for each questionnaire.
Processing the 2022 NDHS data began almost as soon as fieldwork started, and data security procedures were in place in accordance with confidentiality of information as provided by Philippine laws. As data collection was completed in each PSU or cluster, all electronic data files were transferred securely via SyncCloud to a server maintained by the PSA Central Office in Quezon City. These data files were registered and checked for inconsistencies, incompleteness, and outliers. The field teams were alerted to any inconsistencies and errors while still in the area of assignment. Timely generation of field check tables allowed for effective monitoring of fieldwork, including tracking questionnaire completion rates. Only the field teams, project managers, and NDHS supervisors in the provincial, regional, and central offices were given access to the CAPI system and the SyncCloud server.
A team of secondary editors in the PSA Central Office carried out secondary editing, which involved resolving inconsistencies and recoding “other” responses; the former was conducted during data collection, and the latter was conducted following the completion of the fieldwork. Data editing was performed using the CSPro software package. The secondary editing of the data was completed in August 2022. The final cleaning of the data set was carried out by data processing specialists from The DHS Program in September 2022.
A total of 35,470 households were selected for the 2022 NDHS sample, of which 30,621 were found to be occupied. Of the occupied households, 30,372 were successfully interviewed, yielding a response rate of 99%. In the interviewed households, 28,379 women age 15–49 were identified as eligible for individual interviews. Interviews were completed with 27,821 women, yielding a response rate of 98%.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and in data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2022 Philippines National Demographic and Health Survey (2022 NDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 NDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2022 NDHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed in SAS using programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
See details of the data quality tables in Appendix C of the final report.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2008-2012 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2008-2012 American Community Survey
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2007-2011 American Community Survey
The Indonesia Demographic and Health Survey (IDHS), which is part of the Demographic and Health Surveys (DHS) Project, is one of prominent national surveys in the field of population, family planning, and health. The survey is not only important nationally for planning and evaluating population, family planning, and health developments, but is also important internationally since IDHS has been designed so uniquely that it can be compared with similar surveys in other developing countries.
The 1997 Indonesia Demographic and Health Survey (IDHS) is a follow-on project to the 1987 National Indonesia Contraceptive Prevalence Survey (NICPS), the 1991 IDHS, and the 1994 IDHS. The 1997 IDHS was expanded from the 1994 survey to include a module on family welfare; however, unlike the 1994 survey, the 1997 survey no longer investigated the availability of family planning and health services. The 1997 IDHS also included as part of the household schedule a household expenditure module that provided a means of identifying the household's economic status.
The 1997 IDHS was specifically designed to meet the following objectives: - Provide data concerning fertility, family planning, maternal and child health, maternal mortality, and awareness of AIDS that can be used by program managers, policymakers, and researchers to evaluate and improve existing programs - Provide data about availability of family planning and health services, thereby offering an opportunity for linking women's fertility, family planning, and child care behavior with the availability of services - Provide household expenditure data that which can be used to identify the household's economic status - Provide data that can be used to analyze trends over time by examining many of the same fertility, mortality, and health issues that were addressed in the earlier surveys (1987 NICPS, 1991 IDHS and 1994 IDHS) - Measure changes in fertility and contraceptive prevalence rates and at the same time study factors that affect the changes, such as marriage patterns, urban/rural residence, education, breastfeeding habits, and the availability of contraception - Measure the development and achievements of programs related to health policy, particularly those concerning the maternal and child health development program implemented through public health clinics in Indonesia - Provide indicators for classifying families according to their welfare status.
National
Sample survey data
Indonesia is divided into 27 provinces. For the implementation of its family planning program, the National Family Planning Coordinating Board (NFPCB) has divided these provinces into three regions as follows:
The 1990 Population Census of Indonesia shows that Java-Bali accounts for 62 percent of the national population, Outer Java-Bali I accounts for 27 percent, and Outer Java-Bali II accounts for 11 percent. The sample for the 1997 IDHS was designed to produce reliable estimates of fertility, contraceptive prevalence and other important variables for each of the provinces and urban and rural areas of the three regions.
In order to meet this objective, between 1,650 and 2,050 households were selected in each of the provinces in Java-Bali, 1,250 to 1,500 households in the ten provinces in Outer Java-Bali I, and 1,000 to 1,250 households in each of the provinces in Outer Java-Bali II, for a total of 35,500 households. With an average of O.8 ever-married women 15-49 per household, the sample was expected to yield approximately 28,000 women eligible for the individual interview.
Note: See detailed description of sample design in APPENDIX A of the survey report.
Face-to-face [f2f]
The 1997 IDHS used three questionnaires: the household questionnaire, the questionnaire on family welfare, and the individual questionnaire for ever-married women 15-49 years old. The general household and individual questionnaires were based on the DHS Model "A" Questionnaire, which is designed for use in countries with high contraceptive prevalence. Additions and modifications to the model questionnaire were made in order to provide detailed information specific to Indonesia. The questionnaires were developed mainly in English and were translated into Indonesian. One deviation from the standard DHS practice is the exclusion of the anthropometric measurement of young children and their mothers. A separate survey carried out by MOH provides this information.
The household questionnaire includes an expenditure schedule adapted from the core Susenas questionnaire model. Susenas is a national household survey carried out annually by CBS to collect data on various demographic and socioeconomic indicators of the population. The family welfare questionnaire was aimed at collecting indicators developed by the NFPCB to classify families according to their welfare status. Families were identified from the list of household members in the household questionnaire. The expenditure module and the family welfare questionnaire were developed in Indonesian.
The first stage of data editing was carried out by the field editors who checked the completed questionnaires for thoroughness and accuracy. Field supervisors then further examined the questionnaires. In many instances, the teams sent the questionnaires to CBS through the regency/municipality statistics offices. In these cases, no checking was done by the PSO. In other cases, Technical Coordinators are responsible for reviewing the completeness of the forms. At CBS, the questionnaires underwent another round of editing, primarily for completeness and coding of responses to open-ended questions. The data were processed using microcomputers and the DHS computer program, ISSA (Integrated System for Survey Analysis). Data entry and office editing were initiated immediately after fieldwork began. Simple range and skip errors were corrected at the data entry stage. Data processing was completed by February 1998, and the preliminary report of the survey was published in April 1998.
A total of 35,362 households were selected for the survey, of which 34,656 were found. Of the encountered households, 34,255 (99 percent) were successfully interviewed. In these households, 29,317 eligible women were identified, and complete interviews were obtained from 28,810 women, or 98 percent of all eligible women. The generally high response rates for both household and individual interviews were due mainly to the strict enforcement of the rule to revisit the originally selected household if no one was at home initially. No substitution for the originally selected households was allowed. Interviewers were instructed to make at least three visits in an effort to contact the household or eligible woman.
Note: See summarized response rates by place of residence in Table 1.2 of the survey report.
The estimates from a sample survey are affected by two types of errors: (I) non-sampling errors and (2) sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 1997 IDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 1997 IDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 1997 IDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the 1997 IDHS is the ISSA Sampling Error Module. This module
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2007-2011 American Community Survey
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2007-2011 American Community Survey
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2000 data. Boundaries for urban areas have not been updated since Census 2000. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2007-2011 American Community Survey (ACS) data generally reflect the December 2009 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The methodology for calculating median income and median earnings changed between 2008 and 2009. Medians over $75,000 were most likely affected. The underlying income and earning distribution now uses $2,500 increments up to $250,000 for households, non-family households, families, and individuals and employs a linear interpolation method for median calculations. Before 2009 the highest income category was $200,000 for households, families and non-family households ($100,000 for individuals) and portions of the income and earnings distribution contained intervals wider than $2,500. Those cases used a Pareto Interpolation Method..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables..Source: U.S. Census Bureau, 2007-2011 American Community Survey
The 2019-20 Rwanda Demographic and Health Survey (2019-20 RDHS) follows those implemented in 1992, 2000, 2005, 2010, and 2014-15. A nationally representative sample of 500 clusters and 13,000 households were selected. All women age 15-49 who were usual residents of the selected households or who slept in the households the night before the survey were eligible for the survey.
The primary objective of the 2019-20 RDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the 2019-20 RDHS: • collected data on fertility levels and preferences; contraceptive use; maternal and child health; infant, child, and neonatal mortality levels; maternal mortality; gender; nutrition; awareness about HIV/AIDS; self-reported sexually transmitted infections (STIs); and other health issues relevant to the achievement of the Sustainable Development Goals (SDGs) • obtained information on the availability of, access to, and use of mosquito nets as part of the National Malaria Control Program • gathered information on other health issues such as injections, tobacco use, and health insurance • collected data on women’s empowerment and domestic violence • tested household salt for iodine levels • obtained data on child feeding practices, including breastfeeding, and conducted anthropometric measurements to assess the nutritional status of children under age 5 and women age 15-49 • conducted anemia testing of women age 15-49 and children age 6-59 months • conducted malaria testing of women age 15-49 and children age 6-59 months • conducted HIV testing of women age 15-49 and men age 15-59 • conducted micronutrient testing of women age 15-49 and children age 6-59 months
The information collected through the 2019-20 RDHS is intended to assist policymakers and program managers in evaluating and designing programs and strategies for improving the health of the country’s population.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-59, and all children aged 0-5 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2019-20 RDHS is the fourth Rwanda Population and Housing Census (RPHC), which was conducted in 2012 by the National Institute of Statistics of Rwanda (NISR). The sampling frame is a complete list of enumeration areas (EAs) covering the whole country provided by the National Institute of Statistics, the implementing agency for the RDHS. An EA is a natural village or part of a village created for the 2012 RPHC; these areas served as the counting units for the census.
The 2019-20 RDHS followed a two-stage sample design and was intended to allow estimates of key indicators at the national level as well as for urban and rural areas, five provinces, and each of Rwanda’s 30 districts for some limited indicators. The first stage involved selecting sample points (clusters) consisting of EAs delineated for the 2012 RPHC. A total of 500 clusters were selected, 112 in urban areas and 388 in rural areas.
The second stage involved systematic sampling of households. A household listing operation was undertaken in all selected EAs from June to August 2019, and households to be included in the survey were randomly selected from these lists. Twenty-six households were selected from each sample point, for a total sample size of 13,000 households. Because of the approximately equal sample sizes in each district, the sample is not self-weighting at the national level, and weighting factors have been added to the data file so that the results will be proportional at the national level.
For further details on sample selection, see Appendix A of the final report.
Computer Assisted Personal Interview [capi]
Five questionnaires were used for the 2019-20 RDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, the Biomarker Questionnaires, and the Fieldworker Questionnaire. These questionnaires, based on The DHS Program’s standard Demographic and Health Survey (DHS-7) questionnaires, were adapted to reflect the population and health issues relevant to Rwanda.
The processing of the 2019-20 RDHS data began almost as soon as the fieldwork started. As data collection was completed in each cluster, all electronic data files were transferred via the Internet File Streaming System (IFSS) to the NISR central office in City of Kigali. These data files were registered and checked for inconsistencies, incompleteness, and outliers. The field teams were alerted to any inconsistencies and errors. Secondary editing, carried out in the central office, involved resolving inconsistencies and coding the open-ended questions. The NISR data processor coordinated the exercise at the central office. The biomarker paper questionnaires were compared with electronic data files to check for any inconsistencies in data entry. Data entry and editing were carried out using the CSPro software package. The concurrent processing of the data offered a distinct advantage because it maximized the likelihood of the data being error-free and accurate. Timely generation of field check tables allowed for effective monitoring. The secondary editing of the data was completed in the second week of September 2020.
A total of 13,005 households were selected for the sample, of which 12,951 were occupied. All but two occupied households (12,949) were successfully interviewed, yielding a response rate of 100.0%. In the interviewed households, 14,675 women age 15-49 were identified for individual interviews; interviews were completed with 14,634 women, yielding a response rate of 99.7%. In the subsample selected for the male survey, 6,503 households were selected, of which 6,472 were occupied. All but one occupied household (6,471) were successfully interviewed, yielding a response rate of 100.0%. In this subsample, 6,544 men age 15-59 were identified and 6,513 were successfully interviewed, yielding a response rate of 99.5%. In the subsample selected for the micronutrient survey, 3,501 households were selected, of which 3,492 were occupied. All but one of the occupied households (3,491) were successfully interviewed, yielding a response rate of 100.0%.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2019-20 Rwanda Demographic and Health Survey (2019-20 RDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2019-20 RDHS is only one of many samples that could have been selected from the same population, using the same design and sample size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected by simple random sampling, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2019-20 RDHS sample was the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed using SAS programs developed by ICF. These programs use the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Note: A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
The 2016 Ethiopia Demographic and Health Survey (EDHS) is the fourth Demographic and Health Survey conducted in Ethiopia. It was implemented by the Central Statistical Agency (CSA) at the request of the Federal Ministry of Health (FMoH). The primary objective of the 2016 EDHS is to provide up-to-date estimates of key demographic and health indicators. The EDHS provides a comprehensive overview of population, maternal, and child health issues in Ethiopia. More specifically, the 2016 EDHS: - Collected data at the national level that allowed calculation of key demographic indicators, particularly fertility and under-5 and adult mortality rates - Explored the direct and indirect factors that determine levels and trends of fertility and child mortality ? Measured levels of contraceptive knowledge and practice - Collected data on key aspects of family health, including immunisation coverage among children, prevalence and treatment of diarrhoea and other diseases among children under age 5, and maternity care indicators such as antenatal visits and assistance at delivery - Obtained data on child feeding practices, including breastfeeding - Collected anthropometric measures to assess the nutritional status of children under age 5, women age 15-49, and men age 15-59 - Conducted haemoglobin testing on eligible children age 6-59 months, women age 15-49, and men age 15-59 to provide information on the prevalence of anaemia in these groups - Collected data on knowledge and attitudes of women and men about sexually transmitted diseases and HIV/AIDS and evaluated potential exposure to the risk of HIV infection by exploring high-risk behaviours and condom use - Conducted HIV testing of dried blood spot (DBS) samples collected from women age 15-49 and men age 15-59 to provide information on the prevalence of HIV among adults of reproductive age - Collected data on the prevalence of injuries and accidents among all household members - Collected data on knowledge and prevalence of fistula and female genital mutilation or cutting (FGM/C) among women age 15-49 and their daughters age 0-14 - Obtained data on women’s experience of emotional, physical, and sexual violence.
National
The survey covered all de jure household members (usual residents), women age 15-49 years and men age 15-59 years resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2016 EDHS is the Ethiopia Population and Housing Census (PHC), which was conducted in 2007 by the Ethiopia Central Statistical Agency. The census frame is a complete list of 84,915 enumeration areas (EAs) created for the 2007 PHC. An EA is a geographic area covering on average 181 households. The sampling frame contains information about the EA location, type of residence (urban or rural), and estimated number of residential households. With the exception of EAs in six zones of the Somali region, each EA has accompanying cartographic materials. These materials delineate geographic locations, boundaries, main access, and landmarks in or outside the EA that help identify the EA. In Somali, a cartographic frame was used in three zones where sketch maps delineating the EA geographic boundaries were available for each EA; in the remaining six zones, satellite image maps were used to provide a map for each EA.
Administratively, Ethiopia is divided into nine geographical regions and two administrative cities. The sample for the 2016 EDHS was designed to provide estimates of key indicators for the country as a whole, for urban and rural areas separately, and for each of the nine regions and the two administrative cities.
The 2016 EDHS sample was stratified and selected in two stages. Each region was stratified into urban and rural areas, yielding 21 sampling strata. Samples of EAs were selected independently in each stratum in two stages. Implicit stratification and proportional allocation were achieved at each of the lower administrative levels by sorting the sampling frame within each sampling stratum before sample selection, according to administrative units in different levels, and by using a probability proportional to size selection at the first stage of sampling.
For further details on sample design, see Appendix A of the final report.
Face-to-face [f2f]
Five questionnaires were used for the 2016 EDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, the Biomarker Questionnaire, and the Health Facility Questionnaire. These questionnaires, based on the DHS Program’s standard Demographic and Health Survey questionnaires, were adapted to reflect the population and health issues relevant to Ethiopia. Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. After all questionnaires were finalised in English, they were translated into Amarigna, Tigrigna, and Oromiffa.
All electronic data files for the 2016 EDHS were transferred via IFSS to the CSA central office in Addis Ababa, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of openended questions; it also required generating a file for the list of children for whom a vaccination card was not seen by the interviewers and whose vaccination records had to be checked at health facilities. The data were processed by two individuals who took part in the main fieldwork training; they were supervised by two senior staff from CSA. Data editing was accomplished using CSPro software. During the duration of fieldwork, tables were generated to check various data quality parameters and specific feedback was given to the teams to improve performance. Secondary editing and data processing were initiated in January 2016 and completed in August 2016.
A total of 18,008 households were selected for the sample, of which 17,067 were occupied. Of the occupied households, 16,650 were successfully interviewed, yielding a response rate of 98%.
In the interviewed households, 16,583 eligible women were identified for individual interviews. Interviews were completed with 15,683 women, yielding a response rate of 95%. A total of 14,795 eligible men were identified in the sampled households and 12,688 were successfully interviewed, yielding a response rate of 86%. Although overall there was little variation in response rates according to residence, response rates among men were higher in rural than in urban areas.
The estimates from a sample survey are affected by two types of errors: non-sampling errors and sampling errors. Non-sampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding the questions by either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2016 Ethiopia DHS (EDHS) to minimise this type of error, non-sampling errors are impossible to avoid and are difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2016 EDHS is only one of many samples that could have been selected from the same population, by using the same design and the expected size. Each of those samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (such as mean or percentage), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2016 EDHS sample is the result of a multi-stage stratified design and, consequently, it was necessary to use more complex formulae. Sampling errors are computed in either ISSA or SAS, with programs developed by ICF International. These programs use the Taylor linearisation method of variance estimation for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in Appendix B of the survey final report.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Age distribution of eligible and interviewed men - Completeness of reporting - Births by calendar
Urban and regional planners rely on Average Household Size as a foundational indicator for many of their models, calculations, and plans. Average household size (also known as "people per household") is a reflection of many dynamics at play, for example:Age of the population, as many older people tend to live in smaller households (one-person or two-person households)Housing prices in the area, proximity to colleges and universities, and how likely people are to live with roommatesFamily norms and traditions (e.g., multigenerational families are more common in some areas and with some population groups)This feature layer contains the Average Household Size and Population Density for states, counties, and tracts. Data from U.S. Census Bureau's 2014-2018 American Community Survey's 5-year estimates, Tables B25010 and B01001. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. See the field description for the formula used.This layer is symbolized to show the average household size. Population density, as well as average household size breakdown by housing tenure is presented in the pop-up. Click the Data tab -> Fields list to see all available attributes and their definitions.