The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data.-9An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.-8An '-8' means that the estimate is not applicable or not available.-6A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.-5A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.-3A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.-2A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
These Public Use Microdata Sample (PUMS) files contain records representing 1-percent samples of the occupied and vacant housing units in the United States and the people in the occupied units in 2000. Group quarters people also are included. The files contain individual weights for each person and housing unit, which when applied to the individual records, expand the sample to the relevant total. Some of the items included on the housing record are: acreage, agricultural sales, bedrooms, condominium fee, contract rent, cost of utilities, family income in 1999, farm residence, fire, hazard, and flood insurance, fuels used, gross rent, heating fuel, household income in 1999, household type, kitchen facilities, linguistic isolation, meals included in rent, mobile home costs, mortgage payment, mortgage status, plumbing facilities, presence and age of own children, presence of subfamilies in household, real estate taxes, rooms, selected monthly owner costs, size of building (units in structure), telephone service, tenure, vacancy status, value (of housing unit), vehicles available, year householder moved into unit, and year structure was built. Some of the items included on the person record are: ability to speak English, age, ancestry, citizenship, class of worker, disability status, earnings in 1999, educational attainment, grandparents as caregivers, Hispanic origin, hours worked, income in 1999 by type, industry, language spoken at home, marital status, means of transportation to work, migration Public Use Microdata Area (PUMA), migration state, mobility status, veteran period of service, years of military service, occupation, personal care limitation, place of birth, place of work PUMA, place of work state, poverty status in 1999, race, relationship, school enrollment and type of school, time of departure for work, travel time to work, vehicle occupancy, weeks worked in 1999, work limitation status, work status in 1999, and year of entry. The Public Use Microdata Sample (PUMS) files contain geographic units known as super-Public Use Microdata Areas (super-PUMAs) and Public Use Microdata Areas (PUMAs). To maintain the confidentiality of the PUMS data, minimum population thresholds are set for PUMAs and super-PUMAs. For the 1-percent state-level files, the super-PUMAs contain a minimum population of 400,000 and are composed of a PUMA or a group of contiguous PUMAs delineated on the 5-percent state-level PUMS files. Super-PUMAs are a new geographic entity for Census 2000. Super-PUMAs and PUMAs also are defined for place of residence on April 1, 1995, and place of work. (Source: ICPSR, retrieved 06/15/2011)
The 1998 Ghana Demographic and Health Survey (GDHS) is the latest in a series of national-level population and health surveys conducted in Ghana and it is part of the worldwide MEASURE DHS+ Project, designed to collect data on fertility, family planning, and maternal and child health.
The primary objective of the 1998 GDHS is to provide current and reliable data on fertility and family planning behaviour, child mortality, children’s nutritional status, and the utilisation of maternal and child health services in Ghana. Additional data on knowledge of HIV/AIDS are also provided. This information is essential for informed policy decisions, planning and monitoring and evaluation of programmes at both the national and local government levels.
The long-term objectives of the survey include strengthening the technical capacity of the Ghana Statistical Service (GSS) to plan, conduct, process, and analyse the results of complex national sample surveys. Moreover, the 1998 GDHS provides comparable data for long-term trend analyses within Ghana, since it is the third in a series of demographic and health surveys implemented by the same organisation, using similar data collection procedures. The GDHS also contributes to the ever-growing international database on demographic and health-related variables.
National
Sample survey data
The major focus of the 1998 GDHS was to provide updated estimates of important population and health indicators including fertility and mortality rates for the country as a whole and for urban and rural areas separately. In addition, the sample was designed to provide estimates of key variables for the ten regions in the country.
The list of Enumeration Areas (EAs) with population and household information from the 1984 Population Census was used as the sampling frame for the survey. The 1998 GDHS is based on a two-stage stratified nationally representative sample of households. At the first stage of sampling, 400 EAs were selected using systematic sampling with probability proportional to size (PPS-Method). The selected EAs comprised 138 in the urban areas and 262 in the rural areas. A complete household listing operation was then carried out in all the selected EAs to provide a sampling frame for the second stage selection of households. At the second stage of sampling, a systematic sample of 15 households per EA was selected in all regions, except in the Northern, Upper West and Upper East Regions. In order to obtain adequate numbers of households to provide reliable estimates of key demographic and health variables in these three regions, the number of households in each selected EA in the Northern, Upper West and Upper East regions was increased to 20. The sample was weighted to adjust for over sampling in the three northern regions (Northern, Upper East and Upper West), in relation to the other regions. Sample weights were used to compensate for the unequal probability of selection between geographically defined strata.
The survey was designed to obtain completed interviews of 4,500 women age 15-49. In addition, all males age 15-59 in every third selected household were interviewed, to obtain a target of 1,500 men. In order to take cognisance of non-response, a total of 6,375 households nation-wide were selected.
Note: See detailed description of sample design in APPENDIX A of the survey report.
Face-to-face
Three types of questionnaires were used in the GDHS: the Household Questionnaire, the Women’s Questionnaire, and the Men’s Questionnaire. These questionnaires were based on model survey instruments developed for the international MEASURE DHS+ programme and were designed to provide information needed by health and family planning programme managers and policy makers. The questionnaires were adapted to the situation in Ghana and a number of questions pertaining to on-going health and family planning programmes were added. These questionnaires were developed in English and translated into five major local languages (Akan, Ga, Ewe, Hausa, and Dagbani).
The Household Questionnaire was used to enumerate all usual members and visitors in a selected household and to collect information on the socio-economic status of the household. The first part of the Household Questionnaire collected information on the relationship to the household head, residence, sex, age, marital status, and education of each usual resident or visitor. This information was used to identify women and men who were eligible for the individual interview. For this purpose, all women age 15-49, and all men age 15-59 in every third household, whether usual residents of a selected household or visitors who slept in a selected household the night before the interview, were deemed eligible and interviewed. The Household Questionnaire also provides basic demographic data for Ghanaian households. The second part of the Household Questionnaire contained questions on the dwelling unit, such as the number of rooms, the flooring material, the source of water and the type of toilet facilities, and on the ownership of a variety of consumer goods.
The Women’s Questionnaire was used to collect information on the following topics: respondent’s background characteristics, reproductive history, contraceptive knowledge and use, antenatal, delivery and postnatal care, infant feeding practices, child immunisation and health, marriage, fertility preferences and attitudes about family planning, husband’s background characteristics, women’s work, knowledge of HIV/AIDS and STDs, as well as anthropometric measurements of children and mothers.
The Men’s Questionnaire collected information on respondent’s background characteristics, reproduction, contraceptive knowledge and use, marriage, fertility preferences and attitudes about family planning, as well as knowledge of HIV/AIDS and STDs.
A total of 6,375 households were selected for the GDHS sample. Of these, 6,055 were occupied. Interviews were completed for 6,003 households, which represent 99 percent of the occupied households. A total of 4,970 eligible women from these households and 1,596 eligible men from every third household were identified for the individual interviews. Interviews were successfully completed for 4,843 women or 97 percent and 1,546 men or 97 percent. The principal reason for nonresponse among individual women and men was the failure of interviewers to find them at home despite repeated callbacks.
Note: See summarized response rates by place of residence in Table 1.1 of the survey report.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors, and (2) sampling errors. Nonsampling errors are the results of shortfalls made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 1998 GDHS to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 1998 GDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 1998 GDHS sample is the result of a two-stage stratified design, and, consequently, it was necessary to use more complex formulae. The computer software used to calculate sampling errors for the 1998 GDHS is the ISSA Sampling Error Module. This module uses the Taylor linearization method of variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Age distribution of eligible and interviewed men - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months
Note: See detailed tables in APPENDIX C of the survey report.
The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data. -9 An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small. -8 An '-8' means that the estimate is not applicable or not available. -6 A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution. -5 A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. -3 A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate. -2 A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data.-9An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.-8An '-8' means that the estimate is not applicable or not available.-6A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.-5A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.-3A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.-2A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
The 2015-16 Armenia Demographic and Health Survey (2015-16 ADHS) is the fourth in a series of nationally representative sample surveys designed to provide information on population and health issues. It is conducted in Armenia under the worldwide Demographic and Health Surveys program. Specifically, the objective of the 2015-16 ADHS is to provide current and reliable information on fertility and abortion levels, marriage, sexual activity, fertility preferences, awareness and use of family planning methods, breastfeeding practices, nutritional status of young children, childhood mortality, maternal and child health, domestic violence against women, child discipline, awareness and behavior regarding AIDS and other sexually transmitted infections (STIs), and other health-related issues such as smoking, tuberculosis, and anemia. The survey obtained detailed information on these issues from women of reproductive age and, for certain topics, from men as well.
The 2015-16 ADHS results are intended to provide information needed to evaluate existing social programs and to design new strategies to improve the health of and health services for the people of Armenia. Data are presented by region (marz) wherever sample size permits. The information collected in the 2015-16 ADHS will provide updated estimates of basic demographic and health indicators covered in the 2000, 2005, and 2010 surveys.
The long-term objective of the survey includes strengthening the technical capacity of major government institutions, including the NSS. The 2015-16 ADHS also provides comparable data for longterm trend analysis because the 2000, 2005, 2010, and 2015-16 surveys were implemented by the same organization and used similar data collection procedures. It also adds to the international database of demographic and health–related information for research purposes.
National coverage
The survey covered all de jure household members (usual residents), children age 0-4 years, women age 15-49 years and men age 15-49 years resident in the household.
Sample survey data [ssd]
The sample was designed to produce representative estimates of key indicators at the national level, for Yerevan, and for total urban and total rural areas separately. Many indicators can also be estimated at the regional (marz) level.
The sampling frame used for the 2015-16 ADHS is the Armenia Population and Housing Census, which was conducted in Armenia in 2011 (APHC 2011). The sampling frame is a complete list of enumeration areas (EAs) covering the whole country, a total number of 11,571 EAs, provided by the National Statistical Service (NSS) of Armenia, the implementing agency for the 2015-16 ADHS. This EA frame was created from the census data base by summarizing the households down to EA level. A representative probability sample of 8,749 households was selected for the 2015-16 ADHS sample. The sample was selected in two stages. In the first stage, 313 clusters (192 in urban areas and 121 in rural areas) were selected from a list of EAs in the sampling frame. In the second stage, a complete listing of households was carried out in each selected cluster. Households were then systematically selected for participation in the survey. Appendix A provides additional information on the sample design of the 2015-16 Armenia DHS. Because of the approximately equal sample size in each marz, the sample is not self-weighting at the national level, and weighting factors have been calculated, added to the data file, and applied so that results are representative at the national level.
For further details on sample design, see Appendix A of the final report.
Face-to-face [f2f]
Five questionnaires were used for the 2015-16 ADHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, the Biomarker Questionnaire, and the Fieldworker Questionnaire. These questionnaires, based on The DHS Program’s standard Demographic and Health Survey questionnaires, were adapted to reflect the population and health issues relevant to Armenia. Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organizations, and international donors. After all questionnaires were finalized in English, they were translated into Armenian. They were pretested in September-October 2015.
The processing of the 2015-16 ADHS data began shortly after fieldwork commenced. All completed questionnaires were edited immediately by field editors while still in the field and checked by the supervisors before being dispatched to the data processing center at the NSS central office in Yerevan. These completed questionnaires were edited and entered by 15 data processing personnel specially trained for this task. All data were entered twice for 100 percent verification. Data were entered using the CSPro computer package. The concurrent processing of the data was an advantage because the senior ADHS technical staff were able to advise field teams of problems detected during the data entry. In particular, tables were generated to check various data quality parameters. Moreover, the double entry of data enabled easy comparison and identification of errors and inconsistencies. As a result, specific feedback was given to the teams to improve performance. The data entry and editing phase of the survey was completed in June 2016.
A total of 8,749 households were selected in the sample, of which 8,205 were occupied at the time of the fieldwork. The main reason for the difference is that some of the dwelling units that were occupied during the household listing operation were either vacant or the household was away for an extended period at the time of interviewing. The number of occupied households successfully interviewed was 7,893, yielding a household response rate of 96 percent. The household response rate in urban areas (96 percent) was nearly the same as in rural areas (97 percent).
In these households, a total of 6,251 eligible women were identified; interviews were completed with 6,116 of these women, yielding a response rate of 98 percent. In one-half of the households, a total of 2,856 eligible men were identified, and interviews were completed with 2,755 of these men, yielding a response rate of 97 percent. Among men, response rates are slightly lower in urban areas (96 percent) than in rural areas (97 percent), whereas rates for women are the same in urban and in rural areas (98 percent).
The 2015-16 ADHS achieved a slightly higher response rate for households than the 2010 ADHS (NSS 2012). The increase is only notable for urban households (96 percent in 2015-16 compared with 94 percent in 2010). Response rates in all other categories are very close to what they were in 2010.
SAS computer software were used to calculate sampling errors for the 2015-16 ADHS. The programs used the Taylor linearization method of variance estimation for means or proportions and the Jackknife repeated replication method for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in Appendix B of the survey final report.
Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Age distribution of eligible and interviewed men - Completeness of reporting - Births by calendar years - Reporting of age at death in days - Reporting of age at death in months - Nutritional status of children based on the NCHS/CDC/WHO International Reference Population - Vaccinations by background characteristics for children age 18-29 months
See details of the data quality tables in Appendix C of the survey final report.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Demographics from the survey sample compared to the U.S. census bureau.
This layer shows demographic context for emergency response efforts.This is shown by tract, county, and state boundaries.This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of households who do not have access to internet. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right.Current Vintage: 2015-2019ACS Table(s): B01001, B08201, B09021, B16003, B16004, B17020, B18101, B25040, B25117, B27010, B28001, B28002 Data downloaded from:Census Bureau's API for American Community SurveyDate of API call: December 12, 2020National Figures:data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, seeAccuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule.Click hereto learn more about ACS data releases.Boundaries come from theUS Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintageas specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2010 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500kTIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specificationsdefined by the American Community Survey.Field alias names were created based on the Table Shells file available from theAmerican Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero.These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data. -9 An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small. -8 An '-8' means that the estimate is not applicable or not available. -6 A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution. -5 A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. -3 A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate. -2 A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
The 2023 Jordan Population and Family Health Survey (JPFHS) is the eighth Population and Family Health Survey conducted in Jordan, following those conducted in 1990, 1997, 2002, 2007, 2009, 2012, and 2017–18. It was implemented by the Department of Statistics (DoS) at the request of the Ministry of Health (MoH).
The primary objective of the 2023 JPFHS is to provide up-to-date estimates of key demographic and health indicators. Specifically, the 2023 JPFHS: • Collected data at the national level that allowed calculation of key demographic indicators • Explored the direct and indirect factors that determine levels of and trends in fertility and childhood mortality • Measured contraceptive knowledge and practice • Collected data on key aspects of family health, including immunisation coverage among children, prevalence and treatment of diarrhoea and other diseases among children under age 5, and maternity care indicators such as antenatal visits and assistance at delivery • Obtained data on child feeding practices, including breastfeeding, and conducted anthropometric measurements to assess the nutritional status of children under age 5 and women age 15–49 • Conducted haemoglobin testing with eligible children age 6–59 months and women age 15–49 to gather information on the prevalence of anaemia • Collected data on women’s and men’s knowledge and attitudes regarding sexually transmitted infections and HIV/AIDS • Obtained data on women’s experience of emotional, physical, and sexual violence • Gathered data on disability among household members
The information collected through the 2023 JPFHS is intended to assist policymakers and programme managers in evaluating and designing programmes and strategies for improving the health of the country’s population. The survey also provides indicators relevant to the Sustainable Development Goals (SDGs) for Jordan.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men aged 15-59, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
The sampling frame used for the 2023 JPFHS was the 2015 Jordan Population and Housing Census (JPHC) frame. The survey was designed to produce representative results for the country as a whole, for urban and rural areas separately, for each of the country’s 12 governorates, and for four nationality domains: the Jordanian population, the Syrian population living in refugee camps, the Syrian population living outside of camps, and the population of other nationalities. Each of the 12 governorates is subdivided into districts, each district into subdistricts, each subdistrict into localities, and each locality into areas and subareas. In addition to these administrative units, during the 2015 JPHC each subarea was divided into convenient area units called census blocks. An electronic file of a complete list of all of the census blocks is available from DoS. The list contains census information on households, populations, geographical locations, and socioeconomic characteristics of each block. Based on this list, census blocks were regrouped to form a general statistical unit of moderate size, called a cluster, which is widely used in various surveys as the primary sampling unit (PSU). The sample clusters for the 2023 JPFHS were selected from the frame of cluster units provided by the DoS.
The sample for the 2023 JPFHS was a stratified sample selected in two stages from the 2015 census frame. Stratification was achieved by separating each governorate into urban and rural areas. In addition, the Syrian refugee camps in Zarqa and Mafraq each formed a special sampling stratum. In total, 26 sampling strata were constructed. Samples were selected independently in each sampling stratum, through a twostage selection process, according to the sample allocation. Before the sample selection, the sampling frame was sorted by district and subdistrict within each sampling stratum. By using a probability proportional to size selection at the first stage of sampling, an implicit stratification and proportional allocation were achieved at each of the lower administrative levels.
For further details on sample design, see APPENDIX A of the final report.
Computer Assisted Personal Interview [capi]
Five questionnaires were used for the 2023 JPFHS: (1) the Household Questionnaire, (2) the Woman’s Questionnaire, (3) the Man’s Questionnaire, (4) the Biomarker Questionnaire, and (5) the Fieldworker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Jordan. Input was solicited from various stakeholders representing government ministries and agencies, nongovernmental organisations, and international donors. After all questionnaires were finalised in English, they were translated into Arabic.
All electronic data files for the 2023 JPFHS were transferred via SynCloud to the DoS central office in Amman, where they were stored on a password-protected computer. The data processing operation included secondary editing, which required resolution of computer-identified inconsistencies and coding of open-ended questions. Data editing was accomplished using CSPro software. During the duration of fieldwork, tables were generated to check various data quality parameters, and specific feedback was given to the teams to improve performance. Secondary editing and data processing were initiated in July and completed in September 2023.
A total of 20,054 households were selected for the sample, of which 19,809 were occupied. Of the occupied households, 19,475 were successfully interviewed, yielding a response rate of 98%.
In the interviewed households, 13,020 eligible women age 15–49 were identified for individual interviews; interviews were completed with 12,595 women, yielding a response rate of 97%. In the subsample of households selected for the male survey, 6,506 men age 15–59 were identified as eligible for individual interviews and 5,873 were successfully interviewed, yielding a response rate of 90%.
The estimates from a sample survey are affected by two types of errors: nonsampling errors and sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and in data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2023 Jordan Population and Family Health Survey (2023 JPFHS) to minimise this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2023 JPFHS is only one of many samples that could have been selected from the same population, using the same design and sample size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected by simple random sampling, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2023 JPFHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. Sampling errors are computed using SAS programs developed by ICF. These programs use the Taylor linearisation method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
This repository contains datasets relating to coronavirus in Sierra Leone, as well as on demographic and other information from the 2015 Population and Household Census (PHC). It also includes mapping shapefiles by district, so that you can map the district-level coronavirus statistics.
See here for a full description of how the data files have been created from the source data, including the R code.
Last updated: 10 June 2020.
The novel 2019 coronavirus (covid-19) arrived late to West Africa and Sierra Leone in particular. This dataset provides the number of reported cases on a district-by-district basis for Sierra Leone, as well as various additional statistics at the country level. In addition, I provide district-by-district data on demographics and households' main sources of information, both from the 2015 census. For convenience, I also provide shapefiles for mapping the 14 districts of Sierra Leone.
The dataset consists of four main files, which are in the output
folder. See the column descriptions below for further details.
Coronavirus confirmed cases by district (sl_districts_coronavirus.csv
). I found the original data by looking in the static/js/data
folder in the source code for covid19.mic.gov.sl, last accessed 10 June 2020. The file contains the cumulative number of confirmed coronavirus cases in the 14 districts of Sierra Leone as a time series. I have used the R tidyverse to reshape the data and ensure naming is consistent with the other data files.
Demographic statistics by district (sl_districts_demographics.csv
). Data from the 2015 Population and Housing Census (PHC), sourced from Open Data Sierra Leone. The dataset covers the 14 districts of Sierra Leone, which increased to 16 in 2017. Last accessed 10 June 2020.
Main Sources of Information by district (sl_districts_info_sources.csv
). Data from the 2015 Population and Housing Census (PHC), sourced from Open Data Sierra Leone. The dataset presents the main sources of information, such as television or radio, for households in the 14 districts of Sierra Leone. Last accessed 2 June 2020. I note that I have made one correction to the source data (see R code with correction here).
Country-wide coronavirus statistics for Sierra Leone (sl_national_coronavirus.csv
). The original data also comes from covid19.mic.gov.sl, last accessed 10 June 2020. The file contains numerous statistics as time series, listed in the Column Description section below. I note that there are various potential issues in the file which I leave the user to decide how to deal with (duplicate datetimes, inconsistent statistics).
Additionally I include a set of five files with district-by-district mapping (shapefiles) and other data, unchanged from their original source. Each file is labelled in the following way: sl_districts_mapping.*
. These files come from Direct Relief Open Data on ArcGIS Hub. The data also include district-level data on maternal child health attributes, which was the original context of the mapping data.
Coronavirus confirmed cases by district sl_districts_coronavirus.csv
:
date
: Date of reportingdistrict
: District of Sierra Leone (based on pre-2017 administrative boundaries)confirmed_cases
: Cumulative number of confirmed coronavirus cases; NA if no data reporteddecrease
: Dummy variable indicating whether the number of reported cases has been revised down. NA if no reported cases on that date; 1 if there is a decrease from the last reported cases; 0 otherwiseDemographic statistics by district sl_districts_demographics.csv
:
district
: District of Sierra Leone (based on pre-2017 administrative boundaries)d_code
: District coded_id
: District idtotal_pop
: Total population in districtpop_share
: District's share of total country populationt_male
: Total male populationt_female
: Total female populations_ratio
: (*) Sex ratio at birth (number of males for every 100 females, under the age of 1)t_urban
: Total urban populationt_rural
: Total rural populationprop_urban
: Proportion urbant_h_pop
: Sum of h_male
and h_female
h_male
: (?)h_female
: (?)t_i_pop
: Sum of i_male
and i_female
i_male
: (?)i_female
: (?)working_pop
: Working populationdepend_pop
: Dependent population...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Data and Documentation section...Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau''s Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities and towns and estimates of housing units for states and counties..Explanation of Symbols:An ''**'' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate..An ''-'' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution..An ''-'' following a median estimate means the median falls in the lowest interval of an open-ended distribution..An ''+'' following a median estimate means the median falls in the upper interval of an open-ended distribution..An ''***'' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate..An ''*****'' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate. .An ''N'' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small..An ''(X)'' means that the estimate is not applicable or not available..Estimates of urban and rural population, housing units, and characteristics reflect boundaries of urban areas defined based on Census 2010 data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..While the 2015 American Community Survey (ACS) data generally reflect the February 2013 Office of Management and Budget (OMB) definitions of metropolitan and micropolitan statistical areas; in certain instances the names, codes, and boundaries of the principal cities shown in ACS tables may differ from the OMB definitions due to differences in the effective dates of the geographic entities..The category "with a broadband Internet subscription" refers to those who said "Yes" to a DSL, cable, fiberoptic, mobile broadband, satellite, or fixed wireless subscription..Logical coverage edits applying a rules-based assignment of Medicaid, Medicare and military health coverage were added as of 2009 -- please see http://www.census.gov/library/working-papers/2010/demo/coverage_edits_final.html for more details. The 2008 data table in American FactFinder does not incorporate these edits. Therefore, the estimates that appear in these tables are not comparable to the estimates in the 2009 and later tables. Select geographies of 2008 data comparable to the 2009 and later tables are available at http://www.census.gov/data/tables/time-series/acs/1-year-re-run-health-insurance.html. The health insurance coverage category names were modified in 2010. See http://www.census.gov/topics/health/health-insurance/about/glossary.html#par_textimage_18 for a list of the insurance type definitions..Occupation codes are 4-digit codes and are based on Standard Occupational Classification 2010..Industry codes are 4-digit codes and are based on the North American Industry Classification System 2012. The Industry categories adhere to the guidelines issued in Clarification Memorandum No. 2, "NAICS Alternate Aggregation Structure for Use By U.S. Statistical Agencies," issued by the Office of Management and Budget..Employment and unemployment estimates may vary from the official labor force data released by the Bureau of Labor Statistics because of differences in survey design and data collection. For guidance on differences in employment and unemployment estimates from different sources go to Labor Force Guidance..The Census Bureau introduced a new set of disability questions in the 2008 ACS questionnaire. Accordingly, comparisons of disability data from 2008 or later with data from prior years are not recommended. For more information on these questions and their evaluation in the 2006 ACS Content Test, see the Evaluation Report Covering Disability..Due to methodological changes to data collection that began in data year 2013, comparisons of language estimates from that point to estimates from 2013 forw...
The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data.-9An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.-8An '-8' means that the estimate is not applicable or not available.-6A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.-5A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.-3A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.-2A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.
The 1990 Public Use Microdata Sample Areas (PUMA) Boundary Files portion of the Archive of Census Related Products (ACRP) consists of 5% sample (apuma) and 1% sample (bpuma) areas for the mapping of 1990 PUMS data covering the continental United States, Alaska, and Hawaii. These boundary files are created based on equivalency files generated by the Geographic Correspondence Engine (GeoCorr). A national census tract to PUMA geography correspondence file is used in merging the two files resulting in the PUMA geographies. An additional file is also available consisting of geographic centroids for the PUMA coverages calculated by UIC (Urban Information Center/Office of Computing, University of Missouri). This portion of the ACRP is produced by the Center for International Earth Science Information Network (CIESIN).
The 2022 Ghana Demographic and Health Survey (2022 GDHS) is the seventh in the series of DHS surveys conducted by the Ghana Statistical Service (GSS) in collaboration with the Ministry of Health/Ghana Health Service (MoH/GHS) and other stakeholders, with funding from the United States Agency for International Development (USAID) and other partners.
The primary objective of the 2022 GDHS is to provide up-to-date estimates of basic demographic and health indicators. Specifically, the GDHS collected information on: - Fertility levels and preferences, contraceptive use, antenatal and delivery care, maternal and child health, childhood mortality, childhood immunisation, breastfeeding and young child feeding practices, women’s dietary diversity, violence against women, gender, nutritional status of adults and children, awareness regarding HIV/AIDS and other sexually transmitted infections, tobacco use, and other indicators relevant for the Sustainable Development Goals - Haemoglobin levels of women and children - Prevalence of malaria parasitaemia (rapid diagnostic testing and thick slides for malaria parasitaemia in the field and microscopy in the lab) among children age 6–59 months - Use of treated mosquito nets - Use of antimalarial drugs for treatment of fever among children under age 5
The information collected through the 2022 GDHS is intended to assist policymakers and programme managers in designing and evaluating programmes and strategies for improving the health of the country’s population.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, men aged 15-59, and all children aged 0-4 resident in the household.
Sample survey data [ssd]
To achieve the objectives of the 2022 GDHS, a stratified representative sample of 18,450 households was selected in 618 clusters, which resulted in 15,014 interviewed women age 15–49 and 7,044 interviewed men age 15–59 (in one of every two households selected).
The sampling frame used for the 2022 GDHS is the updated frame prepared by the GSS based on the 2021 Population and Housing Census.1 The sampling procedure used in the 2022 GDHS was stratified two-stage cluster sampling, designed to yield representative results at the national level, for urban and rural areas, and for each of the country’s 16 regions for most DHS indicators. In the first stage, 618 target clusters were selected from the sampling frame using a probability proportional to size strategy for urban and rural areas in each region. Then the number of targeted clusters were selected with equal probability systematic random sampling of the clusters selected in the first phase for urban and rural areas. In the second stage, after selection of the clusters, a household listing and map updating operation was carried out in all of the selected clusters to develop a list of households for each cluster. This list served as a sampling frame for selection of the household sample. The GSS organized a 5-day training course on listing procedures for listers and mappers with support from ICF. The listers and mappers were organized into 25 teams consisting of one lister and one mapper per team. The teams spent 2 months completing the listing operation. In addition to listing the households, the listers collected the geographical coordinates of each household using GPS dongles provided by ICF and in accordance with the instructions in the DHS listing manual. The household listing was carried out using tablet computers, with software provided by The DHS Program. A fixed number of 30 households in each cluster were randomly selected from the list for interviews.
For further details on sample design, see APPENDIX A of the final report.
Face-to-face computer-assisted interviews [capi]
Four questionnaires were used in the 2022 GDHS: the Household Questionnaire, the Woman’s Questionnaire, the Man’s Questionnaire, and the Biomarker Questionnaire. The questionnaires, based on The DHS Program’s model questionnaires, were adapted to reflect the population and health issues relevant to Ghana. In addition, a self-administered Fieldworker Questionnaire collected information about the survey’s fieldworkers.
The GSS organized a questionnaire design workshop with support from ICF and obtained input from government and development partners expected to use the resulting data. The DHS Program optional modules on domestic violence, malaria, and social and behavior change communication were incorporated into the Woman’s Questionnaire. ICF provided technical assistance in adapting the modules to the questionnaires.
DHS staff installed all central office programmes, data structure checks, secondary editing, and field check tables from 17–20 October 2022. Central office training was implemented using the practice data to test the central office system and field check tables. Seven GSS staff members (four male and three female) were trained on the functionality of the central office menu, including accepting clusters from the field, data editing procedures, and producing reports to monitor fieldwork.
From 27 February to 17 March, DHS staff visited the Ghana Statistical Service office in Accra to work with the GSS central office staff on finishing the secondary editing and to clean and finalize all data received from the 618 clusters.
A total of 18,540 households were selected for the GDHS sample, of which 18,065 were found to be occupied. Of the occupied households, 17,933 were successfully interviewed, yielding a response rate of 99%. In the interviewed households, 15,317 women age 15–49 were identified as eligible for individual interviews. Interviews were completed with 15,014 women, yielding a response rate of 98%. In the subsample of households selected for the male survey, 7,263 men age 15–59 were identified as eligible for individual interviews and 7,044 were successfully interviewed.
The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors and (2) sampling errors. Nonsampling errors are the results of mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding of the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2022 Ghana Demographic and Health Survey (2022 GDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2022 GDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling errors are a measure of the variability between all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results. A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95% of all possible samples of identical size and design.
If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2022 GDHS sample was the result of a multistage stratified design, and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the GDHS 2022 is an SAS program. This program used the Taylor linearization method to estimate variances for survey estimates that are means, proportions, or ratios. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.
A more detailed description of estimates of sampling errors are presented in APPENDIX B of the survey report.
Data Quality Tables
The 2007 Liberia Demographic and Health Survey (LDHS) was carried out from late December 2006 to April 2007, using a nationally representative sample of over 7,000 households. All women and men age 15-49 years in these households were eligible to be individually interviewed and were asked to provide a blood sample for HIV testing. The blood samples were dried and carried to the National Laboratory of the Ministry of Health and Social Welfare (MOHSW) on the JFK Hospital compound in Monrovia where they were tested for the human immunodeficiency virus (HIV).
The 2007 LDHS was designed to provide data to monitor the population and health situation in Liberia. Specifically, the LDHS collected information on fertility levels, marriage, sexual activity, fertility preferences, awareness and use of family planning methods, breastfeeding practices, nutritional status of women and young children, childhood and maternal mortality, maternal and child health, domestic violence, and awareness and behavior regarding HIV/AIDS and other sexually transmitted infections (STIs).
National
Sample survey data
The LDHS sample was designed to produce most of the key indicators for the country as a whole, for urban and rural areas separately, and for Monrovia and each of five regions that were formed by grouping the 15 counties. The regional groups are as follows:
1 Greater Monrovia
2 North Western: Bomi, Grand Cape Mount, Gbarpolu
3 South Central: Montserrado (outside Monrovia), Margibi, Grand Bassa
4 Southeastern A: River Cess, Sinoe, Grand Gedeh
5 Southeastern B: Rivergee, Grand Kru, Maryland
6 North Central: Bong, Nimba, Lofa
Thus the sample was not spread geographically in proportion to the population, but rather more or less equally across the regions. As a result, the LDHS sample is not self-weighting at the national level and sample weighting factors have been applied to the survey records in order to bring them into proportion.
The survey utilised a two-stage sample design. The first stage involved selecting 300 sample points or clusters from the list of 4,602 enumeration areas (EAs) covered in the 1984 Population Census. This sampling 'frame' is more than 20 years old and in the intervening years Liberia has experienced a civil war and considerable population change. Many people left the country altogether, others lost their lives, while others moved within the country. For example, some households in rural areas relocated into larger villages in order to be better protected. New communities were established, while existing ones had expanded or contracted or even disappeared. Furthermore, as urban areas-especially Monrovia-expanded, some EAs that were previously considered rural may have become urban, but this will not be reflected in the sample frame. Taking all these factors into account, it is obvious that the 1984 census frame is not ideal to be used as sampling frame; however, it is still the only national frame which covers the whole country.
LISGIS conducted a fresh listing of the households residing in the selected sample points, along with identifying the geographic coordinates (latitude and longitude) of the center of each cluster (GPS coding). The listing was conducted from March to May 2006. The second stage of selection involved the systematic sampling of 25 of the households listed in each cluster. It later turned out that there was a problem with the sample frame for Monrovia that resulted in two areas being erroneously oversampled. To correct this error, two clusters were dropped altogether, while five others were replaced in order to provide more balance in the selection. Thus the survey covered a total of 298 clusters-114 urban and 184 rural.
All women and men aged 15-49 years who were either permanent residents of the households in the sample or visitors present in the household on the night before the survey were eligible to be interviewed in the survey and to give a few drops of blood for HIV testing.
Note: See detailed description of the sample design in Appendix A of the survey final report.
Face-to-face
Three questionnaires—a Household Questionnaire, a Women’s Questionnaire, and a Men’s Questionnaire—were used in the survey. The contents of these questionnaires were based on model questionnaires developed by the MEASURE DHS program.
In consultation with a group of stakeholders, LISGIS and Macro staff modified the DHS model questionnaires to reflect relevant issues in population, family planning, HIV/AIDS, and other health issues in Liberia. Given that there are dozens of local languages in Liberia, most of which have no accepted written script and are not taught in the schools, and given that English is widely spoken, it was decided not to attempt to translate the questionnaires into vernaculars. However, many of the questions were broken down into a simpler form of Liberian English that interviewers could use with respondents.
The Household Questionnaire was used to list all the usual members and visitors in the selected households. Some basic information was collected on the characteristics of each person listed, including age, sex, education, and relationship to the head of the household. The main purpose of the Household Questionnaire was to identify women and men who were eligible for the individual interview. The Household Questionnaire also collected information on characteristics of the household’s dwelling unit, such as the source of water, type of toilet facilities, materials used for the floor and roof of the house, ownership of various durable goods, and ownership and use of mosquito nets. In addition, this questionnaire was also used to record height and weight measurements of women age 15-49 years and of children under the age of 5 years and women’s and men’s consent to volunteer to give blood samples. The HIV testing procedures are described in detail in the next section.
The Women’s Questionnaire was used to collect information from all women age 15-49 years and covered the following topics: - Background characteristics (education, residential history, media exposure, etc.) - Reproductive history and child mortality - Knowledge and use of family planning methods - Fertility preferences - Prenatal and delivery care - Breastfeeding and infant feeding practices - Vaccinations and childhood illnesses - Marriage and sexual activity - Woman’s work and husband’s background characteristics - Infant and child feeding practices - Awareness and behavior about HIV/AIDS and other STIs - Adult mortality including maternal mortality.
The Women’s Questionnaire also included a series of questions to obtain information on women’s experience of domestic violence. These questions were administered to one woman per household. In households with two or more eligible women, special procedures were followed in order to ensure that there was random selection of the woman to be interviewed and that these questions were administered in privacy.
The Men’s Questionnaire collected similar information contained in the Woman’s Questionnaire, but was shorter because it did not contain questions on reproductive history, maternal and child health, nutrition, maternal mortality, or domestic violence.
All aspects of the LDHS data collection were pretested in July 2006. For the pretest, LISGIS recruited 19 people to attend the training, most of whom were LISGIS staff with a few from other organizations involved in the survey, e.g., the NACP. Training was held at the Liberia Bible Society for 11 days from June 20 through July 1. Twelve of the 19 participants were selected to implement the pretest interviewing. Two teams were formed for the pretest, each with one supervisor, three female interviewers. and two male interviewers. Each team covered one rural and one urban EA. Because the work was being done during the period of heavy rainfall, the rural areas selected were off a main paved road, about 1-2 hours’ drive from Monrovia, and the urban areas were both in Monrovia itself. Pretest interviewing took six days, from July 4 through July 9. In total, the teams completed interviews with 95 households, 82 women and 60 men, and collected 118 blood samples. The pretest resulted in deleting some questions and making modifications in others.
A total of 7,471 households were selected in the sample, of which 7,021 were found occupied at the time of the fieldwork. The shortfall is largely due to households that were away for an extended period of time and structures that were found to be vacant or destroyed. Of the existing households, 6,824 were successfully interviewed, yielding a household response rate of 97 percent.
In the households interviewed in the survey, a total of 7,448 eligible women were identified, of whom 7,092 were successfully interviewed yielding a response rate of 95 percent. With regard to the male survey results, 6,476 eligible men were identified, of whom 6,009 were successfully interviewed, yielding a response rate of 93 percent. The response rates are lower in the urban than rural sample, especially for men.
The principal reason for non-response among both eligible men and women was the failure to find individuals at home despite repeated visits to the household, followed by refusal to be interviewed. The substantially lower response rate for men reflects the more frequent and longer absence of men from the
The 1993 Ghana Demographic and Health Survey (GDHS) is a nationally representative survey of 4,562 women age 15-49 and 1,302 men age 15-59. The survey is designed to furnish policymakers, planners and program managers with factual, reliable and up-to-date information on fertility, family planning and the status of maternal and child health care in the country. The survey, which was carried out by the Ghana Statistical Service (GSS), marks Ghana's second participation in the worldwide Demographic and Health Surveys (DHS) program.
The principal objective of the 1993 GDHS is to generate reliable and current information on fertility, mortality, contraception and maternal and child health indicators. Such data are necessary for effective policy formulation as well as program design, monitoring and evaluation. The 1993 GDHS is, in large measure, an update to the 1988 GDHS. Together, the two surveys provide comparable information for two points in time, thus allowing assessment of changes and trends in various demographic and health indicators over time.
Long-term objectives of the survey include (i) strengthening the capacity of the Ghana Statistical Service to plan, conduct, process and analyze data from a complex, large-scale survey such as the Demographic and Health Survey, and (ii) contributing to the ever-expanding international database on demographic and health-related variables.
National
Sample survey data
The 1993 GDHS is a stratified, self-weighting, nationally representative sample of households chosen from 400 Enumeration Areas (EAs). The 1984 Population Census EAs constituted the sampling frame. The frame was first stratified into three ecological zones, namely coastal, forest and savannah, and then into urban and rural EAs. The EAs were selected with probability proportional to the number of households. Households within selected EAs were subsequently listed and a systematic sample of households was selected for the survey. The survey was designed to yield a sample of 5,400 women age 15-49 and a sub-sample of males age 15-59 systematically selected from one-third of the 400 EAs.
Note: See detailed description of sample design in APPENDIX A of the survey report.
Face-to-face
Survey instruments used to elicit information for the 1993 GDHS are 1) Household Schedule 2) Women's Questionnaire and 3) Men's Questionnaire.
The questionnaires were structured based on the Demographic and Health Survey Model B Questionnaire designed for countries with low levels of contraceptive use. The final version of the questionnaires evolved out of a series of meetings with personnel of relevant ministries, institutions and organizations engaged in activities relating to fertility and family planning, health and nutrition and rehabilitation of persons with disabilities.
The questionnaires were first developed in English and later translated and printed in five major local languages, namely: Akan, Dagbani, Ewe, Ga, and Hausa. In the selected households, all usual members and visitors were listed in the household schedule. Background information, such as age, sex, relationship to head of household, marital status and level of education, was collected on each listed person. Questions on economic activity, occupation, industry, employment status, number of days worked in the past week and number of hours worked per day was asked of all persons age seven years and over. Those who did not work during the reference period were asked whether or not they actively looked for work.
Information on the health and disability status of all persons was also collected in the household schedule. Migration history was elicited from all persons age 15 years and over, as well as information on the survival status and residence of natural parents of all children less than 15 years in the household.
Data on source of water supply, type of toilet facility, number of sleeping rooms available to the household, material of floor and ownership of specified durable consumer goods were also elicited.
Finally, the household schedule was the instrument used to identify eligible women and men from whom detailed information was collected during the individual interview.
The women's questionnaire was used to collect information on eligible women identified in the household schedule. Eligible women were defined as those age 15-49 years who are usual members of the household and visitors who spent the night before the interview with the household. Questions asked in the questionnaire were on the following topics:
All female respondents with at least one live birth since January 1990 and their children born since 1st January 1990 had their height and weight taken.
The men's questionnaire was administered to men in sample households in a third of selected EAs. An eligible man was 15-59 years old who is either a usual household member or a visitor who spent the night preceding the day of interview with the household.
Topics enquired about in the men's questionnaire included the following: - Background Characteristics - Reproductive History - Contraceptive Knowledge and Use - Marriage - Fertility Preferences - Knowledge of AIDS and Other STDs.
Questionnaires from the field were sent to the secretariat at the Head Office for checking and office editing. The office editing, which was undertaken by two officers, involved correcting inconsistencies in the questionnaire responses and coding open-ended questions. The questionnaires were then forwarded to the data processing unit for data entry. Data capture and verification were undertaken by four data entry operators. Nearly 20 percent of the questionnaires were verified. This phase of the survey covered four and a half months - that is, from mid-October, 1993 to the end of February, 1994.
After the data entry, three professional staff members performed the secondary editing of questionnaires that were flagged either because entries were inconsistent or values of specific variables were out of range or missing. The secondary editing was completed on 17th March, 1994 and the tables for the preliminary report were generated on 18th March, 1994. The software package used for the data processing was the Integrated System for Survey Analysis (ISSA).
A sample of 6,161 households was selected, from which 5,919 households were contacted for interview. Interviews were successfully completed in 5,822 households, indicating a household response rate of 98 percent. About 3 percent of selected households were absent during the interviewing period, and are excluded from the calculations of the response rate.
Even though the sample was designed to yield interviews with nearly 5,400 women age 15-49 only 4,700 women were identified as eligible for the individual interview. Individual interviews were successfully completed for 4,562 eligible women, giving a response rate of 97 percent. Similarly, instead of the expected 1,700 eligible men being identified in the households only 1,354 eligible men were found and 1,302 of these were successfully interviewed, with a response rate of 96 percent.
The principal reason for non-response among eligible women and men was not finding them at home despite repeated visits to the households. However, refusal rates for both eligible women and men were low, 0.3 percent and 0.2 percent, respectively.
Note: See summarized response rates in Table 1.1 of the survey report.
The results from sample surveys are affected by two types of errors, non-sampling error and sampling error. Non-sampling error is due to mistakes made in carrying out field activities, such as failure to locate and interview the correct household, errors in the way the questions are asked, misunderstanding on the part of either the interviewer or the respondent, data entry errors, etc. Although efforts were made during the design and implementation of the 1993 GDHS to minimize this type of error, non-sampling errors are impossible to avoid and difficult to evaluate statistically.
Sampling errors, on the other hand, can be measured statistically. The sample of eligible women selected in the 1993 GDHS is only one of many samples that could have been selected from the same population, using the same design and expected size. Each one would have yielded results that differed somewhat from the actual sample selected. The sampling error is a measure of the variability between all possible samples; although it is not known exactly, it can be estimated from the survey results.
Sampling error is usually measured in terms of standard error of a particular statistic (mean, percentage, etc.), which is the square root of the variance of the statistic. The standard error can be used to calculate confidence intervals within which, apart from non-sampling errors, the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that same statistic as measured in 95 percent of all possible samples with the same design (and expected size) will fall within a range
This layer shows total population count by sex and age group. This is shown by tract, county, and state boundaries. This service is updated annually to contain the most currently released American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the percentage of the population that are considered dependent (ages 65+ and <18). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2019-2023ACS Table(s): B01001Data downloaded from: Census Bureau's API for American Community Survey Date of API call: December 12, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Click here to learn more about ACS data releases.Boundaries come from the US Census TIGER geodatabases, specifically, the National Sub-State Geography Database (named tlgdb_(year)_a_us_substategeo.gdb). Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines erased for cartographic and mapping purposes. For census tracts, the water cutouts are derived from a subset of the 2020 Areal Hydrography boundaries offered by TIGER. Water bodies and rivers which are 50 million square meters or larger (mid to large sized water bodies) are erased from the tract level boundaries, as well as additional important features. For state and county boundaries, the water and coastlines are derived from the coastlines of the 2023 500k TIGER Cartographic Boundary Shapefiles. These are erased to more accurately portray the coastlines and Great Lakes. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters).The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Related article: Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39.
In this dataset:
We present temporally dynamic population distribution data from the Helsinki Metropolitan Area, Finland, at the level of 250 m by 250 m statistical grid cells. Three hourly population distribution datasets are provided for regular workdays (Mon – Thu), Saturdays and Sundays. The data are based on aggregated mobile phone data collected by the biggest mobile network operator in Finland. Mobile phone data are assigned to statistical grid cells using an advanced dasymetric interpolation method based on ancillary data about land cover, buildings and a time use survey. The data were validated by comparing population register data from Statistics Finland for night-time hours and a daytime workplace registry. The resulting 24-hour population data can be used to reveal the temporal dynamics of the city and examine population variations relevant to for instance spatial accessibility analyses, crisis management and planning.
Please cite this dataset as:
Bergroth, C., Järv, O., Tenkanen, H., Manninen, M., Toivonen, T., 2022. A 24-hour population distribution dataset based on mobile phone data from Helsinki Metropolitan Area, Finland. Scientific Data 9, 39. https://doi.org/10.1038/s41597-021-01113-4
Organization of data
The dataset is packaged into a single Zipfile Helsinki_dynpop_matrix.zip which contains following files:
HMA_Dynamic_population_24H_workdays.csv represents the dynamic population for average workday in the study area.
HMA_Dynamic_population_24H_sat.csv represents the dynamic population for average saturday in the study area.
HMA_Dynamic_population_24H_sun.csv represents the dynamic population for average sunday in the study area.
target_zones_grid250m_EPSG3067.geojson represents the statistical grid in ETRS89/ETRS-TM35FIN projection that can be used to visualize the data on a map using e.g. QGIS.
Column names
YKR_ID : a unique identifier for each statistical grid cell (n=13,231). The identifier is compatible with the statistical YKR grid cell data by Statistics Finland and Finnish Environment Institute.
H0, H1 ... H23 : Each field represents the proportional distribution of the total population in the study area between grid cells during a one-hour period. In total, 24 fields are formatted as “Hx”, where x stands for the hour of the day (values ranging from 0-23). For example, H0 stands for the first hour of the day: 00:00 - 00:59. The sum of all cell values for each field equals to 100 (i.e. 100% of total population for each one-hour period)
In order to visualize the data on a map, the result tables can be joined with the target_zones_grid250m_EPSG3067.geojson data. The data can be joined by using the field YKR_ID as a common key between the datasets.
License Creative Commons Attribution 4.0 International.
Related datasets
Järv, Olle; Tenkanen, Henrikki & Toivonen, Tuuli. (2017). Multi-temporal function-based dasymetric interpolation tool for mobile phone data. Zenodo. https://doi.org/10.5281/zenodo.252612
Tenkanen, Henrikki, & Toivonen, Tuuli. (2019). Helsinki Region Travel Time Matrix [Data set]. Zenodo. http://doi.org/10.5281/zenodo.3247564
The American Community Survey Education Tabulation (ACS-ED) is a custom tabulation of the ACS produced for the National Center of Education Statistics (NCES) by the U.S. Census Bureau. The ACS-ED provides a rich collection of social, economic, demographic, and housing characteristics for school systems, school-age children, and the parents of school-age children. In addition to focusing on school-age children, the ACS-ED provides enrollment iterations for children enrolled in public school. The data profiles include percentages (along with associated margins of error) that allow for comparison of school district-level conditions across the U.S. For more information about the NCES ACS-ED collection, visit the NCES Education Demographic and Geographic Estimates (EDGE) program at: https://nces.ed.gov/programs/edge/Demographic/ACSAnnotation values are negative value representations of estimates and have values when non-integer information needs to be represented. See the table below for a list of common Estimate/Margin of Error (E/M) values and their corresponding Annotation (EA/MA) values.All information contained in this file is in the public domain. Data users are advised to review NCES program documentation and feature class metadata to understand the limitations and appropriate use of these data.-9An '-9' entry in the estimate and margin of error columns indicates that data for this geographic area cannot be displayed because the number of sample cases is too small.-8An '-8' means that the estimate is not applicable or not available.-6A '-6' entry in the estimate column indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.-5A '-5' entry in the margin of error column indicates that the estimate is controlled. A statistical test for sampling variability is not appropriate.-3A '-3' entry in the margin of error column indicates that the median falls in the lowest interval or upper interval of an open-ended distribution. A statistical test is not appropriate.-2A '-2' entry in the margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.