analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D
The study included four separate surveys:
The survey of Family Income Support (MOP in Serbian) recipients in 2002 These two datasets are published together separately from the 2003 datasets.
The LSMS survey of general population of Serbia in 2003 (panel survey)
The survey of Roma from Roma settlements in 2003 These two datasets are published together.
Objectives
LSMS represents multi-topical study of household living standard and is based on international experience in designing and conducting this type of research. The basic survey was carried out in 2002 on a representative sample of households in Serbia (without Kosovo and Metohija). Its goal was to establish a poverty profile according to the comprehensive data on welfare of households and to identify vulnerable groups. Also its aim was to assess the targeting of safety net programs by collecting detailed information from individuals on participation in specific government social programs. This study was used as the basic document in developing Poverty Reduction Strategy (PRS) in Serbia which was adopted by the Government of the Republic of Serbia in October 2003.
The survey was repeated in 2003 on a panel sample (the households which participated in 2002 survey were re-interviewed).
Analysis of the take-up and profile of the population in 2003 was the first step towards formulating the system of monitoring in the Poverty Reduction Strategy (PRS). The survey was conducted in accordance with the same methodological principles used in 2002 survey, with necessary changes referring only to the content of certain modules and the reduction in sample size. The aim of the repeated survey was to obtain panel data to enable monitoring of the change in the living standard within a period of one year, thus indicating whether there had been a decrease or increase in poverty in Serbia in the course of 2003. [Note: Panel data are the data obtained on the sample of households which participated in the both surveys. These data made possible tracking of living standard of the same persons in the period of one year.]
Along with these two comprehensive surveys, conducted on national and regional representative samples which were to give a picture of the general population, there were also two surveys with particular emphasis on vulnerable groups. In 2002, it was the survey of living standard of Family Income Support recipients with an aim to validate this state supported program of social welfare. In 2003 the survey of Roma from Roma settlements was conducted. Since all present experiences indicated that this was one of the most vulnerable groups on the territory of Serbia and Montenegro, but with no ample research of poverty of Roma population made, the aim of the survey was to compare poverty of this group with poverty of basic population and to establish which categories of Roma population were at the greatest risk of poverty in 2003. However, it is necessary to stress that the LSMS of the Roma population comprised potentially most imperilled Roma, while the Roma integrated in the main population were not included in this study.
The surveys were conducted on the whole territory of Serbia (without Kosovo and Metohija).
Sample survey data [ssd]
Sample frame for both surveys of general population (LSMS) in 2002 and 2003 consisted of all permanent residents of Serbia, without the population of Kosovo and Metohija, according to definition of permanently resident population contained in UN Recommendations for Population Censuses, which were applied in 2002 Census of Population in the Republic of Serbia. Therefore, permanent residents were all persons living in the territory Serbia longer than one year, with the exception of diplomatic and consular staff.
The sample frame for the survey of Family Income Support recipients included all current recipients of this program on the territory of Serbia based on the official list of recipients given by Ministry of Social affairs.
The definition of the Roma population from Roma settlements was faced with obstacles since precise data on the total number of Roma population in Serbia are not available. According to the last population Census from 2002 there were 108,000 Roma citizens, but the data from the Census are thought to significantly underestimate the total number of the Roma population. However, since no other more precise data were available, this number was taken as the basis for estimate on Roma population from Roma settlements. According to the 2002 Census, settlements with at least 7% of the total population who declared itself as belonging to Roma nationality were selected. A total of 83% or 90,000 self-declared Roma lived in the settlements that were defined in this way and this number was taken as the sample frame for Roma from Roma settlements.
Planned sample: In 2002 the planned size of the sample of general population included 6.500 households. The sample was both nationally and regionally representative (representative on each individual stratum). In 2003 the planned panel sample size was 3.000 households. In order to preserve the representative quality of the sample, we kept every other census block unit of the large sample realized in 2002. This way we kept the identical allocation by strata. In selected census block unit, the same households were interviewed as in the basic survey in 2002. The planned sample of Family Income Support recipients in 2002 and Roma from Roma settlements in 2003 was 500 households for each group.
Sample type: In both national surveys the implemented sample was a two-stage stratified sample. Units of the first stage were enumeration districts, and units of the second stage were the households. In the basic 2002 survey, enumeration districts were selected with probability proportional to number of households, so that the enumeration districts with bigger number of households have a higher probability of selection. In the repeated survey in 2003, first-stage units (census block units) were selected from the basic sample obtained in 2002 by including only even numbered census block units. In practice this meant that every second census block unit from the previous survey was included in the sample. In each selected enumeration district the same households interviewed in the previous round were included and interviewed. On finishing the survey in 2003 the cases were merged both on the level of households and members.
Stratification: Municipalities are stratified into the following six territorial strata: Vojvodina, Belgrade, Western Serbia, Central Serbia (Šumadija and Pomoravlje), Eastern Serbia and South-east Serbia. Primary units of selection are further stratified into enumeration districts which belong to urban type of settlements and enumeration districts which belong to rural type of settlement.
The sample of Family Income Support recipients represented the cases chosen randomly from the official list of recipients provided by Ministry of Social Affairs. The sample of Roma from Roma settlements was, as in the national survey, a two-staged stratified sample, but the units in the first stage were settlements where Roma population was represented in the percentage over 7%, and the units of the second stage were Roma households. Settlements are stratified in three territorial strata: Vojvodina, Beograd and Central Serbia.
Face-to-face [f2f]
In all surveys the same questionnaire with minimal changes was used. It included different modules, topically separate areas which had an aim of perceiving the living standard of households from different angles. Topic areas were the following: 1. Roster with demography. 2. Housing conditions and durables module with information on the age of durables owned by a household with a special block focused on collecting information on energy billing, payments, and usage. 3. Diary of food expenditures (weekly), including home production, gifts and transfers in kind. 4. Questionnaire of main expenditure-based recall periods sufficient to enable construction of annual consumption at the household level, including home production, gifts and transfers in kind. 5. Agricultural production for all households which cultivate 10+ acres of land or who breed cattle. 6. Participation and social transfers module with detailed breakdown by programs 7. Labour Market module in line with a simplified version of the Labour Force Survey (LFS), with special additional questions to capture various informal sector activities, and providing information on earnings 8. Health with a focus on utilization of services and expenditures (including informal payments) 9. Education module, which incorporated pre-school, compulsory primary education, secondary education and university education. 10. Special income block, focusing on sources of income not covered in other parts (with a focus on remittances).
During field work, interviewers kept a precise diary of interviews, recording both successful and unsuccessful visits. Particular attention was paid to reasons why some households were not interviewed. Separate marks were given for households which were not interviewed due to refusal and for cases when a given household could not be found on the territory of the chosen census block.
In 2002 a total of 7,491 households were contacted. Of this number a total of 6,386 households in 621 census rounds were interviewed. Interviewers did not manage to collect the data for 1,106 or 14.8% of selected households. Out of this number 634 households
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the data for the Colony, OK population pyramid, which represents the Colony population distribution across age and gender, using estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. It lists the male and female population for each age group, along with the total population for those age groups. Higher numbers at the bottom of the table suggest population growth, whereas higher numbers at the top indicate declining birth rates. Furthermore, the dataset can be utilized to understand the youth dependency ratio, old-age dependency ratio, total dependency ratio, and potential support ratio.
Key observations
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Age groups:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Colony Population by Age. You can refer the same here
Age, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 5-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: Census Tracts. Current Vintage: 2019-2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 2, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.
The 1940 Census Public Use Microdata Sample Project was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology at the University of Wisconsin. The collection contains a stratified 1-percent sample of households, with separate records for each household, for each "sample line" respondent, and for each person in the household. These records were encoded from microfilm copies of original handwritten enumeration schedules from the 1940 Census of Population. Geographic identification of the location of the sampled households includes Census regions and divisions, states (except Alaska and Hawaii), standard metropolitan areas (SMAs), and state economic areas (SEAs). Accompanying the data collection is a codebook that includes an abstract, descriptions of sample design, processing procedures and file structure, a data dictionary (record layout), category code lists, and a glossary. Also included is a procedural history of the 1940 Census. Each of the 20 subsamples contains three record types: household, sample line, and person. Household variables describe the location and condition of the household. The sample line records contain variables describing demographic characteristics such as nativity, marital status, number of children, veteran status, wage deductions for Social Security, and occupation. Person records also contain variables describing demographic characteristics including nativity, marital status, family membership, education, employment status, income, and occupation. (Source: downloaded from ICPSR 7/13/10)
Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR08236.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.
2013-2023 Virginia Non-Single Occupancy Vehicle (SOV) Travel Percent by Census Urban Area. Contains estimates. Workers 16 years and over, commuting to work, who are NOT using a car, truck, or van when driving alone.
U.S. Census Bureau; American Community Survey, American Community Survey 5-Year Estimates, Table DP03, Column DP03_0019PE Data accessed from: Census Bureau's API for American Community Survey (https://www.census.gov/data/developers/data-sets.html)
Documentation of the method to calculate Non-SOV is provided by the Federal Highway Administration (https://www.fhwa.dot.gov/tpm/guidance/hif18024.pdf) page 38 explains the calculation of the Non-SOV Travel measure.
Urban areas with values of -666,666,666 or 0 have blanks calculated for Non-SOV values.
The United States Census Bureau's American Community Survey (ACS): -What is the American Community Survey? (https://www.census.gov/programs-surveys/acs/about.html) -Geography & ACS (https://www.census.gov/programs-surveys/acs/geography-acs.html) -Technical Documentation (https://www.census.gov/programs-surveys/acs/technical-documentation.html)
Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section. (https://www.census.gov/programs-surveys/acs/technical-documentation/code-lists.html)
Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section. (https://www.census.gov/acs/www/methodology/sample_size_and_data_quality/)
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau's Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units for states and counties.
Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation https://www.census.gov/programs-surveys/acs/technical-documentation.html). The effect of nonsampling error is not represented in these tables.
A broad and generalized selection of 2014-2018 US Census Bureau 2018 5-year American Community Survey race, ethnicity and citizenship data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of the household income, median household income by race and by age group, Social Security income, the GINI Index, per capita income, median family income, and median household earnings by age, and by education level, in New Mexico. The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area. NOTE: A '-666666666' entry indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and the group quarters population for states and counties..Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section..Source: U.S. Census Bureau, 2023 American Community Survey 1-Year Estimates.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year..Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables..Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Explanation of Symbols:- The estimate could not be computed because there were an insufficient number of sample observations. For a ratio of medians estimate, one or both of the median estimates falls in the lowest interval or highest interval of an open-ended distribution. For a 5-year median estimate, the margin of error associated with a median was larger than the median itself.N The estimate or margin of error cannot be displayed because there were an insufficient number of sample cases in the selected geographic area. (X) The estimate or margin of error is not applicable or not available.median- The median falls in the lowest interval of an open-ended distribution (for example "2,500-")median+ The median falls in the highest interval of an open-ended distribution (for example "250,000+").** The margin of error could not be computed because there were an insufficient number of sample observations.*** The margin of error could not be computed because the median falls in the lowest interval or highest interval of an open-ended distribution.***** A margin of error is not appropriate because the corresponding estimate is controlled to an independent population or housing estimate. Effectively, the corresponding estimate has no sampling error and the margin of error may be treated as zero.
Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.
The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire
Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe
Basic units of analysis that the study investigates include: individuals and groups
Sample survey data [ssd]
A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.
The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.
Sample Universe
The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.
What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.
Sample Design
The sample design is a clustered, stratified, multi-stage, area probability sample.
To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.
In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:
The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages
A first-stage to stratify and randomly select primary sampling units;
A second-stage to randomly select sampling start-points;
A third stage to randomly choose households;
A final-stage involving the random selection of individual respondents
We shall deal with each of these stages in turn.
STAGE ONE: Selection of Primary Sampling Units (PSUs)
The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.
We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.
Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.
Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.
Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.
Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.
The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.
These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.
The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will
A broad and generalized selection of 2011-2015 US Census Bureau 2015 5-year American Community Survey race, ethnicity and citizenship data estimates, obtained via Census API and joined to the appropriate geometry (in this case, New Mexico Census tracts). The selection is not comprehensive, but allows a first-level characterization of the household income, median household income by race and by age group, Social Security income, the GINI Index, per capita income, median family income, and median household earnings by age, and by education level, in New Mexico. The determination of which estimates to include was based upon level of interest and providing a manageable dataset for users.The U.S. Census Bureau's American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social, and economic data every year. The ACS collects long-form-type information throughout the decade rather than only once every 10 years. The ACS combines population or housing data from multiple years to produce reliable numbers for small counties, neighborhoods, and other local areas. To provide information for communities each year, the ACS provides 1-, 3-, and 5-year estimates. ACS 5-year estimates (multiyear estimates) are “period” estimates that represent data collected over a 60-month period of time (as opposed to “point-in-time” estimates, such as the decennial census, that approximate the characteristics of an area on a specific date). ACS data are released in the year immediately following the year in which they are collected. ACS estimates based on data collected from 2009–2014 should not be called “2009” or “2014” estimates. Multiyear estimates should be labeled to indicate clearly the full period of time. While the ACS contains margin of error (MOE) information, this dataset does not. Those individuals requiring more complete data are directed to download the more detailed datasets from the ACS American FactFinder website. This dataset is organized by Census tract boundaries in New Mexico. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2010 Census Participant Statistical Areas Program. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area. NOTE: A '-666666666' entry indicates that either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.
The 2011 Population and Housing Census is the third national Census to be conducted in Namibia after independence. The first was conducted 1991 followed by the 2001 Census. Namibia is therefore one of the countries in sub-Saharan Africa that has participated in the 2010 Round of Censuses and followed the international best practice of conducting decennial Censuses, each of which attempts to count and enumerate every person and household in a country every ten years. Surveys, by contrast, collect data from samples of people and/or households.
Censuses provide reliable and critical data on the socio-economic and demographic status of any country. In Namibia, Census data has provided crucial information for development planning and programme implementation. Specifically, the information has assisted in setting benchmarks, formulating policy and the evaluation and monitoring of national development programmes including NDP4, Vision 2030 and several sector programmes. The information has also been used to update the national sampling frame which is used to select samples for household-based surveys, including labour force surveys, demographic and health surveys, household income and expenditure surveys. In addition, Census information will be used to guide the demarcation of Namibia's administrative boundaries where necessary.
At the international level, Census information has been used extensively in monitoring progress towards Namibia's achievement of international targets, particularly the Millennium Development Goals (MDGs).
The latest and most comprehensive Census was conducted in August 2011. Preparations for the Census started in the 2007/2008 financial year under the auspices of the then Central Bureau of Statistics (CBS) which was later transformed into the Namibia Statistics Agency (NSA). The NSA was established under the Statistics Act No. 9 of 2011, with the legal mandate and authority to conduct population Censuses every 10 years. The Census was implemented in three broad phases; pre-enumeration, enumeration and post enumeration.
During the first pre-enumeration phase, activities accomplished including the preparation of a project document, establishing Census management and technical committees, and establishing the Census cartography unit which demarcated the Enumeration Areas (EAs). Other activities included the development of Census instruments and tools, such as the questionnaires, manuals and field control forms.
Field staff were recruited, trained and deployed during the initial stages of the enumeration phase. The actual enumeration exercise was undertaken over a period of about three weeks from 28 August to 15 September 2011, while 28 August 2011 was marked as the reference period or 'Census Day'.
Great efforts were made to check and ensure that the Census data was of high quality to enhance its credibility and increase its usage. Various quality controls were implemented to ensure relevance, timeliness, accuracy, coherence and proper data interpretation. Other activities undertaken to enhance quality included the demarcation of the country into small enumeration areas to ensure comprehensive coverage; the development of structured Census questionnaires after consultat.The post-enumeration phase started with the sending of completed questionnaires to Head Office and the preparation of summaries for the preliminary report, which was published in April 2012. Processing of the Census data began with manual editing and coding, which focused on the household identification section and un-coded parts of the questionnaire. This was followed by the capturing of data through scanning. Finally, the data were verified and errors corrected where necessary. This took longer than planned due to inadequate technical skills.
National coverage
Households and persons
The sampling universe is defined as all households (private and institutions) from 2011 Census dataset.
Census/enumeration data [cen]
Sample Design
The stratified random sample was applied on the constituency and urban/rural variables of households list from Namibia 2011 Population and Housing Census for the Public Use Microdata Sample (PUMS) file. The sampling universe is defined as all households (private and institutions) from 2011 Census dataset. Since urban and rural are very important factor in the Namibia situation, it was then decided to take the stratum at the constituency and urban/rural levels. Some constituencies have very lower households in the urban or rural, the office therefore decided for a threshold (low boundary) for sampling within stratum. Based on data analysis, the threshold for stratum of PUMS file is 250 households. Thus, constituency and urban/rural areas with less than 250 households in total were included in the PUMS file. Otherwise, a simple random sampling (SRS) at a 20% sample rate was applied for each stratum. The sampled households include 93,674 housing units and 418,362 people.
Sample Selection
The PUMS sample is selected from households. The PUMS sample of persons in households is selected by keeping all persons in PUMS households. Sample selection process is performed using Census and Survey Processing System (CSPro).
The sample selection program first identifies the 7 census strata with less than 250 households and the households (private and institutions) with more than 50 people. The households in these areas and with this large size are all included in the sample. For the other households, the program randomly generates a number n from 0 to 4. Out of every 5 households, the program selects the nth household to export to the PUMS data file, creating a 20 percent sample of households. Private households and institutions are equally sampled in the PUMS data file.
Note: The 7 census strata with less than 250 households are: Arandis Constituency Rural, Rehoboth East Urban Constituency Rural, Walvis Bay Rural Constituency Rural, Mpungu Constituency Urban, Etayi Constituency Urban, Kalahari Constituency Urban, and Ondobe Constituency Urban.
Face-to-face [f2f]
The following questionnaire instruments were used for the Namibia 2011 Population and and Housing Census:
Form A (Long Form): For conventional households and residential institutions
Form B1 (Short Form): For special population groups such as persons in transit (travellers), police cells, homeless and off-shore populations
Form B2 (Short Form): For hotels/guesthouses
Form B3 (Short Form): For foreign missions/diplomatic corps
Data editing took place at a number of stages throughout the processing, including: a) During data collection in the field b) Manual editing and coding in the office c) During data entry (Primary validation/editing) Structure checking and completeness using Structured Query Language (SQL) program d) Secondary editing: i. Imputations of variables ii. Structural checking in Census and Survey Processing System (CSPro) program
Sampling Error The standard errors of survey estimates are needed to evaluate the precision of the survey estimation. The statistical software package such as SPSS or SAS can accurately estimate the mean and variance of estimates from the survey. SPSS or SAS software package makes use of the Taylor series approach in computing the variance.
Data quality Great efforts were made to check and ensure that the Census data was of high quality to enhance its credibility and increase its usage. Various quality controls were implemented to ensure relevance, timeliness, accuracy, coherence and proper data interpretation. Other activities undertaken to enhance quality included the demarcation of the country into small enumeration areas to ensure comprehensive coverage; the development of structured Census questionnaires after consultation with government ministries, university expertise and international partners; the preparation of detailed supervisors' and enumerators' instruction manuals to guide field staff during enumeration; the undertaking of comprehensive publicity and advocacy programmes to ensure full Government support and cooperation from the general public; the testing of questionnaires and other procedures; the provision of adequate training and undertaking of intensive supervision using four supervisory layers; the editing of questionnaires at field level; establishing proper mechanisms which ensured that all completed questionnaires were properly accounted for; ensuring intensive verification, validating all information and error corrections; and developing capacity in data processing with support from the international community.
This layer shows Population. This is shown by state and county boundaries. This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the point by Population Density and size of the point by Total Population. The size of the symbol represents the total count of housing units. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. Formula used for Calculating the Pop Density (B01001_001E/GEO_LAND_AREA_SQ_KM). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): B01001, B09020Data downloaded from: Census Bureau's API for American Community Survey Date of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
This layer contains 2010-2014 American Community Survey (ACS) 5-year data, and contains estimates and margins of error. The layer shows tenure (owner or renter) by race of householder. This is shown by tract, county, and state boundaries. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized by the overall homeownership rate. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Vintage: 2010-2014ACS Table(s): B25003, B25003B, B25003C, B25003D, B25003E, B25003F, B25003G, B25003H, B25003I Data downloaded from: Census Bureau's API for American Community Survey Date of API call: November 11, 2020National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. For more information about ACS layers, visit the FAQ. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:This layer has associated layers containing the most recent ACS data available by the U.S. Census Bureau. Click here to learn more about ACS data releases and click here for the associated boundaries layer. The reason this data is 5+ years different from the most recent vintage is due to the overlapping of survey years. It is recommended by the U.S. Census Bureau to compare non-overlapping datasets.Boundaries come from the US Census TIGER geodatabases. Boundary vintage (2014) appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2010 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto RicoCensus tracts with no population that occur in areas of water, such as oceans, are removed from this data service (Census Tracts beginning with 99).Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page.Negative values (e.g., -4444...) have been set to null, with the exception of -5555... which has been set to zero. These negative values exist in the raw API data to indicate the following situations:The margin of error column indicates that either no sample observations or too few sample observations were available to compute a standard error and thus the margin of error. A statistical test is not appropriate.Either no sample observations or too few sample observations were available to compute an estimate, or a ratio of medians cannot be calculated because one or both of the median estimates falls in the lowest interval or upper interval of an open-ended distribution.The median falls in the lowest interval of an open-ended distribution, or in the upper interval of an open-ended distribution. A statistical test is not appropriate.The estimate is controlled. A statistical test for sampling variability is not appropriate.The data for this geographic area cannot be displayed because the number of sample cases is too small.
This layer shows Population. This is shown by state and county boundaries. This service contains the 2017-2021 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the point by Population Density and size of the point by Total Population. The size of the symbol represents the total count of housing units. Population Density was calculated based on the total population and area of land fields, which both came from the U.S. Census Bureau. Formula used for Calculating the Pop Density (B01001_001E/GEO_LAND_AREA_SQ_KM). To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2017-2021ACS Table(s): B01001, B09020Data downloaded from: Census Bureau's API for American Community Survey Date of API call: February 16, 2023National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
National Labor Force Survey (SAKERNAS) is a survey that is designed to observe the general situation of workforce and also to understand whether there is a change of workforce structure between the enumeration period. Since the survey was initiated in 1976, it has undergone a series of changes affecting its coverage, the frequency of enumeration, the number of households sampled and the type of information collected. It is the largest and most representative source of employment data in Indonesia. For each selected household, the general information about the circumstances of each household member that includes the name, relationship to head of household, sex, and age were collected. Household members aged 10 years and over will be prompted to give the information about their marital status, education and employment.
SAKERNAS is aimed to gather informations that meet three objectives: 1.Employment by education, working hours, industrial classification and employment status, 2.Unemployment and underemployment by different characteristics and efforts on looking for work, 3.Working age population not in the labor force (e.g. attending schools, doing housekeeping and others).
The data for quarterly SAKERNAS was gathered in the year 1990 (quarterly surveys) covered all provinces in Indonesia, with 82,080 households, scattered both in rural and urban areas and representative until provincial level. The main household data is taken from core questionnaire of SAK90-AK.
National coverage* including urban and rural area, representative until provincial level.
*) Although covering all of Indonesia, there are some circumstances when not all provincial were covered. For example, in year 2000, the Province of Maluku excluded in SAKERNAS because horizontal conflicts occurred there. Also, the separation of East Timor from Indonesia in year 1999 also changed the scope of SAKERNAS for the years to come. After that, due to the expansion of regional autonomy as a consequence, the proportion of samples per Province is also changed, as in 2006 when the number of provinces are already 33. However, the difference is only on the number of influential scope/level but not to the pattern. On the other hand, changes in the methodology (including sample size) over time is likely to affect the outcome, for example in years 2000 and 2001, when sample size is only 32.384 and 34.176 the level of data presentation is only representative to island level, (insufficient sample size even to make it representative to provincial level).
Individual
The survey covered all de jure household members (usual residents), aged 10 years and over that resident in the household. However, Diplomatic Corps households, households that are in the specific enumeration area and specific households in the regular enumeration area are not chosen as a sample.
Sample survey data
Quarterly SAKERNAS 1990 was implemented in the whole territory of the Republic of Indonesia , with a total sample of about 82,080 households, both in rural and urban areas and representative until provincial level. Diplomatic Corps households, households that are in the specific enumeration area and specific households in the regular enumeration area are not chosen as a sample. Data in the dataset indicates the combined sample data consisting results of the 4 rounds quarterly SAKERNAS in 1990, i.e. quarter I, quarter II, quarter III, and quarter IV.
The sampling method* for quarterly SAKERNAS 1990 is three-stages cluster sampling design (using similar processes applied for rural areas and urban areas sampling) with segment group ( kelseg) as the primary sampling unit (PSU) and household as the ultimate sampling unit. PSUs were selected with probability proportional to size from chosen enumeration area (wilcah). A number of households were taken randomly from selected PSUs. However, there is documentation explained about how the sample size was determined at the domain level, or stratification measures that were implemented and also, the sample size allocation across strata, and also detail information about sample frame formation**.
In SAKERNAS 1990 - 2000 the terms "census block and sub-block census" is not used, instead terms "enumeration area" (wilcah) and "segment group" (kelseg) used. The sample design through similar sample selection between urban areas and rural areas by these steps: 1. In the first stage: selected a number of enumeration areas systematically from KCI (Master Framework of Examples ) by pps to MFD (Master Framework Desa/Kerangka Induk). 2. In the second stage, from each selected enumeration areas, selected segment groups by PPS regarding the number of households. 3. In the third stage, systematically selected some households from each segment group previously chosen.
Establishment of the sample frames is in three stages, namely: 1. Establishment of the sample frame used in the selection of enumeration area, 2. Establishment of the sample frame for the selection of segment group, 3. Establishment of the sample frame for the selection of households.
*) Sampling method used is varied in different years. For example, in SAKERNAS period of 1986-1989 sampling method used is the method of rotation, where most of the households selected at one period was re-elected in the following period. This often happens on quarterly SAKERNAS on that period. At other periods often use multi-stages sampling method (two or three stages depend on whether sub block census / segment group included or not), or a combination of multi stages sampling also with rotation method (e.g. SAKERNAS 2006-2010).
**) Commonly, annual SAKERNAS sample frame comes from the last population census result undertaken before SAKERNAS. For example, for annual SAKERNAS 2003 used sample frame derived from "listing process" of household results of Population Census 2000. Also can refer to sampling frame of some periodic household based census like Agricultural Census, Economic Census,etc., e.g. block census sample frame of SAKERNAS 2007 formed by using Economic Census 2006 result. In the other hand sample frame used for quarterly SAKERNAS is from the list of households obtained from National Socio-Economic Survey (SUSENAS) Core activities held before Sakernas, e.g. for quarterly SAKERNAS 2002/2003 activities, used sample frame which derived from households of the selected districts of SUSENAS 2002.
Face-to-face
In SAKERNAS, the questionnaire has been designed in a simple and concise way. It is expected that respondents will understand the aim of question of survey and avoid the memory lapse and uninterested respondents during data collection. Furthermore, the design of SAKERNAS's questionnaire remains stable in order to maintain data comparison.
A household questionnaire was administered in each selected household, which collected general information of household members that includes name, relationship with head of the household, sex and age. Household members aged 10 years and over were then asked about their marital status, education and occupation.
Stages of data processing in Sakernas are through process of: - Batching - Editing - Coding - Data Entry - Validation - Tabulation
Sampling error results are presented at the end of the publication of The State of Labor Force in Indonesia and in publication of The State of Workers in Indonesia.
This layer shows Households by Type. This is shown by state and county boundaries. This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show Average Household Size and the Total Households in a bi-variate map. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): B11001, B25010, B25044, DP02, DP04Data downloaded from: Census Bureau's API for American Community Survey Date of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
How does your organization use this dataset? What other NYSERDA or energy-related datasets would you like to see on Open NY? Let us know by emailing OpenNY@nyserda.ny.gov.
The Low- to Moderate-Income (LMI) New York State (NYS) Census Population Analysis dataset is resultant from the LMI market database designed by APPRISE as part of the NYSERDA LMI Market Characterization Study (https://www.nyserda.ny.gov/lmi-tool). All data are derived from the U.S. Census Bureau’s American Community Survey (ACS) 1-year Public Use Microdata Sample (PUMS) files for 2013, 2014, and 2015.
Each row in the LMI dataset is an individual record for a household that responded to the survey and each column is a variable of interest for analyzing the low- to moderate-income population.
The LMI dataset includes: county/county group, households with elderly, households with children, economic development region, income groups, percent of poverty level, low- to moderate-income groups, household type, non-elderly disabled indicator, race/ethnicity, linguistic isolation, housing unit type, owner-renter status, main heating fuel type, home energy payment method, housing vintage, LMI study region, LMI population segment, mortgage indicator, time in home, head of household education level, head of household age, and household weight.
The LMI NYS Census Population Analysis dataset is intended for users who want to explore the underlying data that supports the LMI Analysis Tool. The majority of those interested in LMI statistics and generating custom charts should use the interactive LMI Analysis Tool at https://www.nyserda.ny.gov/lmi-tool. This underlying LMI dataset is intended for users with experience working with survey data files and producing weighted survey estimates using statistical software packages (such as SAS, SPSS, or Stata).
This layer shows Population by Age and Sex. This is shown by state and county boundaries. This service contains the 2018-2022 release of data from the American Community Survey (ACS) 5-year data, and contains estimates and margins of error. There are also additional calculated attributes related to this topic, which can be mapped or used within analysis. This layer is symbolized to show the Total population ages 65 and over. To see the full list of attributes available in this service, go to the "Data" tab, and choose "Fields" at the top right. Current Vintage: 2018-2022ACS Table(s): B01001, B01002, DP05Data downloaded from: Census Bureau's API for American Community Survey Date of API call: January 18, 2024National Figures: data.census.govThe United States Census Bureau's American Community Survey (ACS):About the SurveyGeography & ACSTechnical DocumentationNews & UpdatesThis ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.Data Note from the Census:Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.Data Processing Notes:Boundaries come from the Cartographic Boundaries via US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates, and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). The States layer contains 52 records - all US states, Washington D.C., and Puerto Rico. The Counties (and equivalent) layer contains 3221 records - all counties and equivalent, Washington D.C., and Puerto Rico municipios. See Areas Published. Percentages and derived counts, and associated margins of error, are calculated values (that can be identified by the "_calc_" stub in the field name), and abide by the specifications defined by the American Community Survey.Field alias names were created based on the Table Shells.Margin of error (MOE) values of -555555555 in the API (or "*****" (five asterisks) on data.census.gov) are displayed as 0 in this dataset. The estimates associated with these MOEs have been controlled to independent counts in the ACS weighting and have zero sampling error. So, the MOEs are effectively zeroes, and are treated as zeroes in MOE calculations. Other negative values on the API, such as -222222222, -666666666, -888888888, and -999999999, all represent estimates or MOEs that can't be calculated or can't be published, usually due to small sample sizes. All of these are rendered in this dataset as null (blank) values.
Age, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 1-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: District-wide. Current Vintage: 2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 3, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘California Housing Data (1990)’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/harrywang/housing on 12 November 2021.
--- Dataset description provided by original source is as follows ---
This is the dataset used in this book: https://github.com/ageron/handson-ml/tree/master/datasets/housing to illustrate a sample end-to-end ML project workflow (pipeline). This is a great book - I highly recommend!
The data is based on California Census in 1990.
"This dataset is a modified version of the California Housing dataset available from Luís Torgo's page (University of Porto). Luís Torgo obtained it from the StatLib repository (which is closed now). The dataset may also be downloaded from StatLib mirrors.
The following is the description from the book author:
This dataset appeared in a 1997 paper titled Sparse Spatial Autoregressions by Pace, R. Kelley and Ronald Barry, published in the Statistics and Probability Letters journal. They built it using the 1990 California census data. It contains one row per census block group. A block group is the smallest geographical unit for which the U.S. Census Bureau publishes sample data (a block group typically has a population of 600 to 3,000 people).
The dataset in this directory is almost identical to the original, with two differences: 207 values were randomly removed from the total_bedrooms column, so we can discuss what to do with missing data. An additional categorical attribute called ocean_proximity was added, indicating (very roughly) whether each block group is near the ocean, near the Bay area, inland or on an island. This allows discussing what to do with categorical data. Note that the block groups are called "districts" in the Jupyter notebooks, simply because in some contexts the name "block group" was confusing."
http://www.dcc.fc.up.pt/%7Eltorgo/Regression/cal_housing.html
This is a dataset obtained from the StatLib repository. Here is the included description:
"We collected information on the variables using all the block groups in California from the 1990 Cens us. In this sample a block group on average includes 1425.5 individuals living in a geographically co mpact area. Naturally, the geographical area included varies inversely with the population density. W e computed distances among the centroids of each block group as measured in latitude and longitude. W e excluded all the block groups reporting zero entries for the independent and dependent variables. T he final data contained 20,640 observations on 9 variables. The dependent variable is ln(median house value)."
--- Original source retains full ownership of the source dataset ---
analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D