100+ datasets found
  1. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  2. H

    American Community Survey (ACS)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). American Community Survey (ACS) [Dataset]. http://doi.org/10.7910/DVN/DKI9L4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the american community survey (acs) with r and monetdb experimental. think of the american community survey (acs) as the united states' census for off-years - the ones that don't end in zero. every year, one percent of all americans respond, making it the largest complex sample administered by the u.s. government (the decennial census has a much broader reach, but since it attempts to contact 100% of the population, it's not a sur vey). the acs asks how people live and although the questionnaire only includes about three hundred questions on demography, income, insurance, it's often accurate at sub-state geographies and - depending how many years pooled - down to small counties. households are the sampling unit, and once a household gets selected for inclusion, all of its residents respond to the survey. this allows household-level data (like home ownership) to be collected more efficiently and lets researchers examine family structure. the census bureau runs and finances this behemoth, of course. the dow nloadable american community survey ships as two distinct household-level and person-level comma-separated value (.csv) files. merging the two just rectangulates the data, since each person in the person-file has exactly one matching record in the household-file. for analyses of small, smaller, and microscopic geographic areas, choose one-, three-, or fiv e-year pooled files. use as few pooled years as you can, unless you like sentences that start with, "over the period of 2006 - 2010, the average american ... [insert yer findings here]." rather than processing the acs public use microdata sample line-by-line, the r language brazenly reads everything into memory by default. to prevent overloading your computer, dr. thomas lumley wrote the sqlsurvey package principally to deal with t his ram-gobbling monster. if you're already familiar with syntax used for the survey package, be patient and read the sqlsurvey examples carefully when something doesn't behave as you expect it to - some sqlsurvey commands require a different structure (i.e. svyby gets called through svymean) and others might not exist anytime soon (like svyolr). gimme some good news: sqlsurvey uses ultra-fast monetdb (click here for speed tests), so follow the monetdb installation instructions before running this acs code. monetdb imports, writes, recodes data slowly, but reads it hyper-fast . a magnificent trade-off: data exploration typically requires you to think, send an analysis command, think some more, send another query, repeat. importation scripts (especially the ones i've already written for you) can be left running overnight sans hand-holding. the acs weights generalize to the whole united states population including individuals living in group quarters, but non-residential respondents get an abridged questionnaire, so most (not all) analysts exclude records with a relp variable of 16 or 17 right off the bat. this new github repository contains four scripts: 2005-2011 - download all microdata.R create the batch (.bat) file needed to initiate the monet database in the future download, unzip, and import each file for every year and size specified by the user create and save household- and merged/person-level replicate weight complex sample designs create a well-documented block of code to re-initiate the monet db server in the future fair warning: this full script takes a loooong time. run it friday afternoon, commune with nature for the weekend, and if you've got a fast processor and speedy internet connection, monday morning it should be ready for action. otherwise, either download only the years and sizes you need or - if you gotta have 'em all - run it, minimize it, and then don't disturb it for a week. 2011 single-year - analysis e xamples.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file perform the standard repertoire of analysis examples, only this time using sqlsurvey functions 2011 single-year - variable reco de example.R run the well-documented block of code to re-initiate the monetdb server copy the single-year 2011 table to maintain the pristine original add a new age category variable by hand add a new age category variable systematically re-create then save the sqlsurvey replicate weight complex sample design on this new table close everything, then load everything back up in a fresh instance of r replicate a few of the census statistics. no muss, no fuss replicate census estimates - 2011.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file match every nation wide statistic on the census bureau's estimates page, using sqlsurvey functions click here to view these four scripts for more detail about the american community survey (acs), visit: < ul> the us census...

  3. w

    Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho,...

    • microdata.worldbank.org
    • catalog.ihsn.org
    Updated Apr 27, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Institute for Democracy in South Africa (IDASA) (2021). Afrobarometer Survey 1 1999-2000, Merged 7 Country - Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia, Zimbabwe [Dataset]. https://microdata.worldbank.org/index.php/catalog/889
    Explore at:
    Dataset updated
    Apr 27, 2021
    Dataset provided by
    Ghana Centre for Democratic Development (CDD-Ghana)
    Michigan State University (MSU)
    Institute for Democracy in South Africa (IDASA)
    Time period covered
    1999 - 2000
    Area covered
    Africa, Lesotho, Namibia, Malawi, Zimbabwe, Zambia, Botswana, South Africa
    Description

    Abstract

    Round 1 of the Afrobarometer survey was conducted from July 1999 through June 2001 in 12 African countries, to solicit public opinion on democracy, governance, markets, and national identity. The full 12 country dataset released was pieced together out of different projects, Round 1 of the Afrobarometer survey,the old Southern African Democracy Barometer, and similar surveys done in West and East Africa.

    The 7 country dataset is a subset of the Round 1 survey dataset, and consists of a combined dataset for the 7 Southern African countries surveyed with other African countries in Round 1, 1999-2000 (Botswana, Lesotho, Malawi, Namibia, South Africa, Zambia and Zimbabwe). It is a useful dataset because, in contrast to the full 12 country Round 1 dataset, all countries in this dataset were surveyed with the identical questionnaire

    Geographic coverage

    Botswana Lesotho Malawi Namibia South Africa Zambia Zimbabwe

    Analysis unit

    Basic units of analysis that the study investigates include: individuals and groups

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A new sample has to be drawn for each round of Afrobarometer surveys. Whereas the standard sample size for Round 3 surveys will be 1200 cases, a larger sample size will be required in societies that are extremely heterogeneous (such as South Africa and Nigeria), where the sample size will be increased to 2400. Other adaptations may be necessary within some countries to account for the varying quality of the census data or the availability of census maps.

    The sample is designed as a representative cross-section of all citizens of voting age in a given country. The goal is to give every adult citizen an equal and known chance of selection for interview. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible. A randomly selected sample of 1200 cases allows inferences to national adult populations with a margin of sampling error of no more than plus or minus 2.5 percent with a confidence level of 95 percent. If the sample size is increased to 2400, the confidence interval shrinks to plus or minus 2 percent.

    Sample Universe

    The sample universe for Afrobarometer surveys includes all citizens of voting age within the country. In other words, we exclude anyone who is not a citizen and anyone who has not attained this age (usually 18 years) on the day of the survey. Also excluded are areas determined to be either inaccessible or not relevant to the study, such as those experiencing armed conflict or natural disasters, as well as national parks and game reserves. As a matter of practice, we have also excluded people living in institutionalized settings, such as students in dormitories and persons in prisons or nursing homes.

    What to do about areas experiencing political unrest? On the one hand we want to include them because they are politically important. On the other hand, we want to avoid stretching out the fieldwork over many months while we wait for the situation to settle down. It was agreed at the 2002 Cape Town Planning Workshop that it is difficult to come up with a general rule that will fit all imaginable circumstances. We will therefore make judgments on a case-by-case basis on whether or not to proceed with fieldwork or to exclude or substitute areas of conflict. National Partners are requested to consult Core Partners on any major delays, exclusions or substitutions of this sort.

    Sample Design

    The sample design is a clustered, stratified, multi-stage, area probability sample.

    To repeat the main sampling principle, the objective of the design is to give every sample element (i.e. adult citizen) an equal and known chance of being chosen for inclusion in the sample. We strive to reach this objective by (a) strictly applying random selection methods at every stage of sampling and by (b) applying sampling with probability proportionate to population size wherever possible.

    In a series of stages, geographically defined sampling units of decreasing size are selected. To ensure that the sample is representative, the probability of selection at various stages is adjusted as follows:

    The sample is stratified by key social characteristics in the population such as sub-national area (e.g. region/province) and residential locality (urban or rural). The area stratification reduces the likelihood that distinctive ethnic or language groups are left out of the sample. And the urban/rural stratification is a means to make sure that these localities are represented in their correct proportions. Wherever possible, and always in the first stage of sampling, random sampling is conducted with probability proportionate to population size (PPPS). The purpose is to guarantee that larger (i.e., more populated) geographical units have a proportionally greater probability of being chosen into the sample. The sampling design has four stages

    A first-stage to stratify and randomly select primary sampling units;

    A second-stage to randomly select sampling start-points;

    A third stage to randomly choose households;

    A final-stage involving the random selection of individual respondents

    We shall deal with each of these stages in turn.

    STAGE ONE: Selection of Primary Sampling Units (PSUs)

    The primary sampling units (PSU's) are the smallest, well-defined geographic units for which reliable population data are available. In most countries, these will be Census Enumeration Areas (or EAs). Most national census data and maps are broken down to the EA level. In the text that follows we will use the acronyms PSU and EA interchangeably because, when census data are employed, they refer to the same unit.

    We strongly recommend that NIs use official national census data as the sampling frame for Afrobarometer surveys. Where recent or reliable census data are not available, NIs are asked to inform the relevant Core Partner before they substitute any other demographic data. Where the census is out of date, NIs should consult a demographer to obtain the best possible estimates of population growth rates. These should be applied to the outdated census data in order to make projections of population figures for the year of the survey. It is important to bear in mind that population growth rates vary by area (region) and (especially) between rural and urban localities. Therefore, any projected census data should include adjustments to take such variations into account.

    Indeed, we urge NIs to establish collegial working relationships within professionals in the national census bureau, not only to obtain the most recent census data, projections, and maps, but to gain access to sampling expertise. NIs may even commission a census statistician to draw the sample to Afrobarometer specifications, provided that provision for this service has been made in the survey budget.

    Regardless of who draws the sample, the NIs should thoroughly acquaint themselves with the strengths and weaknesses of the available census data and the availability and quality of EA maps. The country and methodology reports should cite the exact census data used, its known shortcomings, if any, and any projections made from the data. At minimum, the NI must know the size of the population and the urban/rural population divide in each region in order to specify how to distribute population and PSU's in the first stage of sampling. National investigators should obtain this written data before they attempt to stratify the sample.

    Once this data is obtained, the sample population (either 1200 or 2400) should be stratified, first by area (region/province) and then by residential locality (urban or rural). In each case, the proportion of the sample in each locality in each region should be the same as its proportion in the national population as indicated by the updated census figures.

    Having stratified the sample, it is then possible to determine how many PSU's should be selected for the country as a whole, for each region, and for each urban or rural locality.

    The total number of PSU's to be selected for the whole country is determined by calculating the maximum degree of clustering of interviews one can accept in any PSU. Because PSUs (which are usually geographically small EAs) tend to be socially homogenous we do not want to select too many people in any one place. Thus, the Afrobarometer has established a standard of no more than 8 interviews per PSU. For a sample size of 1200, the sample must therefore contain 150 PSUs/EAs (1200 divided by 8). For a sample size of 2400, there must be 300 PSUs/EAs.

    These PSUs should then be allocated proportionally to the urban and rural localities within each regional stratum of the sample. Let's take a couple of examples from a country with a sample size of 1200. If the urban locality of Region X in this country constitutes 10 percent of the current national population, then the sample for this stratum should be 15 PSUs (calculated as 10 percent of 150 PSUs). If the rural population of Region Y constitutes 4 percent of the current national population, then the sample for this stratum should be 6 PSU's.

    The next step is to select particular PSUs/EAs using random methods. Using the above example of the rural localities in Region Y, let us say that you need to pick 6 sample EAs out of a census list that contains a total of 240 rural EAs in Region Y. But which 6? If the EAs created by the national census bureau are of equal or roughly equal population size, then selection is relatively straightforward. Just number all EAs consecutively, then make six selections using a table of random numbers. This procedure, known as simple random sampling (SRS), will

  4. D

    Census Tract Top 50 American Community Survey Data

    • data.seattle.gov
    • hub.arcgis.com
    • +1more
    csv, xlsx, xml
    Updated Feb 3, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Census Tract Top 50 American Community Survey Data [Dataset]. https://data.seattle.gov/dataset/Census-Tract-Top-50-American-Community-Survey-Data/jya9-y5bv/data
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Feb 3, 2025
    Description

    Data from: American Community Survey, 5-year Series


    King County, Washington census tracts with nonoverlapping vintages of the 5-year American Community Survey (ACS) estimates starting in 2010 of over 50 attributes of the most requested data derived from the U.S. Census Bureau's demographic profiles (DP02-DP05). Also includes the most recent release annually with the vintage identified in the "ACS Vintage" field.

    The census tract boundaries match the vintage of the ACS data (currently 2010 and 2020) so please note the geographic changes between the decades.

    Tracts have been coded as being within the City of Seattle as well as assigned to neighborhood groups called "Community Reporting Areas". These areas were created after the 2000 census to provide geographically consistent neighborhoods through time for reporting U.S. Census Bureau data. This is not an attempt to identify neighborhood boundaries as defined by neighborhoods themselves.

    Vintages: 2010, 2015, 2020, 2021, 2022, 2023
    ACS Table(s): DP02, DP03, DP04, DP05


    The United States Census Bureau's American Community Survey (ACS):
    This ready-to-use layer can be used within ArcGIS Pro, ArcGIS Online, its configurable apps, dashboards, Story Maps, custom apps, and mobile apps. Data can also be exported for offline workflows. Please cite the Census and ACS when using this data.

    Data Note from the Census:
    Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables.

    Data Processing Notes:
  5. H

    Survey of Income and Program Participation (SIPP)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). Survey of Income and Program Participation (SIPP) [Dataset]. http://doi.org/10.7910/DVN/I0FFJV
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the survey of income and program participation (sipp) with r if the census bureau's budget was gutted and only one complex sample survey survived, pray it's the survey of income and program participation (sipp). it's giant. it's rich with variables. it's monthly. it follows households over three, four, now five year panels. the congressional budget office uses it for their health insurance simulation . analysts read that sipp has person-month files, get scurred, and retreat to inferior options. the american community survey may be the mount everest of survey data, but sipp is most certainly the amazon. questions swing wild and free through the jungle canopy i mean core data dictionary. legend has it that there are still species of topical module variables that scientists like you have yet to analyze. ponce de león would've loved it here. ponce. what a name. what a guy. the sipp 2008 panel data started from a sample of 105,663 individuals in 42,030 households. once the sample gets drawn, the census bureau surveys one-fourth of the respondents every four months, over f our or five years (panel durations vary). you absolutely must read and understand pdf pages 3, 4, and 5 of this document before starting any analysis (start at the header 'waves and rotation groups'). if you don't comprehend what's going on, try their survey design tutorial. since sipp collects information from respondents regarding every month over the duration of the panel, you'll need to be hyper-aware of whether you want your results to be point-in-time, annualized, or specific to some other period. the analysis scripts below provide examples of each. at every four-month interview point, every respondent answers every core question for the previous four months. after that, wave-specific addenda (called topical modules) get asked, but generally only regarding a single prior month. to repeat: core wave files contain four records per person, topical modules contain one. if you stacked every core wave, you would have one record per person per month for the duration o f the panel. mmmassive. ~100,000 respondents x 12 months x ~4 years. have an analysis plan before you start writing code so you extract exactly what you need, nothing more. better yet, modify something of mine. cool? this new github repository contains eight, you read me, eight scripts: 1996 panel - download and create database.R 2001 panel - download and create database.R 2004 panel - download and create database.R 2008 panel - download and create database.R since some variables are character strings in one file and integers in anoth er, initiate an r function to harmonize variable class inconsistencies in the sas importation scripts properly handle the parentheses seen in a few of the sas importation scripts, because the SAScii package currently does not create an rsqlite database, initiate a variant of the read.SAScii function that imports ascii data directly into a sql database (.db) download each microdata file - weights, topical modules, everything - then read 'em into sql 2008 panel - full year analysis examples.R< br /> define which waves and specific variables to pull into ram, based on the year chosen loop through each of twelve months, constructing a single-year temporary table inside the database read that twelve-month file into working memory, then save it for faster loading later if you like read the main and replicate weights columns into working memory too, merge everything construct a few annualized and demographic columns using all twelve months' worth of information construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half, again save it for faster loading later, only if you're so inclined reproduce census-publish ed statistics, not precisely (due to topcoding described here on pdf page 19) 2008 panel - point-in-time analysis examples.R define which wave(s) and specific variables to pull into ram, based on the calendar month chosen read that interview point (srefmon)- or calendar month (rhcalmn)-based file into working memory read the topical module and replicate weights files into working memory too, merge it like you mean it construct a few new, exciting variables using both core and topical module questions construct a replicate-weighted complex sample design with a fay's adjustment factor of one-half reproduce census-published statistics, not exactly cuz the authors of this brief used the generalized variance formula (gvf) to calculate the margin of error - see pdf page 4 for more detail - the friendly statisticians at census recommend using the replicate weights whenever possible. oh hayy, now it is. 2008 panel - median value of household assets.R define which wave(s) and spe cific variables to pull into ram, based on the topical module chosen read the topical module and replicate weights files into working memory too, merge once again construct a replicate-weighted complex sample design with a...

  6. V

    Virginia Population by Sex by Age by Census Block Group (ACS 5-Year)

    • data.virginia.gov
    csv
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of INTERMODAL Planning and Investment (2025). Virginia Population by Sex by Age by Census Block Group (ACS 5-Year) [Dataset]. https://data.virginia.gov/dataset/virginia-population-by-sex-by-age-by-census-block-group-acs-5-year
    Explore at:
    csv(23831484)Available download formats
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    Office of INTERMODAL Planning and Investment
    Description

    2013-2023 Virginia Population by Sex by Age by Census Block Group. Contains estimates and margins of error.

    U.S. Census Bureau; American Community Survey, American Community Survey 5-Year Estimates, Table B01001 Data accessed from: Census Bureau's API for American Community Survey (https://www.census.gov/data/developers/data-sets.html)

    The United States Census Bureau's American Community Survey (ACS): -What is the American Community Survey? (https://www.census.gov/programs-surveys/acs/about.html) -Geography & ACS (https://www.census.gov/programs-surveys/acs/geography-acs.html) -Technical Documentation (https://www.census.gov/programs-surveys/acs/technical-documentation.html)

    Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section. (https://www.census.gov/programs-surveys/acs/technical-documentation/code-lists.html)

    Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section. (https://www.census.gov/acs/www/methodology/sample_size_and_data_quality/)

    Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau's Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units for states and counties.

    Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation https://www.census.gov/programs-surveys/acs/technical-documentation.html). The effect of nonsampling error is not represented in these tables.

  7. American Housing Survey (AHS)

    • catalog.data.gov
    • s.cnmilf.com
    Updated Mar 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Housing and Urban Development (2024). American Housing Survey (AHS) [Dataset]. https://catalog.data.gov/dataset/american-housing-survey-ahs
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    United States Department of Housing and Urban Developmenthttp://www.hud.gov/
    Description

    The AHS is the largest, regular national housing sample survey in the United States. The U.S. Census Bureau conducts the AHS to obtain up-to-date housing statistics for the Department of Housing and Urban Development (HUD). The AHS national survey was conducted annually from 1973-1981 and biennially (every two years) since 1983. Metropolitan area surveys have been conducted annually or biennially since 1974.

  8. d

    ACS 5-Year Demographic Characteristics DC Census Tract

    • opendata.dc.gov
    • opdatahub.dc.gov
    • +5more
    Updated Feb 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    City of Washington, DC (2025). ACS 5-Year Demographic Characteristics DC Census Tract [Dataset]. https://opendata.dc.gov/datasets/62e1f639627342248a4d4027140a1935
    Explore at:
    Dataset updated
    Feb 28, 2025
    Dataset authored and provided by
    City of Washington, DC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Age, Sex, Race, Ethnicity, Total Housing Units, and Voting Age Population. This service is updated annually with American Community Survey (ACS) 5-year data. Contact: District of Columbia, Office of Planning. Email: planning@dc.gov. Geography: Census Tracts. Current Vintage: 2019-2023. ACS Table(s): DP05. Data downloaded from: Census Bureau's API for American Community Survey. Date of API call: January 2, 2025. National Figures: data.census.gov. Please cite the Census and ACS when using this data. Data Note from the Census: Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see Accuracy of the Data). The effect of nonsampling error is not represented in these tables. Data Processing Notes: This layer is updated automatically when the most current vintage of ACS data is released each year, usually in December. The layer always contains the latest available ACS 5-year estimates. It is updated annually within days of the Census Bureau's release schedule. Boundaries come from the US Census TIGER geodatabases. Boundaries are updated at the same time as the data updates (annually), and the boundary vintage appropriately matches the data vintage as specified by the Census. These are Census boundaries with water and/or coastlines clipped for cartographic purposes. For census tracts, the water cutouts are derived from a subset of the 2020 AWATER (Area Water) boundaries offered by TIGER. For state and county boundaries, the water and coastlines are derived from the coastlines of the 500k TIGER Cartographic Boundary Shapefiles. The original AWATER and ALAND fields are still available as attributes within the data table (units are square meters). Field alias names were created based on the Table Shells file available from the American Community Survey Summary File Documentation page. Data processed using R statistical package and ArcGIS Desktop. Margin of Error was not included in this layer but is available from the Census Bureau. Contact the Office of Planning for more information about obtaining Margin of Error values.

  9. Demographic and Health Survey 2017 - Indonesia

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated Jul 12, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Indonesia (BPS) (2019). Demographic and Health Survey 2017 - Indonesia [Dataset]. https://microdata.worldbank.org/index.php/catalog/3477
    Explore at:
    Dataset updated
    Jul 12, 2019
    Dataset provided by
    Statistics Indonesiahttp://www.bps.go.id/
    Ministry of Health (Kemenkes)
    National Population and Family Planning Board (BKKBN)
    Time period covered
    2017
    Area covered
    Indonesia
    Description

    Abstract

    The primary objective of the 2017 Indonesia Dmographic and Health Survey (IDHS) is to provide up-to-date estimates of basic demographic and health indicators. The IDHS provides a comprehensive overview of population and maternal and child health issues in Indonesia. More specifically, the IDHS was designed to: - provide data on fertility, family planning, maternal and child health, and awareness of HIV/AIDS and sexually transmitted infections (STIs) to help program managers, policy makers, and researchers to evaluate and improve existing programs; - measure trends in fertility and contraceptive prevalence rates, and analyze factors that affect such changes, such as residence, education, breastfeeding practices, and knowledge, use, and availability of contraceptive methods; - evaluate the achievement of goals previously set by national health programs, with special focus on maternal and child health; - assess married men’s knowledge of utilization of health services for their family’s health and participation in the health care of their families; - participate in creating an international database to allow cross-country comparisons in the areas of fertility, family planning, and health.

    Geographic coverage

    National coverage

    Analysis unit

    • Household
    • Individual
    • Children age 0-5
    • Woman age 15-49
    • Man age 15-54

    Universe

    The survey covered all de jure household members (usual residents), all women age 15-49 years resident in the household, and all men age 15-54 years resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    The 2017 IDHS sample covered 1,970 census blocks in urban and rural areas and was expected to obtain responses from 49,250 households. The sampled households were expected to identify about 59,100 women age 15-49 and 24,625 never-married men age 15-24 eligible for individual interview. Eight households were selected in each selected census block to yield 14,193 married men age 15-54 to be interviewed with the Married Man's Questionnaire. The sample frame of the 2017 IDHS is the Master Sample of Census Blocks from the 2010 Population Census. The frame for the household sample selection is the updated list of ordinary households in the selected census blocks. This list does not include institutional households, such as orphanages, police/military barracks, and prisons, or special households (boarding houses with a minimum of 10 people).

    The sampling design of the 2017 IDHS used two-stage stratified sampling: Stage 1: Several census blocks were selected with systematic sampling proportional to size, where size is the number of households listed in the 2010 Population Census. In the implicit stratification, the census blocks were stratified by urban and rural areas and ordered by wealth index category.

    Stage 2: In each selected census block, 25 ordinary households were selected with systematic sampling from the updated household listing. Eight households were selected systematically to obtain a sample of married men.

    For further details on sample design, see Appendix B of the final report.

    Mode of data collection

    Face-to-face [f2f]

    Research instrument

    The 2017 IDHS used four questionnaires: the Household Questionnaire, Woman’s Questionnaire, Married Man’s Questionnaire, and Never Married Man’s Questionnaire. Because of the change in survey coverage from ever-married women age 15-49 in the 2007 IDHS to all women age 15-49, the Woman’s Questionnaire had questions added for never married women age 15-24. These questions were part of the 2007 Indonesia Young Adult Reproductive Survey Questionnaire. The Household Questionnaire and the Woman’s Questionnaire are largely based on standard DHS phase 7 questionnaires (2015 version). The model questionnaires were adapted for use in Indonesia. Not all questions in the DHS model were included in the IDHS. Response categories were modified to reflect the local situation.

    Cleaning operations

    All completed questionnaires, along with the control forms, were returned to the BPS central office in Jakarta for data processing. The questionnaires were logged and edited, and all open-ended questions were coded. Responses were entered in the computer twice for verification, and they were corrected for computer-identified errors. Data processing activities were carried out by a team of 34 editors, 112 data entry operators, 33 compare officers, 19 secondary data editors, and 2 data entry supervisors. The questionnaires were entered twice and the entries were compared to detect and correct keying errors. A computer package program called Census and Survey Processing System (CSPro), which was specifically designed to process DHS-type survey data, was used in the processing of the 2017 IDHS.

    Response rate

    Of the 49,261 eligible households, 48,216 households were found by the interviewer teams. Among these households, 47,963 households were successfully interviewed, a response rate of almost 100%.

    In the interviewed households, 50,730 women were identified as eligible for individual interview and, from these, completed interviews were conducted with 49,627 women, yielding a response rate of 98%. From the selected household sample of married men, 10,440 married men were identified as eligible for interview, of which 10,009 were successfully interviewed, yielding a response rate of 96%. The lower response rate for men was due to the more frequent and longer absence of men from the household. In general, response rates in rural areas were higher than those in urban areas.

    Sampling error estimates

    The estimates from a sample survey are affected by two types of errors: (1) nonsampling errors and (2) sampling errors. Nonsampling errors result from mistakes made in implementing data collection and data processing, such as failure to locate and interview the correct household, misunderstanding the questions on the part of either the interviewer or the respondent, and data entry errors. Although numerous efforts were made during the implementation of the 2017 Indonesia Demographic and Health Survey (2017 IDHS) to minimize this type of error, nonsampling errors are impossible to avoid and difficult to evaluate statistically.

    Sampling errors, on the other hand, can be evaluated statistically. The sample of respondents selected in the 2017 IDHS is only one of many samples that could have been selected from the same population, using the same design and identical size. Each of these samples would yield results that differ somewhat from the results of the actual sample selected. Sampling error is a measure of the variability among all possible samples. Although the degree of variability is not known exactly, it can be estimated from the survey results.

    A sampling error is usually measured in terms of the standard error for a particular statistic (mean, percentage, etc.), which is the square root of the variance. The standard error can be used to calculate confidence intervals within which the true value for the population can reasonably be assumed to fall. For example, for any given statistic calculated from a sample survey, the value of that statistic will fall within a range of plus or minus two times the standard error of that statistic in 95 percent of all possible samples of identical size and design.

    If the sample of respondents had been selected as a simple random sample, it would have been possible to use straightforward formulas for calculating sampling errors. However, the 2017 IDHS sample is the result of a multi-stage stratified design, and, consequently, it was necessary to use more complex formulas. The computer software used to calculate sampling errors for the 2017 IDHS is a STATA program. This program used the Taylor linearization method for variance estimation for survey estimates that are means or proportions. The Jackknife repeated replication method is used for variance estimation of more complex statistics such as fertility and mortality rates.

    A more detailed description of estimates of sampling errors are presented in Appendix C of the survey final report.

    Data appraisal

    Data Quality Tables - Household age distribution - Age distribution of eligible and interviewed women - Age distribution of eligible and interviewed men - Completeness of reporting - Births by calendar year - Reporting of age at death in days - Reporting of age at death in months

    See details of the data quality tables in Appendix D of the survey final report.

  10. e

    Census of population and housing - one percent sample (2011) - Dataset -...

    • b2find.eudat.eu
    Updated Apr 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Census of population and housing - one percent sample (2011) - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/4ccf9687-c699-59d0-a94c-4636393857de
    Explore at:
    Dataset updated
    Apr 30, 2023
    License

    Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
    License information was derived automatically

    Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
    License information was derived automatically

    Description

    The Census of Population and Housing is one of the most important surveys carried out by ISTAT. It is conducted every ten years from 1861, and the main objectives are: the count of the whole population and the recognition of its structural characteristics; updating and revision of civil registers; the definition of the legal population for juridical and electoral purposes; the collection of information about the number and structural characteristics of houses and buildings. The Census collects information about demographic and family structure of the population, the types of their households, their level of education, their employment status, and other informations on residents population. In 2011, for the first time, some information of socio-economic character were measured on a sample basis through the use of two types of questionnaire: one in a reduced form, with a few questions, including indispensable information for the production of the data required by the European Union with an high spatial detail, and one in complete form. In particular, Istat provides a 1% sample data (594,247 cases) released in two separate datasets: the first file (individui) refers to persons usually resident in private households and in Institutional households and the second one (alloggi) refers to living quarters. In urban areas with at least 20,000 inhabitants a sample was selected by a simple random sampling without replacement procedure of one third of the families. A complete version (long form) of the questionnaire has been sent to the sample, while a short version the questionnaire has been sent to all other inhabitants. web-based self-administered questionnaire (CAWI)

  11. US Census - ACS and Decennial files **

    • redivis.com
    application/jsonl +7
    Updated Jul 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Environmental Impact Data Collaborative (2023). US Census - ACS and Decennial files ** [Dataset]. https://redivis.com/datasets/b2fz-a8gwpvnh4
    Explore at:
    avro, csv, spss, stata, sas, parquet, application/jsonl, arrowAvailable download formats
    Dataset updated
    Jul 4, 2023
    Dataset provided by
    Redivis Inc.
    Authors
    Environmental Impact Data Collaborative
    Area covered
    United States
    Description

    Abstract

    Dataset quality **: Medium/high quality dataset, not quality checked or modified by the EIDC team

    Census data plays a pivotal role in academic data research, particularly when exploring relationships between different demographic characteristics. The significance of this particular dataset lies in its ability to facilitate the merging of various datasets with basic census information, thereby streamlining the research process and eliminating the need for separate API calls.

    The American Community Survey is an ongoing survey conducted by the U.S. Census Bureau, which provides detailed social, economic, and demographic data about the United States population. The ACS collects data continuously throughout the decade, gathering information from a sample of households across the country, covering a wide range of topics

    Methodology

    The Census Data Application Programming Interface (API) is an API that gives the public access to raw statistical data from various Census Bureau data programs.

    We used this API to collect various demographic and socioeconomic variables from both the ACS and the Deccenial survey on different geographical levels:

    ZCTAs:

    ZIP Code Tabulation Areas (ZCTAs) are generalized areal representations of United States Postal Service (USPS) ZIP Code service areas. The USPS ZIP Codes identify the individual post office or metropolitan area delivery station associated with mailing addresses. USPS ZIP Codes are not areal features but a collection of mail delivery routes.

    Census Tract:

    Census Tracts are small, relatively permanent statistical subdivisions of a county or statistically equivalent entity that can be updated by local participants prior to each decennial census as part of the Census Bureau’s Participant Statistical Areas Program (PSAP).

    Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. A census tract usually covers a contiguous area; however, the spatial size of census tracts varies widely depending on the density of settlement. Census tract boundaries are delineated with the intention of being maintained over a long time so that statistical comparisons can be made from census to census.

    Block Groups:

    Block groups (BGs) are the next level above census blocks in the geographic hierarchy (see Figure 2-1 in Chapter 2). A BG is a combination of census blocks that is a subdivision of a census tract or block numbering area (BNA). (A county or its statistically equivalent entity contains either census tracts or BNAs; it can not contain both.) A BG consists of all census blocks whose numbers begin with the same digit in a given census tract or BNA; for example, BG 3 includes all census blocks numbered in the 300s. The BG is the smallest geographic entity for which the decennial census tabulates and publishes sample data.

    Census Blocks:

    Census blocks, the smallest geographic area for which the Bureau of the Census collects and tabulates decennial census data, are formed by streets, roads, railroads, streams and other bodies of water, other visible physical and cultural features, and the legal boundaries shown on Census Bureau maps.

  12. 2024 Public Sector: CG00ORG01 | Government Units: U.S. and State: Census...

    • data.census.gov
    Updated Aug 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECN (2023). 2024 Public Sector: CG00ORG01 | Government Units: U.S. and State: Census Years 1942 - 2022 (PUB Public Sector Annual Surveys and Census of Governments) [Dataset]. https://data.census.gov/table/GOVSTIMESERIES.CG00ORG01
    Explore at:
    Dataset updated
    Aug 24, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ECN
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Area covered
    United States
    Description

    Key Table Information.Table Title.Government Units: U.S. and State: Census Years 1942 - 2022.Table ID.GOVSTIMESERIES.CG00ORG01.Survey/Program.Public Sector.Year.2024.Dataset.PUB Public Sector Annual Surveys and Census of Governments.Source.U.S. Census Bureau, Public Sector.Release Date.2023-08-24.Release Schedule.For information about Census of Governments planned data product releases, see https://www.census.gov/programs-surveys/gus/newsroom/updates.html.Dataset Universe.Census of Governments - Organization (CG):The universe of this file is all federal, state, and local government units in the United States. In addition to the federal government and the 50 state governments, the Census Bureau recognizes five basic types of local governments. The government types are: County, Municipal, Township, Special District, and School District. Of these five types, three are categorized as General Purpose governments: County, municipal, and township governments are readily recognized and generally present no serious problem of classification. However, legislative provisions for school district and special district governments are diverse. These two types are categorized as Special Purpose governments. Numerous single-function and multiple-function districts, authorities, commissions, boards, and other entities, which have varying degrees of autonomy, exist in the United States. The basic pattern of these entities varies widely from state to state. Moreover, various classes of local governments within a particular state also differ in their characteristics. Refer to the Individual State Descriptions report for an overview of all government entities authorized by state.The Public Use File provides a listing of all independent government units, and dependent school districts active as of fiscal year ending June 30, 2024. The Annual Surveys of Public Employment & Payroll (EP) and State and Local Government Finances (LF):The target population consists of all 50 state governments, the District of Columbia, and a sample of local governmental units (counties, cities, townships, special districts, school districts). In years ending in '2' and '7' the entire universe is canvassed. In intervening years, a sample of the target population is surveyed. Additional details on sampling are available in the survey methodology descriptions for those years.The Annual Survey of Public Pensions (PP):The target population consists of state- and locally-administered defined benefit funds and systems of all 50 state governments, the District of Columbia, and a sample of local governmental units (counties, cities, townships, special districts, school districts). In years ending in '2' and '7' the entire universe is canvassed. In intervening years, a sample of the target population is surveyed. Additional details on sampling are available in the survey methodology descriptions for those years.The Annual Surveys of State Government Finance (SG) and State Government Tax Collections (TC):The target population consists of all 50 state governments. No local governments are included. For the purpose of Census Bureau statistics, the term "state government" refers not only to the executive, legislative, and judicial branches of a given state, but it also includes agencies, institutions, commissions, and public authorities that operate separately or somewhat autonomously from the central state government but where the state government maintains administrative or fiscal control over their activities as defined by the Census Bureau. Additional details are available in the survey methodology description.The Annual Survey of School System Finances (SS):The Annual Survey of School System Finances targets all public school systems providing elementary and/or secondary education in all 50 states and the District of Columbia..Methodology.Data Items and Other Identifying Records.Total federal, state, and local government units by state.Unit(s) of Observation.The basic reporting unit is the governmental unit, defined as an organized entity which in addition to having governmental character, has sufficient discretion in the management of its own affairs to distinguish it as separate from the administrative structure of any other governmental unit.The reporting units for the Annual Survey of School System Finances are public school systems that provide elementary and/or secondary education. The term "public school systems" includes two types of government entities with responsibility for providing education services: (1) school districts that are administratively and fiscally independent of any other government and are counted as separate governments; and (2) public school systems that lack sufficient autonomy to be counted as separate governments and are classified as a dependent agency of some other government—a county, municipal, township, or state government. Charter school systems whose charters are held by nongovernmental entities are deemed to be out of...

  13. m

    Maryland American Community Survey - ACS Census Tracts

    • data.imap.maryland.gov
    • hub.arcgis.com
    • +3more
    Updated Feb 9, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ArcGIS Online for Maryland (2016). Maryland American Community Survey - ACS Census Tracts [Dataset]. https://data.imap.maryland.gov/datasets/fdcb9d65512a44db8735919d9689b43c
    Explore at:
    Dataset updated
    Feb 9, 2016
    Dataset authored and provided by
    ArcGIS Online for Maryland
    Area covered
    Description

    The American Community Survey (ACS) is a nationwide, continuous survey designed to provide communities with reliable and timely demographic, housing, social and economic data. The ACS replaces the decennial census long form in 2010 and every year thereafter. The annual ACS sample is smaller than that of previous long form surveys resulting in a larger sampling error. Coefficients of Variation (CVs), which are statistical measures that show the relative amount of sampling error associated with an estimate, are presented here as a measure of reliability and usability of the data. The unit of geography used for the 2010 - 2014 data is the census tract - a small statistical area within a county, which is delineated every 10 years prior to the decennial census.Last Updated: UnknownThis is a MD iMAP hosted service. Find more information at https://imap.maryland.gov.Feature Service Link:https://mdgeodata.md.gov/imap/rest/services/Demographics/MD_AmericanCommunitySurvey/FeatureServer/0

  14. g

    Census of Population and Housing, 2000 [United States]: Public Law (P.L.)...

    • search.gesis.org
    Updated May 1, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States Department of Commerce. Bureau of the Census (2021). Census of Population and Housing, 2000 [United States]: Public Law (P.L.) 94-171 Adjusted Data - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR13400
    Explore at:
    Dataset updated
    May 1, 2021
    Dataset provided by
    GESIS search
    ICPSR - Interuniversity Consortium for Political and Social Research
    Authors
    United States Department of Commerce. Bureau of the Census
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de446347https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de446347

    Area covered
    United States
    Description

    Abstract (en): The numbers contained in this study are released pursuant to the order of the United States Court of Appeals for the Ninth Circuit in Carter v. Department of Commerce, 307 F.3d 1084. These numbers are not official Census 2000 counts. These numbers are estimates of the population based on a statistical adjustment method, utilizing sampling and modeling, applied to the official Census 2000 figures. The estimates utilized the results of the Accuracy and Coverage Evaluation (A.C.E.), a sample survey intended to measure net over- and undercounts in the census results. The Census Bureau has determined that the A.C.E. estimates dramatically overstate the level of undercoverage in Census 2000, and that the adjusted Census 2000 data are, therefore, not more accurate than the unadjusted data. On March 6, 2001, the Secretary of Commerce decided that unadjusted data from Census 2000 should be used to tabulate population counts reported to states and localities pursuant to 13 U.S.C. 141(c) (see 66 FR 14520, March 13, 2001). The Secretary's decision endorsed the unanimous recommendation of the Executive Steering Committee for A.C.E. Policy (ESCAP), a group of 12 senior career professionals within the Census Bureau. The ESCAP, in its recommendation against the use of the statistically adjusted estimates, had noted serious reservations regarding their accuracy. In order to inform the Census Bureau's planned October 2001 decision regarding the potential use of the adjusted estimates for non-redistricting purposes, the agency conducted extensive analyses throughout the summer of 2001. These extensive analyses confirmed the serious concerns the agency had noted earlier regarding the accuracy of the A.C.E. estimates. Specifically, the adjusted estimates were determined to be so severely flawed that all potential uses of these data would be inappropriate. Accordingly, the Department of Commerce deems that these estimates should not be used for any purpose that legally requires use of data from the decennial census and assumes no responsibility for the accuracy of the data for any purpose whatsoever. The Department, including the U.S. Census Bureau, will provide no assistance in the interpretation or use of these numbers. The collection contains four tables: (1) a count of all persons by race (Table PL1), (2) a count of Hispanic or Latino and a count of not Hispanic or Latino by race of all persons (Table PL2), (3) a count of the population 18 years and older by race (Table PL3), and (4) a count of Hispanic or Latino and a count of not Hispanic or Latino by race for the population 18 years and older (Table PL4). ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.. All persons and housing units in the United States in 2000. 2013-05-24 Multiple Census data file segments were repackaged for distribution into a single zip archive per dataset. No changes were made to the data or documentation.2006-01-12 All files were removed from dataset 90 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 86 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 84 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 83 and flagged as study-level files, so that they will accompany all downloads.2006-01-12 All files were removed from dataset 81 and flagged as study-level files, so that they will accompany all downloads.2004-08-26 All the data definition statements (Parts 83, 84, and 90) were replaced because of errors. The codebook was replaced with an updated one from the Bureau of the Census. The data are provided in three segments (files) per state: the Geographic Header, Tables PL1 and PL2, and Tables PL3 and PL4. The Geographic Header segments are fixed-format ASCII text files, while the Table segments are comma-delimited ASCII files. The Geographic Header has 80 variables and the Table segments have 149 variables each, for a total of 378 variables when the segments a...

  15. 2024 Public Sector: GS00SS14 | Percentage Distribution of Revenue of Public...

    • data.census.gov
    • test.data.census.gov
    Updated Mar 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ECN (2025). 2024 Public Sector: GS00SS14 | Percentage Distribution of Revenue of Public Elementary-Secondary School Systems in the United States: Fiscal Year 2012- 2023 (PUB Public Sector Annual Surveys and Census of Governments) [Dataset]. https://data.census.gov/table/GOVSTIMESERIES.GS00SS14?q=GS00SS14
    Explore at:
    Dataset updated
    Mar 28, 2025
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ECN
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Area covered
    United States
    Description

    Key Table Information.Table Title.Percentage Distribution of Revenue of Public Elementary-Secondary School Systems in the United States: Fiscal Year 2012- 2023.Table ID.GOVSTIMESERIES.GS00SS14.Survey/Program.Public Sector.Year.2024.Dataset.PUB Public Sector Annual Surveys and Census of Governments.Source.U.S. Census Bureau, Public Sector.Release Date.2025-05-01.Release Schedule.The Annual Survey of School System Finances occurs every year. Data are typically released in early May. There are approximately two years between the reference period and data release..Dataset Universe.Census of Governments - Organization (CG):The universe of this file is all federal, state, and local government units in the United States. In addition to the federal government and the 50 state governments, the Census Bureau recognizes five basic types of local governments. The government types are: County, Municipal, Township, Special District, and School District. Of these five types, three are categorized as General Purpose governments: County, municipal, and township governments are readily recognized and generally present no serious problem of classification. However, legislative provisions for school district and special district governments are diverse. These two types are categorized as Special Purpose governments. Numerous single-function and multiple-function districts, authorities, commissions, boards, and other entities, which have varying degrees of autonomy, exist in the United States. The basic pattern of these entities varies widely from state to state. Moreover, various classes of local governments within a particular state also differ in their characteristics. Refer to the Individual State Descriptions report for an overview of all government entities authorized by state.The Public Use File provides a listing of all independent government units, and dependent school districts active as of fiscal year ending June 30, 2024. The Annual Surveys of Public Employment & Payroll (EP) and State and Local Government Finances (LF):The target population consists of all 50 state governments, the District of Columbia, and a sample of local governmental units (counties, cities, townships, special districts, school districts). In years ending in '2' and '7' the entire universe is canvassed. In intervening years, a sample of the target population is surveyed. Additional details on sampling are available in the survey methodology descriptions for those years.The Annual Survey of Public Pensions (PP):The target population consists of state- and locally-administered defined benefit funds and systems of all 50 state governments, the District of Columbia, and a sample of local governmental units (counties, cities, townships, special districts, school districts). In years ending in '2' and '7' the entire universe is canvassed. In intervening years, a sample of the target population is surveyed. Additional details on sampling are available in the survey methodology descriptions for those years.The Annual Surveys of State Government Finance (SG) and State Government Tax Collections (TC):The target population consists of all 50 state governments. No local governments are included. For the purpose of Census Bureau statistics, the term "state government" refers not only to the executive, legislative, and judicial branches of a given state, but it also includes agencies, institutions, commissions, and public authorities that operate separately or somewhat autonomously from the central state government but where the state government maintains administrative or fiscal control over their activities as defined by the Census Bureau. Additional details are available in the survey methodology description.The Annual Survey of School System Finances (SS):The Annual Survey of School System Finances targets all public school systems providing elementary and/or secondary education in all 50 states and the District of Columbia..Methodology.Data Items and Other Identifying Records.Fall enrollmentTotal percentage distribution of revenuePercentage distribution of revenue - Revenue from federal sources - TotalPercentage distribution of revenue - Revenue from federal sources - Title IPercentage distribution of revenue - Revenue from state sources - TotalPercentage distribution of revenue - Revenue from state sources - General formula assistancePercentage distribution of revenue - Revenue from local sources - TotalPercentage distribution of revenue - Revenue from local sources - Taxes and parent government contributionsPercentage distribution of revenue - Revenue from local sources - Other local governmentsPercentage distribution of revenue - Revenue from local sources - Current chargesDefinitions can be found by clicking on the column header in the table or by accessing the Glossary.For detailed information, see Government Finance and Employment Classification Manual..Unit(s) of Observation.The basic reporting unit is the governmental unit, defined as an org...

  16. V

    Virginia Non-Single Occupancy Vehicle (SOV) Travel Percent by Urban Area...

    • data.virginia.gov
    csv
    Updated Jan 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office of INTERMODAL Planning and Investment (2025). Virginia Non-Single Occupancy Vehicle (SOV) Travel Percent by Urban Area (ACS 5-Year) [Dataset]. https://data.virginia.gov/dataset/virginia-non-single-occupancy-vehicle-sov-travel-percent-by-urban-area-acs-5-year
    Explore at:
    csv(53336)Available download formats
    Dataset updated
    Jan 3, 2025
    Dataset authored and provided by
    Office of INTERMODAL Planning and Investment
    Area covered
    Virginia
    Description

    2013-2023 Virginia Non-Single Occupancy Vehicle (SOV) Travel Percent by Census Urban Area. Contains estimates. Workers 16 years and over, commuting to work, who are NOT using a car, truck, or van when driving alone.

    U.S. Census Bureau; American Community Survey, American Community Survey 5-Year Estimates, Table DP03, Column DP03_0019PE Data accessed from: Census Bureau's API for American Community Survey (https://www.census.gov/data/developers/data-sets.html)

    Documentation of the method to calculate Non-SOV is provided by the Federal Highway Administration (https://www.fhwa.dot.gov/tpm/guidance/hif18024.pdf) page 38 explains the calculation of the Non-SOV Travel measure.

    Urban areas with values of -666,666,666 or 0 have blanks calculated for Non-SOV values.

    The United States Census Bureau's American Community Survey (ACS): -What is the American Community Survey? (https://www.census.gov/programs-surveys/acs/about.html) -Geography & ACS (https://www.census.gov/programs-surveys/acs/geography-acs.html) -Technical Documentation (https://www.census.gov/programs-surveys/acs/technical-documentation.html)

    Supporting documentation on code lists, subject definitions, data accuracy, and statistical testing can be found on the American Community Survey website in the Technical Documentation section. (https://www.census.gov/programs-surveys/acs/technical-documentation/code-lists.html)

    Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section. (https://www.census.gov/acs/www/methodology/sample_size_and_data_quality/)

    Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, it is the Census Bureau's Population Estimates Program that produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units for states and counties.

    Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation https://www.census.gov/programs-surveys/acs/technical-documentation.html). The effect of nonsampling error is not represented in these tables.

  17. e

    HSRC Master Sample II - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Aug 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). HSRC Master Sample II - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/96d7c5e3-e8c8-5eb6-a25b-c22ad9f86fba
    Explore at:
    Dataset updated
    Aug 12, 2025
    Description

    Description: The 2005 HSRC Master Sample was used for SABSSM 2008 and 2012, the SANHANES study in 2012 and SASAS 2007-2010 (adjacent EAs) to obtain an understanding of geographical spread of HIV/AIDS, perceptions and attitudes of people and other health related studies over time. Abstract: A sample can be defined as a subset containing the characteristics of a larger population. Samples are used in statistical testing when population sizes are too large for the test to include all possible members or observations. A sample should represent the whole population and not reflect bias toward a specific attribute.[1] One of the most crucial aspects of sample design in household surveys is its frame. The sampling frame has significant implications on the cost and the quality of any survey, household or otherwise.[2] The sampling frame .... in a household survey must cover the entire target population. When that frame is used for multiple surveys or multiple rounds of the same survey it is known as a master sample frame or .... master sample.[3] A master sample is a sample drawn from a population for use on a number of future occasions, so as to avoid ad hoc sampling on each occasion. Sometimes the master sample is large and subsequent inquiries are based on a sub-sample from it.[4] The HSRC compiles master samples in order to construct samples for various HSRC research studies. The 2005 HSRC Master Sample was used for SABSSM 2008 and 2012, SASAS 2007-2010 and the SANHANES study in 2012 to obtain an understanding of geographical spread of HIV/AIDS, perceptions and attitudes of people and other health related studies over time. The 2005 HSRC Master Sample was created in the following way: South Africa was delineated into EAs according to municipality and province. Municipal boundaries were obtained from the Municipal Demarcation Board. An Enumeration area (EA) is the smallest geographical unit (piece of land) into which the country is divided for census or survey enumeration.[5] The concepts and definitions of terms used for Census 2001 comply in most instances with United Nations standards for censuses. A total of 1,000 census enumeration areas (EAs) from the 2001 population census were randomly selected using probability proportional to size and stratified by province, locality type and race in urban areas from a database of 80 787 EAs that were mapped using aerial photography to develop an HSRC master sample for selecting households. The ideal frame would be complete with respect to the target population if all of its members (the universe) are covered by the frame. Ideal characteristics of a master sample: The master frame should be as complete, accurate and current as practicable. A master sample frame for household surveys is typically developed from the most recent census, just as a regular sample frame is. Because the master frame may be used during an entire intercensal (between census) period, however, it will usually require periodic and regular updating such as every 2-3 years. This is in contrast to a regular frame which is more likely to be up-dated on an ad hoc basis and only when a particular survey is being planned[6] [1] http://www.investopedia.com/terms/s/sample.asp [2] http://unstats.un.org/unsd/demographic/meetings/egm/sampling_1203/docs/no_3.pdf [3] http://unstats.un.org/unsd/demographic/meetings/egm/sampling_1203/docs/no_3.pdf [4] A Dictionary of Statistical Terms, 5th edition, prepared for the International Statistical Institute by F.H.C. Marriott. Published for the International Statistical Institute by Longman Scientific and Technical. http://stats.oecd.org/glossary/detail.asp?ID=3708 [5] http://africageodownloads.info/128_mokgokolo.pdf [6] http://unstats.un.org/unsd/demographic/meetings/egm/sampling_1203/docs/no_3.pdf All enumeration areas (80 787 EAs) within the South African borders during the 2001 Census. The whole country was delimited into EAs according to municipality and province. Municipal boundaries were obtained from the Municipal Demarcation Board. A total of 1,000 census enumeration areas (EAs) from the 2001 population census were randomly selected using probability proportional to size and stratified by province, locality type and race in urban areas from a database of 80 787 EAs that were mapped in all surveys using aerial photography to develop all HSRC master sample for selecting households. The first digit represents the province The second and third digits represent the municipality

  18. i

    Household Health Survey 2006-2007, Economic Research Forum (ERF)...

    • catalog.ihsn.org
    Updated Jun 26, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kurdistan Regional Statistics Office (KRSO) (2017). Household Health Survey 2006-2007, Economic Research Forum (ERF) Harmonization Data - Iraq [Dataset]. https://catalog.ihsn.org/index.php/catalog/6936
    Explore at:
    Dataset updated
    Jun 26, 2017
    Dataset provided by
    Kurdistan Regional Statistics Office (KRSO)
    Central Organization for Statistics and Information Technology (COSIT)
    Economic Research Forum
    Time period covered
    2006 - 2007
    Area covered
    Iraq
    Description

    Abstract

    The harmonized data set on health, created and published by the ERF, is a subset of Iraq Household Socio Economic Survey (IHSES) 2006/2007. It was derived from the household, individual and health modules, collected in the context of the above mentioned survey. The sample was then used to create a harmonized health survey, comparable with the Iraq Household Socio Economic Survey (IHSES) 2012 micro data set.

    ----> Overview of the Iraq Household Socio Economic Survey (IHSES) 2006/2007:
    In order to develop an effective poverty reduction policies and programs, Iraqi policy makers need to know how large the poverty problem is, what kind of people are poor, and what are the causes and consequences of poverty. Until recently, they had neither the data nor an official poverty line. (The last national income and expenditure survey was in 1988.)

    In response to this situation, the Iraqi Ministry of Planning and Development Cooperation established the Household Survey and Policies for Poverty Reduction Project in 2006, with financial and technical support of the World Bank. The project has been led by the Iraqi Poverty Reduction Strategy High Committee, a group which includes representatives from Parliament, the prime minister's office, the Kurdistan Regional Government, and the ministries of Planning and Development Cooperation, Finance, Trade, Labor and Social Affairs, Education, Health, Women's Affairs, and Baghdad University.

    The Project has consisted of three components: - Collection of data which can provide a measurable indicator of welfare, i.e. The Iraq Household Socio Economic Survey (IHSES).

    • Establishment of an official poverty line (i.e. a cut off point below which people are considered poor) and analysis of poverty (how large the poverty problem is, what kind of people are poor and what are the causes and consequences of poverty).

    • Development of a Poverty Reduction Strategy, based on a solid understanding of poverty in Iraq.

    The survey has four main objectives. These are:

    • To provide data that will help in the measurement and analysis of poverty. • To provide data required to establish a new consumer price index (CPI) since the current outdated CPI is based on 1993 data and no longer applies to the country's vastly changed circumstances. • To provide data that meet the requirements and needs of national accounts. • To provide other indicators, such as consumption expenditure, sources of income, human development, and time use.

    The raw survey data provided by the Statistical Office were then harmonized by the Economic Research Forum, to create a comparable version with the 2012 Household Socio Economic Survey in Iraq. Harmonization at this stage only included unifying variables' names, labels and some definitions. See: Iraq 2007 & 2012- Variables Mapping & Availability Matrix.pdf provided in the external resources for further information on the mapping of the original variables on the harmonized ones, in addition to more indications on the variables' availability in both survey years and relevant comments.

    Geographic coverage

    National coverage: Covering a sample of urban, rural and metropolitan areas in all the governorates including those in Kurdistan Region.

    Analysis unit

    1- Household/family. 2- Individual/person.

    Universe

    The survey covered a national sample of households and all individuals permanently residing in surveyed households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    ----> Total sample size and stratification:

    The total effective sample size of the Iraq Household Socio Economic Survey (IHSES) 2007 is 17,822 households. The survey was nominally designed to visit 18,144 households - 324 in each of 56 major strata. The strata are the rural, urban and metropolitan sections of each of Iraq's 18 governorates, with the exception of Baghdad, which has three metropolitan strata. The Iraq Household Socio Economic Survey (IHSES) 2007 and the MICS 2006 survey intended to visit the same nominal sample. Variable q0040 indicates whether this was indeed the case.

    ----> Sample frame:

    The 1997 population census frame was applied to the 15 governorates that participated in the census (the three governorates in Kurdistan Region of Iraq were excluded). For Sulaimaniya, the population frame prepared for the compulsory education project was adopted. For Erbil and Duhouk, the enumeration frame implemented in the 2004 Iraq Living Conditions Survey was updated and used. The population covered by Iraq Household Socio Economic Survey (IHSES) included all households residing in Iraq from November 1, 2006, to October 30, 2007, meaning that every household residing within Iraq's geographical boundaries during that period potentially could be selected for the sample.

    ----> Primary sampling units and the listing and mapping exercise:

    The 1997 population census frame provided a database for all households. The smallest enumeration unit was the village in rural areas and the majal (census enumeration area), which is a collection of 15-25 urban households. The majals were merged to form Primary Sampling Units (PSUs), containing 70-100 households each. In Kurdistan, PSUs were created based on the maps and frames updated by the statistics offices. Villages in rural areas, especially those with few inhabitants, were merged to form PSUs. Selecting a truly representative sample required that changes between 1997 and the pilot survey be accounted for. The names and addresses of the households in each sample point (that is, the selected PSU) were updated; and a map was drawn that defined the unit's borders, buildings, houses, and the streets and alleys passing through. All buildings were renumbered. A list of heads of household in each sample point was prepared from forms that were filled out and used as a frame for selecting the sample households.

    ----> Sampling strategy and sampling stages:

    The sample was selected in two stages, with groups of majals (Census Enumeration Areas) as Primary Sampling Units (PSUs) and households as Secondary Sampling Units. In the first stage, 54 PSUs were selected with probability proportional to size (pps) within each stratum, using the number of households recorded by the 1997 Census as a measure of size. In the second stage, six households were selected by systematic equal probability sampling (seps) within each PSU. To these effects, a cartographic updating and household listing operation was conducted in 2006 in all 3,024 PSUs, without resorting to the segmentation of any large PSUs. The total sample is thus nominally composed of 6 households in each of 3,024 PSUs.

    ----> Sample Points Trios, teams and survey waves:

    The PSUs selected in each governorate (270 in Baghdad and 162 in each of the other governorates) were sorted into groups of three neighboring PSUs called trios -- 90 trios in Baghdad and 54 per governorate elsewhere. The three PSUs in each trio do not necessarily belong to the same stratum. The 12 months of the data collection period were divided into 18 periods of 20 or 21 days called survey waves. Fieldworkers were organized into teams of three interviewers, each team being responsible for interviewing one trio during a survey wave. The survey used 56 teams in total - 5 in Baghdad and 3 per governorate elsewhere. The 18 trios assigned to each team were allocated into survey waves at random. The 'time use' module was administered to two of the six households selected in each PSU: nominally the second and fifth households selected by the seps procedure in the PSU.

    ----> Time-use sample:

    The Iraq Household Socio Economic Survey (IHSES) questionnaire on time use covered all household members aged 10 years and older. A subsample of one-third of the households was selected (the second and fifth of the six households in each sample point). The second and fourth visits were designated for completion of the time-use sheet, which covered all activities performed by every member of the household.

    A more detailed description of the allocation of sample across governorates is provided in the tabulation report document available among external resources in both English and Arabic.

    Sampling deviation

    ----> Exceptional Measures

    The design did not consider the replacement of any of the randomly selected units (PSUs or households.) However, sometimes a team could not visit a cluster during the allocated wave because of unsafe security conditions. When this happened, that cluster was then swapped with another cluster from a randomly selected future wave that was considered more secure. If none were considered secure, a sample point was randomly selected from among those that had been visited already. The team then visited a new cluster within that sample point. (That is, the team visited six households that had not been previously interviewed.) The original cluster as well as the new cluster were both selected by systematic equal probability sampling.

    This explains why the survey datasets only contain data from 2,876 of the 3,024 originally selected PSUs, whereas 55 of the PSUs contain more that the six households nominally dictated by the design.

    The wave number in the survey datasets is always the nominal wave number, corresponding to the random allocation considered by the design. The effective interview dates can be found in questions 35 to 39 of the survey questionnaires.

    Remarkably few of the original clusters could not be visited during the fieldwork. Nationally, less than 2 percent of the original clusters (55 of 3,024) had to be replaced. Of the original clusters, 20 of 54 (37 percent) could not be visited in the stratum of “Kirkuk/other urban” and

  19. American Community Survey (ACS) – Vision and Eye Health Surveillance

    • catalog.data.gov
    • healthdata.gov
    • +5more
    Updated May 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (2025). American Community Survey (ACS) – Vision and Eye Health Surveillance [Dataset]. https://catalog.data.gov/dataset/american-community-survey-acs-vision-and-eye-health-surveillance-0f989
    Explore at:
    Dataset updated
    May 16, 2025
    Dataset provided by
    Centers for Disease Control and Preventionhttp://www.cdc.gov/
    Description

    2014 - 2022 (excluding 2020). This dataset is a de-identified summary table of vision and eye health data indicators from ACS, stratified by all available combinations of age group, race/ethnicity, gender, and state. ACS is an annual nationwide survey conducted by the U.S. Census Bureau that collects information on demographic, social, economic, and housing characteristics of the U.S. population. Approximate sample size is 3 million annually. ACS data for VEHSS includes one question related to Visual Function. Data were suppressed for cell sizes less than 30 persons, or where the relative standard error more than 30% of the mean. Data will be updated as it becomes available. Detailed information on VEHSS ACS analyses can be found on the VEHSS ACS webpage (link). Additional information about ACS can be found on the U.S. Census Bureau website (https://www.census.gov/content/dam/Census/programs-surveys/acs/about/ACS_Information_Guide.pdf). The VEHSS ACS dataset was last updated April 2024

  20. w

    Synthetic Data for an Imaginary Country, Sample, 2023 - World

    • microdata.worldbank.org
    • nada-demo.ihsn.org
    Updated Jul 7, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Development Data Group, Data Analytics Unit (2023). Synthetic Data for an Imaginary Country, Sample, 2023 - World [Dataset]. https://microdata.worldbank.org/index.php/catalog/5906
    Explore at:
    Dataset updated
    Jul 7, 2023
    Dataset authored and provided by
    Development Data Group, Data Analytics Unit
    Time period covered
    2023
    Area covered
    World
    Description

    Abstract

    The dataset is a relational dataset of 8,000 households households, representing a sample of the population of an imaginary middle-income country. The dataset contains two data files: one with variables at the household level, the other one with variables at the individual level. It includes variables that are typically collected in population censuses (demography, education, occupation, dwelling characteristics, fertility, mortality, and migration) and in household surveys (household expenditure, anthropometric data for children, assets ownership). The data only includes ordinary households (no community households). The dataset was created using REaLTabFormer, a model that leverages deep learning methods. The dataset was created for the purpose of training and simulation and is not intended to be representative of any specific country.

    The full-population dataset (with about 10 million individuals) is also distributed as open data.

    Geographic coverage

    The dataset is a synthetic dataset for an imaginary country. It was created to represent the population of this country by province (equivalent to admin1) and by urban/rural areas of residence.

    Analysis unit

    Household, Individual

    Universe

    The dataset is a fully-synthetic dataset representative of the resident population of ordinary households for an imaginary middle-income country.

    Kind of data

    ssd

    Sampling procedure

    The sample size was set to 8,000 households. The fixed number of households to be selected from each enumeration area was set to 25. In a first stage, the number of enumeration areas to be selected in each stratum was calculated, proportional to the size of each stratum (stratification by geo_1 and urban/rural). Then 25 households were randomly selected within each enumeration area. The R script used to draw the sample is provided as an external resource.

    Mode of data collection

    other

    Research instrument

    The dataset is a synthetic dataset. Although the variables it contains are variables typically collected from sample surveys or population censuses, no questionnaire is available for this dataset. A "fake" questionnaire was however created for the sample dataset extracted from this dataset, to be used as training material.

    Cleaning operations

    The synthetic data generation process included a set of "validators" (consistency checks, based on which synthetic observation were assessed and rejected/replaced when needed). Also, some post-processing was applied to the data to result in the distributed data files.

    Response rate

    This is a synthetic dataset; the "response rate" is 100%.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD

Current Population Survey (CPS)

Explore at:
Dataset updated
Nov 21, 2023
Dataset provided by
Harvard Dataverse
Authors
Damico, Anthony
Description

analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

Search
Clear search
Close search
Google apps
Main menu