100+ datasets found
  1. Census Income dataset

    • kaggle.com
    zip
    Updated Oct 28, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tawfik elmetwally (2023). Census Income dataset [Dataset]. https://www.kaggle.com/datasets/tawfikelmetwally/census-income-dataset
    Explore at:
    zip(707150 bytes)Available download formats
    Dataset updated
    Oct 28, 2023
    Authors
    tawfik elmetwally
    Description

    This intermediate level data set was extracted from the census bureau database. There are 48842 instances of data set, mix of continuous and discrete (train=32561, test=16281).

    The data set has 15 attribute which include age, sex, education level and other relevant details of a person. The data set will help to improve your skills in Exploratory Data Analysis, Data Wrangling, Data Visualization and Classification Models.

    Feel free to explore the data set with multiple supervised and unsupervised learning techniques. The Following description gives more details on this data set:

    • age: the age of an individual.
    • workclass: The type of work or employment of an individual. It can have the following categories:
      • Private: Working in the private sector.
      • Self-emp-not-inc: Self-employed individuals who are not incorporated.
      • Self-emp-inc: Self-employed individuals who are incorporated.
      • Federal-gov: Working for the federal government.
      • Local-gov: Working for the local government.
      • State-gov: Working for the state government.
      • Without-pay: Not working and without pay.
      • Never-worked: Never worked before.
    • Final Weight: The weights on the CPS files are controlled to independent estimates of the civilian noninstitutional population of the US. These are prepared monthly for us by Population Division here at the Census Bureau. We use 3 sets of controls.

    These are: 1. A single cell estimate of the population 16+ for each state. 2. Controls for Hispanic Origin by age and sex. 3. Controls by Race, age and sex.

    We use all three sets of controls in our weighting program and "rake" through them 6 times so that by the end we come back to all the controls we used.

    People with similar demographic characteristics should have similar weights. There is one important caveat to remember about this statement. That is that since the CPS sample is actually a collection of 51 state samples, each with its own probability of selection, the statement only applies within state.

    • education: The highest level of education completed.
    • education-num: The number of years of education completed.
    • marital-status: The marital status.
    • occupation: Type of work performed by an individual.
    • relationship: The relationship status.
    • race: The race of an individual.
    • sex: The gender of an individual.
    • capital-gain: The amount of capital gain (financial profit).
    • capital-loss: The amount of capital loss an individual has incurred.
    • hours-per-week: The number of hours works per week.
    • native-country: The country of origin or the native country.
    • income: The income level of an individual and serves as the target variable. It indicates whether the income is greater than $50,000 or less than or equal to $50,000, denoted as (>50K, <=50K).
  2. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  3. UCI Adult Census Data Dataset

    • kaggle.com
    zip
    Updated Aug 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sagnik (2020). UCI Adult Census Data Dataset [Dataset]. https://www.kaggle.com/datasets/sagnikpatra/uci-adult-census-data-dataset
    Explore at:
    zip(745972 bytes)Available download formats
    Dataset updated
    Aug 10, 2020
    Authors
    Sagnik
    Description

    The Us Adult income dataset was extracted by Barry Becker from the 1994 US Census Database. The data set consists of anonymous information such as occupation, age, native country, race, capital gain, capital loss, education, work class and more. I encountered it during my course, and I wish to share it here because it is a good starter example for data pre-processing and machine learning practices.

    Fields

    The dataset contains 16 columns Target filed: Income -- The income is divide into two classes: 50K Number of attributes: 14 -- These are the demographics and other features to describe a person

    We can explore the possibility in predicting income level based on the individual’s personal information.

    Acknowledgements

    This dataset named “adult” is found in the UCI machine learning repository

  4. Decennial Census: State Legislative District Summary File (Sample)

    • catalog.data.gov
    Updated Jul 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2023). Decennial Census: State Legislative District Summary File (Sample) [Dataset]. https://catalog.data.gov/dataset/decennial-census-state-legislative-district-summary-file-sample
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    The State Legislative District Summary File (Sample) (SLDSAMPLE) contains the sample data, which is the information compiled from the questions asked of a sample of all people and housing units. Population items include basic population totals; urban and rural; households and families; marital status; grandparents as caregivers; language and ability to speak English; ancestry; place of birth, citizenship status, and year of entry; migration; place of work; journey to work (commuting); school enrollment and educational attainment; veteran status; disability; employment status; industry, occupation, and class of worker; income; and poverty status. Housing items include basic housing totals; urban and rural; number of rooms; number of bedrooms; year moved into unit; household size and occupants per room; units in structure; year structure built; heating fuel; telephone service; plumbing and kitchen facilities; vehicles available; value of home; monthly rent; and shelter costs. The file contains subject content identical to that shown in Summary File 3 (SF 3).

  5. c

    Census of Population and Housing, 1960: Public Use Sample, 1 in 100

    • archive.ciser.cornell.edu
    Updated Feb 13, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bureau of the Census (2020). Census of Population and Housing, 1960: Public Use Sample, 1 in 100 [Dataset]. http://doi.org/10.6077/j5/ohycfx
    Explore at:
    Dataset updated
    Feb 13, 2020
    Dataset authored and provided by
    Bureau of the Census
    Variables measured
    Individual, Household
    Description

    This collection contains individual-level and 1-percent national sample data from the 1960 Census of Population and Housing conducted by the Census Bureau. It consists of a representative sample of the records from the 1960 sample questionnaires. The data are stored in 30 separate files, containing in total over two million records, organized by state. Some files contain the sampled records of several states while other files contain all or part of the sample for a single state. There are two types of records stored in the data files: one for households and one for persons. Each household record is followed by a variable number of person records, one for each of the household members. Data items in this collection include the individual responses to the basic social, demographic, and economic questions asked of the population in the 1960 Census of Population and Housing. Data are provided on household characteristics and features such as the number of persons in household, number of rooms and bedrooms, and the availability of hot and cold piped water, flush toilet, bathtub or shower, sewage disposal, and plumbing facilities. Additional information is provided on tenure, gross rent, year the housing structure was built, and value and location of the structure, as well as the presence of air conditioners, radio, telephone, and television in the house, and ownership of an automobile. Other demographic variables provide information on age, sex, marital status, race, place of birth, nationality, education, occupation, employment status, income, and veteran status. The data files were obtained by ICPSR from the Center for Social Analysis, Columbia University. (Source: downloaded from ICPSR 7/13/10)

    Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR07756.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.

  6. r

    US Census - ACS and Decennial files **

    • redivis.com
    Updated Mar 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). US Census - ACS and Decennial files ** [Dataset]. https://redivis.com/workflows/sex8-4b053ws9z
    Explore at:
    Dataset updated
    Mar 11, 2024
    Description

    Dataset quality **: Medium/high quality dataset, not quality checked or modified by the EIDC team

    Census data plays a pivotal role in academic data research, particularly when exploring relationships between different demographic characteristics. The significance of this particular dataset lies in its ability to facilitate the merging of various datasets with basic census information, thereby streamlining the research process and eliminating the need for separate API calls.

    The American Community Survey is an ongoing survey conducted by the U.S. Census Bureau, which provides detailed social, economic, and demographic data about the United States population. The ACS collects data continuously throughout the decade, gathering information from a sample of households across the country, covering a wide range of topics

  7. Microdata sample codes: Census 2021

    • ons.gov.uk
    • cy.ons.gov.uk
    zip
    Updated Sep 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2023). Microdata sample codes: Census 2021 [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/populationandmigration/populationestimates/datasets/microdatasamplecodescensus2021
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 7, 2023
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Table showing all variables, classifications and codes included within the Census 2021 microdata samples. This covers the secure, safeguarded and public samples.

  8. New York Squirrel Census 2020-2021: training sample dataset

    • figshare.com
    csv
    Updated Apr 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sharron Stapleton (2025). New York Squirrel Census 2020-2021: training sample dataset [Dataset]. http://doi.org/10.6084/m9.figshare.28747574.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 8, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Sharron Stapleton
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    New York
    Description

    Two sample datasets created for training purposes from The Squirrel Census https://www.thesquirrelcensus.com/ hosted by NYC OpenData 2018 Central Park Squirrel Census - Squirrel Data | NYC Open Data, no reuse license specified.

  9. 2011-2015 American Community Survey: 5-Year Estimates - Public Use Microdata...

    • catalog.data.gov
    • datasets.ai
    • +1more
    Updated Jul 19, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Census Bureau (2023). 2011-2015 American Community Survey: 5-Year Estimates - Public Use Microdata Sample [Dataset]. https://catalog.data.gov/dataset/2011-2015-american-community-survey-5-year-estimates-public-use-microdata-sample
    Explore at:
    Dataset updated
    Jul 19, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    The American Community Survey (ACS) Public Use Microdata Sample (PUMS) contains a sample of responses to the ACS. The ACS PUMS dataset includes variables for nearly every question on the survey, as well as many new variables that were derived after the fact from multiple survey responses (such as poverty status).Each record in the file represents a single person, or, in the household-level dataset, a single housing unit. In the person-level file, individuals are organized into households, making possible the study of people within the contexts of their families and other household members. Individuals living in Group Quarters, such as nursing facilities or college facilities, are also included on the person file. ACS PUMS data are available at the nation, state, and Public Use Microdata Area (PUMA) levels. PUMAs are special non-overlapping areas that partition each state into contiguous geographic units containing roughly 100,000 people each. ACS PUMS files for an individual year, such as 2019, contain data on approximately one percent of the United States population.

  10. c

    Census of Population, 1880: Public Use Sample (1 in 1000 Preliminary...

    • archive.ciser.cornell.edu
    Updated Feb 25, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Russell Menard; Steven Ruggles (2020). Census of Population, 1880: Public Use Sample (1 in 1000 Preliminary Subsample) [Dataset]. http://doi.org/10.6077/j5/wrvf3n
    Explore at:
    Dataset updated
    Feb 25, 2020
    Authors
    Russell Menard; Steven Ruggles
    Variables measured
    Individual, Family.HouseholdFamily
    Description

    This collection is a nationally representative--although clustered--1 in 1000 preliminary subsample of the United States population in 1880. The subsample is based on every tenth microfilm reel of enumeration forms (there are a total of 1,454 reels) and, within each reel, on the census page itself. In terms of the Public Use Sample as a whole, a sample density of 1 person per 100 was chosen so that a single sample point was randomly generated for every two census pages. Sample points were chosen for inclusion in the collection only if the individual selected was the first person listed in the dwelling. Under this procedure each dwelling, family, and individual in the population had a 1 in 100 probability of inclusion in the Public Use Sample.

    Please Note: This dataset is part of the historical CISER Data Archive Collection and is also available at ICPSR at https://doi.org/10.3886/ICPSR09474.v1. We highly recommend using the ICPSR version as they may make this dataset available in multiple data formats in the future.

  11. f

    Sample comparison to U.S. census statistics.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jul 3, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gruda, Dritjon; Hanges, Paul (2024). Sample comparison to U.S. census statistics. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001500985
    Explore at:
    Dataset updated
    Jul 3, 2024
    Authors
    Gruda, Dritjon; Hanges, Paul
    Area covered
    United States
    Description

    Lowering average household heating energy consumption plays a pivotal role in addressing climate change and has been central to policy initiatives. Strategies proposed so far have included commitments, incentives/ disincentives, feedback, and social norms. Yet, findings so far have been mixed and fail to explain the mechanism that drives energy conservation behavior. Using a sample of 2,128 participants across the United States, we collected survey data matched with archival temperature data to investigate the influence of past experiences on current energy conservation behaviors. Our findings indicate that childhood home temperatures significantly predict current home temperature settings. Importantly, community fit moderated this relationship. Individuals with high community fit were more likely to align their home temperature settings to those of their community. These insights not only shed light on the underlying mechanisms driving energy consumption behavior but also suggest that fostering a sense of community fit might be a more effective strategy for promoting sustainable energy practices.

  12. u

    1981 1% Teaching Census Microdata for Great Britain

    • datacatalogue.ukdataservice.ac.uk
    Updated Mar 7, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UK Data Service (2019). 1981 1% Teaching Census Microdata for Great Britain [Dataset]. http://doi.org/10.5255/UKDA-SN-8243-1
    Explore at:
    Dataset updated
    Mar 7, 2019
    Dataset provided by
    UK Data Servicehttps://ukdataservice.ac.uk/
    Area covered
    United Kingdom
    Description

    The 1981 Census Microdata Teaching Dataset for Great Britain: 1% Sample: Open Access dataset was created from existing digital records from the 1981 Census. It can be used as a 'taster' file for 1981 Census data and is freely available for anyone to download under an Open Government Licence.

    The file was created under a project known as Enhancing and Enriching Historic Census Microdata Samples (EEHCM), which was funded by the Economic and Social Research Council with input from the Office for National Statistics and National Records of Scotland. The project ran from 2012-2014 and was led from the UK Data Archive, University of Essex, in collaboration with the Cathie Marsh Institute for Social Research (CMIST) at the University of Manchester and the Census Offices. In addition to the 1981 data, the team worked on files from the 1961 Census and 1971 Census.

    The original 1981 records preceded current data archival standards and were created before microdata sets for secondary use were anticipated. A process of data recovery and quality checking was necessary to maximise their utility for current researchers, though some imperfections remain (see the User Guide for details). Three other 1981 Census datasets have been created:

    • SN 8241 - 1981 Census Microdata Individual File for Great Britain: 5% Sample, which contains information on individuals in larger local authorities;
    • SN 8242 - 1981 Census Microdata Household File for Great Britain: 0.95% Sample, which links household members together to allow individuals to be understood within their household context. SNs 8241 and 8242 are both available to registered UK Data Service users based in the United Kingdom (see Access section for non-UK access restrictions); and
    • SN 8248 - 1981 Census Microdata for Great Britain: 9% Sample: Secure Access, which comprises a larger population sample and so contains sufficient information to constitute personal data, meaning that it is only available to Accredited Researchers, under restrictive Secure Access conditions.
  13. 2023 Census population change by age group and RC

    • 2023census-statsnz.hub.arcgis.com
    • maps-by-statsnz.hub.arcgis.com
    Updated May 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics New Zealand (2024). 2023 Census population change by age group and RC [Dataset]. https://2023census-statsnz.hub.arcgis.com/datasets/StatsNZ::2023-census-population-change-by-age-group-and-rc?layer=1
    Explore at:
    Dataset updated
    May 29, 2024
    Dataset authored and provided by
    Statistics New Zealandhttp://www.stats.govt.nz/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    The life-cycle age groups are:

    • under 15 years
    • 15 to 29 years
    • 30 to 64 years
    • 65 years and over.

    Map shows the percentage change in the census usually resident population count for life-cycle age groups between the 2018 and 2023 Censuses.

    Download lookup file from Stats NZ ArcGIS Online or Stats NZ geographic data service.

    Footnotes

    Geographical boundaries
    Statistical standard for geographic areas 2023 (updated December 2023) has information about geographic boundaries as of 1 January 2023. Address data from 2013 and 2018 Censuses was updated to be consistent with the 2023 areas. Due to the changes in area boundaries and coding methodologies, 2013 and 2018 counts published in 2023 may be slightly different to those published in 2013 or 2018.

    Subnational census usually resident population
    The census usually resident population count of an area (subnational count) is a count of all people who usually live in that area and were present in New Zealand on census night. It excludes visitors from overseas, visitors from elsewhere in New Zealand, and residents temporarily overseas on census night. For example, a person who usually lives in Christchurch city and is visiting Wellington city on census night will be included in the census usually resident population count of Christchurch city. 

    Caution using time series
    Time series data should be interpreted with care due to changes in census methodology and differences in response rates between censuses. The 2023 and 2018 Censuses used a combined census methodology (using census responses and administrative data), while the 2013 Census used a full-field enumeration methodology (with no use of administrative data).

    About the 2023 Census dataset
    For information on the 2023 dataset see Using a combined census model for the 2023 Census. We combined data from the census forms with administrative data to create the 2023 Census dataset, which meets Stats NZ's quality criteria for population structure information. We added real data about real people to the dataset where we were confident the people who hadn’t completed a census form (which is known as admin enumeration) will be counted. We also used data from the 2018 and 2013 Censuses, administrative data sources, and statistical imputation methods to fill in some missing characteristics of people and dwellings.

    Data quality
    The quality of data in the 2023 Census is assessed using the quality rating scale and the quality assurance framework to determine whether data is fit for purpose and suitable for release. Data quality assurance in the 2023 Census has more information.

    Quality rating of a variable
    The quality rating of a variable provides an overall evaluation of data quality for that variable, usually at the highest levels of classification. The quality ratings shown are for the 2023 Census unless stated. There is variability in the quality of data at smaller geographies. Data quality may also vary between censuses, for subpopulations, or when cross tabulated with other variables or at lower levels of the classification. Data quality ratings for 2023 Census variables has more information on quality ratings by variable.

    Age concept quality rating
    Age is rated as very high quality.
    Age – 2023 Census: Information by concept has more information, for example, definitions and data quality.

    Using data for good
    Stats NZ expects that, when working with census data, it is done so with a positive purpose, as outlined in the Māori Data Governance Model (Data Iwi Leaders Group, 2023). This model states that "data should support transformative outcomes and should uplift and strengthen our relationships with each other and with our environments. The avoidance of harm is the minimum expectation for data use. Māori data should also contribute to iwi and hapū tino rangatiratanga".

    Confidentiality
    The 2023 Census confidentiality rules have been applied to 2013, 2018, and 2023 data. These rules protect the confidentiality of individuals, families, households, dwellings, and undertakings in 2023 Census data. Counts are calculated using fixed random rounding to base 3 (FRR3) and suppression of ‘sensitive’ counts less than six, where tables report multiple geographic variables and/or small populations. Individual figures may not always sum to stated totals. Applying confidentiality rules to 2023 Census data and summary of changes since 2018 and 2013 Censuses has more information about 2023 Census confidentiality rules.

  14. 2024 American Community Survey: S0101 | Age and Sex (ACS 1-Year Estimates...

    • data.census.gov
    • test.data.census.gov
    Updated Oct 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ACS (2023). 2024 American Community Survey: S0101 | Age and Sex (ACS 1-Year Estimates Subject Tables) [Dataset]. https://data.census.gov/cedsci/table?q=female
    Explore at:
    Dataset updated
    Oct 25, 2023
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    ACS
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2024
    Description

    Key Table Information.Table Title.Age and Sex.Table ID.ACSST1Y2024.S0101.Survey/Program.American Community Survey.Year.2024.Dataset.ACS 1-Year Estimates Subject Tables.Source.U.S. Census Bureau, 2024 American Community Survey, 1-Year Estimates.Dataset Universe.The dataset universe of the American Community Survey (ACS) is the U.S. resident population and housing. For more information about ACS residence rules, see the ACS Design and Methodology Report. Note that each table describes the specific universe of interest for that set of estimates..Methodology.Unit(s) of Observation.American Community Survey (ACS) data are collected from individuals living in housing units and group quarters, and about housing units whether occupied or vacant. For more information about ACS sampling and data collection, see the ACS Design and Methodology Report..Geography Coverage.ACS data generally reflect the geographic boundaries of legal and statistical areas as of January 1 of the estimate year. For more information, see Geography Boundaries by Year.Estimates of urban and rural populations, housing units, and characteristics reflect boundaries of urban areas defined based on 2020 Census data. As a result, data for urban and rural areas from the ACS do not necessarily reflect the results of ongoing urbanization..Sampling.The ACS consists of two separate samples: housing unit addresses and group quarters facilities. Independent housing unit address samples are selected for each county or county-equivalent in the U.S. and Puerto Rico, with sampling rates depending on a measure of size for the area. For more information on sampling in the ACS, see the Accuracy of the Data document..Confidentiality.The Census Bureau has modified or suppressed some estimates in ACS data products to protect respondents' confidentiality. Title 13 United States Code, Section 9, prohibits the Census Bureau from publishing results in which an individual's data can be identified. For more information on confidentiality protection in the ACS, see the Accuracy of the Data document..Technical Documentation/Methodology.Information about the American Community Survey (ACS) can be found on the ACS website. Supporting documentation including code lists, subject definitions, data accuracy, and statistical testing, and a full list of ACS tables and table shells (without estimates) can be found on the Technical Documentation section of the ACS website.Sample size and data quality measures (including coverage rates, allocation rates, and response rates) can be found on the American Community Survey website in the Methodology section.Data are based on a sample and are subject to sampling variability. The degree of uncertainty for an estimate arising from sampling variability is represented through the use of a margin of error. The value shown here is the 90 percent margin of error. The margin of error can be interpreted roughly as providing a 90 percent probability that the interval defined by the estimate minus the margin of error and the estimate plus the margin of error (the lower and upper confidence bounds) contains the true value. In addition to sampling variability, the ACS estimates are subject to nonsampling error (for a discussion of nonsampling variability, see ACS Technical Documentation). The effect of nonsampling error is not represented in these tables.Users must consider potential differences in geographic boundaries, questionnaire content or coding, or other methodological issues when comparing ACS data from different years. Statistically significant differences shown in ACS Comparison Profiles, or in data users' own analysis, may be the result of these differences and thus might not necessarily reflect changes to the social, economic, housing, or demographic characteristics being compared. For more information, see Comparing ACS Data..Weights.ACS estimates are obtained from a raking ratio estimation procedure that results in the assignment of two sets of weights: a weight to each sample person record and a weight to each sample housing unit record. Estimates of person characteristics are based on the person weight. Estimates of family, household, and housing unit characteristics are based on the housing unit weight. For any given geographic area, a characteristic total is estimated by summing the weights assigned to the persons, households, families or housing units possessing the characteristic in the geographic area. For more information on weighting and estimation in the ACS, see the Accuracy of the Data document.Although the American Community Survey (ACS) produces population, demographic and housing unit estimates, the decennial census is the official source of population totals for April 1st of each decennial year. In between censuses, the Census Bureau's Population Estimates Program produces and disseminates the official estimates of the population for the nation, states, counties, cities, and towns and estimates of housing units and t...

  15. N

    Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Excel, AL Age Group Population Dataset: A Complete Breakdown of Excel Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/4521c211-f122-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Alabama, Excel
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Excel population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Excel. The dataset can be utilized to understand the population distribution of Excel by age. For example, using this dataset, we can identify the largest age group in Excel.

    Key observations

    The largest age group in Excel, AL was for the group of age 5 to 9 years years with a population of 77 (15.28%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Excel, AL was the 85 years and over years with a population of 2 (0.40%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Excel is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Excel total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Excel Population by Age. You can refer the same here

  16. N

    Nome Census Area, AK Age Group Population Dataset: A Complete Breakdown of...

    • neilsberg.com
    csv, json
    Updated Feb 22, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Nome Census Area, AK Age Group Population Dataset: A Complete Breakdown of Nome Census Area Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2025 Edition [Dataset]. https://www.neilsberg.com/research/datasets/453abc4f-f122-11ef-8c1b-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Feb 22, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Nome Census Area
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Nome Census Area population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Nome Census Area. The dataset can be utilized to understand the population distribution of Nome Census Area by age. For example, using this dataset, we can identify the largest age group in Nome Census Area.

    Key observations

    The largest age group in Nome Census Area, AK was for the group of age 10 to 14 years years with a population of 1,058 (10.66%), according to the ACS 2019-2023 5-Year Estimates. At the same time, the smallest age group in Nome Census Area, AK was the 85 years and over years with a population of 45 (0.45%). Source: U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Nome Census Area is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Nome Census Area total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Nome Census Area Population by Age. You can refer the same here

  17. 2010 Census: Iowa Population by ZCTA

    • kaggle.com
    zip
    Updated May 2, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Mucchetti (2020). 2010 Census: Iowa Population by ZCTA [Dataset]. https://www.kaggle.com/markmucchetti/2010-census-iowa-population-by-zcta
    Explore at:
    zip(8057 bytes)Available download formats
    Dataset updated
    May 2, 2020
    Authors
    Mark Mucchetti
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Iowa
    Description

    Context

    This is a sample dataset derived from 2010 U.S. Government Census data. It is intended to be used in combination with example analyses on the public dataset "Iowa Liquor Sales", available as a Google Public Dataset, on Kaggle, and at https://data.iowa.gov/Sales-Distribution/Iowa-Liquor-Sales/m3tr-qhgy.

    Usage

    This dataset is intended for use as an example. Columns have purposely not been filtered by string manipulation in order to explore joining data between two pandas DataFrames and to do further processing.

    Because this data is at the Zip Code Tabulation Area (ZCTA) level, additional processing is required to join it with general-purpose datasets, which may be specified at the zip code, county name, county FIPS code, or coordinate level. This is intentional.

  18. Historic US Census - 1910

    • redivis.com
    • stanford.redivis.com
    application/jsonl +7
    Updated Jan 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Historic US Census - 1910 [Dataset]. http://doi.org/10.57761/n3ks-0444
    Explore at:
    parquet, application/jsonl, stata, csv, avro, sas, arrow, spssAvailable download formats
    Dataset updated
    Jan 10, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 1910 - Dec 31, 1910
    Description

    Abstract

    The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    phsdatacore@stanford.edu for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Documentation

    Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

    In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

    The historic US 1910 census data was collected in April 1910. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

    Section 2

    This dataset was created on 2020-01-10 23:47:27.924 by merging multiple datasets together. The source datasets for this version were:

    IPUMS 1910 households: The Integrated Public Use Microdata Series (IPUMS) Complete Count Data are historic individual and household census records and are a unique source for research on social and economic change.

    IPUMS 1910 persons: This dataset includes all individuals from the 1910 US census.

  19. i16 Census BlockGroup EconomicallyDistressedAreas 2023

    • data.ca.gov
    • data.cnra.ca.gov
    • +5more
    Updated Aug 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Water Resources (2025). i16 Census BlockGroup EconomicallyDistressedAreas 2023 [Dataset]. https://data.ca.gov/dataset/i16-census-blockgroup-economicallydistressedareas-2023
    Explore at:
    csv, kml, html, geojson, zip, arcgis geoservices rest apiAvailable download formats
    Dataset updated
    Aug 18, 2025
    Dataset authored and provided by
    California Department of Water Resourceshttp://www.water.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a copy of the statewide Census Block Group GIS Tiger file. The IRWM web based EDA mapping tool uses this GIS layer. Created by joining ACS 2019-2023 5 year estimates to the 2020 Census Tract feature class. The TIGER/Line Files are shapefiles and related database files (.dbf) that are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line File is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Block Groups (BGs) are defined before tabulation block delineation and numbering, but are clusters of blocks within the same census tract that have the same first digit of their 4-digit census block number from the same decennial census. For example, Census 2020 tabulation blocks 3001, 3002, 3003,.., 3999 within Census 2020 tract 1210.02 are also within BG 3 within that census tract. Census 2020 BGs generally contained between 600 and 3,000 people, with an optimum size of 1,500 people. Most BGs were delineated by local participants in the Census Bureau's Participant Statistical Areas Program (PSAP). The Census Bureau delineated BGs only where the PSAP participant declined to delineate BGs or where the Census Bureau could not identify any local PSAP participant. A BG usually covers a contiguous area. Each census tract contains at least one BG, and BGs are uniquely numbered within census tract. Within the standard census geographic hierarchy, BGs never cross county or census tract boundaries, but may cross the boundaries of other geographic entities like county subdivisions, places, urban areas, voting districts, congressional districts, and American Indian / Alaska Native / Native Hawaiian areas. BGs have a valid code range of 0 through 9. BGs coded 0 were intended to only include water area, no land area, and they are generally in territorial seas, coastal water, and Great Lakes water areas. For Census 2020, rather than extending a census tract boundary into the Great Lakes or out to the U.S. nautical three-mile limit, the Census Bureau delineated some census tract boundaries along the shoreline or just offshore. The Census Bureau assigned a default census tract number of 0 and BG of 0 to these offshore, water-only areas not included in regularly numbered census tract areas.

  20. N

    Snowflake, AZ Age Group Population Dataset: A Complete Breakdown of...

    • neilsberg.com
    csv, json
    Updated Jul 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2024). Snowflake, AZ Age Group Population Dataset: A Complete Breakdown of Snowflake Age Demographics from 0 to 85 Years and Over, Distributed Across 18 Age Groups // 2024 Edition [Dataset]. https://www.neilsberg.com/research/datasets/aab8cd11-4983-11ef-ae5d-3860777c1fe6/
    Explore at:
    json, csvAvailable download formats
    Dataset updated
    Jul 24, 2024
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Arizona, Snowflake
    Variables measured
    Population Under 5 Years, Population over 85 years, Population Between 5 and 9 years, Population Between 10 and 14 years, Population Between 15 and 19 years, Population Between 20 and 24 years, Population Between 25 and 29 years, Population Between 30 and 34 years, Population Between 35 and 39 years, Population Between 40 and 44 years, and 9 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates. To measure the two variables, namely (a) population and (b) population as a percentage of the total population, we initially analyzed and categorized the data for each of the age groups. For age groups we divided it into roughly a 5 year bucket for ages between 0 and 85. For over 85, we aggregated data into a single group for all ages. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Snowflake population distribution across 18 age groups. It lists the population in each age group along with the percentage population relative of the total population for Snowflake. The dataset can be utilized to understand the population distribution of Snowflake by age. For example, using this dataset, we can identify the largest age group in Snowflake.

    Key observations

    The largest age group in Snowflake, AZ was for the group of age 10 to 14 years years with a population of 873 (14.10%), according to the ACS 2018-2022 5-Year Estimates. At the same time, the smallest age group in Snowflake, AZ was the 80 to 84 years years with a population of 48 (0.78%). Source: U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2018-2022 5-Year Estimates

    Age groups:

    • Under 5 years
    • 5 to 9 years
    • 10 to 14 years
    • 15 to 19 years
    • 20 to 24 years
    • 25 to 29 years
    • 30 to 34 years
    • 35 to 39 years
    • 40 to 44 years
    • 45 to 49 years
    • 50 to 54 years
    • 55 to 59 years
    • 60 to 64 years
    • 65 to 69 years
    • 70 to 74 years
    • 75 to 79 years
    • 80 to 84 years
    • 85 years and over

    Variables / Data Columns

    • Age Group: This column displays the age group in consideration
    • Population: The population for the specific age group in the Snowflake is shown in this column.
    • % of Total Population: This column displays the population of each age group as a proportion of Snowflake total population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Snowflake Population by Age. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
tawfik elmetwally (2023). Census Income dataset [Dataset]. https://www.kaggle.com/datasets/tawfikelmetwally/census-income-dataset
Organization logo

Census Income dataset

Estimate whether a person’s income exceeds $50K/year

Explore at:
zip(707150 bytes)Available download formats
Dataset updated
Oct 28, 2023
Authors
tawfik elmetwally
Description

This intermediate level data set was extracted from the census bureau database. There are 48842 instances of data set, mix of continuous and discrete (train=32561, test=16281).

The data set has 15 attribute which include age, sex, education level and other relevant details of a person. The data set will help to improve your skills in Exploratory Data Analysis, Data Wrangling, Data Visualization and Classification Models.

Feel free to explore the data set with multiple supervised and unsupervised learning techniques. The Following description gives more details on this data set:

  • age: the age of an individual.
  • workclass: The type of work or employment of an individual. It can have the following categories:
    • Private: Working in the private sector.
    • Self-emp-not-inc: Self-employed individuals who are not incorporated.
    • Self-emp-inc: Self-employed individuals who are incorporated.
    • Federal-gov: Working for the federal government.
    • Local-gov: Working for the local government.
    • State-gov: Working for the state government.
    • Without-pay: Not working and without pay.
    • Never-worked: Never worked before.
  • Final Weight: The weights on the CPS files are controlled to independent estimates of the civilian noninstitutional population of the US. These are prepared monthly for us by Population Division here at the Census Bureau. We use 3 sets of controls.

These are: 1. A single cell estimate of the population 16+ for each state. 2. Controls for Hispanic Origin by age and sex. 3. Controls by Race, age and sex.

We use all three sets of controls in our weighting program and "rake" through them 6 times so that by the end we come back to all the controls we used.

People with similar demographic characteristics should have similar weights. There is one important caveat to remember about this statement. That is that since the CPS sample is actually a collection of 51 state samples, each with its own probability of selection, the statement only applies within state.

  • education: The highest level of education completed.
  • education-num: The number of years of education completed.
  • marital-status: The marital status.
  • occupation: Type of work performed by an individual.
  • relationship: The relationship status.
  • race: The race of an individual.
  • sex: The gender of an individual.
  • capital-gain: The amount of capital gain (financial profit).
  • capital-loss: The amount of capital loss an individual has incurred.
  • hours-per-week: The number of hours works per week.
  • native-country: The country of origin or the native country.
  • income: The income level of an individual and serves as the target variable. It indicates whether the income is greater than $50,000 or less than or equal to $50,000, denoted as (>50K, <=50K).
Search
Clear search
Close search
Google apps
Main menu