100+ datasets found
  1. Historic US census - 1930

    • redivis.com
    application/jsonl +7
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Historic US census - 1930 [Dataset]. http://doi.org/10.57761/6e5q-rh85
    Explore at:
    application/jsonl, parquet, spss, csv, arrow, stata, avro, sasAvailable download formats
    Dataset updated
    Jan 10, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 1930 - Dec 31, 1930
    Area covered
    United States
    Description

    Abstract

    The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    phsdatacore@stanford.edu for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Documentation

    This dataset was created on 2020-01-10 22:52:11.461 by merging multiple datasets together. The source datasets for this version were:

    IPUMS 1930 households: This dataset includes all households from the 1930 US census.

    IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.

    IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

    Section 2

    Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

    In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

    The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

    Notes

    • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

    • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

    • Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

    • Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

    • Most inconsistent information was not edite

  2. d

    Census Data

    • catalog.data.gov
    • data.globalchange.gov
    • +2more
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Bureau of the Census (2024). Census Data [Dataset]. https://catalog.data.gov/dataset/census-data
    Explore at:
    Dataset updated
    Mar 1, 2024
    Dataset provided by
    U.S. Bureau of the Census
    Description

    The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.

  3. census-bureau-usa

    • kaggle.com
    zip
    Updated May 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Google BigQuery (2020). census-bureau-usa [Dataset]. https://www.kaggle.com/datasets/bigquery/census-bureau-usa
    Explore at:
    zip(0 bytes)Available download formats
    Dataset updated
    May 18, 2020
    Dataset provided by
    BigQueryhttps://cloud.google.com/bigquery
    Authors
    Google BigQuery
    Area covered
    United States
    Description

    Context :

    The United States census count (also known as the Decennial Census of Population and Housing) is a count of every resident of the US. The census occurs every 10 years and is conducted by the United States Census Bureau. Census data is publicly available through the census website, but much of the data is available in summarized data and graphs. The raw data is often difficult to obtain, is typically divided by region, and it must be processed and combined to provide information about the nation as a whole. Update frequency: Historic (none)

    Dataset source

    United States Census Bureau

    Sample Query

    SELECT zipcode, population FROM bigquery-public-data.census_bureau_usa.population_by_zip_2010 WHERE gender = '' ORDER BY population DESC LIMIT 10

    Terms of use

    This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source - http://www.data.gov/privacy-policy#data_policy - and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

    See the GCP Marketplace listing for more details and sample queries: https://console.cloud.google.com/marketplace/details/united-states-census-bureau/us-census-data

  4. a

    2020 USA Census Tracts for USR Search Segments - d0a8d6

    • data-napsg.opendata.arcgis.com
    • prep-response-portal.napsgfoundation.org
    Updated Jun 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SARGeo (2025). 2020 USA Census Tracts for USR Search Segments - d0a8d6 [Dataset]. https://data-napsg.opendata.arcgis.com/items/30826c66b8d14e3386b2265f7cd0a8d6
    Explore at:
    Dataset updated
    Jun 24, 2025
    Dataset authored and provided by
    SARGeo
    Area covered
    Description

    USA Census Tracts for Urban Search and Rescue. This layer can be used for search segment planning. Census Tracts generally contain between 1,200 and 8,000 people, with an optimum size of 4,000 people and the boundaries generally follow existing roads and waterways. The field segment_designation is the last 5 digits of the unique identifier and matches the field in the SARCOP Segment layer.This layer presents the USA 2020 Census Tract boundaries of the United States in the 50 states and the District of Columbia. It is updated annually as Tract boundaries change. The geography is sourced from US Census Bureau 2020 TIGER FGDB (National Sub-State) and edited using TIGER Hydrology to add a detailed coastline for cartographic purposes. Geography last updated May 2022.Attribute fields include 2020 total population from the US Census PL94 data.

  5. 2020 Census Tracts

    • catalog.data.gov
    • data.oregon.gov
    • +3more
    Updated Jan 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Spatial Data Collection and Products Branch (2025). 2020 Census Tracts [Dataset]. https://catalog.data.gov/dataset/census-tracts
    Explore at:
    Dataset updated
    Jan 31, 2025
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Description

    This data layer is an element of the Oregon GIS Framework. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2020 Census Participant Statistical Areas Program. The Census Bureau delineated the census tracts in situations where no local participant existed or where all the potential participants declined to participate. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. When first delineated, census tracts were designed to be homogeneous with respect to population characteristics, economic status, and living conditions. The spatial size of census tracts varies widely depending on the density of settlement. Physical changes in street patterns caused by highway construction, new development, and so forth, may require boundary revisions. In addition, census tracts occasionally are split due to population growth, or combined as a result of substantial population decline. Census tract boundaries generally follow visible and identifiable features. They may follow legal boundaries such as minor civil division (MCD) or incorporated place boundaries in some States and situations to allow for census tract-to-governmental unit relationships where the governmental boundaries tend to remain unchanged between censuses. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census and beyond, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.

  6. n

    2020 Census Block Groups for Urban Search and Rescue - 764585

    • prep-response-portal.napsgfoundation.org
    Updated Jun 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SARGeo (2025). 2020 Census Block Groups for Urban Search and Rescue - 764585 [Dataset]. https://prep-response-portal.napsgfoundation.org/datasets/sargeo::2020-census-block-groups-for-urban-search-and-rescue-764585
    Explore at:
    Dataset updated
    Jun 24, 2025
    Dataset authored and provided by
    SARGeo
    Area covered
    Description

    USA Census Block Groups (CBG) for Urban Search and Rescue. This layer can be used for search segment planning. Block groups generally contain between 600 and 5,000 people and the boundaries generally follow existing roads and waterways. The field segment_designation is the last 6 digits of the unique identifier and matches the field in the SARCOP Segment layer.Data download date: August 12, 2021Census tables: P1, P2, P3, P4, H1, P5, HeaderDownloaded from: Census FTP siteProcessing Notes:Data was downloaded from the U.S. Census Bureau FTP site, imported into SAS format and joined to the 2020 TIGER boundaries. Boundaries are sourced from the 2020 TIGER/Line Geodatabases. Boundaries have been projected into Web Mercator and each attribute has been given a clear descriptive alias name. No alterations have been made to the vertices of the data.Each attribute maintains it's specified name from Census, but also has a descriptive alias name and long description derived from the technical documentation provided by the Census. For a detailed list of the attributes contained in this layer, view the Data tab and select "Fields". The following alterations have been made to the tabular data:Joined all tables to create one wide attribute table:P1 - RaceP2 - Hispanic or Latino, and not Hispanic or Latino by RaceP3 - Race for the Population 18 Years and OverP4 - Hispanic or Latino, and not Hispanic or Latino by Race for the Population 18 Years and OverH1 - Occupancy Status (Housing)P5 - Group Quarters Population by Group Quarters Type (correctional institutions, juvenile facilities, nursing facilities/skilled nursing, college/university student housing, military quarters, etc.)HeaderAfter joining, dropped fields: FILEID, STUSAB, CHARITER, CIFSN, LOGRECNO, GEOVAR, GEOCOMP, LSADC, and BLOCK.GEOCOMP was renamed to GEOID and moved be the first column in the table, the original GEOID was dropped.Placeholder fields for future legislative districts have been dropped: CD118, CD119, CD120, CD121, SLDU22, SLDU24, SLDU26, SLDU28, SLDL22, SLDL24 SLDL26, SLDL28.P0020001 was dropped, as it is duplicative of P0010001. Similarly, P0040001 was dropped, as it is duplicative of P0030001.In addition to calculated fields, County_Name and State_Name were added.The following calculated fields have been added (see long field descriptions in the Data tab for formulas used): PCT_P0030001: Percent of Population 18 Years and OverPCT_P0020002: Percent Hispanic or LatinoPCT_P0020005: Percent White alone, not Hispanic or LatinoPCT_P0020006: Percent Black or African American alone, not Hispanic or LatinoPCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or LatinoPCT_P0020008: Percent Asian alone, Not Hispanic or LatinoPCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or LatinoPCT_P0020010: Percent Some Other Race alone, not Hispanic or LatinoPCT_P0020011: Percent Population of Two or More Races, not Hispanic or LatinoPCT_H0010002: Percent of Housing Units that are OccupiedPCT_H0010003: Percent of Housing Units that are VacantPlease note these percentages might look strange at the individual block group level, since this data has been protected using differential privacy.* *To protect the privacy and confidentiality of respondents, data has been protected using differential privacy techniques by the U.S. Census Bureau. This means that some individual block groups will have values that are inconsistent or improbable. However, when aggregated up, these issues become minimized. The pop-up on this layer uses Arcade to display aggregated values for the surrounding area rather than values for the block group itself.Download Census redistricting data in this layer as a file geodatabase.Additional links:U.S. Census BureauU.S. Census Bureau Decennial CensusAbout the 2020 Census2020 Census2020 Census data qualityDecennial Census P.L. 94-171 Redistricting Data Program

  7. g

    Census of Population, 1880 [United States]: Public Use Sample (1 in 1000...

    • search.gesis.org
    Updated Feb 1, 2001
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GESIS search (2001). Census of Population, 1880 [United States]: Public Use Sample (1 in 1000 Preliminary Subsample) - Archival Version [Dataset]. http://doi.org/10.3886/ICPSR09474
    Explore at:
    Dataset updated
    Feb 1, 2001
    Dataset provided by
    ICPSR - Interuniversity Consortium for Political and Social Research
    GESIS search
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119

    Area covered
    United States
    Description

    Abstract (en): This data collection provides a preliminary subsample of the 1880 Public Use Sample drawn from census enumeration forms. The file contains two types of records: family and person. Each household record is followed by a record for each person in the family. This collection contains information about size of family, number of persons and families in dwelling, and geographic location of each household. Information on individuals includes demographic characteristics, civil condition, occupation, health, education, and nativity. Manuscript census records from 1880 for the 38 United States, the District of Columbia, and the Dakota Territory. This collection is a nationally representative--although clustered--1 in 1000 preliminary subsample of the United States population in 1880. The subsample is based on every tenth microfilm reel of enumeration forms (there are a total of 1,454 reels) and, within each reel, on the census page itself. In terms of the Public Use Sample as a whole, a sample density of 1 person per 100 was chosen so that a single sample point was randomly generated for every two census pages. Sample points were chosen for inclusion in the collection only if the individual selected was the first person listed in the dwelling. Under this procedure each dwelling, family, and individual in the population had a 1 in 100 probability of inclusion in the Public Use Sample. The complete sample, which will be released by the principal investigators in December 1993, will contain approximately 500,000 individuals living in 100,000 families, or 1 percent of the United States population in 1880. Funding insitution(s): United States Department of Health and Human Services. National Institutes of Health (HD25839). (1) This dataset has two levels. The first level ("F" Record Type) contains 29 variables for each of 10,126 families. The second level ("P" Record Type) contains 45 variables for each of 48,786 individuals residing in those families. (2) The data contain blanks and alphabetic characters. (3) Users will note some differences in code frequencies between certain variables in this collection and the totals listed in the documentation. (4) This collection is superseded by CENSUS OF POPULATION, 1880 [UNITED STATES]: PUBLIC USE SAMPLE (ICPSR 6460).

  8. Census of Population and Housing, 1980 [United States]: Person and Housing...

    • icpsr.umich.edu
    • search.datacite.org
    ascii
    Updated Feb 16, 1992
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States. Bureau of the Census (1992). Census of Population and Housing, 1980 [United States]: Person and Housing Unit Counts for Tracts and Minor Civil Divisions [Dataset]. http://doi.org/10.3886/ICPSR07970.v1
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Feb 16, 1992
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States. Bureau of the Census
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/7970/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7970/terms

    Time period covered
    1980
    Area covered
    United States
    Description

    This data collection covers all census tracts and all minor civil divisions (MCD) or census county divisions (CCD) in the United States. All persons in the country were included in the file, which has counts for total population, population in group quarters, number of total housing units, and number of occupied housing units. There are 83,672 records in the file, one for each geographic unit, and each record is arranged in a sort sequence of record type, state, county and then MCD/CCD or tract.

  9. 2020 Decennial Census: B_TOTUS | Components of Coverage for the United...

    • data.census.gov
    Updated Mar 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DEC (2022). 2020 Decennial Census: B_TOTUS | Components of Coverage for the United States Household Popluation (in Thousands) (DEC Decennial Post-Enumeration Survey) [Dataset]. https://data.census.gov/table/DECENNIALPES2020.B_TOTUS
    Explore at:
    Dataset updated
    Mar 10, 2022
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    DEC
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2020
    Area covered
    United States
    Description

    X Not applicable.1. For the national table, someone who should have been counted is considered a correct enumeration if he or she was enumerated anywhere in the United States..2. More precisely, enumerated in the search area for the correct basic collection unit. For definitions of basic collection unit and search area, see PES Reports..3. Other reasons include fictitious people, those born after April 1, 2020, those who died before April 1, 2020, etc..4. These imputations represent people from whom we did not collect sufficient information. Their records are included in the census count..5. This number is the PES estimate of people who should have been counted in the PES household universe. It does not include people in group quarters or people living in the Remote Alaska enumeration areas..6. Omissions are people who should have been correctly enumerated in the United States household population, but were not. Many of these people may have been accounted for in the whole-person census imputations above..7. A negative (positive) estimate of net coverage error indicates an undercount (overcount)..Note: Estimates are rounded for disclosure avoidance. As a result, counts may not sum to totals shown..Source: U.S. Census Bureau, Decennial Statistical Studies Division, 2020 Post-Enumeration Survey (March 2022 Release).Post-Enumeration Survey estimates are subject to sampling and nonsampling errors. For information regarding data collection, definitions, sampling error, nonsampling error, and estimation methodology, see: Post-Enumeration Surveys.

  10. d

    Current Population Survey (CPS)

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Damico, Anthony (2023). Current Population Survey (CPS) [Dataset]. http://doi.org/10.7910/DVN/AK4FDD
    Explore at:
    Dataset updated
    Nov 21, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Damico, Anthony
    Description

    analyze the current population survey (cps) annual social and economic supplement (asec) with r the annual march cps-asec has been supplying the statistics for the census bureau's report on income, poverty, and health insurance coverage since 1948. wow. the us census bureau and the bureau of labor statistics ( bls) tag-team on this one. until the american community survey (acs) hit the scene in the early aughts (2000s), the current population survey had the largest sample size of all the annual general demographic data sets outside of the decennial census - about two hundred thousand respondents. this provides enough sample to conduct state- and a few large metro area-level analyses. your sample size will vanish if you start investigating subgroups b y state - consider pooling multiple years. county-level is a no-no. despite the american community survey's larger size, the cps-asec contains many more variables related to employment, sources of income, and insurance - and can be trended back to harry truman's presidency. aside from questions specifically asked about an annual experience (like income), many of the questions in this march data set should be t reated as point-in-time statistics. cps-asec generalizes to the united states non-institutional, non-active duty military population. the national bureau of economic research (nber) provides sas, spss, and stata importation scripts to create a rectangular file (rectangular data means only person-level records; household- and family-level information gets attached to each person). to import these files into r, the parse.SAScii function uses nber's sas code to determine how to import the fixed-width file, then RSQLite to put everything into a schnazzy database. you can try reading through the nber march 2012 sas importation code yourself, but it's a bit of a proc freak show. this new github repository contains three scripts: 2005-2012 asec - download all microdata.R down load the fixed-width file containing household, family, and person records import by separating this file into three tables, then merge 'em together at the person-level download the fixed-width file containing the person-level replicate weights merge the rectangular person-level file with the replicate weights, then store it in a sql database create a new variable - one - in the data table 2012 asec - analysis examples.R connect to the sql database created by the 'download all microdata' progr am create the complex sample survey object, using the replicate weights perform a boatload of analysis examples replicate census estimates - 2011.R connect to the sql database created by the 'download all microdata' program create the complex sample survey object, using the replicate weights match the sas output shown in the png file below 2011 asec replicate weight sas output.png statistic and standard error generated from the replicate-weighted example sas script contained in this census-provided person replicate weights usage instructions document. click here to view these three scripts for more detail about the current population survey - annual social and economic supplement (cps-asec), visit: the census bureau's current population survey page the bureau of labor statistics' current population survey page the current population survey's wikipedia article notes: interviews are conducted in march about experiences during the previous year. the file labeled 2012 includes information (income, work experience, health insurance) pertaining to 2011. when you use the current populat ion survey to talk about america, subract a year from the data file name. as of the 2010 file (the interview focusing on america during 2009), the cps-asec contains exciting new medical out-of-pocket spending variables most useful for supplemental (medical spending-adjusted) poverty research. confidential to sas, spss, stata, sudaan users: why are you still rubbing two sticks together after we've invented the butane lighter? time to transition to r. :D

  11. Census of Population, 1940 [United States]: Public Use Microdata Sample

    • icpsr.umich.edu
    ascii
    Updated Jan 12, 2006
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States. Bureau of the Census (2006). Census of Population, 1940 [United States]: Public Use Microdata Sample [Dataset]. http://doi.org/10.3886/ICPSR08236.v1
    Explore at:
    asciiAvailable download formats
    Dataset updated
    Jan 12, 2006
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    United States. Bureau of the Census
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/8236/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/8236/terms

    Time period covered
    1940
    Area covered
    New Hampshire, New Mexico, Connecticut, Vermont, Hawaii, Washington, Maryland, New York (state), Florida, United States
    Description

    The 1940 Census Public Use Microdata Sample Project was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology at the University of Wisconsin. The collection contains a stratified 1-percent sample of households, with separate records for each household, for each "sample line" respondent, and for each person in the household. These records were encoded from microfilm copies of original handwritten enumeration schedules from the 1940 Census of Population. Geographic identification of the location of the sampled households includes Census regions and divisions, states (except Alaska and Hawaii), standard metropolitan areas (SMAs), and state economic areas (SEAs). Accompanying the data collection is a codebook that includes an abstract, descriptions of sample design, processing procedures and file structure, a data dictionary (record layout), category code lists, and a glossary. Also included is a procedural history of the 1940 Census. Each of the 20 subsamples contains three record types: household, sample line, and person. Household variables describe the location and condition of the household. The sample line records contain variables describing demographic characteristics such as nativity, marital status, number of children, veteran status, wage deductions for Social Security, and occupation. Person records also contain variables describing demographic characteristics including nativity, marital status, family membership, education, employment status, income, and occupation.

  12. 1950 Census Population Schedules, Enumeration District Maps, and Enumeration...

    • registry.opendata.aws
    Updated Apr 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Archives and Records Administration (NARA) (2022). 1950 Census Population Schedules, Enumeration District Maps, and Enumeration District Descriptions [Dataset]. https://registry.opendata.aws/nara-1950-census/
    Explore at:
    Dataset updated
    Apr 1, 2022
    Dataset provided by
    National Archives and Records Administrationhttp://www.archives.gov/
    Description

    The 1950 Census population schedules were created by the Bureau of the Census in an attempt to enumerate every person living in the United States on April 1, 1950, although some persons were missed. The 1950 census population schedules were digitized by the National Archives and Records Administration (NARA) and released publicly on April 1, 2022. The 1950 Census enumeration district maps contain maps of counties, cities, and other minor civil divisions that show enumeration districts, census tracts, and related boundaries and numbers used for each census. The coverage is nation wide and includes territorial areas. The 1950 Census enumeration district descriptions contain written descriptions of census districts, subdivisions, and enumeration districts.

  13. d

    Replication Data for: The use of differential privacy for census data and...

    • search.dataone.org
    • dataverse.harvard.edu
    Updated Nov 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kenny, Christopher T.; Kuriwaki, Shiro; McCartan, Cory; Rosenman, Evan; Simko, Tyler; Kosuke, Imai (2023). Replication Data for: The use of differential privacy for census data and its impact on redistricting: The case of the 2020 U.S. Census [Dataset]. http://doi.org/10.7910/DVN/TNNSXG
    Explore at:
    Dataset updated
    Nov 14, 2023
    Dataset provided by
    Harvard Dataverse
    Authors
    Kenny, Christopher T.; Kuriwaki, Shiro; McCartan, Cory; Rosenman, Evan; Simko, Tyler; Kosuke, Imai
    Description

    Census statistics play a key role in public policy decisions and social science research. However, given the risk of revealing individual information, many statistical agencies are considering disclosure control methods based on differential privacy, which add noise to tabulated data. Unlike other applications of differential privacy, however, census statistics must be postprocessed after noise injection to be usable. We study the impact of the U.S. Census Bureau’s latest disclosure avoidance system (DAS) on a major application of census statistics, the redrawing of electoral districts. We find that the DAS systematically undercounts the population in mixed-race and mixed-partisan precincts, yielding unpredictable racial and partisan biases. While the DAS leads to a likely violation of the “One Person, One Vote” standard as currently interpreted, it does not prevent accurate predictions of an individual’s race and ethnicity. Our findings underscore the difficulty of balancing accuracy and respondent privacy in the Census.

  14. US Census Demographic Data

    • kaggle.com
    zip
    Updated Mar 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MuonNeutrino (2019). US Census Demographic Data [Dataset]. https://www.kaggle.com/muonneutrino/us-census-demographic-data
    Explore at:
    zip(11110116 bytes)Available download formats
    Dataset updated
    Mar 3, 2019
    Authors
    MuonNeutrino
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset expands on my earlier New York City Census Data dataset. It includes data from the entire country instead of just New York City. The expanded data will allow for much more interesting analyses and will also be much more useful at supporting other data sets.

    Content

    The data here are taken from the DP03 and DP05 tables of the 2015 American Community Survey 5-year estimates. The full datasets and much more can be found at the American Factfinder website. Currently, I include two data files:

    1. acs2015_census_tract_data.csv: Data for each census tract in the US, including DC and Puerto Rico.
    2. acs2015_county_data.csv: Data for each county or county equivalent in the US, including DC and Puerto Rico.

    The two files have the same structure, with just a small difference in the name of the id column. Counties are political subdivisions, and the boundaries of some have been set for centuries. Census tracts, however, are defined by the census bureau and will have a much more consistent size. A typical census tract has around 5000 or so residents.

    The Census Bureau updates the estimates approximately every year. At least some of the 2016 data is already available, so I will likely update this in the near future.

    Acknowledgements

    The data here were collected by the US Census Bureau. As a product of the US federal government, this is not subject to copyright within the US.

    Inspiration

    There are many questions that we could try to answer with the data here. Can we predict things such as the state (classification) or household income (regression)? What kinds of clusters can we find in the data? What other datasets can be improved by the addition of census data?

  15. c

    US Census Bureau Data

    • s.cnmilf.com
    • data.sfgov.org
    • +3more
    Updated Mar 29, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.sfgov.org (2025). US Census Bureau Data [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/us-census-bureau-data
    Explore at:
    Dataset updated
    Mar 29, 2025
    Dataset provided by
    data.sfgov.org
    Area covered
    United States
    Description

    The Census Bureau conducts nearly one hundred surveys and censuses every year. By law, no one is permitted to reveal information from these censuses and surveys that could identify any person, household, or business. The Decennial Census collects data every 10 years about households, income, education, homeownership, and more. NOTE: Follow the link and search for SAN FRANCISCO data.

  16. Census Data for 2000 from Geolytics

    • search.dataone.org
    Updated Oct 14, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cary Institute Of Ecosystem Studies; Jarlath O'Neil-Dunne (2013). Census Data for 2000 from Geolytics [Dataset]. https://search.dataone.org/view/knb-lter-bes.23.570
    Explore at:
    Dataset updated
    Oct 14, 2013
    Dataset provided by
    Long Term Ecological Research Networkhttp://www.lternet.edu/
    Authors
    Cary Institute Of Ecosystem Studies; Jarlath O'Neil-Dunne
    Time period covered
    Jan 1, 2004 - Nov 17, 2011
    Area covered
    Description

    Geolytics Census 2000 Long Form dataset. The Geolytics Census 2000 Long Form is a comprehensive source of detailed information about the people, housing, and economy of the United States. The Census 2000 Long Form offers the entire US Census Bureau's SF3 dataset. This dataset contains variables such as income, housing, employment, language spoken, ancestry, education, poverty, rent, mortgage, commute to work, etc. There are 5,500 variables at the Block Group level. A select portion of the Geolytics Census data was joined to GDT spatial data by block group and some census attributes were aggregated. See the attached txt file for a description of the attributes. This is part of a collection of 221 Baltimore Ecosystem Study metadata records that point to a geodatabase. The geodatabase is available online and is considerably large. Upon request, and under certain arrangements, it can be shipped on media, such as a usb hard drive. The geodatabase is roughly 51.4 Gb in size, consisting of 4,914 files in 160 folders. Although this metadata record and the others like it are not rich with attributes, it is nonetheless made available because the data that it represents could be indeed useful.

  17. H

    American Community Survey (ACS)

    • dataverse.harvard.edu
    Updated May 30, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Damico (2013). American Community Survey (ACS) [Dataset]. http://doi.org/10.7910/DVN/DKI9L4
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 30, 2013
    Dataset provided by
    Harvard Dataverse
    Authors
    Anthony Damico
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    analyze the american community survey (acs) with r and monetdb experimental. think of the american community survey (acs) as the united states' census for off-years - the ones that don't end in zero. every year, one percent of all americans respond, making it the largest complex sample administered by the u.s. government (the decennial census has a much broader reach, but since it attempts to contact 100% of the population, it's not a sur vey). the acs asks how people live and although the questionnaire only includes about three hundred questions on demography, income, insurance, it's often accurate at sub-state geographies and - depending how many years pooled - down to small counties. households are the sampling unit, and once a household gets selected for inclusion, all of its residents respond to the survey. this allows household-level data (like home ownership) to be collected more efficiently and lets researchers examine family structure. the census bureau runs and finances this behemoth, of course. the dow nloadable american community survey ships as two distinct household-level and person-level comma-separated value (.csv) files. merging the two just rectangulates the data, since each person in the person-file has exactly one matching record in the household-file. for analyses of small, smaller, and microscopic geographic areas, choose one-, three-, or fiv e-year pooled files. use as few pooled years as you can, unless you like sentences that start with, "over the period of 2006 - 2010, the average american ... [insert yer findings here]." rather than processing the acs public use microdata sample line-by-line, the r language brazenly reads everything into memory by default. to prevent overloading your computer, dr. thomas lumley wrote the sqlsurvey package principally to deal with t his ram-gobbling monster. if you're already familiar with syntax used for the survey package, be patient and read the sqlsurvey examples carefully when something doesn't behave as you expect it to - some sqlsurvey commands require a different structure (i.e. svyby gets called through svymean) and others might not exist anytime soon (like svyolr). gimme some good news: sqlsurvey uses ultra-fast monetdb (click here for speed tests), so follow the monetdb installation instructions before running this acs code. monetdb imports, writes, recodes data slowly, but reads it hyper-fast . a magnificent trade-off: data exploration typically requires you to think, send an analysis command, think some more, send another query, repeat. importation scripts (especially the ones i've already written for you) can be left running overnight sans hand-holding. the acs weights generalize to the whole united states population including individuals living in group quarters, but non-residential respondents get an abridged questionnaire, so most (not all) analysts exclude records with a relp variable of 16 or 17 right off the bat. this new github repository contains four scripts: 2005-2011 - download all microdata.R create the batch (.bat) file needed to initiate the monet database in the future download, unzip, and import each file for every year and size specified by the user create and save household- and merged/person-level replicate weight complex sample designs create a well-documented block of code to re-initiate the monet db server in the future fair warning: this full script takes a loooong time. run it friday afternoon, commune with nature for the weekend, and if you've got a fast processor and speedy internet connection, monday morning it should be ready for action. otherwise, either download only the years and sizes you need or - if you gotta have 'em all - run it, minimize it, and then don't disturb it for a week. 2011 single-year - analysis e xamples.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file perform the standard repertoire of analysis examples, only this time using sqlsurvey functions 2011 single-year - variable reco de example.R run the well-documented block of code to re-initiate the monetdb server copy the single-year 2011 table to maintain the pristine original add a new age category variable by hand add a new age category variable systematically re-create then save the sqlsurvey replicate weight complex sample design on this new table close everything, then load everything back up in a fresh instance of r replicate a few of the census statistics. no muss, no fuss replicate census estimates - 2011.R run the well-documented block of code to re-initiate the monetdb server load the r data file (.rda) containing the replicate weight designs for the single-year 2011 file match every nation wide statistic on the census bureau's estimates page, using sqlsurvey functions click here to view these four scripts for more detail about the american community survey (acs), visit: < ul> the us census...

  18. 2020 Decennial Census: B_TOTPR | Components of Coverage for the Puerto Rico...

    • data.census.gov
    Updated Aug 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DEC (2022). 2020 Decennial Census: B_TOTPR | Components of Coverage for the Puerto Rico Household Population (DEC Decennial Post-Enumeration Survey) [Dataset]. https://data.census.gov/table/DECENNIALPES2020.B_TOTPR
    Explore at:
    Dataset updated
    Aug 15, 2022
    Dataset provided by
    United States Census Bureauhttp://census.gov/
    Authors
    DEC
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Time period covered
    2020
    Description

    X Not applicable.1. For this table, someone who should have been counted is considered a correct enumeration if he or she was enumerated anywhere in Puerto Rico..2. More precisely, enumerated in the search area for the correct basic collection unit. For definitions of basic collection unit and search area, see PES Reports..3. Other reasons include fictitious people, those born after April 1, 2020, those who died before April 1, 2020, etc..4. These imputations represent people from whom we did not collect sufficient information. Their records are included in the census count..5. This number is the PES estimate of people who should have been counted in the PES household universe. It does not include people in group quarters..6. Omissions were people who should have been correctly enumerated in Puerto Rico, but were not. Many of these people may have been accounted for in the whole-person census imputations above..7. A negative (positive) estimate of net coverage error indicates an undercount (overcount)..Note: Estimates are rounded for disclosure avoidance. As a result, counts may not sum to totals shown..Source: U.S. Census Bureau, Decennial Statistical Studies Division, 2020 Post-Enumeration Survey (August 2022 Release).For information regarding data collection, definitions, sampling error, nonsampling error, and estimation methodology, see: Post-Enumeration Surveys.

  19. g

    Metadata for Census 2010 Restricted-Use Microdata

    • search.gesis.org
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S.Census Bureau, Metadata for Census 2010 Restricted-Use Microdata [Dataset]. http://doi.org/10.3886/E101222V1-4873
    Explore at:
    Dataset provided by
    ICPSR - Interuniversity Consortium for Political and Social Research
    GESIS search
    Authors
    U.S.Census Bureau
    License

    https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de616008https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de616008

    Description

    Abstract (en): The U.S. Census counts every resident in the United States. It is mandated by Article I, Section 2 of the Constitution and takes place every 10 years. The basic purpose of the census is apportionment and redistricting. "Apportionment" is the process of dividing the 435 memberships, or seats, in the House of Representatives among the 50 states based on the population figures collected during the decennial census. "Redistricting" is the process of geographically defining state legislative districts. The census data allow state officials to realign congressional and state legislative districts in their states, taking into account population shifts since the last census and assuring equal representation for their constituents in compliance with the “one-person, one-vote” principle of the 1965 Voting Rights Act. The resident population of the United States

  20. I

    Israel Unemployment: 2008 Census: Female: Duration of Search for Work: 27...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Israel Unemployment: 2008 Census: Female: Duration of Search for Work: 27 Weeks and More [Dataset]. https://www.ceicdata.com/en/israel/unemployment-2008-census/unemployment-2008-census-female-duration-of-search-for-work-27-weeks-and-more
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2015 - Mar 1, 2018
    Area covered
    Israel
    Variables measured
    Unemployment
    Description

    Israel Unemployment: 2008 Census: Female: Duration of Search for Work: 27 Weeks and More data was reported at 9.449 Person th in Sep 2018. This records a decrease from the previous number of 10.333 Person th for Jun 2018. Israel Unemployment: 2008 Census: Female: Duration of Search for Work: 27 Weeks and More data is updated quarterly, averaging 18.434 Person th from Mar 2011 (Median) to Sep 2018, with 31 observations. The data reached an all-time high of 31.431 Person th in Dec 2012 and a record low of 7.759 Person th in Mar 2018. Israel Unemployment: 2008 Census: Female: Duration of Search for Work: 27 Weeks and More data remains active status in CEIC and is reported by Central Bureau of Statistics. The data is categorized under Global Database’s Israel – Table IL.G029: Unemployment: 2008 Census.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford Center for Population Health Sciences (2020). Historic US census - 1930 [Dataset]. http://doi.org/10.57761/6e5q-rh85
Organization logo

Historic US census - 1930

Explore at:
application/jsonl, parquet, spss, csv, arrow, stata, avro, sasAvailable download formats
Dataset updated
Jan 10, 2020
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1930 - Dec 31, 1930
Area covered
United States
Description

Abstract

The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

This dataset was created on 2020-01-10 22:52:11.461 by merging multiple datasets together. The source datasets for this version were:

IPUMS 1930 households: This dataset includes all households from the 1930 US census.

IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.

IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

Section 2

Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

Notes

  • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

  • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

  • Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

  • Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

  • Most inconsistent information was not edite

Search
Clear search
Close search
Google apps
Main menu