10 datasets found
  1. Historic US census - 1930

    • redivis.com
    application/jsonl +7
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Historic US census - 1930 [Dataset]. http://doi.org/10.57761/6e5q-rh85
    Explore at:
    application/jsonl, parquet, spss, csv, arrow, stata, avro, sasAvailable download formats
    Dataset updated
    Jan 10, 2020
    Dataset provided by
    Redivis Inc.
    Authors
    Stanford Center for Population Health Sciences
    Time period covered
    Jan 1, 1930 - Dec 31, 1930
    Area covered
    United States
    Description

    Abstract

    The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

    Before Manuscript Submission

    All manuscripts (and other items you'd like to publish) must be submitted to

    phsdatacore@stanford.edu for approval prior to journal submission.

    We will check your cell sizes and citations.

    For more information about how to cite PHS and PHS datasets, please visit:

    https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

    Documentation

    This dataset was created on 2020-01-10 22:52:11.461 by merging multiple datasets together. The source datasets for this version were:

    IPUMS 1930 households: This dataset includes all households from the 1930 US census.

    IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.

    IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

    Section 2

    Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

    In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

    The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

    Notes

    • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

    • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

    • Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

    • Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

    • Most inconsistent information was not edite

  2. r

    Persons

    • redivis.com
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Persons [Dataset]. https://redivis.com/datasets/hs2s-9ff789s72
    Explore at:
    Dataset updated
    Jan 10, 2020
    Dataset authored and provided by
    Stanford Center for Population Health Sciences
    Time period covered
    1930
    Description

    This dataset includes all individuals from the 1930 US census.

  3. r

    Households

    • redivis.com
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Households [Dataset]. https://redivis.com/datasets/hs2s-9ff789s72
    Explore at:
    Dataset updated
    Jan 10, 2020
    Dataset authored and provided by
    Stanford Center for Population Health Sciences
    Time period covered
    1930
    Description

    This dataset includes all households from the 1930 US census.

  4. r

    Lookup

    • redivis.com
    Updated Jan 10, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stanford Center for Population Health Sciences (2020). Lookup [Dataset]. https://redivis.com/datasets/hs2s-9ff789s72
    Explore at:
    Dataset updated
    Jan 10, 2020
    Dataset authored and provided by
    Stanford Center for Population Health Sciences
    Description

    This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

  5. H

    Census Linking Project: 1920-1930 Crosswalk

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Jul 31, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ran Abramitzky; Leah Boustan; Katherine Eriksson; Myera Rashid; Santiago Pérez (2025). Census Linking Project: 1920-1930 Crosswalk [Dataset]. http://doi.org/10.7910/DVN/JCNEX2
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 31, 2025
    Dataset provided by
    Harvard Dataverse
    Authors
    Ran Abramitzky; Leah Boustan; Katherine Eriksson; Myera Rashid; Santiago Pérez
    License

    https://dataverse.harvard.edu/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.7910/DVN/JCNEX2https://dataverse.harvard.edu/api/datasets/:persistentId/versions/3.0/customlicense?persistentId=doi:10.7910/DVN/JCNEX2

    Description

    This crosswalk consists of individuals matched between the 1920 and 1930 complete-count US Censuses. Within the crosswalk, users have the option to select the linking method with which these matches were created. This version of the crosswalk contains links made by the ABE-exact (conservative and standard) method, the ABE-NYSIIS (conservative and standard) method, ABE-EI exact (conservative and standard) method, and the ABE-EI NYSIIS (conservative and standard) method, with variants in which race is used as a matching variable. For any chosen method, users can merge into this crosswalk a wide set of individual- and household-level variables provided publicly by IPUMS, thereby creating a historical longitudinal dataset for analysis.

  6. o

    Historic Redlining Scores for 2010 and 2020 US Census Tracts

    • openicpsr.org
    spss
    Updated May 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Helen C.S. Meier; Bruce C. Mitchell (2021). Historic Redlining Scores for 2010 and 2020 US Census Tracts [Dataset]. http://doi.org/10.3886/E141121V2
    Explore at:
    spssAvailable download formats
    Dataset updated
    May 25, 2021
    Dataset provided by
    National Community Reinvestment Coalition
    University of Michigan. Institute for Social Research. Survey Research Center
    Authors
    Helen C.S. Meier; Bruce C. Mitchell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    The Home Owners’ Loan Corporation (HOLC) was a U.S. federal agency that graded mortgage investment risk of neighborhoods across the U.S. between 1935 and 1940. HOLC residential security maps standardized neighborhood risk appraisal methods that included race and ethnicity, pioneering the institutional logic of residential “redlining.” The Mapping Inequality Project digitized the HOLC mortgage security risk maps from the 1930s. We overlaid the HOLC maps with 2010 and 2020 census tracts for 142 cities across the U.S. using ArcGIS and determined the proportion of HOLC residential security grades contained within the boundaries. We assigned a numerical value to each HOLC risk category as follows: 1 for “A” grade, 2 for “B” grade, 3 for “C” grade, and 4 for “D” grade. We calculated a historic redlining score from the summed proportion of HOLC residential security grades multiplied by a weighting factor based on area within each census tract. A higher score means greater redlining of the census tract. Continuous historic redlining score, assessing the degree of “redlining,” as well as 4 equal interval divisions of redlining, can be linked to existing data sources by census tract identifier allowing for one form of structural racism in the housing market to be assessed with a variety of outcomes. The 2010 files are set to census 2010 tract boundaries. The 2020 files use the new census 2020 tract boundaries, reflecting the increase in the number of tracts from 12,888 in 2010, to 13,488 in 2020. Use the 2010 HRS with decennial census 2010 or ACS 2010-2019 data. As of publication (10/15/2020) decennial census 2020 data for the P1 (population) and H1 (housing) files are available from census.

  7. e

    Data from: Neighborhood Socioeconomic and demographic changes in Baltimore's...

    • portal.edirepository.org
    • search.dataone.org
    • +1more
    csv, zip
    Updated Mar 5, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dexter Locke (2019). Neighborhood Socioeconomic and demographic changes in Baltimore's (BES) Neighborhoods: 1930 to 2010 [Dataset]. http://doi.org/10.6073/pasta/346d11d1e409ac395d18f5619b896336
    Explore at:
    zip(1627908 bytes), csv(408211 byte)Available download formats
    Dataset updated
    Mar 5, 2019
    Dataset provided by
    EDI
    Authors
    Dexter Locke
    Time period covered
    1930 - 2017
    Area covered
    Variables measured
    Name, p_own, p_black, p_eduHS, p_white, time_yr, Comments, neigh_yr, p_eduCOL, p_vacant, and 5 more
    Description

    This dataset was created primarily to map and track socioeconomic and demographic variables from the US Census Bureau from year 1940 to year 2010, by decade, within the City of Baltimore's Mayor's Office of Information Technology (MOIT) year 2010 neighborhood boundaries. The socioeconomic and demographic variables include the percent White, percent African American, percent owner occupied homes, percent vacant homes, the percentage of age 25 and older people with a high school education or greater, and the percentage of age 25 and older people with a college education or greater. Percent White and percent African American are also provided for year 1930. Each of the the year 2010 neighborhood boundaries were also attributed with the 1937 Home Owners' Loan Corporation (HOLC) definition of neighborhoods via spatial overlay. HOLC rated neighborhoods as A, B, C, D or Undefined. HOLC categorized the perceived safety and risk of mortgage refinance lending in metropolitan areas using a hierarchical grading scale of A, B, C, and D. A and B areas were considered the safest areas for federal investment due to their newer housing as well as higher earning and racially homogenous households. In contrast, C and D graded areas were viewed to be in a state of inevitable decline, depreciation, and decay, and thus risky for federal investment, due to their older housing stock and racial and ethnic composition. This policy was inherently a racist practice. Places were graded based on who lived there; poor areas with people of color were labeled as lower and less-than. HOLC's 1937 neighborhoods do not cover the entire extent of the year 2010 neighborhood boundaries. The neighborhood boundaries were also augmented to include which of the year 2017 Housing Market Typology (HMT) the 2010 neighborhoods fall within. Finally, the neighborhood boundaries were also augmented to include tree canopy and tree canopy change year 2007 to year 2015.

  8. o

    Deep Roots of Racial Inequalities in US Healthcare: The 1906 American...

    • portal.sds.ox.ac.uk
    txt
    Updated Dec 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Chrisinger (2023). Deep Roots of Racial Inequalities in US Healthcare: The 1906 American Medical Directory [Dataset]. http://doi.org/10.25446/oxford.24065709.v2
    Explore at:
    txtAvailable download formats
    Dataset updated
    Dec 5, 2023
    Dataset provided by
    University of Oxford
    Authors
    Benjamin Chrisinger
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    United States
    Description

    This dataset comprises physician-level entries from the 1906 American Medical Directory, the first in a series of semi-annual directories of all practicing physicians published by the American Medical Association [1]. Physicians are consistently listed by city, county, and state. Most records also include details about the place and date of medical training. From 1906-1940, Directories also identified the race of black physicians [2].This dataset comprises physician entries for a subset of US states and the District of Columbia, including all of the South and several adjacent states (Alabama, Arkansas, Delaware, Florida, Georgia, Kansas, Kentucky, Louisiana, Maryland, Mississippi, Missouri, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia). Records were extracted via manual double-entry by professional data management company [3], and place names were matched to latitude/longitude coordinates. The main source for geolocating physician entries was the US Census. Historical Census records were sourced from IPUMS National Historical Geographic Information System [4]. Additionally, a public database of historical US Post Office locations was used to match locations that could not be found using Census records [5]. Fuzzy matching algorithms were also used to match misspelled place or county names [6].The source of geocoding match is described in the “match.source” field (Type of spatial match (census_YEAR = match to NHGIS census place-county-state for given year; census_fuzzy_YEAR = matched to NHGIS place-county-state with fuzzy matching algorithm; dc = matched to centroid for Washington, DC; post_places = place-county-state matched to Blevins & Helbock's post office dataset; post_fuzzy = matched to post office dataset with fuzzy matching algorithm; post_simp = place/state matched to post office dataset; post_confimed_missing = post office dataset confirms place and county, but could not find coordinates; osm = matched using Open Street Map geocoder; hand-match = matched by research assistants reviewing web archival sources; unmatched/hand_match_missing = place coordinates could not be found). For records where place names could not be matched, but county names could, coordinates for county centroids were used. Overall, 40,964 records were matched to places (match.type=place_point) and 931 to county centroids ( match.type=county_centroid); 76 records could not be matched (match.type=NA).Most records include information about the physician’s medical training, including the year of graduation and a code linking to a school. A key to these codes is given on Directory pages 26-27, and at the beginning of each state’s section [1]. The OSM geocoder was used to assign coordinates to each school by its listed location. Straight-line distances between physicians’ place of training and practice were calculated using the sf package in R [7], and are given in the “school.dist.km” field. Additionally, the Directory identified a handful of schools that were “fraudulent” (school.fraudulent=1), and institutions set up to train black physicians (school.black=1).AMA identified black physicians in the directory with the signifier “(col.)” following the physician’s name (race.black=1). Additionally, a number of physicians attended schools identified by AMA as serving black students, but were not otherwise identified as black; thus an expanded racial identifier was generated to identify black physicians (race.black.prob=1), including physicians who attended these schools and those directly identified (race.black=1).Approximately 10% of dataset entries were audited by trained research assistants, in addition to 100% of black physician entries. These audits demonstrated a high degree of accuracy between the original Directory and extracted records. Still, given the complexity of matching across multiple archival sources, it is possible that some errors remain; any identified errors will be periodically rectified in the dataset, with a log kept of these updates.For further information about this dataset, or to report errors, please contact Dr Ben Chrisinger (Benjamin.Chrisinger@tufts.edu). Future updates to this dataset, including additional states and Directory years, will be posted here: https://dataverse.harvard.edu/dataverse/amd.References:1. American Medical Association, 1906. American Medical Directory. American Medical Association, Chicago. Retrieved from: https://catalog.hathitrust.org/Record/000543547.2. Baker, Robert B., Harriet A. Washington, Ololade Olakanmi, Todd L. Savitt, Elizabeth A. Jacobs, Eddie Hoover, and Matthew K. Wynia. "African American physicians and organized medicine, 1846-1968: origins of a racial divide." JAMA 300, no. 3 (2008): 306-313. doi:10.1001/jama.300.3.306.3. GABS Research Consult Limited Company, https://www.gabsrcl.com.4. Steven Manson, Jonathan Schroeder, David Van Riper, Tracy Kugler, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 17.0 [GNIS, TIGER/Line & Census Maps for US Places and Counties: 1900, 1910, 1920, 1930, 1940, 1950; 1910_cPHA: ds37]. Minneapolis, MN: IPUMS. 2022. http://doi.org/10.18128/D050.V17.05. Blevins, Cameron; Helbock, Richard W., 2021, "US Post Offices", https://doi.org/10.7910/DVN/NUKCNA, Harvard Dataverse, V1, UNF:6:8ROmiI5/4qA8jHrt62PpyA== [fileUNF]6. fedmatch: Fast, Flexible, and User-Friendly Record Linkage Methods. https://cran.r-project.org/web/packages/fedmatch/index.html7. sf: Simple Features for R. https://cran.r-project.org/web/packages/sf/index.html

  9. Population of the United States 1500-2100

    • statista.com
    Updated Aug 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Population of the United States 1500-2100 [Dataset]. https://www.statista.com/statistics/1067138/population-united-states-historical/
    Explore at:
    Dataset updated
    Aug 1, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    In the past four centuries, the population of the Thirteen Colonies and United States of America has grown from a recorded 350 people around the Jamestown colony in Virginia in 1610, to an estimated 346 million in 2025. While the fertility rate has now dropped well below replacement level, and the population is on track to go into a natural decline in the 2040s, projected high net immigration rates mean the population will continue growing well into the next century, crossing the 400 million mark in the 2070s. Indigenous population Early population figures for the Thirteen Colonies and United States come with certain caveats. Official records excluded the indigenous population, and they generally remained excluded until the late 1800s. In 1500, in the first decade of European colonization of the Americas, the native population living within the modern U.S. borders was believed to be around 1.9 million people. The spread of Old World diseases, such as smallpox, measles, and influenza, to biologically defenseless populations in the New World then wreaked havoc across the continent, often wiping out large portions of the population in areas that had not yet made contact with Europeans. By the time of Jamestown's founding in 1607, it is believed the native population within current U.S. borders had dropped by almost 60 percent. As the U.S. expanded, indigenous populations were largely still excluded from population figures as they were driven westward, however taxpaying Natives were included in the census from 1870 to 1890, before all were included thereafter. It should be noted that estimates for indigenous populations in the Americas vary significantly by source and time period. Migration and expansion fuels population growth The arrival of European settlers and African slaves was the key driver of population growth in North America in the 17th century. Settlers from Britain were the dominant group in the Thirteen Colonies, before settlers from elsewhere in Europe, particularly Germany and Ireland, made a large impact in the mid-19th century. By the end of the 19th century, improvements in transport technology and increasing economic opportunities saw migration to the United States increase further, particularly from southern and Eastern Europe, and in the first decade of the 1900s the number of migrants to the U.S. exceeded one million people in some years. It is also estimated that almost 400,000 African slaves were transported directly across the Atlantic to mainland North America between 1500 and 1866 (although the importation of slaves was abolished in 1808). Blacks made up a much larger share of the population before slavery's abolition. Twentieth and twenty-first century The U.S. population has grown steadily since 1900, reaching one hundred million in the 1910s, two hundred million in the 1960s, and three hundred million in 2007. Since WWII, the U.S. has established itself as the world's foremost superpower, with the world's largest economy, and most powerful military. This growth in prosperity has been accompanied by increases in living standards, particularly through medical advances, infrastructure improvements, clean water accessibility. These have all contributed to higher infant and child survival rates, as well as an increase in life expectancy (doubling from roughly 40 to 80 years in the past 150 years), which have also played a large part in population growth. As fertility rates decline and increases in life expectancy slows, migration remains the largest factor in population growth. Since the 1960s, Latin America has now become the most common origin for migrants in the U.S., while immigration rates from Asia have also increased significantly. It remains to be seen how immigration restrictions of the current administration affect long-term population projections for the United States.

  10. Historical Jewish population by region 1170-1995

    • statista.com
    Updated Jan 1, 2001
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2001). Historical Jewish population by region 1170-1995 [Dataset]. https://www.statista.com/statistics/1357607/historical-jewish-population/
    Explore at:
    Dataset updated
    Jan 1, 2001
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    The world's Jewish population has had a complex and tumultuous history over the past millennia, regularly dealing with persecution, pogroms, and even genocide. The legacy of expulsion and persecution of Jews, including bans on land ownership, meant that Jewish communities disproportionately lived in urban areas, working as artisans or traders, and often lived in their own settlements separate to the rest of the urban population. This separation contributed to the impression that events such as pandemics, famines, or economic shocks did not affect Jews as much as other populations, and such factors came to form the basis of the mistrust and stereotypes of wealth (characterized as greed) that have made up anti-Semitic rhetoric for centuries. Development since the Middle Ages The concentration of Jewish populations across the world has shifted across different centuries. In the Middle Ages, the largest Jewish populations were found in Palestine and the wider Levant region, with other sizeable populations in present-day France, Italy, and Spain. Later, however, the Jewish disapora became increasingly concentrated in Eastern Europe after waves of pogroms in the west saw Jewish communities move eastward. Poland in particular was often considered a refuge for Jews from the late-Middle Ages until the 18th century, when it was then partitioned between Austria, Prussia, and Russia, and persecution increased. Push factors such as major pogroms in the Russian Empire in the 19th century and growing oppression in the west during the interwar period then saw many Jews migrate to the United States in search of opportunity.

  11. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Stanford Center for Population Health Sciences (2020). Historic US census - 1930 [Dataset]. http://doi.org/10.57761/6e5q-rh85
Organization logo

Historic US census - 1930

Explore at:
application/jsonl, parquet, spss, csv, arrow, stata, avro, sasAvailable download formats
Dataset updated
Jan 10, 2020
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1930 - Dec 31, 1930
Area covered
United States
Description

Abstract

The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

This dataset was created on 2020-01-10 22:52:11.461 by merging multiple datasets together. The source datasets for this version were:

IPUMS 1930 households: This dataset includes all households from the 1930 US census.

IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.

IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

Section 2

Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

Notes

  • We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

  • Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

  • Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

  • Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

  • Most inconsistent information was not edite

Search
Clear search
Close search
Google apps
Main menu