Website alows the public full access to the 1940 Census images, census maps and descriptions.
The 1940 Census population schedules were created by the Bureau of the Census in an attempt to enumerate every person living in the United States on April 1, 1940, although some persons were missed. The 1940 census population schedules were digitized by the National Archives and Records Administration (NARA) and released publicly on April 2, 2012. The 1940 Census enumeration district maps contain maps of counties, cities, and other minor civil divisions that show enumeration districts, census tracts, and related boundaries and numbers used for each census. The coverage is nation wide and includes territorial areas. The 1940 Census enumeration district descriptions contain written descriptions of census districts, subdivisions, and enumeration districts.
https://www.icpsr.umich.edu/web/ICPSR/studies/8236/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/8236/terms
The 1940 Census Public Use Microdata Sample Project was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology at the University of Wisconsin. The collection contains a stratified 1-percent sample of households, with separate records for each household, for each "sample line" respondent, and for each person in the household. These records were encoded from microfilm copies of original handwritten enumeration schedules from the 1940 Census of Population. Geographic identification of the location of the sampled households includes Census regions and divisions, states (except Alaska and Hawaii), standard metropolitan areas (SMAs), and state economic areas (SEAs). Accompanying the data collection is a codebook that includes an abstract, descriptions of sample design, processing procedures and file structure, a data dictionary (record layout), category code lists, and a glossary. Also included is a procedural history of the 1940 Census. Each of the 20 subsamples contains three record types: household, sample line, and person. Household variables describe the location and condition of the household. The sample line records contain variables describing demographic characteristics such as nativity, marital status, number of children, veteran status, wage deductions for Social Security, and occupation. Person records also contain variables describing demographic characteristics including nativity, marital status, family membership, education, employment status, income, and occupation.
This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1940 datasets.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The IPUMS microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The historic US 1940 census data was collected in April 1940. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Notes
This dataset includes all individuals from the 1940 US census.
This dataset includes all households from the 1940 US census.
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/7.0/customlicense?persistentId=doi:10.7910/DVN/I0TLPIhttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/7.0/customlicense?persistentId=doi:10.7910/DVN/I0TLPI
The CenSoc-Numident dataset links the 1940 census to the National Archives’ public release of the Social Security Numident file (“NARA Numident”). Our linking strategy relies on first name, last name, year of birth, and place of birth. To link unmarried women, we use father’s last name as a proxy for women’s maiden name. We use the ABE fully automated linking approach developed by Abramitzky, Boustan, and Eriksson (2012, 2014, 2017). To work with this dataset, researchers must download and link the 1940 full-count Census sample from IPUMS-USA on the HISTID variable. Please adhere to the citation and usage guidelines of both CenSoc and IPUMS-USA when using this dataset. The CenSoc-Numident supplemental geography file contains additional variables with place of birth and/or place of death information, such as county of birth and death, for a subset of the CenSoc-Numident dataset. The CenSoc-Numident sibling files identify sibling groups in the CenSoc-Numident dataset.
These data comprise Census records relating to the Alaskan people's population demographics for the State of Alaskan Salmon and People (SASAP) Project. Decennial census data were originally extracted from IPUMS National Historic Geographic Information Systems website: https://data2.nhgis.org/main(Citation: Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 12.0 [Database]. Minneapolis: University of Minnesota. 2017. http://doi.org/10.18128/D050.V12.0). A number of relevant tables of basic demographics on age and race, household income and poverty levels, and labor force participation were extracted.
These particular variables were selected as part of an effort to understand and potentially quantify various dimensions of well-being in Alaskan communities.
The file "censusdata_master.csv" is a consolidation of all 21 other data files in the package. For detailed information on how the datasets vary over different years, view the file "readme.docx" available in this data package.
The included .Rmd file is a script which combines the 21 files by year into a single file (censusdata_master.csv). It also cleans up place names (including typographical errors) and uses the
USGS place names dataset and the SASAP regions dataset to assign latitude and longitude values and region values to each place in the dataset. Note that some places were not assigned a region or
location because they do not fit well into the regional framework.
Considerable heterogeneity exists between census surveys each year. While we have attempted to combine these datasets in a way that makes sense, there may be some discrepancies or unexpected values.
Please send a description of any unusual values to the dataset contact.
This crosswalk consists of individuals matched between the 1900 and 1940 complete-count US Censuses. Within the crosswalk, users have the option to select the linking method with which these matches were created. This version of the crosswalk contains links made by the ABE-exact (conservative and standard) method, the ABE-NYSIIS (conservative and standard) method and the ABE-NYSIIS (conservative and standard) method where race is used as a matching variable. For any chosen method, users can merge into this crosswalk a wide set of individual- and household-level variables provided publicly by IPUMS, thereby creating a historical longitudinal dataset for analysis.
The CenSoc WWII Army Enlistment Dataset is a cleaned and harmonized version of the National Archives and Records Administration’s Electronic Army Serial Number Merged File, ca. 1938 - 1946 (2002). It contains enlistment records for over 9 million men and women who served in the United States Army, including the Army Air Corps, Women's Army Auxiliary Corps, and Enlisted Reserve Corps. We publish links between men in the CenSoc WWII Army Enlistment Dataset, Social Security Administration mortality data, and the 1940 Census. The CenSoc Enlistment-Census-1940 file links these enlistment records to the complete 1940 Census, and may be merged with IPUMS-USA census data using the HISTID identifier variable. The CenSoc Enlistment-Numident file links enlistment records to the Berkley Unified Numident Mortality Database (BUNMD), and the CenSoc Enlistment-DMF file links enlistment records to the Social Security Death Master File. For enlistment records in the Enlistment-Numident and Enlistment-DMF datasets that have been independently and additionally linked to the 1940 Census, we include the HISTID identifier variable that can be used to merge the data with IPUMS census data.
This study matches Canadian and US manufacturing industries at the 2-digit SIC code level for census years 1900 to 1940. Canadian figures start at 1870. Only general figures were recorded, such as number of employees, number of establishments, salary and wages, gross production, cost of input materials, gross value added. The project does have some drawbacks, such as the lack of US figures gross production, cost of materials, and lack of figures for the iron and steel industry. But for an aggregate comparison of the two countries, the numbers can be considered reliable.
This dataset was created primarily to map and track socioeconomic and demographic variables from the US Census Bureau from year 1940 to year 2010, by decade, within the City of Baltimore's Mayor's Office of Information Technology (MOIT) year 2010 neighborhood boundaries. The socioeconomic and demographic variables include the percent White, percent African American, percent owner occupied homes, percent vacant homes, the percentage of age 25 and older people with a high school education or greater, and the percentage of age 25 and older people with a college education or greater. Percent White and percent African American are also provided for year 1930. Each of the the year 2010 neighborhood boundaries were also attributed with the 1937 Home Owners' Loan Corporation (HOLC) definition of neighborhoods via spatial overlay. HOLC rated neighborhoods as A, B, C, D or Undefined. HOLC categorized the perceived safety and risk of mortgage refinance lending in metropolitan areas using a hierarchical grading scale of A, B, C, and D. A and B areas were considered the safest areas for federal investment due to their newer housing as well as higher earning and racially homogenous households. In contrast, C and D graded areas were viewed to be in a state of inevitable decline, depreciation, and decay, and thus risky for federal investment, due to their older housing stock and racial and ethnic composition. This policy was inherently a racist practice. Places were graded based on who lived there; poor areas with people of color were labeled as lower and less-than. HOLC's 1937 neighborhoods do not cover the entire extent of the year 2010 neighborhood boundaries. The neighborhood boundaries were also augmented to include which of the year 2017 Housing Market Typology (HMT) the 2010 neighborhoods fall within. Finally, the neighborhood boundaries were also augmented to include tree canopy and tree canopy change year 2007 to year 2015.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset comprises physician-level entries from the 1906 American Medical Directory, the first in a series of semi-annual directories of all practicing physicians published by the American Medical Association [1]. Physicians are consistently listed by city, county, and state. Most records also include details about the place and date of medical training. From 1906-1940, Directories also identified the race of black physicians [2].This dataset comprises physician entries for a subset of US states and the District of Columbia, including all of the South and several adjacent states (Alabama, Arkansas, Delaware, Florida, Georgia, Kansas, Kentucky, Louisiana, Maryland, Mississippi, Missouri, North Carolina, Oklahoma, South Carolina, Tennessee, Texas, Virginia, West Virginia). Records were extracted via manual double-entry by professional data management company [3], and place names were matched to latitude/longitude coordinates. The main source for geolocating physician entries was the US Census. Historical Census records were sourced from IPUMS National Historical Geographic Information System [4]. Additionally, a public database of historical US Post Office locations was used to match locations that could not be found using Census records [5]. Fuzzy matching algorithms were also used to match misspelled place or county names [6].The source of geocoding match is described in the “match.source” field (Type of spatial match (census_YEAR = match to NHGIS census place-county-state for given year; census_fuzzy_YEAR = matched to NHGIS place-county-state with fuzzy matching algorithm; dc = matched to centroid for Washington, DC; post_places = place-county-state matched to Blevins & Helbock's post office dataset; post_fuzzy = matched to post office dataset with fuzzy matching algorithm; post_simp = place/state matched to post office dataset; post_confimed_missing = post office dataset confirms place and county, but could not find coordinates; osm = matched using Open Street Map geocoder; hand-match = matched by research assistants reviewing web archival sources; unmatched/hand_match_missing = place coordinates could not be found). For records where place names could not be matched, but county names could, coordinates for county centroids were used. Overall, 40,964 records were matched to places (match.type=place_point) and 931 to county centroids ( match.type=county_centroid); 76 records could not be matched (match.type=NA).Most records include information about the physician’s medical training, including the year of graduation and a code linking to a school. A key to these codes is given on Directory pages 26-27, and at the beginning of each state’s section [1]. The OSM geocoder was used to assign coordinates to each school by its listed location. Straight-line distances between physicians’ place of training and practice were calculated using the sf package in R [7], and are given in the “school.dist.km” field. Additionally, the Directory identified a handful of schools that were “fraudulent” (school.fraudulent=1), and institutions set up to train black physicians (school.black=1).AMA identified black physicians in the directory with the signifier “(col.)” following the physician’s name (race.black=1). Additionally, a number of physicians attended schools identified by AMA as serving black students, but were not otherwise identified as black; thus an expanded racial identifier was generated to identify black physicians (race.black.prob=1), including physicians who attended these schools and those directly identified (race.black=1).Approximately 10% of dataset entries were audited by trained research assistants, in addition to 100% of black physician entries. These audits demonstrated a high degree of accuracy between the original Directory and extracted records. Still, given the complexity of matching across multiple archival sources, it is possible that some errors remain; any identified errors will be periodically rectified in the dataset, with a log kept of these updates.For further information about this dataset, or to report errors, please contact Dr Ben Chrisinger (Benjamin.Chrisinger@tufts.edu). Future updates to this dataset, including additional states and Directory years, will be posted here: https://dataverse.harvard.edu/dataverse/amd.References:1. American Medical Association, 1906. American Medical Directory. American Medical Association, Chicago. Retrieved from: https://catalog.hathitrust.org/Record/000543547.2. Baker, Robert B., Harriet A. Washington, Ololade Olakanmi, Todd L. Savitt, Elizabeth A. Jacobs, Eddie Hoover, and Matthew K. Wynia. "African American physicians and organized medicine, 1846-1968: origins of a racial divide." JAMA 300, no. 3 (2008): 306-313. doi:10.1001/jama.300.3.306.3. GABS Research Consult Limited Company, https://www.gabsrcl.com.4. Steven Manson, Jonathan Schroeder, David Van Riper, Tracy Kugler, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 17.0 [GNIS, TIGER/Line & Census Maps for US Places and Counties: 1900, 1910, 1920, 1930, 1940, 1950; 1910_cPHA: ds37]. Minneapolis, MN: IPUMS. 2022. http://doi.org/10.18128/D050.V17.05. Blevins, Cameron; Helbock, Richard W., 2021, "US Post Offices", https://doi.org/10.7910/DVN/NUKCNA, Harvard Dataverse, V1, UNF:6:8ROmiI5/4qA8jHrt62PpyA== [fileUNF]6. fedmatch: Fast, Flexible, and User-Friendly Record Linkage Methods. https://cran.r-project.org/web/packages/fedmatch/index.html7. sf: Simple Features for R. https://cran.r-project.org/web/packages/sf/index.html
These data comprise Census records relating to the Alaskan people's population demographics for the State of Alaskan Salmon and People (SASAP) Project. Decennial census data were originally extracted from IPUMS National Historic Geographic Information Systems website: https://data2.nhgis.org/main (Citation: Steven Manson, Jonathan Schroeder, David Van Riper, and Steven Ruggles. IPUMS National Historical Geographic Information System: Version 12.0 [Database]. Minneapolis: University of Minnesota. 2017. http://doi.org/10.18128/D050.V12.0). A number of relevant tables of basic demographics on age and race, household income and poverty levels, and labor force participation were extracted. These particular variables were selected as part of an effort to understand and potentially quantify various dimensions of well-being in Alaskan communities. The file "censusdata_master.csv" is a consolidation of all 21 other data files in the package. For detailed information on how the datasets vary over different years, view the file "readme.docx" available in this data package. The included .Rmd file is a script which combines the 21 files by year into a single file (censusdata_master.csv). It also cleans up place names (including typographical errors) and uses the USGS place names dataset and the SASAP regions dataset to assign latitude and longitude values and region values to each place in the dataset. Note that some places were not assigned a region or location because they do not fit well into the regional framework. Considerable heterogeneity exists between census surveys each year. While we have attempted to combine these datasets in a way that makes sense, there may be some discrepancies or unexpected values. The RMarkdown document SASAPWebsiteGraphicsCensus.Rmd is used to generate a variety of figures using these data, including the additional file Chignik_population.png
A prelinked “demo” version of the CenSoc-DMF and CenSoc-Numident datasets with approximately 20 mortality covariates from the 1940 census and ~1% of records in the complete CenSoc datasets.
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de451385https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de451385
Abstract (en): This collection includes county-level data from the United States Censuses of Agriculture for the years 1840 to 2012. The files provide data about the number, types, output, and prices of various agricultural products, as well as information on the amount, expenses, sales, values, and production of machinery. Most of the basic crop output data apply to the previous harvest year. Data collected also included the population and value of livestock, the number of animals slaughtered, and the size, type, and value of farms. Part 46 of this collection contains data from 1980 through 2010. Variables in part 46 include information such as the average value of farmland, number and value of buildings per acre, food services, resident population, composition of households, and unemployment rates. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Checked for undocumented or out-of-range codes.. Response Rates: Not applicable. Datasets:DS0: Study-Level FilesDS1: Farm Land Value Data Set (County and State) 1850-1959DS2: 1840 County and StateDS3: 1850 County and StateDS4: 1860 County and StateDS5: 1870 County and StateDS6: 1880 County and StateDS7: 1890 County and StateDS8: 1900 County and StateDS9: 1910 County and StateDS10: 1920 County and State, Dataset 1DS11: 1920 County and State, Dataset 2DS12: 1925 County and StateDS13: 1930 County and State, Dataset 1DS14: 1930 County and State, Dataset 2DS15: 1935 County and StateDS16: 1940 County and State, Dataset 1DS17: 1940 County and State, Dataset 2DS18: 1940 County and State, Dataset 3DS19: 1940 County and State, Dataset 4 (Water)DS20: 1945 County and StateDS21: 1950 County and State, Dataset 1DS22: 1950 Crops, County and State, Dataset 2DS23: 1950 County, Dataset 3DS24: 1950 County and State, Dataset 4DS25: 1954 County and State, Dataset 1DS26: 1954 Crops, County and State, Dataset 2DS27: 1959 County and State, Dataset 1DS28: 1959 Crops, County and State, Dataset 2DS29: 1959 County, Dataset 3DS30: 1964 Dataset 1DS31: 1964 Crops, County and State, Dataset 2DS32: 1964 County, Dataset 3DS33: 1969 All Farms, County and State, Dataset 1DS34: 1969 Farms 2500, County and State, Dataset 2DS35: 1969 Crops, County and State, Dataset 3DS36: 1974 All Farms, County and State, Dataset 1DS37: 1974 Farms 2500, County and State, Dataset 2DS38: 1974 Crops, County and State, Dataset 3DS39: 1978 County and StateDS40: 1982 County and StateDS41: 1987 County and StateDS42: 1992 County and StateDS43: 1997 County and StateDS44: 2002 County and StateDS45: 2007 County and StateDS46: State and County Data, United States, 1980-2010DS47: 2012 County and State Farms within United States counties and states. Smallest Geographic Unit: FIPS code The sample was the universe of agricultural operating units. For 1969-2007, data were taken from computer files from the Census Bureau and the United States Department of Agriculture. 2018-08-20 The P.I. resupplied data and documentation for 1935 County and State (dataset 15) and 1997 County and State (dataset 43). Additionally, documentation updates and variable label revisions have been incorporated in datasets 22, 26, 28, 31, 35, and 38 at the request of the P.I.2016-06-29 The data and documentation for 2012 County and State (data set 47) have been added to this collection. The collection and documentation titles have been updated to reflect the new year.2015-08-05 The data, setup files, and documentation for 1964 Dataset 1 have been updated to reflect changes from the producer. Funding insitution(s): National Science Foundation (NSF-SES-0921732; 0648045). United States Department of Health and Human Services. National Institutes of Health (R01 HD057929).
In 1800, the population of Luxembourg was estimated to be 127,000, a figure which would rise steadily through the early 19 th century as the country would become an increasingly prominent city in the region. Luxembourg’s population would see its first major period of growth following the defeat of Napoleon in 1815, which would result in the previously-French occupied Luxembourg being granted formal autonomy in the subsequent Congress of Vienna. As a largely agrarian state at this time, the population of Luxembourg would see several periods of growth and decline throughout the remainder of the 20th century, as many residents emigrated abroad to countries such as the United States in search of work. Nevertheless, the population of Luxembourg would rise to over 235,000 by the turn of the century, as Dutch modernization and the removal of the city’s fortifications in the 1867 Treaty of London would allow for a greater expansion of the city proper.
The first half of the 20 th century would largely be a period of stagnation for the country, as the country would see large periods of stagnation in the 1910s and throughout the 1930s and 1940s, as occupation in both World Wars and the 1918 Spanish Flu epidemic) would see massive damage to the city in both human and economic terms. Luxembourg’s population would see significant growth in the country’s population, particularly so following the creation of the European Union in 1958 (Luxembourg was one of the six founding members of the union). Growth would accelerate even further following the 1980s, as increases in industrialization and accompanying economic growth would lead to an increasingly growing immigrant population from other EU nations in Luxembourg, which by 2015 would account for nearly half the citizens in Luxembourg. As a result of this growth, in 2020, Luxembourg is estimated to have a population of 626,000.
In 1800, the region of present-day Switzerland had a population of approximately 1.8 million people. This figure would grow steadily throughout the 19th century, as political and religious grievances gave way to a united federation, whose economic policies saw Switzerland emerge as one of Europe's most prosperous and stable countries. Growth boomed between 1890 and 1910, as industrialization would see significant economic growth and migration to the country. While Switzerland’s neutrality in both World Wars would prevent the mass fatalities experienced across the rest of Europe during the early 20th century, Switzerland’s population would nevertheless stagnate in both the First and Second World War and in the Great Depression in the 1930s, as the economic turmoil and conflict abroad would halt the migration that had previously driven population growth.
Following the end of the Second World War, growth would resume and would rise steadily until the late 1970s, before an economic recession saw the population fall again as workers migrated in search of employment elsewhere. However, population growth has steadily risen since the 1980s, reaching seven million in the mid-1990s and eight million in 2012. Today, with a population of 8.7 million, Switzerland is ranked among the wealthiest and most developed nations in the world, with very high standards of living.
The world's population first reached one billion people in 1803, and reach eight billion in 2023, and will peak at almost 11 billion by the end of the century. Although it took thousands of years to reach one billion people, it did so at the beginning of a phenomenon known as the demographic transition; from this point onwards, population growth has skyrocketed, and since the 1960s the population has increased by one billion people every 12 to 15 years. The demographic transition sees a sharp drop in mortality due to factors such as vaccination, sanitation, and improved food supply; the population boom that follows is due to increased survival rates among children and higher life expectancy among the general population; and fertility then drops in response to this population growth. Regional differences The demographic transition is a global phenomenon, but it has taken place at different times across the world. The industrialized countries of Europe and North America were the first to go through this process, followed by some states in the Western Pacific. Latin America's population then began growing at the turn of the 20th century, but the most significant period of global population growth occurred as Asia progressed in the late-1900s. As of the early 21st century, almost two thirds of the world's population live in Asia, although this is set to change significantly in the coming decades. Future growth The growth of Africa's population, particularly in Sub-Saharan Africa, will have the largest impact on global demographics in this century. From 2000 to 2100, it is expected that Africa's population will have increased by a factor of almost five. It overtook Europe in size in the late 1990s, and overtook the Americas a decade later. In contrast to Africa, Europe's population is now in decline, as birth rates are consistently below death rates in many countries, especially in the south and east, resulting in natural population decline. Similarly, the population of the Americas and Asia are expected to go into decline in the second half of this century, and only Oceania's population will still be growing alongside Africa. By 2100, the world's population will have over three billion more than today, with the vast majority of this concentrated in Africa. Demographers predict that climate change is exacerbating many of the challenges that currently hinder progress in Africa, such as political and food instability; if Africa's transition is prolonged, then it may result in further population growth that would place a strain on the region's resources, however, curbing this growth earlier would alleviate some of the pressure created by climate change.
Website alows the public full access to the 1940 Census images, census maps and descriptions.