The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset was created on 2020-01-10 22:51:40.810
by merging multiple datasets together. The source datasets for this version were:
IPUMS 1900 households: This dataset includes all households from the 1900 US census.
IPUMS 1900 persons: This dataset includes all individuals from the 1910 US census.
IPUMS 1900 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
This dataset includes all households from the 1900 US census.
https://www.icpsr.umich.edu/web/ICPSR/studies/2877/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/2877/terms
This data collection, Aging of Veterans of the Union Army: Surgeons' Certificates, United States, 1862-1940, constitutes a portion of the historical data collected by the project "Early Indicators of Later Work Levels, Disease, and Death." With the goal of constructing datasets suitable for longitudinal analyses of factors affecting the aging process, the project collects military, medical, and socioeconomic data on a sample of white males mustered into the Union Army during the Civil War. The surgeons' certificates contain information from examining physicians to determine eligibility for pension benefits. Also included are questions regarding the age, occupation, residence, and military experience of the veterans. These data can be linked to "Aging of Veterans of the Union Army: Military, Pension, and Medical Records, 1820-1940" (ICPSR 6837) and "Aging of Veterans of the Union Army: United States Federal Census Records, 1850, 1860, 1900, 1910" (ICPSR 6836) using the variable "recidnum."
This dataset includes all individuals from the 1860 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1860 census was 1 June 1860. The main goal of an early census like the 1860 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT
This crosswalk consists of individuals matched between the 1900 and 1940 complete-count US Censuses. Within the crosswalk, users have the option to select the linking method with which these matches were created. This version of the crosswalk contains links made by the ABE-exact (conservative and standard) method, the ABE-NYSIIS (conservative and standard) method and the ABE-NYSIIS (conservative and standard) method where race is used as a matching variable. For any chosen method, users can merge into this crosswalk a wide set of individual- and household-level variables provided publicly by IPUMS, thereby creating a historical longitudinal dataset for analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Population: All Ages data was reported at 325,719.000 Person th in 2017. This records an increase from the previous number of 323,406.000 Person th for 2016. United States Population: All Ages data is updated yearly, averaging 176,356.000 Person th from Jun 1900 (Median) to 2017, with 118 observations. The data reached an all-time high of 325,719.000 Person th in 2017 and a record low of 76,094.000 Person th in 1900. United States Population: All Ages data remains active status in CEIC and is reported by US Census Bureau. The data is categorized under Global Database’s United States – Table US.G002: Population by Age. Series Remarks Population data for the years 1900 to 1949 exclude the population residing in Alaska and Hawaii. Population data for the years 1940 to 1979 cover the resident population plus Armed Forces overseas. Population data for all other years cover only the resident population.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset offers a comparative snapshot of the social composition in two transatlantic port districts by documenting over 5,000 individuals who lived in the main thoroughfares of Antwerp (Schippersstraat, Vingerlingstraat, and Oudemansstraat, located in the Schipperskwartier) and Boston (North Street, situated in the North End) in 1880 and 1900. Based on Belgian population registers and U.S. census records, it includes data on name, sex, year of birth and age, birthplace, marital status, relationship to the household head, occupation, race, and birthplaces of parents. As the two sources sometimes contain different variables or use differing classifications, the available data may vary between the Belgian and American cases.
The dataset was created in the context of the postdoctoral research project at the University of Antwerp (Belgium), Rethinking “Sailortown:” Comparing the Socioeconomic Dynamics of Harbour Districts in Antwerp and Boston, 1850-1930, funded by the Research Foundation – Flanders, grant number 12X1822N. It was used in the peer-reviewed article 'Unpacking the Town within Sailortown: The Port District Neighborhoods of Antwerp and Boston, c. 1880-1900', Journal of Urban History (forthcoming).
Further details on the dataset's structure, methodology and use can be found in the accompanying data paper: Kristof Loockx, 'Residents of Late Nineteenth-Century Port Districts: A Comparative Dataset from Antwerp and Boston, 1880-1900', Journal of Open Humanities Data 11, no. 39 (2025): 1-9. DOI: https://doi.org/10.5334/johd.335
This crosswalk consists of individuals matched between the 1860 and 1900 complete-count US Censuses. Within the crosswalk, users have the option to select the linking method with which these matches were created. This version of the crosswalk contains links made by the ABE-exact (conservative and standard) method, the ABE-NYSIIS (conservative and standard) method and the ABE-NYSIIS (conservative and standard) method where race is used as a matching variable. For any chosen method, users can merge into this crosswalk a wide set of individual- and household-level variables provided publicly by IPUMS, thereby creating a historical longitudinal dataset for analysis.
A dataset to advance the study of life-cycle interactions of biomedical and socioeconomic factors in the aging process. The EI project has assembled a variety of large datasets covering the life histories of approximately 39,616 white male volunteers (drawn from a random sample of 331 companies) who served in the Union Army (UA), and of about 6,000 African-American veterans from 51 randomly selected United States Colored Troops companies (USCT). Their military records were linked to pension and medical records that detailed the soldiers������?? health status and socioeconomic and family characteristics. Each soldier was searched for in the US decennial census for the years in which they were most likely to be found alive (1850, 1860, 1880, 1900, 1910). In addition, a sample consisting of 70,000 men examined for service in the Union Army between September 1864 and April 1865 has been assembled and linked only to census records. These records will be useful for life-cycle comparisons of those accepted and rejected for service. Military Data: The military service and wartime medical histories of the UA and USCT men were collected from the Union Army and United States Colored Troops military service records, carded medical records, and other wartime documents. Pension Data: Wherever possible, the UA and USCT samples have been linked to pension records, including surgeon''''s certificates. About 70% of men in the Union Army sample have a pension. These records provide the bulk of the socioeconomic and demographic information on these men from the late 1800s through the early 1900s, including family structure and employment information. In addition, the surgeon''''s certificates provide rich medical histories, with an average of 5 examinations per linked recruit for the UA, and about 2.5 exams per USCT recruit. Census Data: Both early and late-age familial and socioeconomic information is collected from the manuscript schedules of the federal censuses of 1850, 1860, 1870 (incomplete), 1880, 1900, and 1910. Data Availability: All of the datasets (Military Union Army; linked Census; Surgeon''''s Certificates; Examination Records, and supporting ecological and environmental variables) are publicly available from ICPSR. In addition, copies on CD-ROM may be obtained from the CPE, which also maintains an interactive Internet Data Archive and Documentation Library, which can be accessed on the Project Website. * Dates of Study: 1850-1910 * Study Features: Longitudinal, Minority Oversamples * Sample Size: ** Union Army: 35,747 ** Colored Troops: 6,187 ** Examination Sample: 70,800 ICPSR Link: http://www.icpsr.umich.edu/icpsrweb/ICPSR/studies/06836
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population Census: Northeast: Paraiba data was reported at 3,974,687.000 Person in 2022. This records an increase from the previous number of 3,766,528.000 Person for 2010. Population Census: Northeast: Paraiba data is updated yearly, averaging 2,810,032.000 Person from Jul 1900 (Median) to 2022, with 13 observations. The data reached an all-time high of 3,974,687.000 Person in 2022 and a record low of 490,784.000 Person in 1900. Population Census: Northeast: Paraiba data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Global Database’s Brazil – Table BR.GAC008: Population Census: by State.
https://www.icpsr.umich.edu/web/ICPSR/studies/35032/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/35032/terms
This dataset was produced in the 1990s by Myron Gutmann and others at the University of Texas to assess demographic change in European- and Mexican-origin populations in Texas from the mid-nineteenth to early-twentieth centuries. Most of the data come from manuscript records for six rural Texas counties - Angelina, DeWitt, Gillespie, Jack, Red River, and Webb - for the U.S. Censuses of 1850-1880 and 1900-1910, and tax records where available. Together, the populations of these counties reflect the cultural, ethnic, economic, and ecological diversity of rural Texas. Red River and Angelina Counties, in Eastern Texas, had largely native-born white and black populations and cotton economies. DeWitt County in Southeast Texas had the most diverse population, including European and Mexican immigrants as well as native-born white and black Americans, and its economy was divided between cotton and cattle. The population of Webb County, on the Mexican border, was almost entirely of Mexican origin, and economic activities included transportation services as well as cattle ranching. Gillespie County in Central Texas had a mostly European immigrant population and an economy devoted to cropping and livestock. Jack County in North-Central Texas was sparsely populated, mainly by native-born white cattle ranchers. These counties were selected to over-represent the European and Mexican immigrant populations. Slave schedules were not included, so there are no African Americans in the samples for 1850 or 1860. In some years and counties, the Census records were sub-sampled, using a letter-based sample with the family as the primary sampling unit (families were chosen if the surname of the head began with one of the sample letters for the county). In other counties and years, complete populations were transcribed from the Census microfilms. For details and sample sizes by county, see the County table in the Original P.I. Documentation section of the ICPSR Codebook, or see Gutmann, Myron P. and Kenneth H. Fliess, How to Study Southern Demography in the Nineteenth Century: Early Lessons of the Texas Demography Project (Austin: Texas Population Research Center Papers, no. 11.11, 1989).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population Census: Southeast: Espirito Santo data was reported at 3,833,712.000 Person in 2022. This records an increase from the previous number of 3,514,952.000 Person for 2010. Population Census: Southeast: Espirito Santo data is updated yearly, averaging 2,063,679.000 Person from Jul 1900 (Median) to 2022, with 13 observations. The data reached an all-time high of 3,833,712.000 Person in 2022 and a record low of 209,783.000 Person in 1900. Population Census: Southeast: Espirito Santo data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Global Database’s Brazil – Table BR.GAC008: Population Census: by State.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract: Large and persistent racial disparities in land-based wealth were an important legacy of the Reconstruction era. To assess how these disparities were transmitted intergenerationally, we build a dataset to observe Black households’ landholdings in 1880 alongside a sample of White households. We then link sons from all households to the 1900 census records to observe their economic and human capital outcomes. We show that Black landowners (relative to laborers) transmitted substantial intergenerational advantages to their sons, including an 11 pp advantage in literacy. But such advantages were small relative to the racial gaps in metrics of economic status.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population Census: Northeast: Rio Grande do Norte data was reported at 3,302,729.000 Person in 2022. This records an increase from the previous number of 3,168,027.000 Person for 2010. Population Census: Northeast: Rio Grande do Norte data is updated yearly, averaging 1,933,126.000 Person from Jul 1900 (Median) to 2022, with 13 observations. The data reached an all-time high of 3,302,729.000 Person in 2022 and a record low of 274,317.000 Person in 1900. Population Census: Northeast: Rio Grande do Norte data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Global Database’s Brazil – Table BR.GAC008: Population Census: by State.
In the past four centuries, the population of the United States has grown from a recorded 350 people around the Jamestown colony of Virginia in 1610, to an estimated 331 million people in 2020. The pre-colonization populations of the indigenous peoples of the Americas have proven difficult for historians to estimate, as their numbers decreased rapidly following the introduction of European diseases (namely smallpox, plague and influenza). Native Americans were also omitted from most censuses conducted before the twentieth century, therefore the actual population of what we now know as the United States would have been much higher than the official census data from before 1800, but it is unclear by how much. Population growth in the colonies throughout the eighteenth century has primarily been attributed to migration from the British Isles and the Transatlantic slave trade; however it is also difficult to assert the ethnic-makeup of the population in these years as accurate migration records were not kept until after the 1820s, at which point the importation of slaves had also been illegalized. Nineteenth century In the year 1800, it is estimated that the population across the present-day United States was around six million people, with the population in the 16 admitted states numbering at 5.3 million. Migration to the United States began to happen on a large scale in the mid-nineteenth century, with the first major waves coming from Ireland, Britain and Germany. In some aspects, this wave of mass migration balanced out the demographic impacts of the American Civil War, which was the deadliest war in U.S. history with approximately 620 thousand fatalities between 1861 and 1865. The civil war also resulted in the emancipation of around four million slaves across the south; many of whose ancestors would take part in the Great Northern Migration in the early 1900s, which saw around six million black Americans migrate away from the south in one of the largest demographic shifts in U.S. history. By the end of the nineteenth century, improvements in transport technology and increasing economic opportunities saw migration to the United States increase further, particularly from southern and Eastern Europe, and in the first decade of the 1900s the number of migrants to the U.S. exceeded one million people in some years. Twentieth and twenty-first century The U.S. population has grown steadily throughout the past 120 years, reaching one hundred million in the 1910s, two hundred million in the 1960s, and three hundred million in 2007. In the past century, the U.S. established itself as a global superpower, with the world's largest economy (by nominal GDP) and most powerful military. Involvement in foreign wars has resulted in over 620,000 further U.S. fatalities since the Civil War, and migration fell drastically during the World Wars and Great Depression; however the population continuously grew in these years as the total fertility rate remained above two births per woman, and life expectancy increased (except during the Spanish Flu pandemic of 1918).
Since the Second World War, Latin America has replaced Europe as the most common point of origin for migrants, with Hispanic populations growing rapidly across the south and border states. Because of this, the proportion of non-Hispanic whites, which has been the most dominant ethnicity in the U.S. since records began, has dropped more rapidly in recent decades. Ethnic minorities also have a much higher birth rate than non-Hispanic whites, further contributing to this decline, and the share of non-Hispanic whites is expected to fall below fifty percent of the U.S. population by the mid-2000s. In 2020, the United States has the third-largest population in the world (after China and India), and the population is expected to reach four hundred million in the 2050s.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is a census of penguin colony counts from the year 1900 in the Antarctic region. It forms part of the Inventory of Antarctic seabird breeding sites within the Antarctic and subantarctic islands.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population Census: North: Para data was reported at 8,121,025.000 Person in 2022. This records an increase from the previous number of 7,581,051.000 Person for 2010. Population Census: North: Para data is updated yearly, averaging 3,507,312.000 Person from Jul 1900 (Median) to 2022, with 13 observations. The data reached an all-time high of 8,121,025.000 Person in 2022 and a record low of 445,356.000 Person in 1900. Population Census: North: Para data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Global Database’s Brazil – Table BR.GAC008: Population Census: by State.
This study matches Canadian and US manufacturing industries at the 2-digit SIC code level for census years 1900 to 1940. Canadian figures start at 1870. Only general figures were recorded, such as number of employees, number of establishments, salary and wages, gross production, cost of input materials, gross value added. The project does have some drawbacks, such as the lack of US figures gross production, cost of materials, and lack of figures for the iron and steel industry. But for an aggregate comparison of the two countries, the numbers can be considered reliable.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population Census: Southeast: Sao Paulo data was reported at 44,411,238.000 Person in 2022. This records an increase from the previous number of 41,262,199.000 Person for 2010. Population Census: Southeast: Sao Paulo data is updated yearly, averaging 25,375,199.000 Person from Jul 1900 (Median) to 2022, with 13 observations. The data reached an all-time high of 44,411,238.000 Person in 2022 and a record low of 2,282,279.000 Person in 1900. Population Census: Southeast: Sao Paulo data remains active status in CEIC and is reported by Brazilian Institute of Geography and Statistics. The data is categorized under Global Database’s Brazil – Table BR.GAC008: Population Census: by State.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset was created on 2020-01-10 22:51:40.810
by merging multiple datasets together. The source datasets for this version were:
IPUMS 1900 households: This dataset includes all households from the 1900 US census.
IPUMS 1900 persons: This dataset includes all individuals from the 1910 US census.
IPUMS 1900 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.