This dataset includes all individuals from the 1870 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1870 census was 1 June 1870. The main goal of an early census like the 1870 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT
https://www.icpsr.umich.edu/web/ICPSR/studies/7923/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7923/terms
This data collection consists of modified records from CENSUS OF POPULATION AND HOUSING, 1970 [UNITED STATES]: PUBLIC USE SAMPLES (ICPSR 0018). The original records consisted of 120-character household records and 120-character person records, whereas the new modified records are rectangular (each person record is combined with the corresponding household record) with a length of 188, after the deletion of some items. Additional information was added to the data records, including typical educational requirement for current occupation, occupational prestige score, and group identification code. This version also differs from the original public use census samples in other ways: persons aged 15-75 were included, no majority males were included, but the majority males from CENSUS OF POPULATION AND HOUSING [UNITED STATES], 1970 PUBLIC USE SAMPLE: MODIFIED 1/1000 5% STATE SAMPLES (ICPSR 7922) were included for convenience, 10 percent of the Black population from each file was included, and Mexican Americans (identified by a Spanish surname) from outside the five southwestern states of Arizona, California, Colorado, New Mexico, and Texas were not included in this file. Variables provide information on the housing unit, such as occupancy and vacancy status of house, value of property, commercial use, ratio of rent and property value to family income, availability of plumbing facilities, sewage disposal, complete kitchen facilities, heating facilities, flush toilet, water, television, and telephone. Data are also provided on household characteristics such as household size, family size, and household relationships. Other demographic variables specify age, sex, place of birth, state of residence, Spanish descent, marital status, race, veteran status, income, and ratio of family income to poverty cutoff level. This collection was made available by the National Chicano Research Network of the Institute for Social Research, University of Michigan. See the related collection, CENSUS OF POPULATION AND HOUSING [UNITED STATES], 1970 PUBLIC USE SAMPLE: MODIFIED 1/1000 5% STATE SAMPLES (ICPSR 7922).
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset was created on 2020-01-10 22:51:40.810
by merging multiple datasets together. The source datasets for this version were:
IPUMS 1900 households: This dataset includes all households from the 1900 US census.
IPUMS 1900 persons: This dataset includes all individuals from the 1910 US census.
IPUMS 1900 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Website alows the public full access to the 1940 Census images, census maps and descriptions.
https://www.icpsr.umich.edu/web/ICPSR/studies/2863/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/2863/terms
The objective of this data collection was to examine inequalities of wealth and the geographic distribution of wealthy individuals in late 18th- and early 19th-century New York and to investigate wealth in relationship to occupation and location. For this study, the entire set of tax assessment records and United States Census records for New York City were computerized and occupational status was added for all entries. The collection addresses topics such as social class structure, demographic factors, occupational status and geographic distribution, property values and geographic distribution, and the relationship of these factors to the political system. Units of analysis were individual property owners and renters for the tax assessment data and heads of households for the census data. Data collected included the individual's name, address, occupation, sex, and race, the type, quantity, and value of real and personal property, and the type and occupancy of the structure at the address. Occupational data from city directories were used to supplement the tax and census data.
The 1940 Census population schedules were created by the Bureau of the Census in an attempt to enumerate every person living in the United States on April 1, 1940, although some persons were missed. The 1940 census population schedules were digitized by the National Archives and Records Administration (NARA) and released publicly on April 2, 2012. The 1940 Census enumeration district maps contain maps of counties, cities, and other minor civil divisions that show enumeration districts, census tracts, and related boundaries and numbers used for each census. The coverage is nation wide and includes territorial areas. The 1940 Census enumeration district descriptions contain written descriptions of census districts, subdivisions, and enumeration districts.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for New One Family Houses for Sale for the South Census Region (HNFSS) from Jan 1973 to May 2025 about South Census Region, 1-unit structures, family, new, sales, housing, and USA.
This computerised transcription of the census enumerators' books for the 1881 Census for England, Scotland and Wales, the Channel Islands and the Isle of Man is a by-product of a project to create a microfiche index of the population of Great Britain for genealogists. Covering the entire enumerated population of England, Scotland and Wales, the Channel Islands and the Isle of Man in 1881, it is the largest collection of historical source material to be made available in computerised form. The data consists of the name, address, relationship to the head of household, marital status, age, occupation and birthplace of some 26 million individuals, together with information about disabilities.
In 1999 the Genealogical Society of Utah published a version of this computerised transcription as a CD-ROM product suitable for genealogical research (Genealogical Society of Utah (1999) 1881 British census and national index. [25 CDs]. Salt Lake City, Utah: GSU). This study is an enriched version of these data. The sample is a 5 per cent random sample of the parishes of Great Britain. The sample was chosen in the simplest manner possible. A list of all the parishes in England, Wales, Scotland and the Islands in the British Seas was created; using a random number generator in Microsoft Excel, a random number between zero and one was allocated to each parish. All those less than or equal to 0.05 were selected for the sample. The records relating to the individuals in each of these parishes were then extracted from the data and combined in a database.
Tables B1 and B3 in Appendix B of the documentation list the 716 parishes in the sample.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
[ARCHIVED] Community Counts data is retained for archival purposes only, such as research, reference and record-keeping. This data has not been maintained or updated. Users looking for the latest information should refer to Statistics Canada’s Census Program (https://www12.statcan.gc.ca/census-recensement/index-eng.cfm?MM=1) for the latest data, including detailed results about Nova Scotia. This table reports family structures. This data is sourced from the Census of Population. Geographies available: provinces, counties, communities, municipalities, district health authorities, community health boards, economic regions, police districts, school boards, municipal electoral districts, provincial electoral districts, federal electoral districts, regional development authorities, watersheds
ABS Census data extract - G08 ANCESTRY BY COUNTRY OF BIRTH OF PARENTS providing a breakdown of population at LGA level and by:ancestry(a)birthplace not stated(b)total responses(c) andother(d)This data is based on place of usual residence.(a) This list of ancestries consists of the most common 30 Ancestry responses reported in the 2016 and 2011 Census. (b) Includes birthplace for either or both parents not stated.(c) This table is a multi-response table and therefore the total responses count will not equal the total persons count.(d) If two responses from one person are categorised in the 'Other' category only one response is counted. Includes ancestries not identified individually and 'Inadequately described'.Please note that there are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals.
Data on census family structure, number of children, average number of children and age of youngest child for census families with children, Canada, provinces and territories, census metropolitan areas and census agglomerations, 2021, 2016 and 2011 censuses.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Veterans’ Grandchildren Mortality Plus sample consists of the records of more than 35,700 total grandchildrenboth male and female in nearly equal numbers,about 28,000 of which survived to age 45,who were born after the war to 16,791 children of 2,825 veterans,and contains an oversample of ex-POW veterans.The primary purpose of the project was to explore how grandfathers’ trauma affects the longevity and overweight of descendants. The dataset contains birth and death dates of grandchildren, census information on their parents' household, select socioeconomic and education information from the 1930 and 1940 census, and height and weight information from WWII draft cards for the grandsons. This multigenerational dataset can be used for researching the intergenerational transmission of longevity, overweight and socioeconomic status and the sex-specific pathways of this transmission and for testing mechanical linkage algorithms. Researchers built on a previously collected NIA-funded project containing census and death information of children of ex-POW and non-POW veterans (“Early Indicators, Intergenerational Processes, and Aging,” NIA grant P01AG10120, PI: Costa). The Veterans’ Grandchildren Mortality Plus data set contains the newly collected records of the veterans’ grandchildren, as well as the previously collected data of the veterans and their children.
Official statistics are produced impartially and free from political influence.
Designed to facilitate analysis of the status of Blacks around the turn of the century, this oversample of Black-headed households in the United States was drawn from the 1910 manuscript census schedules. The sample complements the 1/250 Public Use Sample of the 1910 census manuscripts collected by Samuel H. Preston at the University of Pennsylvania: CENSUS OF POPULATION, 1910 [UNITED STATES]: PUBLIC USE SAMPLE (ICPSR 9166). Part 1, Household Records, contains a record for each household selected in the sample and supplies variables describing the location, type, and composition of the households. Part 2, Individual Records, contains a record for each individual residing in the sampled households and includes information on demographic characteristics, occupation, literacy, nativity, ethnicity, and fertility. Manuscript census records for 1910 from counties with at least 10 percent of the population African-American (Negro, Black, or Mulatto) located in nine states where a large number of counties had at least this same proportion of African-Americans (Maryland, Virginia, North Carolina, Florida, Kentucky, Tennessee, Arkansas, Louisiana, and Texas). The four states with the largest population of Blacks (South Carolina, Alabama, Mississippi, and Georgia) were excluded from the oversample because the 1/250 Public Use Sample (referred to above) provided sufficient cases for most analyses. Sampling was carried out using computer software that randomly selected households based on the manuscript census microfilm reel number, sequence, and page and line number, with two different sampling fractions. Counties in Maryland, Kentucky, and Texas were sampled using a 0.01 sampling fraction, while a 0.005 sampling fraction was employed in Virginia, North Carolina, Florida, Tennessee, and Arkansas. In Louisiana, both fractions were utilized to test optimum sampling fractions. ICPSR data undergo a confidentiality review and are altered when necessary to limit the risk of disclosure. ICPSR also routinely creates ready-to-go data files along with setups in the major statistical software formats as well as standard codebooks to accompany the data. In addition to these procedures, ICPSR performed the following processing steps for this data collection: Created variable labels and/or value labels.. The data contain blanks and alphabetic characters. This oversample can be combined with the 1/250 Public Use Sample by differential weighting of households (or individuals) by county of enumeration as described in the User's Guide. Datasets: DS0: Study-Level Files DS1: Household Records DS2: Individual Records
The community profiles contain data from 2016 Census and long form program. The 2016 census data is considered to be of good quality and general comparisons can be made with similar data from previous years. Direct comparisons cannot be made between Statistics Canada’s 2016 Long Form data and the 2011 National Household Survey (NHS).The figures shown in the tables and charts have been subjected to a confidentiality procedure known as random rounding to prevent the possibility of associating statistical data with any identifiable individual. Under this method, all figures, including totals and margins, are randomly rounded either up or down to a multiple of "5", and in some cases "10". While providing strong protection against disclosure, this technique does not add significant error to the data. The user should be aware that totals and margins are rounded independently of the cell data so that some differences between these and the sum of rounded cell data may exist. Also, minor differences can be expected in corresponding totals and cell values among various census tabulations.Statistics Canada is committed to protect the privacy of all Canadians and the confidentiality of the data they provide. As part of this commitment, some population counts of geographic areas are adjusted in order to ensure confidentiality.For more information about Kingston's Community & Neighbourhood Profiles, as well as links to exciting new tools, please visit our website: https://www.cityofkingston.ca/explore/neighbourhood-profilesA detailed Glossary of Terms is also available (Adobe PDF format): https://drive.google.com/file/d/1KAbrqmARXjzy1yBcVlVYf2Xz-KidOfXM/view?usp=sharing
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This training dataset includes a total of 34,913 manually transcribed text segments. It is dedicated to the handwritten text recognition (HTR) of historical sources, typically tabular records, such as censuses. This dataset is based on a sample of 83 pages from the 19th century (1805-1898) censuses of Lausanne, Switzerland. The primary language of the documents is French, although many germanic names and toponyms are also found.
The training data are formatted and provided on the model of the Bentham dataset. The format thus simply consists in a list of jpeg images, one per text segments, and their corresponding transcription, stored in a txt file. The file naming convention is 'yyyy-ppp-n', where 'y' stands for the year of publication of the census, and 'p' for the page number.
The digitized documents are provided by the Archives of the City of Lausanne.
Please note that the annotation and extraction methodology, as well as the complete evaluation of performance, including HTR benchmark and post-correction performance is published in :
Petitpierre R., Rappo L., Kramer M. (2023). An end-to-end pipeline for historical censuses processing. International Journal on Document Analysis and Recognition (IJDAR). doi: 10.1007/s10032-023-00428-9
Tabular dataset resulting from automatic extraction are also available on Zenodo :
Petitpierre R., Rappo L., Kramer M., di Lenardo I. (2023). 1805-1898 Census Records of Lausanne : a Long Digital Dataset for Demographic History. Zenodo. doi: 10.5281/zenodo.7711640
https://www.statcan.gc.ca/eng/reference/licencehttps://www.statcan.gc.ca/eng/reference/licence
Statistics Canada Census Data from 2021. This dataset includes the indigenous ancestry data provided by Statistics Canada joined with the census tracts. Each topic covered by the census was exported as a separate table. Each table contains the total, male, and female characteristics as fields for each census tract. Topics range from population, age and sex, immigration, language, family and households, income, education, and labour. For more information on definitions of terms used in the tables and other notes, refer to Statistics Canada's 2021 Census.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for New One Family Houses for Sale for the Northeast Census Region (HNFSNE) from Jan 1973 to Apr 2025 about Northeast Census Region, 1-unit structures, family, new, sales, housing, and USA.
Starting in mid-July of 2020, despite many delays due to Covid-19, census takers began interviewing households who had not yet responded online or via the mail to the U.S. 2020 Census. The federal census, required by the United States’ Constitution, happens once every 10 years and each time, there are new variations in enumeration (counting) techniques and what statistical data to collect. There are processes around “how” to count and then also “what” to count; the data collected needs to be useful for governance and allocation yet also respectful of privacy and remain fair and impartial for the entire U.S. population. In 2019 and 2020, hundreds of thousands of temporary workers from local communities were hired to go out into the field as census takers as well as staff offices and provide supervision. This 22nd federal census count began in January 2020 with remote portions of Alaska, where the territory was still frozen and traversable. These employed citizens are just one aspect of how the census is truly a community event. Let’s dive into the history of the U.S. Census and also learn why this count is so important.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses. Geographies: * Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations; * Canada, Provinces and Territories, Census Divisions and Census Subdivisions; * Census Metropolitan Areas, Tracted Census Agglomerations and Census Tracts;
This dataset includes all individuals from the 1870 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1870 census was 1 June 1870. The main goal of an early census like the 1870 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT