Facebook
TwitterThis dataset includes all individuals from the 1870 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1870 census was 1 June 1870. The main goal of an early census like the 1870 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT
Facebook
TwitterWebsite alows the public full access to the 1940 Census images, census maps and descriptions.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/2863/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/2863/terms
The objective of this data collection was to examine inequalities of wealth and the geographic distribution of wealthy individuals in late 18th- and early 19th-century New York and to investigate wealth in relationship to occupation and location. For this study, the entire set of tax assessment records and United States Census records for New York City were computerized and occupational status was added for all entries. The collection addresses topics such as social class structure, demographic factors, occupational status and geographic distribution, property values and geographic distribution, and the relationship of these factors to the political system. Units of analysis were individual property owners and renters for the tax assessment data and heads of households for the census data. Data collected included the individual's name, address, occupation, sex, and race, the type, quantity, and value of real and personal property, and the type and occupancy of the structure at the address. Occupational data from city directories were used to supplement the tax and census data.
Facebook
TwitterThe Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset was created on 2020-01-10 22:51:40.810 by merging multiple datasets together. The source datasets for this version were:
IPUMS 1900 households: This dataset includes all households from the 1900 US census.
IPUMS 1900 persons: This dataset includes all individuals from the 1910 US census.
IPUMS 1900 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
Historic data are scarce and often only exists in aggregate tables. The key advantage of the IPUMS data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the IPUMS data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The IPUMS 1900 census data was collected in June 1900. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Facebook
TwitterThis Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.
The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.
This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.
The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:
Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
Facebook
TwitterOfficial statistics are produced impartially and free from political influence.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/7923/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7923/terms
This data collection consists of modified records from CENSUS OF POPULATION AND HOUSING, 1970 [UNITED STATES]: PUBLIC USE SAMPLES (ICPSR 0018). The original records consisted of 120-character household records and 120-character person records, whereas the new modified records are rectangular (each person record is combined with the corresponding household record) with a length of 188, after the deletion of some items. Additional information was added to the data records, including typical educational requirement for current occupation, occupational prestige score, and group identification code. This version also differs from the original public use census samples in other ways: persons aged 15-75 were included, no majority males were included, but the majority males from CENSUS OF POPULATION AND HOUSING [UNITED STATES], 1970 PUBLIC USE SAMPLE: MODIFIED 1/1000 5% STATE SAMPLES (ICPSR 7922) were included for convenience, 10 percent of the Black population from each file was included, and Mexican Americans (identified by a Spanish surname) from outside the five southwestern states of Arizona, California, Colorado, New Mexico, and Texas were not included in this file. Variables provide information on the housing unit, such as occupancy and vacancy status of house, value of property, commercial use, ratio of rent and property value to family income, availability of plumbing facilities, sewage disposal, complete kitchen facilities, heating facilities, flush toilet, water, television, and telephone. Data are also provided on household characteristics such as household size, family size, and household relationships. Other demographic variables specify age, sex, place of birth, state of residence, Spanish descent, marital status, race, veteran status, income, and ratio of family income to poverty cutoff level. This collection was made available by the National Chicano Research Network of the Institute for Social Research, University of Michigan. See the related collection, CENSUS OF POPULATION AND HOUSING [UNITED STATES], 1970 PUBLIC USE SAMPLE: MODIFIED 1/1000 5% STATE SAMPLES (ICPSR 7922).
Facebook
TwitterTwo novel datasets—French 19th-century and U.S. 1950 Census records—to demonstrate our approach.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119
Abstract (en): This data collection provides a preliminary subsample of the 1880 Public Use Sample drawn from census enumeration forms. The file contains two types of records: family and person. Each household record is followed by a record for each person in the family. This collection contains information about size of family, number of persons and families in dwelling, and geographic location of each household. Information on individuals includes demographic characteristics, civil condition, occupation, health, education, and nativity. Manuscript census records from 1880 for the 38 United States, the District of Columbia, and the Dakota Territory. This collection is a nationally representative--although clustered--1 in 1000 preliminary subsample of the United States population in 1880. The subsample is based on every tenth microfilm reel of enumeration forms (there are a total of 1,454 reels) and, within each reel, on the census page itself. In terms of the Public Use Sample as a whole, a sample density of 1 person per 100 was chosen so that a single sample point was randomly generated for every two census pages. Sample points were chosen for inclusion in the collection only if the individual selected was the first person listed in the dwelling. Under this procedure each dwelling, family, and individual in the population had a 1 in 100 probability of inclusion in the Public Use Sample. The complete sample, which will be released by the principal investigators in December 1993, will contain approximately 500,000 individuals living in 100,000 families, or 1 percent of the United States population in 1880. Funding insitution(s): United States Department of Health and Human Services. National Institutes of Health (HD25839). (1) This dataset has two levels. The first level ("F" Record Type) contains 29 variables for each of 10,126 families. The second level ("P" Record Type) contains 45 variables for each of 48,786 individuals residing in those families. (2) The data contain blanks and alphabetic characters. (3) Users will note some differences in code frequencies between certain variables in this collection and the totals listed in the documentation. (4) This collection is superseded by CENSUS OF POPULATION, 1880 [UNITED STATES]: PUBLIC USE SAMPLE (ICPSR 6460).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context. This historical dataset stems from the project of automatic extraction of 72 census records of Lausanne, Switzerland. The complete dataset covers a century of historical demography in Lausanne (1805-1898), which corresponds to 18,831 pages, and nearly 6 million cells.
Content. The data published in this repository correspond to a first release, i.e. a diachronic slice of one register every 8 to 9 years. Unfortunately, the remaining data are currently under embargo. Their publication will take place as soon as possible, and at the latest by the end of 2023. In the meantime, the data presented here correspond to a large subset of 2,844 pages, which already allows to investigate most research hypotheses.
Description. The population censuses, digitized by the Archives of the city of Lausanne, continuously cover the evolution of the population in Lausanne throughout the 19th century, starting in 1805, with only one long interruption from 1814 to 1831. Highly detailed, they are an invaluable source for studying migration, economic and social history, and traces of cultural exchanges not only with Bern, but also with France and Italy. Indeed, the system of tracing family origin, specific to Switzerland, allows to follow the migratory movements of families long before the censuses appeared. The bourgeoisie is also an essential economic tracer. In addition, censuses extensively describe the organization of the social fabric into family nuclei, around which gravitate various boarders, workers, servants or apprentices, often living in the same apartment with the family.
Production. The structure and richness of censuses have also provided an opportunity to develop automatic methods for processing structured documents. The processing of censuses includes several steps, from the identification of text segments to the restructuring of information as digital tabular data, through Handwritten Text Recognition and the automatic segmentation of the structure using neural networks. Please note that the detailed extraction methodology, as well as the complete evaluation of performance and reliability is published in:
Petitpierre R., Rappo L., Kramer M. (2023). An end-to-end pipeline for historical censuses processing. International Journal on Document Analysis and Recognition (IJDAR). doi: 10.1007/s10032-023-00428-9
Data structure. The data are structured in rows and columns, with each row corresponding to a household. Multiple entries in the same column for a single household are separated by vertical bars ⟨|⟩. The center point ⟨·⟩ indicates an empty entry. For some columns (e.g., street name, house number, owner name), an empty entry indicates that the last non-empty value should be carried over. The page number is in the last column.
Liability. The data presented here are not curated nor verified. They are the raw results of the extraction, the reliability of which was thoroughly assessed in the above-mentioned publication. We insist on the fact that for any reuse of this data for research purposes, the implementation of an appropriate methodology is necessary. This may typically include string distance heuristics, or statistical methodologies to deal with noise and uncertainty.
Facebook
TwitterThe layer was derived and compiled from the U.S. Census Bureau’s 2013 – 2017 American Community Survey (ACS) 5-Year Estimates in order to assist 2020 Census planning purposes.
Source: U.S. Census Bureau, Table B04006 PEOPLE REPORTING ANCESTRY, 2013 – 2017 ACS 5-Year Estimates
Effective Date: December 2018
Last Update: December 2019
Update Cycle: ACS 5-Year Estimates update annually each December. Vintage used for 2020 Census planning purposes by Broward County.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
China Population Census: Family Size data was reported at 2.620 Person in 12-01-2020. This records a decrease from the previous number of 3.100 Person for 12-01-2010. China Population Census: Family Size data is updated decadal, averaging 3.960 Person from Dec 1953 (Median) to 12-01-2020, with 7 observations. The data reached an all-time high of 4.430 Person in 12-01-1964 and a record low of 2.620 Person in 12-01-2020. China Population Census: Family Size data remains active status in CEIC and is reported by National Bureau of Statistics. The data is categorized under China Premium Database’s Socio-Demographic – Table CN.GA: Population: National Population Census.
Facebook
TwitterThe Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The historic US 1910 census data was collected in April 1910. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
This dataset was created on 2020-01-10 23:47:27.924 by merging multiple datasets together. The source datasets for this version were:
IPUMS 1910 households: The Integrated Public Use Microdata Series (IPUMS) Complete Count Data are historic individual and household census records and are a unique source for research on social and economic change.
IPUMS 1910 persons: This dataset includes all individuals from the 1910 US census.
Facebook
TwitterABS Census data extract - G08 ANCESTRY BY COUNTRY OF BIRTH OF PARENTS providing a breakdown of population at LGA level and by:ancestry(a)birthplace not stated(b)total responses(c) andother(d)This data is based on place of usual residence.(a) This list of ancestries consists of the most common 30 Ancestry responses reported in the 2016 and 2011 Census. (b) Includes birthplace for either or both parents not stated.(c) This table is a multi-response table and therefore the total responses count will not equal the total persons count.(d) If two responses from one person are categorised in the 'Other' category only one response is counted. Includes ancestries not identified individually and 'Inadequately described'.Please note that there are small random adjustments made to all cell values to protect the confidentiality of data. These adjustments may cause the sum of rows or columns to differ by small amounts from table totals.
Facebook
TwitterThe layer was derived and compiled from the U.S. Census Bureau’s 2013 – 2017 American Community Survey (ACS) 5-Year Estimates in order to assist 2020 Census planning purposes.
Source: U.S. Census Bureau, Table B04006 PEOPLE REPORTING ANCESTRY, 2013 – 2017 ACS 5-Year Estimates
Effective Date: December 2018
Last Update: December 2019
Update Cycle: ACS 5-Year Estimates update annually each December. Vintage used for 2020 Census planning purposes by Broward County.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Replication files (syntax) and data from: Intergenerational mobility in a mid-Atlantic economy: Canada,1871-1901.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This training dataset includes a total of 34,913 manually transcribed text segments. It is dedicated to the handwritten text recognition (HTR) of historical sources, typically tabular records, such as censuses. This dataset is based on a sample of 83 pages from the 19th century (1805-1898) censuses of Lausanne, Switzerland. The primary language of the documents is French, although many germanic names and toponyms are also found.
The training data are formatted and provided on the model of the Bentham dataset. The format thus simply consists in a list of jpeg images, one per text segments, and their corresponding transcription, stored in a txt file. The file naming convention is 'yyyy-ppp-n', where 'y' stands for the year of publication of the census, and 'p' for the page number.
The digitized documents are provided by the Archives of the City of Lausanne.
Please note that the annotation and extraction methodology, as well as the complete evaluation of performance, including HTR benchmark and post-correction performance is published in :
Petitpierre R., Rappo L., Kramer M. (2023). An end-to-end pipeline for historical censuses processing. International Journal on Document Analysis and Recognition (IJDAR). doi: 10.1007/s10032-023-00428-9
Tabular dataset resulting from automatic extraction are also available on Zenodo :
Petitpierre R., Rappo L., Kramer M., di Lenardo I. (2023). 1805-1898 Census Records of Lausanne : a Long Digital Dataset for Demographic History. Zenodo. doi: 10.5281/zenodo.7711640
Facebook
TwitterOpen Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.
Facebook
Twitterhttps://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Mean Family Income in Northeast Census Region (MAFAINUSNEA646N) from 1967 to 2024 about Northeast Census Region, family, average, income, and USA.
Facebook
TwitterThe layer was derived and compiled from the U.S. Census Bureau’s 2013 – 2017 American Community Survey (ACS) 5-Year Estimates in order to assist 2020 Census planning purposes.
Source: U.S. Census Bureau, Table B04006 PEOPLE REPORTING ANCESTRY, 2013 – 2017 ACS 5-Year Estimates
Effective Date: December 2018
Last Update: December 2019
Update Cycle: ACS 5-Year Estimates update annually each December. Vintage used for 2020 Census planning purposes by Broward County.
Facebook
TwitterThis dataset includes all individuals from the 1870 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1870 census was 1 June 1870. The main goal of an early census like the 1870 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT