The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was created on 2020-01-10 22:52:11.461
by merging multiple datasets together. The source datasets for this version were:
IPUMS 1930 households: This dataset includes all households from the 1930 US census.
IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.
IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.
Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Notes
We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.
Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.
Coded variables derived from string variables are still in progress. These variables include: occupation and industry.
Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.
Most inconsistent information was not edite
1825 Census of Lower Canada contains records from St. Jean, Deschaillons, Bécancour, Quebec, Canada by Ancestry.com. 1825 Census of Lower Canada [database on-line]. Provo, UT, USA: Ancestry.com Operations, Inc., 2014.; Original data: Canada, Lower Canada Census, 1825. Salt Lake City, Utah: FamilySearch, 2013. - Page: 426; Affiliate Publication Title: 1825 Lower Canada Census; Affiliate Publication Number: MG 31 C1; FHL Film Number: 2443957.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This spreadsheet is designed to be used in conjunction with the Integrated Census Microdata (I-CeM) collection of historic census data covering the period 1851 to 1921. For further details of the I-CeM data collection, please visit the comprehensive project website at:
https://www.campop.geog.cam.ac.uk/research/projects/icem/
Outline information on the I-CeM project are also provided on the README page of this spreadsheet.
This file is specifically related to the I-CeM data collection variable HHD
https://www.cognitivemarketresearch.com/privacy-policyhttps://www.cognitivemarketresearch.com/privacy-policy
As per Cognitive Market Research's latest published report, the Global Genealogy Products and Services Market size will be USD 5,093.64 Million by 2028. Genealogy Products and Services Industry's Compound Annual Growth Rate will be 7.97% from 2023 to 2030.
The North America Genealogy Products and Services market size will be USD 2,008.93 Million by 2028.
Market Dynamics of Genealogy Products and Services
Key Drivers for Genealogy Products and Services
Growing Interest in Ancestry and Family History: Rising consumer interest in personal heritage, cultural origins, and ethnic backgrounds is driving the demand for genealogy kits, online family tree services, and archival data platforms.
Advancements in DNA Testing Technologies: The development of cost-effective and precise DNA testing technologies has transformed genealogy, facilitating easier access for consumers to genetic information that enhances traditional family research.
Increased Digitalization of Historical Records: Governments, religious institutions, and private companies are digitizing essential records (birth, marriage, death, census), broadening access for genealogists and boosting subscriptions to genealogy services.
Key Restraints for Genealogy Products and Services
Concerns Regarding Privacy and Data Security: The act of sharing genetic and personal information on the internet presents significant privacy challenges, which may deter potential users due to fears of misuse, data breaches, or insufficient control over their personal data.
Limited Access to Records in Specific Regions: The presence of historical conflicts, inadequate recordkeeping, and disjointed archives in certain nations complicates the process of tracing lineage, thereby diminishing the effectiveness and attractiveness of services on a global scale.
Costs Associated with Subscriptions and Testing: Despite a reduction in prices, the comprehensive DNA kits and premium family history subscriptions continue to pose a financial obstacle for numerous users, particularly in developing economies.
Key Trends for Genealogy Products and Services
Integration of Artificial Intelligence for Record Matching: Companies are leveraging AI and machine learning technologies to identify patterns, propose familial connections, and automatically construct family trees, thereby improving user experience and the precision of research.
Collaborations with Health and Wellness Providers: Genealogy services are progressively forming partnerships with health platforms, providing users with insights into genetic predispositions, nutrition based on ancestry, and wellness recommendations.
Mobile Applications and Research Tools for On-the-Go: There is an increasing trend towards mobile-optimized platforms, allowing users to investigate family trees, upload documents, and engage with relatives directly from their smartphones. Introduction of Genealogy Products and Services
Genealogy is study of family and their history, tracing lineages, obtaining information about family, ancestors and it comprises DNA testing cemetery records, family tree creation, newspapers, online records, blogs, links that provides access to database for obtaining information about family members.
There are various institutions, advanced applications that are mobile based used for finding information about ancestors. The market is growing rapidly with adoption of emerging technologies that boost its growth in the market.
There is increasing technological advancement in the genealogical studies and its benefits in effectively find out information about ancestors has gained popularity across globe that drives the growth of genealogy products and service market.
For instance, there are various technological incorporation and ensure cost effective research that helps in tracing lineages, information about ancestors. The major companies are adopting DNA testing services and they merged genealogical research with genetic testing that helps in obtaining information about families. They have database, online records that has detailed information about ancestors. They use modern applications such as Ancestry, electronic database, blogs, that provide accurate database and genetic representation of family tree used in genetic services.
There are various benefits such as genealogical data provides medical history of...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
[ed. note: from https://www.census.gov/topics/population/genealogy/data/2000_surnames.html as of May 29, 2017. Has also been referenced as http://www.census.gov/genealogy/www/data/2000surnames/index.html]NOTE: This presentation of data focuses on summarized aggregates of counts of surnames, and does not in any way identify specific individuals.Tabulations of all surnames occurring 100 or more times in the Census 2000 returns are provided in the files listed below. The first link explains the methodology used for identifying and editing names data. The second link provides an Excel file of the top 1000 surnames. The third link provides zipped Excel and CSV (comma separated) files of the complete list of 151,671 names. Related Files [Ed. note: the links point to the original location; all files are available in this archive as well]Technical Documentation: Demographic Aspects of Surnames - Census 2000 <1.0MBFile A: Top 1000 Names <1.0MBFile B: Surnames Occurring 100 or more times <1.0MB
1960 Ancestry Census Data for Baltimore, Maryland. Refer to the 1960 codebook (codebook_1960.pdf) for more information. This is part of a collection of 221 Baltimore Ecosystem Study metadata records that point to a geodatabase. The geodatabase is available online and is considerably large. Upon request, and under certain arrangements, it can be shipped on media, such as a usb hard drive. The geodatabase is roughly 51.4 Gb in size, consisting of 4,914 files in 160 folders. Although this metadata record and the others like it are not rich with attributes, it is nonetheless made available because the data that it represents could be indeed useful.
https://www.thebusinessresearchcompany.com/privacy-policyhttps://www.thebusinessresearchcompany.com/privacy-policy
Global Genealogy Products And Services market size is expected to reach $7.94 billion by 2029 at 11.4%, segmented as by family records, birth records, marriage records, death records, census records, immigration and naturalization records
https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445119
Abstract (en): This data collection provides a preliminary subsample of the 1880 Public Use Sample drawn from census enumeration forms. The file contains two types of records: family and person. Each household record is followed by a record for each person in the family. This collection contains information about size of family, number of persons and families in dwelling, and geographic location of each household. Information on individuals includes demographic characteristics, civil condition, occupation, health, education, and nativity. Manuscript census records from 1880 for the 38 United States, the District of Columbia, and the Dakota Territory. This collection is a nationally representative--although clustered--1 in 1000 preliminary subsample of the United States population in 1880. The subsample is based on every tenth microfilm reel of enumeration forms (there are a total of 1,454 reels) and, within each reel, on the census page itself. In terms of the Public Use Sample as a whole, a sample density of 1 person per 100 was chosen so that a single sample point was randomly generated for every two census pages. Sample points were chosen for inclusion in the collection only if the individual selected was the first person listed in the dwelling. Under this procedure each dwelling, family, and individual in the population had a 1 in 100 probability of inclusion in the Public Use Sample. The complete sample, which will be released by the principal investigators in December 1993, will contain approximately 500,000 individuals living in 100,000 families, or 1 percent of the United States population in 1880. Funding insitution(s): United States Department of Health and Human Services. National Institutes of Health (HD25839). (1) This dataset has two levels. The first level ("F" Record Type) contains 29 variables for each of 10,126 families. The second level ("P" Record Type) contains 45 variables for each of 48,786 individuals residing in those families. (2) The data contain blanks and alphabetic characters. (3) Users will note some differences in code frequencies between certain variables in this collection and the totals listed in the documentation. (4) This collection is superseded by CENSUS OF POPULATION, 1880 [UNITED STATES]: PUBLIC USE SAMPLE (ICPSR 6460).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NOTE: No specific individual information is given.The Census Bureau receives numerous requests to supply information on name frequency. In an effort to comply with those requests, the Census Bureau has embarked on a names list project involving a tabulation of names from the 1990 Census. These files contain only the frequency of a given name, no specific individual information.[ed.note: all links point to the original URL; all files are available in this repository]Name List: Documentation and Methodology <1.0MBFrequently Occurring Surnames from Census 1990 – Names Files[ed. note: this content was originally on a separate webpage, at https://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html]Filesdist.all.last [<1.0MB]dist.female.first [<1.0MB]dist.male.first [<1.0MB]Each of the three files, (dist.all.last), (dist. male.first), and (dist female.first) contain four items of data. The four items are:A "Name"Frequency in percentCumulative Frequency in percentRankIn the file (dist.all.last) one entry appears as:MOORE 0.312 5.312 9In our search area sample, MOORE ranks 9th in terms of frequency. 5.312 percent of the sample population is covered by MOORE and the 8 names occurring more frequently than MOORE. The surname, MOORE, is possessed by 0.312 percent of our population sample.
This dataset includes all individuals from the 1860 US census.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was developed through a collaboration between the Minnesota Population Center and the Church of Jesus Christ of Latter-Day Saints. The data contain demographic variables, economic variables, migration variables and race variables. Unlike more recent census datasets, pre-1900 census datasets only contain individual level characteristics and no household or family characteristics, but household and family identifiers do exist.
The official enumeration day of the 1860 census was 1 June 1860. The main goal of an early census like the 1860 U.S. census was to allow Congress to determine the collection of taxes and the appropriation of seats in the House of Representatives. Each district was assigned a U.S. Marshall who organized other marshals to administer the census. These enumerators visited households and recorder names of every person, along with their age, sex, color, profession, occupation, value of real estate, place of birth, parental foreign birth, marriage, literacy, and whether deaf, dumb, blind, insane or “idiotic”.
Sources: Szucs, L.D. and Hargreaves Luebking, S. (1997). Research in Census Records, The Source: A Guidebook of American Genealogy. Ancestry Incorporated, Salt Lake City, UT Dollarhide, W.(2000). The Census Book: A Genealogist’s Guide to Federal Census Facts, Schedules and Indexes. Heritage Quest, Bountiful, UT
Ireland Census contains records from Scalp, Peterswell, County Galway, Ireland by Class: RG14; Census of Ireland 1901/1911. The National Archives of Ireland. http://www.census.nationalarchives.ie/search/: accessed 31 May 2013; Ancestry.com. Web: Ireland, Census, 1911 [database on-line]. Provo, UT, USA: Ancestry.com Operations, Inc., 2013. - .
CCRI Selected Published Tables Data Files: For each census from 1911-1951, a series of published volumes and tables were produced by the Dominion of Canada’s statistical agency. From those published books, the CCRI made a selection of 23 tables which contain information regarding particular topics such as: population (male and female counts), number of dwellings, households and families, as well as religion and origin of the people. For 1951, selected tables from published volumes (1 & 3) included: Population by census subdivisions, 1871-1951 Population by sex for census subdivisions, 1951 Population by origin and sex, for counties and census divisions, 1951 Population by specified religious denominations, for census subdivisions, 1951 Households by number of persons and average number of persons per household, for counties and census divisions, rural farm, rural non-farm, and urban, 1951 Occupied dwellings by tenure, for counties and census divisions, rural farm, rural non-farm, and urban, 1951 Occupied dwellings by tenure showing type of dwelling, for counties and census divisions, 1951
This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.
The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.
This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.
The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:
Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
This User Guide contains information about the 2021 Census NSPL including: directory content; data currency; the methodology for assigning areas to postcodes; data formats; data quality and limitations and details of recent changes that have impacted on the data. Various annexes and tables provide more detailed supporting information. (File size - 620 KB)
https://search.gesis.org/research_data/datasearch-httpsoai-datacite-orgoai--doi10-5255ukda-sn-7427-2https://search.gesis.org/research_data/datasearch-httpsoai-datacite-orgoai--doi10-5255ukda-sn-7427-2
The aggregate data produced as outputs from censuses in the United Kingdom provide information on a wide range of demographic and socio-economic characteristics. They are predominantly a collection of aggregated, or summary counts of the numbers of people, families or households resident in specific geographical areas possessing particular characteristics drawn from the themes of population, people and places, families, ethnicity and religion, health, work, and housing.
Aggregate data for Census 2011 cover the full range of geographies employed within the census, from the smallest (output areas with an average of 150 persons in England and Wales) to the nation as a whole.
• Access data through InFuse
• Census aggregate data guide
Citation: Office for National Statistics. (2019). 2011 Census: Aggregate Data. [data collection]. UK Data Service. SN: 7427, http://doi.org/10.5257/census/aggregate-2011-2
The UK censuses took place on 27 March 2011. They were run by the Northern Ireland Statistics & Research Agency (NISRA), National Records of Scotland (NRS), and the Office for National Statistics (ONS) for both England and Wales. The UK comprises the countries of England, Wales, Scotland and Northern Ireland.
Statistics from the UK censuses help paint a picture of the nation and how we live. They provide a detailed snapshot of the population and its characteristics, and underpin funding allocation to provide public services. This is the home for all UK census data.
The Survey of Consumer Finances (SCF) is conducted annually to obtain work experience and income information from Canadian households. The Survey provides up-to-date information on the distribution and sources of income, before and after taxes, for families and individuals. With this file, users may identify specific family types, such as two-parent and lone-parent families. Information is also provided on earnings, transfers, and total income for the head and the spouse of the census family unit, as well as personal and labour-related characteristics. This reference year for this file is 1986. Commencing with the 1998 microdata files, annual cross-sectional income data will be sourced from the Survey of Labour and Income Dynamics (SLID).
The Survey of Consumer Finances (SCF) is conducted annually to obtain work experience and income information from Canadian households. The Survey provides up-to-date information on the distribution and sources of income, before and after taxes, for families and individuals. With this file, users may identify specific family types, such as two-parent and lone-parent families. Information is also provided on earnings, transfers, and total income for the head and the spouse of the census family unit, as well as personal and labour-related characteristics. This reference year for this file is 1996. Commencing with the 1998 microdata files, annual cross-sectional income data will be sourced from the Survey of Labour and Income Dynamics (SLID).
The Survey of Labour and Income Dynamics (SLID) complements traditional survey data on labour market activity and income with an additional dimension: the changes experienced by individuals over time. At the heart of the survey's objectives is the understanding of the economic well-being of Canadians: what economic shifts do individuals and families live through, and how does it vary with changes in their paid work, family make-up, receipt of government transfers or other factors? The survey's longitudinal dimension makes it possible to see such concurrent and often related events. SLID is the first Canadian household survey to provide national data on the fluctuations in income that a typical family or individual experiences over time which gives greater insight on the nature and extent of poverty in Canada. Added to the longitudinal aspect are the "traditional" cross-sectional data: the primary Canadian source for income data and providing additional content to data collected by the Labour Force Survey (LFS). Particularly in SLID, the focus extends from static measures (cross-sectional) to the whole range of transitions, durations, and repeat occurrences (longitudinal) of people's financial and work situations. Since their family situation, education, and demographic background may play a role, the survey has extensive information on these topics as well.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
This dataset was created on 2020-01-10 22:52:11.461
by merging multiple datasets together. The source datasets for this version were:
IPUMS 1930 households: This dataset includes all households from the 1930 US census.
IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.
IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.
Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Notes
We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.
Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.
Coded variables derived from string variables are still in progress. These variables include: occupation and industry.
Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.
Most inconsistent information was not edite