Facebook
TwitterWebsite alows the public full access to the 1940 Census images, census maps and descriptions.
Facebook
TwitterThis dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1900 datasets.
Facebook
TwitterThe Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.
All manuscripts (and other items you'd like to publish) must be submitted to
phsdatacore@stanford.edu for approval prior to journal submission.
We will check your cell sizes and citations.
For more information about how to cite PHS and PHS datasets, please visit:
https:/phsdocs.developerhub.io/need-help/citing-phs-data-core
Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.
In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.
The historic US 1920 census data was collected in January 1920. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.
Notes
We provide household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.
Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.
Coded variables derived from string variables are still in progress. These variables include: occupation and industry.
Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, MORTGAGE, FARM, CLASSWKR, OCC1950, IND1950, MARST, RACE, SEX, RELATE, MTONGUE. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.
Most inconsistent information was not edited for this release, thus there are observations outside of the universe for some variables. In particular, the variables GQ, and GQTYPE have known inconsistencies and will be improved with the next release.
%3C!-- --%3E
This dataset was created on 2020-01-10 18:46:34.647 by merging multiple datasets together. The source datasets for this version were:
IPUMS 1920 households: This dataset includes all households from the 1920 US census.
IPUMS 1920 persons: This dataset includes all individuals from the 1920 US census.
IPUMS 1920 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1920 datasets.
Facebook
TwitterThe Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.
Facebook
Twitterhttps://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445718https://search.gesis.org/research_data/datasearch-httpwww-da-ra-deoaip--oaioai-da-ra-de445718
Abstract (en): This data collection contains FIPS codes for state, county, county subdivision, and place, along with the 1990 Census tract number for each side of the street for the urban cores of 550 counties in the United States. Street names, including prefix and/or suffix direction (north, southeast, etc.) and street type (avenue, lane, etc.) are provided, as well as the address range for that portion of the street located within a particular Census tract and the corresponding Census tract number. The FIPS county subdivision and place codes can be used to determine the correct Census tract number when streets with identical names and ranges exist in different parts of the same county. Contiguous block segments that have consecutive address ranges along a street and that have the same geographic codes (state, county, Census tract, county subdivision, and place) have been collapsed together and are represented by a single record with a single address range. 2006-01-12 All files were removed from dataset 551 and flagged as study-level files, so that they will accompany all downloads. (1) Due to the number of files in this collection, parts have been eliminated here. For a complete list of individual part names designated by state and county, consult the ICPSR Website. (2) There are two types of records in this collection, distinguished by the first character of each record. A "0" indicates a street name/address range record that can be used to find the Census tract number and other geographic codes from a street name and address number. A "2" indicates a geographic code/name record that can be used to find the name of the state, county, county subdivision, and/or place from the FIPS code. The "0" records contain 18 variables and the "2" records contain 10 variables.
Facebook
TwitterThe 1940 Census population schedules were created by the Bureau of the Census in an attempt to enumerate every person living in the United States on April 1, 1940, although some persons were missed. The 1940 census population schedules were digitized by the National Archives and Records Administration (NARA) and released publicly on April 2, 2012. The 1940 Census enumeration district maps contain maps of counties, cities, and other minor civil divisions that show enumeration districts, census tracts, and related boundaries and numbers used for each census. The coverage is nation wide and includes territorial areas. The 1940 Census enumeration district descriptions contain written descriptions of census districts, subdivisions, and enumeration districts.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/2863/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/2863/terms
The objective of this data collection was to examine inequalities of wealth and the geographic distribution of wealthy individuals in late 18th- and early 19th-century New York and to investigate wealth in relationship to occupation and location. For this study, the entire set of tax assessment records and United States Census records for New York City were computerized and occupational status was added for all entries. The collection addresses topics such as social class structure, demographic factors, occupational status and geographic distribution, property values and geographic distribution, and the relationship of these factors to the political system. Units of analysis were individual property owners and renters for the tax assessment data and heads of households for the census data. Data collected included the individual's name, address, occupation, sex, and race, the type, quantity, and value of real and personal property, and the type and occupancy of the structure at the address. Occupational data from city directories were used to supplement the tax and census data.
Facebook
TwitterThis data layer is an element of the Oregon GIS Framework. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. Census tracts are small, relatively permanent statistical subdivisions of a county or equivalent entity, and were defined by local participants as part of the 2020 Census Participant Statistical Areas Program. The Census Bureau delineated the census tracts in situations where no local participant existed or where all the potential participants declined to participate. The primary purpose of census tracts is to provide a stable set of geographic units for the presentation of census data and comparison back to previous decennial censuses. Census tracts generally have a population size between 1,200 and 8,000 people, with an optimum size of 4,000 people. When first delineated, census tracts were designed to be homogeneous with respect to population characteristics, economic status, and living conditions. The spatial size of census tracts varies widely depending on the density of settlement. Physical changes in street patterns caused by highway construction, new development, and so forth, may require boundary revisions. In addition, census tracts occasionally are split due to population growth, or combined as a result of substantial population decline. Census tract boundaries generally follow visible and identifiable features. They may follow legal boundaries such as minor civil division (MCD) or incorporated place boundaries in some States and situations to allow for census tract-to-governmental unit relationships where the governmental boundaries tend to remain unchanged between censuses. State and county boundaries always are census tract boundaries in the standard census geographic hierarchy. In a few rare instances, a census tract may consist of noncontiguous areas. These noncontiguous areas may occur where the census tracts are coextensive with all or parts of legal entities that are themselves noncontiguous. For the 2010 Census and beyond, the census tract code range of 9400 through 9499 was enforced for census tracts that include a majority American Indian population according to Census 2000 data and/or their area was primarily covered by federally recognized American Indian reservations and/or off-reservation trust lands; the code range 9800 through 9899 was enforced for those census tracts that contained little or no population and represented a relatively large special land use area such as a National Park, military installation, or a business/industrial park; and the code range 9900 through 9998 was enforced for those census tracts that contained only water area, no land area.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/6836/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/6836/terms
This data collection constitutes a portion of the historical data collected by the project "Early Indicators of Later Work Levels, Disease, and Death." With the goal of constructing datasets suitable for longitudinal analyses of factors affecting the aging process, the project is collecting military, medical, and socioeconomical data on a sample of white males mustered into the Union Army during the Civil War. The project seeks to examine the influence of environmental and host factors prior to recruitment on the health performance and survival of recruits during military service, to identify and show relationships between socioeconomic and biomedical conditions (including nutritional status) of veterans at early ages and mortality rates from diseases at middle and late ages, and to study the effects of health and pensions on labor force participation rates of veterans at ages 65 and over. This installment of the collection, Version C-3, supersedes all previous collections (Versions C-1 and C-2), and contains data from the censuses of 1850, 1860, 1900, and 1910 on veterans who were originally mustered into the Union Army in Connecticut, Delaware, District of Columbia, Illinois, Iowa, Kansas, Kentucky, Maine, Maryland, Massachusetts, Michigan, Minnesota, Missouri, New Hampshire, New Jersey, New York, Ohio, Pennsylvania, Vermont, and West Virginia. This version of the collection also contains observations from Wisconsin, Indiana, California, and New Mexico. Census Data, Part 1, includes place of residence, relationship to head of household, date and place of birth, number of children, education, disability status, employment status, number of years in the United States, literacy, marital status, occupation, parents' birthplace, and property/home ownership. The variables in Part 2, Linkage Data, indicate which document sources were located for each recruit.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/7756/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/7756/terms
This collection contains individual-level and 1-percent national sample data from the 1960 Census of Population and Housing conducted by the Census Bureau. It consists of a representative sample of the records from the 1960 sample questionnaires. The data are stored in 30 separate files, containing in total over two million records, organized by state. Some files contain the sampled records of several states while other files contain all or part of the sample for a single state. There are two types of records stored in the data files: one for households and one for persons. Each household record is followed by a variable number of person records, one for each of the household members. Data items in this collection include the individual responses to the basic social, demographic, and economic questions asked of the population in the 1960 Census of Population and Housing. Data are provided on household characteristics and features such as the number of persons in household, number of rooms and bedrooms, and the availability of hot and cold piped water, flush toilet, bathtub or shower, sewage disposal, and plumbing facilities. Additional information is provided on tenure, gross rent, year the housing structure was built, and value and location of the structure, as well as the presence of air conditioners, radio, telephone, and television in the house, and ownership of an automobile. Other demographic variables provide information on age, sex, marital status, race, place of birth, nationality, education, occupation, employment status, income, and veteran status. The data files were obtained by ICPSR from the Center for Social Analysis, Columbia University.
Facebook
Twitter2020 Census Tract to MCD lookup table
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/4344/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/4344/terms
The data comprising the Puerto Rico Census Project, 1920 contain individual and household records drawn from the 1920 Puerto Rican Population Census. The data include variables containing basic demographic information such as age, sex, race, marital status, number of children born and surviving, family size, place of birth, immigration status, county and neighborhood of residence, urban/rural status, and citizenship. The data also describe language proficiency, literacy, school attendance, and disabilities (blind or deaf) of the individuals. Other variables provide data on occupation, industry, ownership of residence, status of mortgage, and farm ownership. There are four classifications of variables belonging to this dataset: original input variables, coded variables, constructed variables, and quality flag variables. The original input variables contain the raw data collected by the enumerators. The coded variables are variables that were recoded by the University of Wisconsin Survey Center (UWSC) as part of the Puerto Rico Census Project. Constructed variables were produced by UWSC to capture additional relevant information. For example, one constructed variable measures literacy by combining separate variables containing data on whether the individual could read and if they could write. Finally, quality flag variables were created by UWSC to indicate whether it could be logically deduced that individual records had been hand edited by the Census Office.
Facebook
TwitterThe TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
Facebook
TwitterThe TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
Facebook
TwitterThis Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.
The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.
This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.
The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:
Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
Facebook
TwitterThe TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) System (MTS). The MTS represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Range Features shapefile contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. This shapefile contains a record for each address range to street name combination. Address ranges associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that this shapefile includes all unsuppressed address ranges compared to the All Lines shapefile (edges.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefiles contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line shapefiles are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
Facebook
TwitterThe TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The Address Ranges Feature Shapefile (ADDRFEAT.dbf) contains the geospatial edge geometry and attributes of all unsuppressed address ranges for a county or county equivalent area. The term "address range" refers to the collection of all possible structure numbers from the first structure number to the last structure number and all numbers of a specified parity in between along an edge side relative to the direction in which the edge is coded. Single-address address ranges have been suppressed to maintain the confidentiality of the addresses they describe. Multiple coincident address range feature edge records are represented in the shapefile if more than one left or right address ranges are associated to the edge. The ADDRFEAT shapefile contains a record for each address range to street name combination. Address range associated to more than one street name are also represented by multiple coincident address range feature edge records. Note that the ADDRFEAT shapefile includes all unsuppressed address ranges compared to the All Lines Shapefile (EDGES.shp) which only includes the most inclusive address range associated with each side of a street edge. The TIGER/Line shapefile contain potential address ranges, not individual addresses. The address ranges in the TIGER/Line Files are potential ranges that include the full range of possible structure numbers even though the actual structures may not exist.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Summary The CENSUS-NER-Name-Email-Address-Phone dataset is a processed and structured version of the FMCSA (Federal Motor Carrier Safety Administration) CENSUS1 2016Sep dataset. It is designed to assist in training language models for tasks such as Named Entity Recognition (NER), address parsing, and information extraction from unstructured text. The dataset contains records that include information such as name, email, phone number, and address, extracted from the original dataset and… See the full description on the dataset page: https://huggingface.co/datasets/Josephgflowers/CENSUS-NER-Name-Email-Address-Phone.
Facebook
Twitterhttps://www.icpsr.umich.edu/web/ICPSR/studies/8236/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/8236/terms
The 1940 Census Public Use Microdata Sample Project was assembled through a collaborative effort between the United States Bureau of the Census and the Center for Demography and Ecology at the University of Wisconsin. The collection contains a stratified 1-percent sample of households, with separate records for each household, for each "sample line" respondent, and for each person in the household. These records were encoded from microfilm copies of original handwritten enumeration schedules from the 1940 Census of Population. Geographic identification of the location of the sampled households includes Census regions and divisions, states (except Alaska and Hawaii), standard metropolitan areas (SMAs), and state economic areas (SEAs). Accompanying the data collection is a codebook that includes an abstract, descriptions of sample design, processing procedures and file structure, a data dictionary (record layout), category code lists, and a glossary. Also included is a procedural history of the 1940 Census. Each of the 20 subsamples contains three record types: household, sample line, and person. Household variables describe the location and condition of the household. The sample line records contain variables describing demographic characteristics such as nativity, marital status, number of children, veteran status, wage deductions for Social Security, and occupation. Person records also contain variables describing demographic characteristics including nativity, marital status, family membership, education, employment status, income, and occupation.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This application was put together at the request of the Emergency Rental Assistance Program team. They wanted an easy way to quickly look up whether an address fell within a low to moderate income census tract. This data is provided by HUD and more information can be found here. The layer used is a nation wide data set with a filter put on it to focus on census tracts in Orange County, Florida with >50% low to moderate income population.
Facebook
TwitterWebsite alows the public full access to the 1940 Census images, census maps and descriptions.