CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The list includes 4,250 first names and information on their respective count and proportions across six mutually exclusive racial and Hispanic origin groups. These six categories are consistent with the categories used in the Census Bureau's surname list.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NOTE: No specific individual information is given.The Census Bureau receives numerous requests to supply information on name frequency. In an effort to comply with those requests, the Census Bureau has embarked on a names list project involving a tabulation of names from the 1990 Census. These files contain only the frequency of a given name, no specific individual information.[ed.note: all links point to the original URL; all files are available in this repository]Name List: Documentation and Methodology <1.0MBFrequently Occurring Surnames from Census 1990 – Names Files[ed. note: this content was originally on a separate webpage, at https://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html]Filesdist.all.last [<1.0MB]dist.female.first [<1.0MB]dist.male.first [<1.0MB]Each of the three files, (dist.all.last), (dist. male.first), and (dist female.first) contain four items of data. The four items are:A "Name"Frequency in percentCumulative Frequency in percentRankIn the file (dist.all.last) one entry appears as:MOORE 0.312 5.312 9In our search area sample, MOORE ranks 9th in terms of frequency. 5.312 percent of the sample population is covered by MOORE and the 8 names occurring more frequently than MOORE. The surname, MOORE, is possessed by 0.312 percent of our population sample.
The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
description: The files provide counts of frequently-occurring surnames and male and female first names in the 1990 Census returns.; abstract: The files provide counts of frequently-occurring surnames and male and female first names in the 1990 Census returns.
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.
Abstract copyright UK Data Service and data collection copyright owner. This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables. This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section. The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include: England and Wales: 1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)1881 - truncated at the 16th character (maximum I-CeM field length 50 characters). Scotland: for 1851‐71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most 1851 - truncated at the 70th character1861 - truncated at the 76th character1871 - truncated at the 82th character1881 - truncated at the 50th character. Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
[ed. note: from https://www.census.gov/topics/population/genealogy/data/2000_surnames.html as of May 29, 2017. Has also been referenced as http://www.census.gov/genealogy/www/data/2000surnames/index.html]NOTE: This presentation of data focuses on summarized aggregates of counts of surnames, and does not in any way identify specific individuals.Tabulations of all surnames occurring 100 or more times in the Census 2000 returns are provided in the files listed below. The first link explains the methodology used for identifying and editing names data. The second link provides an Excel file of the top 1000 surnames. The third link provides zipped Excel and CSV (comma separated) files of the complete list of 151,671 names. Related Files [Ed. note: the links point to the original location; all files are available in this archive as well]Technical Documentation: Demographic Aspects of Surnames - Census 2000 <1.0MBFile A: Top 1000 Names <1.0MBFile B: Surnames Occurring 100 or more times <1.0MB
https://www.icpsr.umich.edu/web/ICPSR/studies/28501/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/28501/terms
The 1915 Iowa State Census is a unique document. It was the first census in the United States to include information on education and income prior to the United States Federal Census of 1940. It contains considerable detail on other aspects of individuals and households, e.g., religion, wealth and years in the United States and Iowa. The Iowa State Census of 1915 was a complete sample of the residents of the state and the returns were written by census takers (assessors) on index cards. These cards were kept in the Iowa State Archives in Des Moines and were microfilmed in 1986 by the Genealogical Society of Salt Lake City. The census cards were sorted by county, although large cities (those having more than 25,000 residents) were grouped separately. Within each county or large city, records were alphabetized by last name and within last name by first name. This data set includes individual-level records for three of the largest Iowa cities (Des Moines, Dubuque, and Davenport; the Sioux City films were unreadable) and for ten counties that did not contain a large city. (Additional details on sample selection are available in the documentation). Variables include name, age, place of residence, earnings, education, birthplace, religion, marital status, race, occupation, military service, among others. Data on familial ties between records are also included.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This is the large sample of minipics of the handwritten names from the Danish census from 1916. We use this sample for testing the performance of transfer learning from the HANA Database.
Each row contain a reference to the corresponding image as the first element and the name as the second element. All names are written in lower case letters and contain only characters which are used in Danish words, which implies 29 alphabetic characters i.e. this database include the letters æ, ø, and å.
More information can be found in: HANA: A HAndwritten NAme Database for Offline Handwritten Text Recognition and the full HANA Database can be found at HANA Database
In the first half of 2024, Nikodem was the most common name for a newborn child in Poland, with over 3 thousand registrations. Next were Jan, Aleksander, and Anton with over two thousand registrations each.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This package contains three sets of crosswalks (and the code to create them) for use with the complete-count census data provided by IPUMS. Each crosswalk links observations in those data sets to variables that can be created only by using the restricted IPUMS-Ancestry.com data. The three sets of crosswalks are as follows, and all are based on the histid variable: lengths of individuals' first and last names; commonness of individuals' first and last names; and imputed occ1950 codes for individuals currently with the code 979 ("Not Yet Classified").V2 is identical to V1 except for code/Execute.sh
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Rank and count of the top names for baby boys, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NOTE: This presentation of data focuses on summarized aggregates of counts and characteristics associated with surnames, and the data do not in any way identify any specific individuals.Tabulations of all surnames occurring 100 or more times in the 2010 Census returns are provided in the files listed below. The first link explains the methodology used for identifying and editing names data. The second link provides an Excel file of the top 1,000 surnames. The third link provides zipped Excel and CSV (comma separated) files of the complete list of 162,253 names. [ed. note: Table removed]Related Files[ed. note: links below point to the original URL; all files are available in this archive.]Technical Documentation: Demographic Aspects of Surnames - 2010 Census <1.0MBFile A: Top 1000 Names <1.0MBFile B: Surnames Occurring 100 or more times <1.0MB
This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.
The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.
This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.
The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:
Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
We provide four dictionaries that provide the racial distributions associated with names in the United States. These dictionaries are used by the latest iteration of the "WRU" package (Khanna et al., 2022) to make probabilistic predictions about the race of individuals, given their names and geolocations. The probabilities cover five racial categories: White, Black, Hispanic, Asian, and Other. We provide two surname dictionaries. The first provides entries P(race | surname) for about 160K names, derived from the 2010 Census surname list, aggregated with the Census Spanish surname list. The second provides analogous probabilities for 1.48MM surnames. This dictionary is created by starting with the Census-based dictionary and supplementing it with race distributions estimated from the voter files of six Southern states -- Alabama, Florida, Georgia, Louisiana, North Carolina, and South Carolina -- that collect race data. We also provide dictionaries estimating P(race | first name) and P(race | middle name). These dictionaries -- which contain 1.04MM and 1.16MM names respectively -- are sourced exclusively from the voter files of the six Southern states. References Kabir Khanna, Brandon Bertelsen, Santiago Olivella, Evan Rosenman and Kosuke Imai (2022). wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation. R package version 1.0.0. https://CRAN.R-project.org/package=wru
In 2023, the most popular name for a boy in Germany was Matt(h)eo/Mat(h)eo. The second most popular name was Noah. This statistic shows the most popular boys' names in Germany in 2023.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
blockgroupdemographics A selection of variables from the US Census Bureau's American Community Survey 5YR and TIGER/Line publications. Overview The U.S. Census Bureau published it's American Community Survey 5 Year with more than 37,000 variables. Most ACS advanced users will have their personal list of favorites, but this conventional wisdom is not available to occasional analysts. This publication re-shares 174 select demographic data from the U.S. Census Bureau to provide an supplement to Open Environments Block Group publications. These results do not reflect any proprietary or predictive model. Rather, they extract from Census Bureau results. For additional support or more detail, please see the Census Bureau citations below. The first 170 demographic variables are taken from popular variables in the American Community Survey (ACS) including age, race, income, education and family structure. A full list of ACS variable names and definitions can be found in the ACS 'Table Shells' here https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html. The dataset includes 4 additional columns from the Census' TIGER/Line publication. See Open Environment's 2023blockgroupcartographics publication for the shapes of each block group. For each block group, the dataset includes land area (ALAND), water area (AWATER), interpolated latitude (INTPTLAT) and longitude (INTPTLON). These are valuable for calculating population density variables which combine ACS populations and TIGER land area. Files The resulting dataset is available with other block group based datasets on Harvard's Dataverse https://dataverse.harvard.edu/ in Open Environment's Block Group Dataverse https://dataverse.harvard.edu/dataverse/blockgroupdatasets/. This data simply requires csv reader software or pythons pandas package. Supporting the data file, is acsvars.csv, a list of the Census variable names and their corresponding description. Citations “American Community Survey 5-Year Data (2019-2023).” Census.gov, US Census Bureau, https://www.census.gov/data/developers/data-sets/acs-5year.html. 2023 "American Community Survey, Table Shells and Table List” Census.gov, US Census Bureau, https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html Python Package Index - PyPI. Python Software Foundation. "A simple wrapper for the United States Census Bureau’s API.". Retrieved from https://pypi.org/project/census/
This layer contains a Vermont-only subset of block group level 2020 Decennial Census redistricting data as reported by the U.S. Census Bureau for all states plus DC and Puerto Rico. The attributes come from the 2020 Public Law 94-171 (P.L. 94-171) tables.Data download date: August 12, 2021Census tables: P1, P2, P3, P4, H1, P5, HeaderDownloaded from: Census FTP siteProcessing Notes:Data was downloaded from the U.S. Census Bureau FTP site, imported into SAS format and joined to the 2020 TIGER boundaries. Boundaries are sourced from the 2020 TIGER/Line Geodatabases. Boundaries have been projected into Web Mercator and each attribute has been given a clear descriptive alias name. No alterations have been made to the vertices of the data.Each attribute maintains it's specified name from Census, but also has a descriptive alias name and long description derived from the technical documentation provided by the Census. For a detailed list of the attributes contained in this layer, view the Data tab and select "Fields". The following alterations have been made to the tabular data:Joined all tables to create one wide attribute table:P1 - RaceP2 - Hispanic or Latino, and not Hispanic or Latino by RaceP3 - Race for the Population 18 Years and OverP4 - Hispanic or Latino, and not Hispanic or Latino by Race for the Population 18 Years and OverH1 - Occupancy Status (Housing)P5 - Group Quarters Population by Group Quarters Type (correctional institutions, juvenile facilities, nursing facilities/skilled nursing, college/university student housing, military quarters, etc.)HeaderAfter joining, dropped fields: FILEID, STUSAB, CHARITER, CIFSN, LOGRECNO, GEOVAR, GEOCOMP, LSADC, and BLOCK.GEOCOMP was renamed to GEOID and moved be the first column in the table, the original GEOID was dropped.Placeholder fields for future legislative districts have been dropped: CD118, CD119, CD120, CD121, SLDU22, SLDU24, SLDU26, SLDU28, SLDL22, SLDL24 SLDL26, SLDL28.P0020001 was dropped, as it is duplicative of P0010001. Similarly, P0040001 was dropped, as it is duplicative of P0030001.In addition to calculated fields, County_Name and State_Name were added.The following calculated fields have been added (see long field descriptions in the Data tab for formulas used): PCT_P0030001: Percent of Population 18 Years and OverPCT_P0020002: Percent Hispanic or LatinoPCT_P0020005: Percent White alone, not Hispanic or LatinoPCT_P0020006: Percent Black or African American alone, not Hispanic or LatinoPCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or LatinoPCT_P0020008: Percent Asian alone, Not Hispanic or LatinoPCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or LatinoPCT_P0020010: Percent Some Other Race alone, not Hispanic or LatinoPCT_P0020011: Percent Population of Two or More Races, not Hispanic or LatinoPCT_H0010002: Percent of Housing Units that are OccupiedPCT_H0010003: Percent of Housing Units that are VacantPlease note these percentages might look strange at the individual block group level, since this data has been protected using differential privacy.*VCGI exported a Vermont-only subset of the nation-wide layer to produce this layer--with fields limited to this popular subset: OBJECTID: OBJECTID GEOID: Geographic Record Identifier NAME: Area Name-Legal/Statistical Area Description (LSAD) Term-Part Indicator County_Name: County Name State_Name: State Name P0010001: Total Population P0010003: Population of one race: White alone P0010004: Population of one race: Black or African American alone P0010005: Population of one race: American Indian and Alaska Native alone P0010006: Population of one race: Asian alone P0010007: Population of one race: Native Hawaiian and Other Pacific Islander alone P0010008: Population of one race: Some Other Race alone P0020002: Hispanic or Latino Population P0020003: Non-Hispanic or Latino Population P0030001: Total population 18 years and over H0010001: Total housing units H0010002: Total occupied housing units H0010003: Total vacant housing units P0050001: Total group quarters population PCT_P0030001: Percent of Population 18 Years and Over PCT_P0020002: Percent Hispanic or Latino PCT_P0020005: Percent White alone, not Hispanic or Latino PCT_P0020006: Percent Black or African American alone, not Hispanic or Latino PCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or Latino PCT_P0020008: Percent Asian alone, not Hispanic or Latino PCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino PCT_P0020010: Percent Some Other Race alone, not Hispanic or Latino PCT_P0020011: Percent Population of two or more races, not Hispanic or Latino PCT_H0010002: Percent of Housing Units that are Occupied PCT_H0010003: Percent of Housing Units that are Vacant SUMLEV: Summary Level REGION: Region DIVISION: Division COUNTY: County (FIPS) COUNTYNS: County (NS) TRACT: Census Tract BLKGRP: Block Group AREALAND: Area (Land) AREAWATR: Area (Water) INTPTLAT: Internal Point (Latitude) INTPTLON: Internal Point (Longitude) BASENAME: Area Base Name POP100: Total Population Count HU100: Total Housing Count *To protect the privacy and confidentiality of respondents, data has been protected using differential privacy techniques by the U.S. Census Bureau. This means that some individual block groups will have values that are inconsistent or improbable. However, when aggregated up, these issues become minimized.Download Census redistricting data in this layer as a file geodatabase.Additional links:U.S. Census BureauU.S. Census Bureau Decennial CensusAbout the 2020 Census2020 Census2020 Census data qualityDecennial Census P.L. 94-171 Redistricting Data Program
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
This layer contains a Vermont-only subset of county level 2020 Decennial Census redistricting data as reported by the U.S. Census Bureau for all states plus DC and Puerto Rico. The attributes come from the 2020 Public Law 94-171 (P.L. 94-171) tables.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The list includes 4,250 first names and information on their respective count and proportions across six mutually exclusive racial and Hispanic origin groups. These six categories are consistent with the categories used in the Census Bureau's surname list.