100+ datasets found

H
Data for: Demographic aspects of first names
dataverse.harvard.edu
Updated Mar 12, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Konstantinos Tzioumis (2018). Data for: Demographic aspects of first names [Dataset]. http://doi.org/10.7910/DVN/TYJKEZ
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/TYJKEZ
Dataset updated
Mar 12, 2018
Dataset provided by
Harvard Dataverse
Authors
Konstantinos Tzioumis
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The list includes 4,250 first names and information on their respective count and proportions across six mutually exclusive racial and Hispanic origin groups. These six categories are consistent with the categories used in the Census Bureau's surname list.
Frequently Occurring Surnames from the 1990 Census
datalumos.org
Updated May 29, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Commerce. Bureau of the Census (2017). Frequently Occurring Surnames from the 1990 Census [Dataset]. http://doi.org/10.3886/E100669V1
Explore at:
Unique identifier
https://doi.org/10.3886/E100669V1
Dataset updated
May 29, 2017
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
United States Department of Commerce. Bureau of the Census
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NOTE: No specific individual information is given.The Census Bureau receives numerous requests to supply information on name frequency. In an effort to comply with those requests, the Census Bureau has embarked on a names list project involving a tabulation of names from the 1990 Census. These files contain only the frequency of a given name, no specific individual information.[ed.note: all links point to the original URL; all files are available in this repository]Name List: Documentation and Methodology <1.0MBFrequently Occurring Surnames from Census 1990 – Names Files[ed. note: this content was originally on a separate webpage, at https://www.census.gov/topics/population/genealogy/data/1990_census/1990_census_namefiles.html]Filesdist.all.last [<1.0MB]dist.female.first [<1.0MB]dist.male.first [<1.0MB]Each of the three files, (dist.all.last), (dist. male.first), and (dist female.first) contain four items of data. The four items are:A "Name"Frequency in percentCumulative Frequency in percentRankIn the file (dist.all.last) one entry appears as:MOORE 0.312 5.312 9In our search area sample, MOORE ranks 9th in terms of frequency. 5.312 percent of the sample population is covered by MOORE and the 8 names occurring more frequently than MOORE. The surname, MOORE, is possessed by 0.312 percent of our population sample.
Census Data
catalog.data.gov
datadiscoverystudio.org
+3more
Updated Mar 1, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Bureau of the Census (2024). Census Data [Dataset]. https://catalog.data.gov/dataset/census-data
Explore at:
Dataset updated
Mar 1, 2024
Dataset provided by
United States Census Bureauhttp://census.gov/
Description
The Bureau of the Census has released Census 2000 Summary File 1 (SF1) 100-Percent data. The file includes the following population items: sex, age, race, Hispanic or Latino origin, household relationship, and household and family characteristics. Housing items include occupancy status and tenure (whether the unit is owner or renter occupied). SF1 does not include information on incomes, poverty status, overcrowded housing or age of housing. These topics will be covered in Summary File 3. Data are available for states, counties, county subdivisions, places, census tracts, block groups, and, where applicable, American Indian and Alaskan Native Areas and Hawaiian Home Lands. The SF1 data are available on the Bureau's web site and may be retrieved from American FactFinder as tables, lists, or maps. Users may also download a set of compressed ASCII files for each state via the Bureau's FTP server. There are over 8000 data items available for each geographic area. The full listing of these data items is available here as a downloadable compressed data base file named TABLES.ZIP. The uncompressed is in FoxPro data base file (dbf) format and may be imported to ACCESS, EXCEL, and other software formats. While all of this information is useful, the Office of Community Planning and Development has downloaded selected information for all states and areas and is making this information available on the CPD web pages. The tables and data items selected are those items used in the CDBG and HOME allocation formulas plus topics most pertinent to the Comprehensive Housing Affordability Strategy (CHAS), the Consolidated Plan, and similar overall economic and community development plans. The information is contained in five compressed (zipped) dbf tables for each state. When uncompressed the tables are ready for use with FoxPro and they can be imported into ACCESS, EXCEL, and other spreadsheet, GIS and database software. The data are at the block group summary level. The first two characters of the file name are the state abbreviation. The next two letters are BG for block group. Each record is labeled with the code and name of the city and county in which it is located so that the data can be summarized to higher-level geography. The last part of the file name describes the contents . The GEO file contains standard Census Bureau geographic identifiers for each block group, such as the metropolitan area code and congressional district code. The only data included in this table is total population and total housing units. POP1 and POP2 contain selected population variables and selected housing items are in the HU file. The MA05 table data is only for use by State CDBG grantees for the reporting of the racial composition of beneficiaries of Area Benefit activities. The complete package for a state consists of the dictionary file named TABLES, and the five data files for the state. The logical record number (LOGRECNO) links the records across tables.
d
Popular Baby Names
catalog.data.gov
data.cityofnewyork.us
+3more
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2025). Popular Baby Names [Dataset]. https://catalog.data.gov/dataset/popular-baby-names
Explore at:
Dataset updated
Jul 12, 2025
Dataset provided by
data.cityofnewyork.us
Description
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
d
Names from Census 1990.
datadiscoverystudio.org
data.amerigeoss.org
+1more
html
Updated Mar 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2018). Names from Census 1990. [Dataset]. http://datadiscoverystudio.org/geoportal/rest/metadata/item/ed19d4c0965943f188ef0f4cd0b36652/html
Explore at:
htmlAvailable download formats
Dataset updated
Mar 2, 2018
Description
description: The files provide counts of frequently-occurring surnames and male and female first names in the 1990 Census returns.; abstract: The files provide counts of frequently-occurring surnames and male and female first names in the 1990 Census returns.
Baby Names from Social Security Card Applications - National Data
catalog.data.gov
Updated Jul 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2025). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data
Explore at:
Dataset updated
Jul 4, 2025
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.
e
Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special...
b2find.eudat.eu
Updated Jun 18, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special Licence Access - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/331ae75d-8c66-5f3f-9ea2-94af6fb81978
Explore at:
Dataset updated
Jun 18, 2023
Description
Abstract copyright UK Data Service and data collection copyright owner. This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables. This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section. The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include: England and Wales: 1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)1881 - truncated at the 16th character (maximum I-CeM field length 50 characters). Scotland: for 1851‐71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most 1851 - truncated at the 70th character1861 - truncated at the 76th character1871 - truncated at the 82th character1881 - truncated at the 50th character. Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
Frequently Occurring Surnames from the Census 2000
datalumos.org
Updated May 29, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Commerce. Bureau of the Census (2017). Frequently Occurring Surnames from the Census 2000 [Dataset]. http://doi.org/10.3886/E100667V1
Explore at:
Unique identifier
https://doi.org/10.3886/E100667V1
Dataset updated
May 29, 2017
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
United States Department of Commerce. Bureau of the Census
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
[ed. note: from https://www.census.gov/topics/population/genealogy/data/2000_surnames.html as of May 29, 2017. Has also been referenced as http://www.census.gov/genealogy/www/data/2000surnames/index.html]NOTE: This presentation of data focuses on summarized aggregates of counts of surnames, and does not in any way identify specific individuals.Tabulations of all surnames occurring 100 or more times in the Census 2000 returns are provided in the files listed below. The first link explains the methodology used for identifying and editing names data. The second link provides an Excel file of the top 1000 surnames. The third link provides zipped Excel and CSV (comma separated) files of the complete list of 151,671 names. Related Files [Ed. note: the links point to the original location; all files are available in this archive as well]Technical Documentation: Demographic Aspects of Surnames - Census 2000 <1.0MBFile A: Top 1000 Names <1.0MBFile B: Surnames Occurring 100 or more times <1.0MB
The 1915 Iowa State Census Project
icpsr.umich.edu
ascii, delimited, sas +2
Updated Dec 14, 2010
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Goldin, Claudia; Katz, Lawrence (2010). The 1915 Iowa State Census Project [Dataset]. http://doi.org/10.3886/ICPSR28501.v1
Explore at:
spss, ascii, sas, stata, delimitedAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR28501.v1
Dataset updated
Dec 14, 2010
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
Goldin, Claudia; Katz, Lawrence
License
https://www.icpsr.umich.edu/web/ICPSR/studies/28501/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/28501/terms
Time period covered
1915
Area covered
Iowa, United States
Description
The 1915 Iowa State Census is a unique document. It was the first census in the United States to include information on education and income prior to the United States Federal Census of 1940. It contains considerable detail on other aspects of individuals and households, e.g., religion, wealth and years in the United States and Iowa. The Iowa State Census of 1915 was a complete sample of the residents of the state and the returns were written by census takers (assessors) on index cards. These cards were kept in the Iowa State Archives in Des Moines and were microfilmed in 1986 by the Genealogical Society of Salt Lake City. The census cards were sorted by county, although large cities (those having more than 25,000 residents) were grouped separately. Within each county or large city, records were alphabetized by last name and within last name by first name. This data set includes individual-level records for three of the largest Iowa cities (Des Moines, Dubuque, and Davenport; the Sioux City films were unreadable) and for ten counties that did not contain a large city. (Additional details on sample selection are available in the documentation). Variables include name, age, place of residence, earnings, education, birthplace, religion, marital status, race, occupation, military service, among others. Data on familial ties between records are also included.
Danish Census Handwritten Names (Large)
kaggle.com
Updated Feb 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Simon Wittrock (2022). Danish Census Handwritten Names (Large) [Dataset]. https://www.kaggle.com/datasets/sdusimonwittrock/danish-census-handwritten-names-large
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 16, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Simon Wittrock
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is the large sample of minipics of the handwritten names from the Danish census from 1916. We use this sample for testing the performance of transfer learning from the HANA Database.

Each row contain a reference to the corresponding image as the first element and the name as the second element. All names are written in lower case letters and contain only characters which are used in Danish words, which implies 29 alphabetic characters i.e. this database include the letters æ, ø, and å.

More information can be found in: HANA: A HAndwritten NAme Database for Offline Handwritten Text Recognition and the full HANA Database can be found at HANA Database
Most popular first names for children in Poland H1 2024
statista.com
Updated Nov 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most popular first names for children in Poland H1 2024 [Dataset]. https://www.statista.com/statistics/1092210/poland-most-popular-first-names-for-children/
Explore at:
Dataset updated
Nov 14, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
Poland
Description
In the first half of 2024, Nikodem was the most common name for a newborn child in Poland, with over 3 thousand registrations. Next were Jan, Aleksander, and Anton with over two thousand registrations each.
o
Miscellaneous US Census Crosswalks, 1850-1930
openicpsr.org
Updated Jun 2, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ariell Zimran (2022). Miscellaneous US Census Crosswalks, 1850-1930 [Dataset]. http://doi.org/10.3886/E171861V2
Explore at:
Unique identifier
https://doi.org/10.3886/E171861V2
Dataset updated
Jun 2, 2022
Dataset provided by
Vanderbilt University & NBER
Authors
Ariell Zimran
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States
Description
This package contains three sets of crosswalks (and the code to create them) for use with the complete-count census data provided by IPUMS. Each crosswalk links observations in those data sets to variables that can be created only by using the restricted IPUMS-Ancestry.com data. The three sets of crosswalks are as follows, and all are based on the histid variable: lengths of individuals' first and last names; commonness of individuals' first and last names; and imputed occ1950 codes for individuals currently with the code 979 ("Not Yet Classified").V2 is identical to V1 except for code/Execute.sh
Baby names for boys in England and Wales
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Baby names for boys in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Rank and count of the top names for baby boys, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
Frequently Occurring Surnames from the 2010 Census
datalumos.org
Updated May 29, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United States Department of Commerce. Bureau of the Census (2017). Frequently Occurring Surnames from the 2010 Census [Dataset]. http://doi.org/10.3886/E100668V1
Explore at:
Unique identifier
https://doi.org/10.3886/E100668V1
Dataset updated
May 29, 2017
Dataset provided by
United States Census Bureauhttp://census.gov/
Authors
United States Department of Commerce. Bureau of the Census
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NOTE: This presentation of data focuses on summarized aggregates of counts and characteristics associated with surnames, and the data do not in any way identify any specific individuals.Tabulations of all surnames occurring 100 or more times in the 2010 Census returns are provided in the files listed below. The first link explains the methodology used for identifying and editing names data. The second link provides an Excel file of the top 1,000 surnames. The third link provides zipped Excel and CSV (comma separated) files of the complete list of 162,253 names. [ed. note: Table removed]Related Files[ed. note: links below point to the original URL; all files are available in this archive.]Technical Documentation: Demographic Aspects of Surnames - 2010 Census <1.0MBFile A: Top 1000 Names <1.0MBFile B: Surnames Occurring 100 or more times <1.0MB
Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special...
beta.ukdataservice.ac.uk
Updated 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
K. Schurer; E. Higgs (2025). Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special Licence Access [Dataset]. http://doi.org/10.5255/ukda-sn-7856-2
Explore at:
Unique identifier
https://doi.org/10.5255/ukda-sn-7856-2
Dataset updated
2025
Dataset provided by
DataCitehttps://www.datacite.org/
UK Data Servicehttps://ukdataservice.ac.uk/
Authors
K. Schurer; E. Higgs
Description
This Special Licence access dataset contains names and addresses from the Integrated Census Microdata (I-CeM) dataset of the censuses of Great Britain for the period 1851 to 1911. These data are made available under Special Licence (SL) access conditions due to commercial sensitivity.

The anonymised main I-CeM database that complements these names and addresses is available under SN 7481. It comprises the Censuses of Great Britain for the period 1851-1911; data are available for England and Wales for 1851-1861 and 1881-1911 (1871 is not currently available for England and Wales) and for Scotland for 1851-1901 (1911 is not currently available for Scotland). The database contains over 180 million individual census records and was digitised and harmonised from the original census enumeration books. It details characteristics for all individuals resident in Great Britain at each of the included Censuses. The original digital data has been coded and standardised; the I-CeM database has consistent geography over time and standardised coding schemes for many census variables.

This dataset of names and addresses for individual census records is organised per country (England and Wales; Scotland) and per census year. Within each data file each census record contains first and last name, street address and an individual identification code (RecID) that allows linking with the corresponding anonymised I-CeM record. The data cannot be used for true linking of individual census records across census years for commercial genealogy purposes nor for any other commercial purposes. The SL arrangements are required to ensure that commercial sensitivity is protected. For information on making an application, see the Access section.

The data were updated in February 2020, with some files redeposited with longer field length limits. Users should note that some name and address fields are truncated due to the limits set by the LDS project that transcribed the original data. No more than 10,000 records out of some 210 million across the study should be affected. Examples include:

England and Wales:
1851 - truncated at the 24th character (maximum I-CeM field length 95 characters)
1881 - truncated at the 16th character (maximum I-CeM field length 50 characters).

Scotland: for 1851‐71, truncations affect less than 0.01% of all addresses and for 1851 around 1% at most
1851 - truncated at the 70th character
1861 - truncated at the 76th character
1871 - truncated at the 82th character
1881 - truncated at the 50th character.

Further information about I-CeM can be found on the I-CeM Integrated Microdata Project and I-CeM Guide webpages.
d
Name Dictionaries for \"wru\" R Package
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosenman, Evan; Santiago Olivella; Kosuke Imai (2023). Name Dictionaries for \"wru\" R Package [Dataset]. http://doi.org/10.7910/DVN/7TRYAC
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/7TRYAC
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Rosenman, Evan; Santiago Olivella; Kosuke Imai
Description
We provide four dictionaries that provide the racial distributions associated with names in the United States. These dictionaries are used by the latest iteration of the "WRU" package (Khanna et al., 2022) to make probabilistic predictions about the race of individuals, given their names and geolocations. The probabilities cover five racial categories: White, Black, Hispanic, Asian, and Other. We provide two surname dictionaries. The first provides entries P(race | surname) for about 160K names, derived from the 2010 Census surname list, aggregated with the Census Spanish surname list. The second provides analogous probabilities for 1.48MM surnames. This dictionary is created by starting with the Census-based dictionary and supplementing it with race distributions estimated from the voter files of six Southern states -- Alabama, Florida, Georgia, Louisiana, North Carolina, and South Carolina -- that collect race data. We also provide dictionaries estimating P(race | first name) and P(race | middle name). These dictionaries -- which contain 1.04MM and 1.16MM names respectively -- are sourced exclusively from the voter files of the six Southern states. References Kabir Khanna, Brandon Bertelsen, Santiago Olivella, Evan Rosenman and Kosuke Imai (2022). wru: Who are You? Bayesian Prediction of Racial Category Using Surname, First Name, Middle Name, and Geolocation. R package version 1.0.0. https://CRAN.R-project.org/package=wru
Most popular boys' names Germany in 2023
statista.com
Updated Jan 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Most popular boys' names Germany in 2023 [Dataset]. https://www.statista.com/statistics/1334810/most-popular-boys-names-germany/
Explore at:
Dataset updated
Jan 13, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Germany
Description
In 2023, the most popular name for a boy in Germany was Matt(h)eo/Mat(h)eo. The second most popular name was Noah. This statistic shows the most popular boys' names in Germany in 2023.
H
2023 Major Demographics by US Census Block Group
dataverse.harvard.edu
Updated Mar 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Bryan (2025). 2023 Major Demographics by US Census Block Group [Dataset]. http://doi.org/10.7910/DVN/9AEYAS
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/9AEYAS
Dataset updated
Mar 7, 2025
Dataset provided by
Harvard Dataverse
Authors
Michael Bryan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
blockgroupdemographics A selection of variables from the US Census Bureau's American Community Survey 5YR and TIGER/Line publications. Overview The U.S. Census Bureau published it's American Community Survey 5 Year with more than 37,000 variables. Most ACS advanced users will have their personal list of favorites, but this conventional wisdom is not available to occasional analysts. This publication re-shares 174 select demographic data from the U.S. Census Bureau to provide an supplement to Open Environments Block Group publications. These results do not reflect any proprietary or predictive model. Rather, they extract from Census Bureau results. For additional support or more detail, please see the Census Bureau citations below. The first 170 demographic variables are taken from popular variables in the American Community Survey (ACS) including age, race, income, education and family structure. A full list of ACS variable names and definitions can be found in the ACS 'Table Shells' here https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html. The dataset includes 4 additional columns from the Census' TIGER/Line publication. See Open Environment's 2023blockgroupcartographics publication for the shapes of each block group. For each block group, the dataset includes land area (ALAND), water area (AWATER), interpolated latitude (INTPTLAT) and longitude (INTPTLON). These are valuable for calculating population density variables which combine ACS populations and TIGER land area. Files The resulting dataset is available with other block group based datasets on Harvard's Dataverse https://dataverse.harvard.edu/ in Open Environment's Block Group Dataverse https://dataverse.harvard.edu/dataverse/blockgroupdatasets/. This data simply requires csv reader software or pythons pandas package. Supporting the data file, is acsvars.csv, a list of the Census variable names and their corresponding description. Citations “American Community Survey 5-Year Data (2019-2023).” Census.gov, US Census Bureau, https://www.census.gov/data/developers/data-sets/acs-5year.html. 2023 "American Community Survey, Table Shells and Table List” Census.gov, US Census Bureau, https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html Python Package Index - PyPI. Python Software Foundation. "A simple wrapper for the United States Census Bureau’s API.". Retrieved from https://pypi.org/project/census/
v
VT Data – 2020 Census Block Group
geodata.vermont.gov
arc-gis-hub-home-arcgishub.hub.arcgis.com
+4more
Updated Aug 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VT Center for Geographic Information (2021). VT Data – 2020 Census Block Group [Dataset]. https://geodata.vermont.gov/maps/vt-data-2020-census-block-group
Explore at:
Dataset updated
Aug 12, 2021
Dataset authored and provided by
VT Center for Geographic Information
Area covered

Description
This layer contains a Vermont-only subset of block group level 2020 Decennial Census redistricting data as reported by the U.S. Census Bureau for all states plus DC and Puerto Rico. The attributes come from the 2020 Public Law 94-171 (P.L. 94-171) tables.Data download date: August 12, 2021Census tables: P1, P2, P3, P4, H1, P5, HeaderDownloaded from: Census FTP siteProcessing Notes:Data was downloaded from the U.S. Census Bureau FTP site, imported into SAS format and joined to the 2020 TIGER boundaries. Boundaries are sourced from the 2020 TIGER/Line Geodatabases. Boundaries have been projected into Web Mercator and each attribute has been given a clear descriptive alias name. No alterations have been made to the vertices of the data.Each attribute maintains it's specified name from Census, but also has a descriptive alias name and long description derived from the technical documentation provided by the Census. For a detailed list of the attributes contained in this layer, view the Data tab and select "Fields". The following alterations have been made to the tabular data:Joined all tables to create one wide attribute table:P1 - RaceP2 - Hispanic or Latino, and not Hispanic or Latino by RaceP3 - Race for the Population 18 Years and OverP4 - Hispanic or Latino, and not Hispanic or Latino by Race for the Population 18 Years and OverH1 - Occupancy Status (Housing)P5 - Group Quarters Population by Group Quarters Type (correctional institutions, juvenile facilities, nursing facilities/skilled nursing, college/university student housing, military quarters, etc.)HeaderAfter joining, dropped fields: FILEID, STUSAB, CHARITER, CIFSN, LOGRECNO, GEOVAR, GEOCOMP, LSADC, and BLOCK.GEOCOMP was renamed to GEOID and moved be the first column in the table, the original GEOID was dropped.Placeholder fields for future legislative districts have been dropped: CD118, CD119, CD120, CD121, SLDU22, SLDU24, SLDU26, SLDU28, SLDL22, SLDL24 SLDL26, SLDL28.P0020001 was dropped, as it is duplicative of P0010001. Similarly, P0040001 was dropped, as it is duplicative of P0030001.In addition to calculated fields, County_Name and State_Name were added.The following calculated fields have been added (see long field descriptions in the Data tab for formulas used): PCT_P0030001: Percent of Population 18 Years and OverPCT_P0020002: Percent Hispanic or LatinoPCT_P0020005: Percent White alone, not Hispanic or LatinoPCT_P0020006: Percent Black or African American alone, not Hispanic or LatinoPCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or LatinoPCT_P0020008: Percent Asian alone, Not Hispanic or LatinoPCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or LatinoPCT_P0020010: Percent Some Other Race alone, not Hispanic or LatinoPCT_P0020011: Percent Population of Two or More Races, not Hispanic or LatinoPCT_H0010002: Percent of Housing Units that are OccupiedPCT_H0010003: Percent of Housing Units that are VacantPlease note these percentages might look strange at the individual block group level, since this data has been protected using differential privacy.*VCGI exported a Vermont-only subset of the nation-wide layer to produce this layer--with fields limited to this popular subset: OBJECTID: OBJECTID GEOID: Geographic Record Identifier NAME: Area Name-Legal/Statistical Area Description (LSAD) Term-Part Indicator County_Name: County Name State_Name: State Name P0010001: Total Population P0010003: Population of one race: White alone P0010004: Population of one race: Black or African American alone P0010005: Population of one race: American Indian and Alaska Native alone P0010006: Population of one race: Asian alone P0010007: Population of one race: Native Hawaiian and Other Pacific Islander alone P0010008: Population of one race: Some Other Race alone P0020002: Hispanic or Latino Population P0020003: Non-Hispanic or Latino Population P0030001: Total population 18 years and over H0010001: Total housing units H0010002: Total occupied housing units H0010003: Total vacant housing units P0050001: Total group quarters population PCT_P0030001: Percent of Population 18 Years and Over PCT_P0020002: Percent Hispanic or Latino PCT_P0020005: Percent White alone, not Hispanic or Latino PCT_P0020006: Percent Black or African American alone, not Hispanic or Latino PCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or Latino PCT_P0020008: Percent Asian alone, not Hispanic or Latino PCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino PCT_P0020010: Percent Some Other Race alone, not Hispanic or Latino PCT_P0020011: Percent Population of two or more races, not Hispanic or Latino PCT_H0010002: Percent of Housing Units that are Occupied PCT_H0010003: Percent of Housing Units that are Vacant SUMLEV: Summary Level REGION: Region DIVISION: Division COUNTY: County (FIPS) COUNTYNS: County (NS) TRACT: Census Tract BLKGRP: Block Group AREALAND: Area (Land) AREAWATR: Area (Water) INTPTLAT: Internal Point (Latitude) INTPTLON: Internal Point (Longitude) BASENAME: Area Base Name POP100: Total Population Count HU100: Total Housing Count *To protect the privacy and confidentiality of respondents, data has been protected using differential privacy techniques by the U.S. Census Bureau. This means that some individual block groups will have values that are inconsistent or improbable. However, when aggregated up, these issues become minimized.Download Census redistricting data in this layer as a file geodatabase.Additional links:U.S. Census BureauU.S. Census Bureau Decennial CensusAbout the 2020 Census2020 Census2020 Census data qualityDecennial Census P.L. 94-171 Redistricting Data Program
O
County
data.vermont.gov
Updated Jul 9, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
US Census (2024). County [Dataset]. https://data.vermont.gov/Government/County/3dr5-ewdb
Explore at:
application/rssxml, csv, kml, tsv, xml, application/rdfxml, application/geo+json, kmzAvailable download formats
Dataset updated
Jul 9, 2024
Dataset authored and provided by
US Census
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
This layer contains a Vermont-only subset of county level 2020 Decennial Census redistricting data as reported by the U.S. Census Bureau for all states plus DC and Puerto Rico. The attributes come from the 2020 Public Law 94-171 (P.L. 94-171) tables.

Data download date: August 12, 2021
Census tables: P1, P2, P3, P4, H1, P5, Header
Downloaded from: Census FTP site

Processing Notes:
Data was downloaded from the U.S. Census Bureau FTP site, imported into SAS format and joined to the 2020 TIGER boundaries. Boundaries are sourced from the 2020 TIGER/Line Geodatabases. Boundaries have been projected into Web Mercator and each attribute has been given a clear descriptive alias name. No alterations have been made to the vertices of the data.
Each attribute maintains it's specified name from Census, but also has a descriptive alias name and long description derived from the technical documentation provided by the Census.
For a detailed list of the attributes contained in this layer, view the Data tab and select "Fields".
The following alterations have been made to the tabular data:
Joined all tables to create one wide attribute table:
P1 - Race
P2 - Hispanic or Latino, and not Hispanic or Latino by Race
P3 - Race for the Population 18 Years and Over
P4 - Hispanic or Latino, and not Hispanic or Latino by Race for the Population 18 Years and Over
H1 - Occupancy Status (Housing)
P5 - Group Quarters Population by Group Quarters Type (correctional institutions, juvenile facilities, nursing facilities/skilled nursing, college/university student housing, military quarters, etc.)
Header
After joining, dropped fields: FILEID, STUSAB, CHARITER, CIFSN, LOGRECNO, GEOVAR, GEOCOMP, LSADC, BLOCK, BLKGRP, TRACT, COUSUB, COUSUBCC, COUSUBNS, SUBMCD, SUBMCDCC, SUBMCDNS, ESTATE, ESTATECC, ESTATENS, CONCIT, CONCITCC, CONCITNS, PLACE, PLACECC, PLACENS, AIANHH, AIHHTLI, AIANHHFP, AIANHHCC, AIANHHNS, AITS, AITSFP, AITSCC, AITSNS, TTRACT, TBLKGRP, ANRC, ANRCCC, ANRCNS, NECTA, NMEMI, CNECTA, NECTADIV, CBSAPCI, NECTAPCI, UA, UATYPE, UR, CD116, CD118, CD119, CD120, CD121, SLDU18, SLDU22, SLDU24, SLDU26, SLDU28, SLDL18, SLDL22, SLDL24, SLDL26, SLDL28, VTD, VTDI, ZCTA, SDELM, SDSEC, SDUNI, and PUMA.
GEOCOMP was renamed to GEOID and moved be the first column in the table, the original GEOID was dropped.
P0020001 was dropped, as it is duplicative of P0010001. Similarly, P0040001 was dropped, as it is duplicative of P0030001.
The following calculated fields have been added (see long field descriptions in the Data tab for formulas used):
PCT_P0030001: Percent of Population 18 Years and Over
PCT_P0020002: Percent Hispanic or Latino
PCT_P0020005: Percent White alone, not Hispanic or Latino
PCT_P0020006: Percent Black or African American alone, not Hispanic or Latino
PCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or Latino
PCT_P0020008: Percent Asian alone, Not Hispanic or Latino
PCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino
PCT_P0020010: Percent Some Other Race alone, not Hispanic or Latino
PCT_P0020011: Percent Population of Two or More Races, not Hispanic or Latino
PCT_H0010002: Percent of Housing Units that are Occupied
PCT_H0010003: Percent of Housing Units that are Vacant
VCGI exported a Vermont-only subset of the nation-wide layer to produce this layer--with fields limited to this popular subset:
OBJECTID: OBJECTID
GEOID: Geographic Record Identifier
NAME: Area Name-Legal/Statistical Area Description (LSAD) Term-Part Indicator
State: State
P0010001: Total Population
P0010003: Population of one race: White alone
P0010004: Population of one race: Black or African American alone
P0010005: Population of one race: American Indian and Alaska Native alone
P0010006: Population of one race: Asian alone
P0010007: Population of one race: Native Hawaiian and Other Pacific Islander alone
P0010008: Population of one race: Some Other Race alone
P0020002: Hispanic or Latino Population
P0020003: Non-Hispanic or Latino Population
P0030001: Total population 18 years and over
H0010001: Total housing units
H0010002: Total occupied housing units
H0010003: Total vacant housing units
P0050001: Total group quarters population
PCT_P0030001: Percent of Population 18 Years and Over
PCT_P0020002: Percent Hispanic or Latino
PCT_P0020005: Percent White alone, not Hispanic or Latino
PCT_P0020006: Percent Black or African American alone, not Hispanic or Latino
PCT_P0020007: Percent American Indian and Alaska Native alone, not Hispanic or Latino
PCT_P0020008: Percent Asian alone, not Hispanic or Latino
PCT_P0020009: Percent Native Hawaiian and Other Pacific Islander alone, not Hispanic or Latino
PCT_P0020010: Percent Some Other Race alone, not Hispanic or Latino
PCT_P0020011: Percent Population of two or more races, not Hispanic or Latino
PCT_H0010002: Percent of Housing Units that are Occupied
PCT_H0010003: Percent of Housing Units that are Vacant
SUMLEV: Summary Level
REGION: Region
DIVISION: Division
COUNTY: County (FIPS)
COUNTYNS: County (NS)
AREALAND: Area (Land)
AREAWATR: Area (Water)
INTPTLAT: Internal Point (Latitude)
INTPTLON: Internal Point (Longitude)
BASENAME: Area Base Name
POP100: Total Population Count
HU100: Total Housing Count
Download Census redistricting data in this layer as a file geodatabase.

Additional links:
<div style='font-family:"Avenir Next W01", "Avenir Next W00",

Facebook

Twitter

Click to copy link

Link copied

Cite

Konstantinos Tzioumis (2018). Data for: Demographic aspects of first names [Dataset]. http://doi.org/10.7910/DVN/TYJKEZ

Data for: Demographic aspects of first names

Explore at:

18 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Unique identifier

https://doi.org/10.7910/DVN/TYJKEZ

Dataset updated

Mar 12, 2018

Dataset provided by

Harvard Dataverse

Authors

Konstantinos Tzioumis

License

CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically

Description

The list includes 4,250 first names and information on their respective count and proportions across six mutually exclusive racial and Hispanic origin groups. These six categories are consistent with the categories used in the Census Bureau's surname list.

Clear search

Close search

Google apps

Main menu

Data for: Demographic aspects of first names

Frequently Occurring Surnames from the 1990 Census

Census Data

Popular Baby Names

Names from Census 1990.

Baby Names from Social Security Card Applications - National Data

Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special...

Frequently Occurring Surnames from the Census 2000

The 1915 Iowa State Census Project

Danish Census Handwritten Names (Large)

Most popular first names for children in Poland H1 2024

Miscellaneous US Census Crosswalks, 1850-1930

Baby names for boys in England and Wales

Frequently Occurring Surnames from the 2010 Census

Integrated Census Microdata (I-CeM) Names and Addresses, 1851-1911: Special...

Name Dictionaries for \"wru\" R Package

Most popular boys' names Germany in 2023

2023 Major Demographics by US Census Block Group

VT Data – 2020 Census Block Group

County

Data for: Demographic aspects of first names