100+ datasets found

USA Name Data
kaggle.com
zip
Updated Feb 12, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data.gov (2019). USA Name Data [Dataset]. https://www.kaggle.com/datasets/datagov/usa-names
Explore at:
zip(0 bytes)Available download formats
Dataset updated
Feb 12, 2019
Dataset provided by
Data.govhttps://data.gov/
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
United States
Description
Context

Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States

Content

This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.

All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names

https://cloud.google.com/bigquery/public-data/usa-names

Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @dcp from Unplash.

Inspiration

What are the most common names?

What are the most common female names?

Are there more female or male names?

Female names by a wide margin?
Baby Names from Social Security Card Applications - National Data
catalog.data.gov
res1catalogd-o-tdatad-o-tgov.vcapture.xyz
Updated Jul 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2025). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data
Explore at:
Dataset updated
Jul 4, 2025
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.
Nyc popular baby names
kaggle.com
Updated Jun 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rahul Sarkar (2022). Nyc popular baby names [Dataset]. https://www.kaggle.com/datasets/rahulsarkar221/nyc-popular-baby-names
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 20, 2022
Dataset provided by
Kaggle
Authors
Rahul Sarkar
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
New York
Description
This data contains popular baby names in New York .

Dataset :- 1 file (popular-baby-names.csv)

Columns - Year of Birth : Year of the baby's birth. - Gender : Gender of the baby. - Ethnicity : Types of ethnicity they belong to. - Child's First Name : The first name of the child. - Count : How many babies were named . - Ranking : Ranking of that name.
d
Race and ethnicity data for first, middle, and last names
search.dataone.org
Updated Nov 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosenman, Evan; Olivella, Santiago; Imai, Kosuke (2023). Race and ethnicity data for first, middle, and last names [Dataset]. http://doi.org/10.7910/DVN/SGKW0K
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/SGKW0K
Dataset updated
Nov 8, 2023
Dataset provided by
Harvard Dataverse
Authors
Rosenman, Evan; Olivella, Santiago; Imai, Kosuke
Description
We provide datasets that that estimate the racial distributions associated with first, middle, and last names in the United States. The datasets cover five racial categories: White, Black, Hispanic, Asian, and Other. The provided data are computed from the voter files of six Southern states -- Alabama, Florida, Georgia, Louisiana, North Carolina, and South Carolina -- that collect race and ethnicity data upon registration. We include seven voter files per state, sourced between 2018 and 2021 from L2, Inc. Together, these states have approximately 36MM individuals who provide self-reported race and ethnicity. The last name datasets includes 338K surnames, while the middle name dictionaries contains 126K middle names and the first name datasets includes 136K first names. For each type of name, we provide a dataset of P(race | name) probabilities and P(name | race) probabilities. We include only names that appear at least 25 times across the 42 (= 7 voter files * 6 states) voter files in our dataset. These data are closely related to the the dataset: "Name Dictionaries for "wru" R Package", https://doi.org/10.7910/DVN/7TRYAC. These are the probabilities used in the latest iteration of the "WRU" package (Khanna et al., 2022) to make probabilistic predictions about the race of individuals, given their names and geolocations.
a
Facebook Names Dataset
academictorrents.com
bittorrent
Updated Nov 11, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ron Bowes (Skull Security) (2015). Facebook Names Dataset [Dataset]. https://academictorrents.com/details/e54c73099d291605e7579b90838c2cd86a8e9575
Explore at:
bittorrent(2991052604)Available download formats
Dataset updated
Nov 11, 2015
Dataset authored and provided by
Ron Bowes (Skull Security)
License
https://academictorrents.com/nolicensespecifiedhttps://academictorrents.com/nolicensespecified
Description
171 million names (100 million unique) This torrent contains: The URL of every searchable Facebook user s profile The name of every searchable Facebook user, both unique and by count (perfect for post-processing, datamining, etc) Processed lists, including first names with count, last names with count, potential usernames with count, etc The programs I used to generate everything So, there you have it: lots of awesome data from Facebook. Now, I just have to find one more problem with Facebook so I can write "Revenge of the Facebook Snatchers" and complete the trilogy. Any suggestions? >:-) Limitations So far, I have only indexed the searchable users, not their friends. Getting their friends will be significantly more data to process, and I don t have those capabilities right now. I d like to tackle that in the future, though, so if anybody has any bandwidth they d like to donate, all I need is an ssh account and Nmap installed. An additional limitation is that these are on
Most Popular Baby Names
data.chhs.ca.gov
data.ca.gov
+3more
csv, zip
Updated Dec 30, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2024). Most Popular Baby Names [Dataset]. https://data.chhs.ca.gov/dataset/most-popular-baby-names-2005-current
Explore at:
csv(1219), csv(121160), zipAvailable download formats
Dataset updated
Dec 30, 2024
Dataset authored and provided by
California Department of Public Healthhttps://www.cdph.ca.gov/
Description
This dataset contains ranks and counts for the top 25 baby names by sex for live births that occurred in California (by occurrence) based on information entered on birth certificates.
f
Namesakes
figshare.com
json
Updated Nov 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oleg Vasilyev; Aysu Altun; Nidhi Vyas; Vedant Dharnidharka; Erika Lampert; John Bohannon (2021). Namesakes [Dataset]. http://doi.org/10.6084/m9.figshare.17009105.v1
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17009105.v1
Dataset updated
Nov 20, 2021
Dataset provided by
figshare
Authors
Oleg Vasilyev; Aysu Altun; Nidhi Vyas; Vedant Dharnidharka; Erika Lampert; John Bohannon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

Motivation: creating challenging dataset for testing Named-Entity

Linking. The Namesakes dataset consists of three closely related datasets: Entities, News and Backlinks. Entities were collected as Wikipedia text chunks corresponding to highly ambiguous entity names. The News were collected as random news text chunks, containing mentions that either belong to the Entities dataset or can be easily confused with them. Backlinks were obtained from Wikipedia dump data with intention to have mentions linked to the entities of the Entity dataset. The Entities and News are human-labeled, resolving the mentions of the entities.Methods

Entities were collected as Wikipedia

text chunks corresponding to highly ambiguous entity names: the most popular people names, the most popular locations, and organizations with name ambiguity. In each Entities text chunk, the named entities with the name similar to the chunk Wikipedia page name are labeled. For labeling, these entities were suggested to human annotators (odetta.ai) to tag as "Same" (same as the page entity) or "Other". The labeling was done by 6 experienced annotators that passed through a preliminary trial task. The only accepted tags are the tags assigned in agreement by not less than 5 annotators, and then passed through reconciliation with an experienced reconciliator.

The News were collected as random news text chunks, containing mentions which either belong to the Entities dataset or can be easily confused with them. In each News text chunk one mention was selected for labeling, and 3-10 Wikipedia pages from Entities were suggested as the labels for an annotator to choose from. The labeling was done by 3 experienced annotators (odetta.ai), after the annotators passed a preliminary trial task. The results were reconciled by an experienced reconciliator. All the labeling was done using Lighttag (lighttag.io).

Backlinks were obtained from Wikipedia dump data (dumps.wikimedia.org/enwiki/20210701) with intention to have mentions linked to the entities of the Entity dataset. The backlinks were filtered to leave only mentions in a good quality text; each text was cut 1000 characters after the last mention.

Usage NotesEntities:

File: Namesakes_entities.jsonl The Entities dataset consists of 4148 Wikipedia text chunks containing human-tagged mentions of entities. Each mention is tagged either as "Same" (meaning that the mention is of this Wikipedia page entity), or "Other" (meaning that the mention is of some other entity, just having the same or similar name). The Entities dataset is a jsonl list, each item is a dictionary with the following keys and values: Key: ‘pagename’: page name of the Wikipedia page. Key ‘pageid’: page id of the Wikipedia page. Key ‘title’: title of the Wikipedia page. Key ‘url’: URL of the Wikipedia page. Key ‘text’: The text chunk from the Wikipedia page. Key ‘entities’: list of the mentions in the page text, each entity is represented by a dictionary with the keys: Key 'text': the mention as a string from the page text. Key ‘start’: start character position of the entity in the text. Key ‘end’: end (one-past-last) character position of the entity in the text. Key ‘tag’: annotation tag given as a string - either ‘Same’ or ‘Other’.

News: File: Namesakes_news.jsonl The News dataset consists of 1000 news text chunks, each one with a single annotated entity mention. The annotation either points to the corresponding entity from the Entities dataset (if the mention is of that entity), or indicates that the mentioned entity does not belong to the Entities dataset. The News dataset is a jsonl list, each item is a dictionary with the following keys and values: Key ‘id_text’: Id of the sample. Key ‘text’: The text chunk. Key ‘urls’: List of URLs of wikipedia entities suggested to labelers for identification of the entity mentioned in the text. Key ‘entity’: a dictionary describing the annotated entity mention in the text: Key 'text': the mention as a string found by an NER model in the text. Key ‘start’: start character position of the mention in the text. Key ‘end’: end (one-past-last) character position of the mention in the text. Key 'tag': This key exists only if the mentioned entity is annotated as belonging to the Entities dataset - if so, the value is a dictionary identifying the Wikipedia page assigned by annotators to the mentioned entity: Key ‘pageid’: Wikipedia page id. Key ‘pagetitle’: page title. Key 'url': page URL.

Backlinks dataset: The Backlinks dataset consists of two parts: dictionary Entity-to-Backlinks and Backlinks documents. The dictionary points to backlinks for each entity of the Entity dataset (if any backlinks exist for the entity). The Backlinks documents are the backlinks Wikipedia text chunks with identified mentions of the entities from the Entities dataset.

Each mention is identified by surrounded double square brackets, e.g. "Muir built a small cabin along [[Yosemite Creek]].". However, if the mention differs from the exact entity name, the double square brackets wrap both the exact name and, separated by '|', the mention string to the right, for example: "Muir also spent time with photographer [[Carleton E. Watkins | Carleton Watkins]] and studied his photographs of Yosemite.".

The Entity-to-Backlinks is a jsonl with 1527 items. File: Namesakes_backlinks_entities.jsonl Each item is a tuple: Entity name. Entity Wikipedia page id. Backlinks ids: a list of pageids of backlink documents.

The Backlinks documents is a jsonl with 26903 items. File: Namesakes_backlinks_texts.jsonl Each item is a dictionary: Key ‘pageid’: Id of the Wikipedia page. Key ‘title’: Title of the Wikipedia page. Key 'content': Text chunk from the Wikipedia page, with all mentions in the double brackets; the text is cut 1000 characters after the last mention, the cut is denoted as '...[CUT]'. Key 'mentions': List of the mentions from the text, for convenience. Each mention is a tuple: Entity name. Entity Wikipedia page id. Sorted list of all character indexes at which the mention occurrences start in the text.
USA Names
console.cloud.google.com
Updated Mar 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
https://console.cloud.google.com/marketplace/browse?filter=partner:U.S.%20Social%20Security%20Administration&hl=ja&inv=1&invt=Ab5HbQ (2023). USA Names [Dataset]. https://console.cloud.google.com/marketplace/product/social-security-administration/us-names?hl=ja
Explore at:
Dataset updated
Mar 29, 2023
Dataset provided by
Googlehttp://google.com/
Area covered
United States
Description
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data. All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Popular Baby Names - Dataset - data.sa.gov.au
data.sa.gov.au
Updated Mar 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.sa.gov.au (2025). Popular Baby Names - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/popular-baby-names
Explore at:
Dataset updated
Mar 1, 2025
Dataset provided by
Government of South Australiahttp://sa.gov.au/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
South Australia
Description
List of male and female baby names in South Australia from 1944 to 2024. The annual data for baby names is published January/February each year.
Baby Names by Year
kaggle.com
Updated Sep 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Baby Names by Year [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-baby-names-by-year-of-birth/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 20, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About this dataset

This dataset contains US baby names from the Social Security Administration dating back to 1879. With over 150 years of data, this is one of the most comprehensive datasets on baby names in the US. The data includes the name, year of birth, sex, and number of babies with that name for each year. This dataset is a great resource for anyone interested in studying baby naming trends over time

How to use the dataset

How to use the US Baby Names by Year of Birth dataset:

This dataset is a compilation of over 140 years of data from the Social Security Administration. It includes data on baby names, year of birth, and sex. There are also columns for the number of babies with that name born in that year.

This dataset can be used to track changes in baby naming trends over time, or to study how popular names have changed in popularity. It can also be used to study how naming trends differ between sexes, or between different years

Research Ideas

This dataset could be used for a number of things, including: 1. Determining baby name trends over time 2. Finding out what the most popular baby names are in the US 3. Analyzing how baby name popularity has changed over the years

Columns

index: the index of the dataframe

YearOfBirth: the year in which the baby was born

Name: the name of the baby

Sex: the sex of the baby

Number: the number of babies with that name and sex

Acknowledgements

If you use this dataset in your research, please credit @nickgott, @rflprr and the Social Security Administration via Data.gov

Data Source
h
fun-club-name-generator-dataset
huggingface.co
Updated Apr 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mitchell (2025). fun-club-name-generator-dataset [Dataset]. https://huggingface.co/datasets/Laurenfromhere/fun-club-name-generator-dataset
Explore at:
Dataset updated
Apr 5, 2025
Authors
Mitchell
Description
Fun Club Name Generator Dataset

This is a small, handcrafted dataset of random and fun club name ideas.The goal is to help people who are stuck naming something — whether it's a book club, a gaming group, a project, or just a Discord server between friends.

Why this?

A few friends and I spent hours trying to name a casual group — everything felt cringey, too serious, or already taken. We started writing down names that made us laugh, and eventually collected enough to… See the full description on the dataset page: https://huggingface.co/datasets/Laurenfromhere/fun-club-name-generator-dataset.
o
Geonames - All Cities with a population > 1000
public.opendatasoft.com
data.smartidf.services
+1more
csv, excel, geojson +1
Updated Mar 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
Explore at:
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
O
Top 100 Baby Names
data.qld.gov.au
researchdata.edu.au
+1more
csv
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justice (2025). Top 100 Baby Names [Dataset]. https://www.data.qld.gov.au/dataset/top-100-baby-names
Explore at:
csv, csv(2 KiB), csv(200 KiB)Available download formats
Dataset updated
Feb 13, 2025
Dataset authored and provided by
Justice
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
Queensland Top 100 Baby Names
Baby names for girls in England and Wales
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Baby names for girls in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsgirls
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Rank and count of the top names for baby girls, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
Baby names for boys in England and Wales
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Baby names for boys in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Rank and count of the top names for baby boys, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
S
Baby Names: Beginning 2007
health.data.ny.gov
healthdata.gov
application/rdfxml +5
Updated Apr 25, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York State Department of Health (2025). Baby Names: Beginning 2007 [Dataset]. https://health.data.ny.gov/Health/Baby-Names-Beginning-2007/jxy9-yhdk
Explore at:
csv, application/rdfxml, application/rssxml, xml, json, tsvAvailable download formats
Dataset updated
Apr 25, 2025
Dataset authored and provided by
New York State Department of Health
Description
New York State Baby Names are aggregated and displayed by the year, county, or borough where the mother resided as stated on a New York State or New York City (NYC) birth certificate. The frequency of the baby name is listed if there are 5 or more of the same baby name in a county outside of NYC or 10 or more of the same baby name in a NYC borough.
Popular White Last Names in the US
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Popular White Last Names in the US [Dataset]. https://www.johnsnowlabs.com/marketplace/popular-white-last-names-in-the-us/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description
This dataset represents the popular last names in the United States for White.
Historic US census - 1930
redivis.com
application/jsonl +7
Updated Jan 10, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2020). Historic US census - 1930 [Dataset]. http://doi.org/10.57761/6e5q-rh85
Explore at:
application/jsonl, parquet, spss, csv, arrow, stata, avro, sasAvailable download formats
Unique identifier
https://doi.org/10.57761/6e5q-rh85
Dataset updated
Jan 10, 2020
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1930 - Dec 31, 1930
Area covered
United States
Description
Abstract

The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

This dataset was created on 2020-01-10 22:52:11.461 by merging multiple datasets together. The source datasets for this version were:

IPUMS 1930 households: This dataset includes all households from the 1930 US census.

IPUMS 1930 persons: This dataset includes all individuals from the 1930 US census.

IPUMS 1930 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1930 datasets.

Section 2

Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier. In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

The historic US 1930 census data was collected in April 1930. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

Notes

We provide IPUMS household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGEMARR, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, FARM, EMPSTAT, OCC1950, IND1950, MTONGUE, MARST, RACE, SEX, RELATE, CLASSWKR. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

Most inconsistent information was not edite
o
Notices of Name Changes
data.ontario.ca
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Government and Consumer Services (2021). Notices of Name Changes [Dataset]. https://data.ontario.ca/dataset/notices-of-name-changes
Explore at:
(None)Available download formats
Dataset updated
Dec 9, 2021
Dataset authored and provided by
Government and Consumer Services
License
https://www.ontario.ca/page/copyright-informationhttps://www.ontario.ca/page/copyright-information
Time period covered
Oct 5, 2016
Area covered
Ontario
Description
This dataset contains a listing of individuals who have had their name formally changed in Ontario.

This data is made publicly available through the Ontario Gazette.
Top 100 baby names in England and Wales: historical data
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Top 100 baby names in England and Wales: historical data [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalestop100babynameshistoricaldata
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Historic lists of top 100 names for baby boys and girls for 1904 to 2024 at 10-yearly intervals.

Facebook

Twitter

Click to copy link

Link copied

Cite

Data.gov (2019). USA Name Data [Dataset]. https://www.kaggle.com/datasets/datagov/usa-names

USA Name Data

USA Name Data (BigQuery Dataset)

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zip(0 bytes)Available download formats

Dataset updated

Feb 12, 2019

Dataset provided by

Data.govhttps://data.gov/

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

United States

Description

Context

Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States

Content

This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.

All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.

Fork this kernel to get started with this dataset.

Acknowledgements

https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names

https://cloud.google.com/bigquery/public-data/usa-names

Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.

Banner Photo by @dcp from Unplash.

Inspiration

What are the most common names?

What are the most common female names?

Are there more female or male names?

Female names by a wide margin?

Clear search

Close search

Google apps

Main menu

USA Name Data

Context

Content

Acknowledgements

Inspiration

Baby Names from Social Security Card Applications - National Data

Nyc popular baby names

Race and ethnicity data for first, middle, and last names

Facebook Names Dataset

Most Popular Baby Names

Namesakes

USA Names

Popular Baby Names - Dataset - data.sa.gov.au

Baby Names by Year

About this dataset

How to use the dataset

How to use the US Baby Names by Year of Birth dataset:

Research Ideas

Columns

Acknowledgements

fun-club-name-generator-dataset

Geonames - All Cities with a population > 1000

Top 100 Baby Names

Baby names for girls in England and Wales

Baby names for boys in England and Wales

Baby Names: Beginning 2007

Popular White Last Names in the US

Historic US census - 1930

Abstract

Before Manuscript Submission

Documentation

Section 2

Notices of Name Changes

Top 100 baby names in England and Wales: historical data

USA Name Data

USA Name Data (BigQuery Dataset)

Context

Content

Acknowledgements

Inspiration