100+ datasets found

Popular Baby Names
kaggle.com
zip
Updated Mar 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ulrik Thyge Pedersen (2023). Popular Baby Names [Dataset]. https://www.kaggle.com/datasets/ulrikthygepedersen/baby-names
Explore at:
zip(12903 bytes)Available download formats
Dataset updated
Mar 21, 2023
Authors
Ulrik Thyge Pedersen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The popularity of baby names is a fascinating reflection of our society's cultural trends and values over time. The dataset on the most popular baby names from 1880 until now provides a comprehensive look at the evolution of naming practices in the United States over the last 140 years. The dataset includes information on the top 1000 baby names for each year, as well as the number of babies given each name, broken down by gender.

By analyzing this dataset, researchers can identify trends and patterns in baby naming, such as the rise and fall of certain names, the influence of popular culture on naming trends, and the impact of immigration on naming practices. This dataset is a valuable resource for researchers, parents, and anyone interested in exploring the social and cultural history of the United States.
d
Popular Baby Names
catalog.data.gov
data.cityofnewyork.us
+5more
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
data.cityofnewyork.us (2025). Popular Baby Names [Dataset]. https://catalog.data.gov/dataset/popular-baby-names
Explore at:
Dataset updated
Jul 12, 2025
Dataset provided by
data.cityofnewyork.us
Description
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
Baby Names by Year
kaggle.com
zip
Updated Sep 20, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Baby Names by Year [Dataset]. https://www.kaggle.com/datasets/thedevastator/us-baby-names-by-year-of-birth/code
Explore at:
zip(9916059 bytes)Available download formats
Dataset updated
Sep 20, 2022
Authors
The Devastator
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
About this dataset

This dataset contains US baby names from the Social Security Administration dating back to 1879. With over 150 years of data, this is one of the most comprehensive datasets on baby names in the US. The data includes the name, year of birth, sex, and number of babies with that name for each year. This dataset is a great resource for anyone interested in studying baby naming trends over time

How to use the dataset

How to use the US Baby Names by Year of Birth dataset:

This dataset is a compilation of over 140 years of data from the Social Security Administration. It includes data on baby names, year of birth, and sex. There are also columns for the number of babies with that name born in that year.

This dataset can be used to track changes in baby naming trends over time, or to study how popular names have changed in popularity. It can also be used to study how naming trends differ between sexes, or between different years

Research Ideas

This dataset could be used for a number of things, including: 1. Determining baby name trends over time 2. Finding out what the most popular baby names are in the US 3. Analyzing how baby name popularity has changed over the years

Columns

index: the index of the dataframe

YearOfBirth: the year in which the baby was born

Name: the name of the baby

Sex: the sex of the baby

Number: the number of babies with that name and sex

Acknowledgements

If you use this dataset in your research, please credit @nickgott, @rflprr and the Social Security Administration via Data.gov

Data Source
Most Popular Baby Names in NYC
kaggle.com
zip
Updated Mar 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashant Banerjee (2020). Most Popular Baby Names in NYC [Dataset]. https://www.kaggle.com/datasets/prashant111/most-popular-baby-names-in-nyc
Explore at:
zip(88421 bytes)Available download formats
Dataset updated
Mar 15, 2020
Authors
Prashant Banerjee
Area covered
New York
Description
DESCRIPTION

The most popular baby names by sex and mother's ethnicity in New York City from 2011-2014.

SUMMARY

Popular Baby Name Data In NYC from 2011-2014

Rows: 13962; Columns: 6

The data include items, such as:

BRTH_YR: birth year the baby GNDR: gender ETHCTY: mother's ethnicity NM: baby's name CNT: count of the name RNK: ranking of the name

Source: NYC Open Data
d
Most Popular Baby Names
catalog.data.gov
data.chhs.ca.gov
+3more
Updated Nov 23, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
California Department of Public Health (2025). Most Popular Baby Names [Dataset]. https://catalog.data.gov/dataset/most-popular-baby-names-810d5
Explore at:
Dataset updated
Nov 23, 2025
Dataset provided by
California Department of Public Health
Description
This dataset contains ranks and counts for the top 25 baby names by sex for live births that occurred in California (by occurrence) based on information entered on birth certificates.
Baby Names from Social Security Card Applications - National Data
catalog.data.gov
Updated Jul 4, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2025). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data
Explore at:
Dataset updated
Jul 4, 2025
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.
U.S. First Names: Popularity and Counts
kaggle.com
zip
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Fedorov (2025). U.S. First Names: Popularity and Counts [Dataset]. https://www.kaggle.com/datasets/downshift/u-s-first-names-popularity-and-counts
Explore at:
zip(2425 bytes)Available download formats
Dataset updated
Jun 9, 2025
Authors
Daniel Fedorov
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Description

This dataset contains counts and rankings of the most common first names in the United States, sourced from comprehensive name census data. It is ideal for analyzing naming trends, demographic patterns, and cultural preferences, as well as for building statistical models to explore name popularity over time.

Dataset structure

male_first_names.csv: Male first name frequencies and rankings in the U.S.

female_first_names.csv: Female first name frequencies and rankings in the U.S.
Most Popular Baby Names - 8ia4-svqc - Archive Repository
healthdata.gov
csv, xlsx, xml
Updated Nov 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Most Popular Baby Names - 8ia4-svqc - Archive Repository [Dataset]. https://healthdata.gov/dataset/Most-Popular-Baby-Names-8ia4-svqc-Archive-Reposito/hwxa-t8ig
Explore at:
xml, csv, xlsxAvailable download formats
Dataset updated
Nov 7, 2025
Description
This dataset tracks the updates made on the dataset "Most Popular Baby Names" as a repository for previous versions of the data and metadata.
NYC Most Popular Baby Names
kaggle.com
zip
Updated Jan 1, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of New York (2021). NYC Most Popular Baby Names [Dataset]. https://www.kaggle.com/datasets/new-york-city/nyc-most-popular-baby-names/discussion
Explore at:
zip(179712 bytes)Available download formats
Dataset updated
Jan 1, 2021
Dataset authored and provided by
City of New York
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
New York
Description
Content

Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.

Context

This is a dataset hosted by the City of New York. The city has an open data platform found here and they update their information according the amount of data that is brought in. Explore New York City using Kaggle and all of the data sources available through the City of New York organization page!

Update Frequency: This dataset is updated annually.

Acknowledgements

This dataset is maintained using Socrata's API and Kaggle's API. Socrata has assisted countless organizations with hosting their open data and has been an integral part of the process of bringing more data to the public.

Cover photo by freestocks.org on Unsplash
Unsplash Images are distributed under a unique Unsplash License.
Top 100 baby names in England and Wales: historical data
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Top 100 baby names in England and Wales: historical data [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalestop100babynameshistoricaldata
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Historic lists of top 100 names for baby boys and girls for 1904 to 2024 at 10-yearly intervals.
Namesakes
figshare.com
json
Updated Nov 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Oleg Vasilyev; Aysu Altun; Nidhi Vyas; Vedant Dharnidharka; Erika Lampert; John Bohannon (2021). Namesakes [Dataset]. http://doi.org/10.6084/m9.figshare.17009105.v1
Explore at:
jsonAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17009105.v1
Dataset updated
Nov 20, 2021
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Oleg Vasilyev; Aysu Altun; Nidhi Vyas; Vedant Dharnidharka; Erika Lampert; John Bohannon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract

Motivation: creating challenging dataset for testing Named-Entity

Linking. The Namesakes dataset consists of three closely related datasets: Entities, News and Backlinks. Entities were collected as Wikipedia text chunks corresponding to highly ambiguous entity names. The News were collected as random news text chunks, containing mentions that either belong to the Entities dataset or can be easily confused with them. Backlinks were obtained from Wikipedia dump data with intention to have mentions linked to the entities of the Entity dataset. The Entities and News are human-labeled, resolving the mentions of the entities.Methods

Entities were collected as Wikipedia

text chunks corresponding to highly ambiguous entity names: the most popular people names, the most popular locations, and organizations with name ambiguity. In each Entities text chunk, the named entities with the name similar to the chunk Wikipedia page name are labeled. For labeling, these entities were suggested to human annotators (odetta.ai) to tag as "Same" (same as the page entity) or "Other". The labeling was done by 6 experienced annotators that passed through a preliminary trial task. The only accepted tags are the tags assigned in agreement by not less than 5 annotators, and then passed through reconciliation with an experienced reconciliator.

The News were collected as random news text chunks, containing mentions which either belong to the Entities dataset or can be easily confused with them. In each News text chunk one mention was selected for labeling, and 3-10 Wikipedia pages from Entities were suggested as the labels for an annotator to choose from. The labeling was done by 3 experienced annotators (odetta.ai), after the annotators passed a preliminary trial task. The results were reconciled by an experienced reconciliator. All the labeling was done using Lighttag (lighttag.io).

Backlinks were obtained from Wikipedia dump data (dumps.wikimedia.org/enwiki/20210701) with intention to have mentions linked to the entities of the Entity dataset. The backlinks were filtered to leave only mentions in a good quality text; each text was cut 1000 characters after the last mention.

Usage NotesEntities:

File: Namesakes_entities.jsonl The Entities dataset consists of 4148 Wikipedia text chunks containing human-tagged mentions of entities. Each mention is tagged either as "Same" (meaning that the mention is of this Wikipedia page entity), or "Other" (meaning that the mention is of some other entity, just having the same or similar name). The Entities dataset is a jsonl list, each item is a dictionary with the following keys and values: Key: ‘pagename’: page name of the Wikipedia page. Key ‘pageid’: page id of the Wikipedia page. Key ‘title’: title of the Wikipedia page. Key ‘url’: URL of the Wikipedia page. Key ‘text’: The text chunk from the Wikipedia page. Key ‘entities’: list of the mentions in the page text, each entity is represented by a dictionary with the keys: Key 'text': the mention as a string from the page text. Key ‘start’: start character position of the entity in the text. Key ‘end’: end (one-past-last) character position of the entity in the text. Key ‘tag’: annotation tag given as a string - either ‘Same’ or ‘Other’.

News: File: Namesakes_news.jsonl The News dataset consists of 1000 news text chunks, each one with a single annotated entity mention. The annotation either points to the corresponding entity from the Entities dataset (if the mention is of that entity), or indicates that the mentioned entity does not belong to the Entities dataset. The News dataset is a jsonl list, each item is a dictionary with the following keys and values: Key ‘id_text’: Id of the sample. Key ‘text’: The text chunk. Key ‘urls’: List of URLs of wikipedia entities suggested to labelers for identification of the entity mentioned in the text. Key ‘entity’: a dictionary describing the annotated entity mention in the text: Key 'text': the mention as a string found by an NER model in the text. Key ‘start’: start character position of the mention in the text. Key ‘end’: end (one-past-last) character position of the mention in the text. Key 'tag': This key exists only if the mentioned entity is annotated as belonging to the Entities dataset - if so, the value is a dictionary identifying the Wikipedia page assigned by annotators to the mentioned entity: Key ‘pageid’: Wikipedia page id. Key ‘pagetitle’: page title. Key 'url': page URL.

Backlinks dataset: The Backlinks dataset consists of two parts: dictionary Entity-to-Backlinks and Backlinks documents. The dictionary points to backlinks for each entity of the Entity dataset (if any backlinks exist for the entity). The Backlinks documents are the backlinks Wikipedia text chunks with identified mentions of the entities from the Entities dataset.

Each mention is identified by surrounded double square brackets, e.g. "Muir built a small cabin along [[Yosemite Creek]].". However, if the mention differs from the exact entity name, the double square brackets wrap both the exact name and, separated by '|', the mention string to the right, for example: "Muir also spent time with photographer [[Carleton E. Watkins | Carleton Watkins]] and studied his photographs of Yosemite.".

The Entity-to-Backlinks is a jsonl with 1527 items. File: Namesakes_backlinks_entities.jsonl Each item is a tuple: Entity name. Entity Wikipedia page id. Backlinks ids: a list of pageids of backlink documents.

The Backlinks documents is a jsonl with 26903 items. File: Namesakes_backlinks_texts.jsonl Each item is a dictionary: Key ‘pageid’: Id of the Wikipedia page. Key ‘title’: Title of the Wikipedia page. Key 'content': Text chunk from the Wikipedia page, with all mentions in the double brackets; the text is cut 1000 characters after the last mention, the cut is denoted as '...[CUT]'. Key 'mentions': List of the mentions from the text, for convenience. Each mention is a tuple: Entity name. Entity Wikipedia page id. Sorted list of all character indexes at which the mention occurrences start in the text.
d
Popular Baby Names - Dataset - data.sa.gov.au
data.sa.gov.au
Updated Mar 1, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Popular Baby Names - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/popular-baby-names
Explore at:
Dataset updated
Mar 1, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
South Australia
Description
List of male and female baby names in South Australia from 1944 to 2024. The annual data for baby names is published January/February each year.
Forest Common Names (Feature Layer)
catalog.data.gov
datasets.ai
+5more
Updated Apr 21, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Forest Service (2025). Forest Common Names (Feature Layer) [Dataset]. https://catalog.data.gov/dataset/forest-common-names-feature-layer-7d8b7
Explore at:
Dataset updated
Apr 21, 2025
Dataset provided by
U.S. Department of Agriculture Forest Servicehttp://fs.fed.us/
Description
This dataset contains the common names of the national forests and grasslands and their respective FS WWW URL information that is used for both display of the national forest and national grassland boundaries on any map product and for dynamic interactivity of the map. This dataset exhibits the following characteristics: 1. Granularity of the polygon features - The spatial extent of the national forests and the grasslands match the way the agency would like to communicate with the public. 2. Preferred /Common Name of the National Forest Units - The common names of the national forest and grassland match the preferred name column that is present in the common names decision table maintained by the FS Office of Communication. 3. Hyperlinks to FS WWW Home page - This column contains the national forest and their respective FS WWW URL information. This URL could be used on any interactive map applications to link users directly to a forest's home page. Data Source - This dataset is derived from the following FS ALP (Automated Lands Program) Land Status Records System authoritative data sources: 1. Administrative Forest Boundaries 2. Proclaimed Forest Boundaries 3. Ranger District Boundaries 4. National Grassland Areas. The common names decision table maintained by the FS Office of Communication contains the common name and its respective Land Status Records System authoritative data source to be used for building the spatial polygon. The spatial polygons for every feature in this dataset comes from one or more authoritative data sources listed above. The process to create the common names dataset is reusing the already existing ALP names from the data sources listed above.
O
Top 100 Baby Names
data.qld.gov.au
researchdata.edu.au
+1more
csv
Updated Feb 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justice (2025). Top 100 Baby Names [Dataset]. https://www.data.qld.gov.au/dataset/top-100-baby-names
Explore at:
csv(2 KiB), csv, csv(200 KiB)Available download formats
Dataset updated
Feb 13, 2025
Dataset authored and provided by
Justice
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
Queensland Top 100 Baby Names
Most Popular Names in the Philippines Dataset
kaggle.com
zip
Updated Jun 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joriz ivann Villanueva (2023). Most Popular Names in the Philippines Dataset [Dataset]. https://www.kaggle.com/datasets/jorizivannvillanueva/most-popular-names-in-philippines-dataset
Explore at:
zip(13981 bytes)Available download formats
Dataset updated
Jun 25, 2023
Authors
Joriz ivann Villanueva
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Area covered
Philippines
Description
Overview

The Most Popular Names in the Philippines dataset provides insights into the popularity of different names in the Philippines.

Content

The dataset includes the following fields:

rank: The position of the name when graded by incidence with all other names in the place.

forename: The personal name given to an individual at or shortly after birth, also known as a first name.

incidence: Number of people who bear the name.

frequency: Ratio and percentage of people who bear the name.

gender: The gender of the specific name based on the percentage.

gender_percentage: The percentage of bearers who are male or female.

Potential Use Cases

This dataset can be used for various purposes, such as:

Analyzing naming trends in the Philippines.

Exploring the gender distribution of popular names.

Conducting research on cultural naming practices.

Studying the popularity and prevalence of specific names.
E
A corpus of names drawn from the local birth registers of England and Wales,...
dtechtive.com
find.data.gov.scot
txt, xlsx, zip
Updated Jan 25, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
University of Edinburgh (2018). A corpus of names drawn from the local birth registers of England and Wales, 1838-2014 [Dataset]. http://doi.org/10.7488/ds/2294
Explore at:
xlsx(30.21 MB), zip(5.395 MB), txt(0.0166 MB)Available download formats
Unique identifier
https://doi.org/10.7488/ds/2294
Dataset updated
Jan 25, 2018
Dataset provided by
University of Edinburgh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
UNITED KINGDOM
Description
This dataset comprises a corpus of names, in both the first and middle position, for approximately 22 million individuals born in England and Wales between 1838 and 2014. This data is obtained from birth records made available by a set of volunteer-run genealogical resources - collectively, the 'UK local BMD project' (http://www.ukbmd.org.uk/local) - and has been re-purposed here to demonstrate the applicability of network analysis methods to an onomastic dataset. The ownership and licensing of the intellectual property constituting the original birth records is detailed at https://www.ukbmd.org.uk/TermsAndConditions. Under section 29A of the UK Copyright, Designs and Patents Act 1988, a copyright exception permits copies to be made of lawfully accessible material in order to conduct text and data mining for non-commercial research. The data included in this dataset represents the outcome of such a text-mining analysis. No birth records are included in this dataset, and nor is it possible for records to be reconstructed from the data presented herein. The data comprises an archive of tables, presenting this corpus in various forms: as a rank order of names (in both the first and middle position) by number of registered births per year, and by the total number of births across all years sampled. An overview of the data is also provided, with summary statistics such as the number of usable records registered per year, most popular names per year, and measures of forename diversity and the surname-to-forename usage ratio (an indicator of which forenames are more likely to be transferred uses of surnames). These tables are extensive but not exhaustive, and do not exclude the possibility that errors are present in the corpus. Data are also presented both as '.expression' files (an input format readable by the network analysis tool Graphia Professional) and as '.layout' files, a text file format output by Graphia Professional that describes the characteristics of the network so that it may be replicated. Characteristics of the original birth records that allow the identification of individuals - for instance, full name or location of birth - have been removed.
m
Reddit r/AskScience Flair Dataset
data.mendeley.com
Updated May 23, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sumit Mishra (2022). Reddit r/AskScience Flair Dataset [Dataset]. http://doi.org/10.17632/k9r2d9z999.3
Explore at:
Unique identifier
https://doi.org/10.17632/k9r2d9z999.3
Dataset updated
May 23, 2022
Authors
Sumit Mishra
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Reddit is a social news, content rating and discussion website. It's one of the most popular sites on the internet. Reddit has 52 million daily active users and approximately 430 million users who use it once a month. Reddit has different subreddits and here We'll use the r/AskScience Subreddit.

The dataset is extracted from the subreddit /r/AskScience from Reddit. The data was collected between 01-01-2016 and 20-05-2022. It contains 612,668 Datapoints and 25 Columns. The database contains a number of information about the questions asked on the subreddit, the description of the submission, the flair of the question, NSFW or SFW status, the year of the submission, and more. The data is extracted using python and Pushshift's API. A little bit of cleaning is done using NumPy and pandas as well. (see the descriptions of individual columns below).

The dataset contains the following columns and descriptions: author - Redditor Name author_fullname - Redditor Full name contest_mode - Contest mode [implement obscured scores and randomized sorting]. created_utc - Time the submission was created, represented in Unix Time. domain - Domain of submission. edited - If the post is edited or not. full_link - Link of the post on the subreddit. id - ID of the submission. is_self - Whether or not the submission is a self post (text-only). link_flair_css_class - CSS Class used to identify the flair. link_flair_text - Flair on the post or The link flair’s text content. locked - Whether or not the submission has been locked. num_comments - The number of comments on the submission. over_18 - Whether or not the submission has been marked as NSFW. permalink - A permalink for the submission. retrieved_on - time ingested. score - The number of upvotes for the submission. description - Description of the Submission. spoiler - Whether or not the submission has been marked as a spoiler. stickied - Whether or not the submission is stickied. thumbnail - Thumbnail of Submission. question - Question Asked in the Submission. url - The URL the submission links to, or the permalink if a self post. year - Year of the Submission. banned - Banned by the moderator or not.

This dataset can be used for Flair Prediction, NSFW Classification, and different Text Mining/NLP tasks. Exploratory Data Analysis can also be done to get the insights and see the trend and patterns over the years.
Baby names for boys in England and Wales
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2025). Baby names for boys in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsboys
Explore at:
xlsxAvailable download formats
Dataset updated
Jul 31, 2025
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Rank and count of the top names for baby boys, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
NYC Baby Names
kaggle.com
zip
Updated Sep 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of New York (2017). NYC Baby Names [Dataset]. https://www.kaggle.com/datasets/new-york-city/nyc-baby-names/suggestions?status=pending&yourSuggestions=true
Explore at:
zip(139141 bytes)Available download formats
Dataset updated
Sep 8, 2017
Dataset authored and provided by
City of New York
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
New York
Description
This dataset is now updated annually here.

Context

Baby names for children recently born in New York City. This dataset is notable because it includes a breakdown by the ethnicity of the mother of the baby: a source of ethnic information that is missing from many other similar datasets published on state and national levels.

Content

This dataset includes columns for the name, year of birth, sex, and mother's ethnicity of the baby. It also includes a rank column (that name's popularity relative to the rest of the names on the list).

Acknowledgements

This data is published as-is by the City of New York.

Inspiration

How do baby names in New York City differ from national trends?

What names are most, more, or less popular amongst different ethnicities?
g
First names of the newborns of the city of Pré Saint-Gervais for the period...
gimi9.com
Updated Dec 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). First names of the newborns of the city of Pré Saint-Gervais for the period 2018-2020 | gimi9.com [Dataset]. https://gimi9.com/dataset/eu_602e644ee2fdd1c8ad070b54
Explore at:
Dataset updated
Dec 16, 2024
Area covered
Le Pré-Saint-Gervais
Description
The dataset contains the first names of the newborns of the city of Pré Saint-Gervais for the period 2018-2020. In the file, there are 688 first names. The structuring of the data is based on the name of the municipality where the children were born (in the majority of cases, the children were born outside the Pré Saint-Gervais because of the absence of maternity in the commune but at least one of the parents comes from the commune), the INSEE number, the sex, the child’s first name and the number of occurrences and the year of birth. These data are useful in order to analyse trends in the choice of first names and thus to understand the history of the city. The data are collected by the General Affairs Department of the commune of Pré Saint-Gervais from birth declarations. The file can be opened in csv format. To get in touch with the manager for this dataset, you can write to Benjamin Mittet-Brême, Director of General Administration, Civil State and Cemetery. Data-visualisation proposals: — Gender distribution of first names by year https://prenomspsg.trial.opendatasoft.com/chart/embed/repartition_des_sexes_des_prenoms_par_annee1/ — Gender distribution of first names over the period 2018-2020 https://prenomspsg.trial.opendatasoft.com/chart/embed/repartition_des_sexes_des_prenoms_sur_la_periode_2018-20201/ — Most used male given names per year (2018-2020) https://prenomspsg.trial.opendatasoft.com/chart/embed/prenoms_de_sexe_masculin_les_plus_utilises_par_annee_2018-2020/ — Most used female given names per year (2018-2020) https://prenomspsg.trial.opendatasoft.com/chart/embed/prenoms_de_sexe_feminin_les_plus_utilises_par_annee_2018-2020/ — The 10 most given names over the period 2018-2020 https://app.workbenchdata.com/workflows/132629/report Dataset published during the Challenge Data week organised by Sciences Po Saint-Germain-en-Laye from February 15 to 19, 2021.

Facebook

Twitter

Click to copy link

Link copied

Cite

Ulrik Thyge Pedersen (2023). Popular Baby Names [Dataset]. https://www.kaggle.com/datasets/ulrikthygepedersen/baby-names

Popular Baby Names

Can you find patterns in popular Baby Names and predict the next top names?

Explore at:

zip(12903 bytes)Available download formats

Dataset updated

Mar 21, 2023

Authors

Ulrik Thyge Pedersen

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The popularity of baby names is a fascinating reflection of our society's cultural trends and values over time. The dataset on the most popular baby names from 1880 until now provides a comprehensive look at the evolution of naming practices in the United States over the last 140 years. The dataset includes information on the top 1000 baby names for each year, as well as the number of babies given each name, broken down by gender.

By analyzing this dataset, researchers can identify trends and patterns in baby naming, such as the rise and fall of certain names, the influence of popular culture on naming trends, and the impact of immigration on naming practices. This dataset is a valuable resource for researchers, parents, and anyone interested in exploring the social and cultural history of the United States.

Clear search

Close search

Google apps

Main menu

Popular Baby Names

Popular Baby Names

Baby Names by Year

About this dataset

How to use the dataset

How to use the US Baby Names by Year of Birth dataset:

Research Ideas

Columns

Acknowledgements

Most Popular Baby Names in NYC

DESCRIPTION

SUMMARY

Most Popular Baby Names

Baby Names from Social Security Card Applications - National Data

U.S. First Names: Popularity and Counts

Description

Dataset structure

Most Popular Baby Names - 8ia4-svqc - Archive Repository

NYC Most Popular Baby Names

Content

Context

Acknowledgements

Top 100 baby names in England and Wales: historical data

Namesakes

Popular Baby Names - Dataset - data.sa.gov.au

Forest Common Names (Feature Layer)

Top 100 Baby Names

Most Popular Names in the Philippines Dataset

Overview

Content

Potential Use Cases

A corpus of names drawn from the local birth registers of England and Wales,...

Reddit r/AskScience Flair Dataset

Baby names for boys in England and Wales

NYC Baby Names

Context

Content

Acknowledgements

Inspiration

First names of the newborns of the city of Pré Saint-Gervais for the period...

Popular Baby Names

Can you find patterns in popular Baby Names and predict the next top names?