The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.
This dataset contains ranks and counts for the top 25 baby names by sex for live births that occurred in California (by occurrence) based on information entered on birth certificates.
The data (name, year of birth, sex, state, and number) are from a 100 percent sample of Social Security card applications starting with 1910. National data is in another dataset.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Rank and count of the top names for baby girls, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
https://www.etalab.gouv.fr/licence-ouverte-open-licencehttps://www.etalab.gouv.fr/licence-ouverte-open-licence
In order to facilitate the anonymisation of data, this list of first names and surnames was extracted from the SIRENE database of INSEE.
For each first name and surname, the number of appearances is indicated.
ATTENTION: No content check is done, and these lists may contain anomalies present in the original database!
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains statistics on names (first names of women, first names of men, and last names) by country of birth. In total, there are 231,505 names by 202 countries. The data comes from Statistics Sweden's population statistics (name register) and refers to persons registered in Sweden on December 31st, 2020. However, some names are excluded due to confidentiality, such as names with fewer than five carriers. The data is licensed with Creative Commons Attribution 4.0 International (CC BY 4.0) and may be used as long as Statistics Sweden is stated as the source. In this dataset, you will also find (in addition to the original data from Statistics Sweden) tidied data where the ISO code for each country has been added, as well as data in so-called wide format and long format to facilitate easier data processing. Please see the Swedish version of the post and the README file for more information about the data.
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Top 10 Baby Names by Region (ABS SA4 - Sub State categorisation) Transaction data is omitted due to the small cell counts
By Derek Howard [source]
This dataset provides an essential tool for generating gender-specific datasets from names alone. It contains information on the probability of a person's name belonging to a certain gender, based off of US Social Security records from the last century. This makes it easy to assign genders to datasets that do not natively include this data. All probability values were culled from records with 5 or more people associated with each name - so those individuals with less common monikers can still have their genders correctly predicted! With this resource, users can generate gender-aware data in no time, making gender identification in data sets more accurate and easier than ever
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides a helpful resource when you need to accurately identify gender from names. With this dataset, you’ll be able to quickly and accurately assign genders to datasets that contain names but no other information about the person.
To get started, you will need a csv file with two columns: name and probability. The name column should contain the first names of the people in your dataset. The probability column should contain numbers between 0 and 1 indicating the likelihood that each name is associated with one specific gender (0 for male, 1 for female).
In addition to simply assigning genders from these probabilities alone, users of this dataset also have more control over their classifications - they can use it as either a baseline or as an absolute measure of accuracy depending on their exact needs/preferences. Experimentation is highly encouraged here!
Good luck!
Create gender-specific applications - tailor different apps to different genders based on the probability of a particular name belonging to a certain gender.
Generate gender neutral names - use this data to generate random names with no gender bias.
Automate record lookup - quickly and accurately assign genders based on the probability associated with their name
If you use this dataset in your research, please credit the original authors.
License
Unknown License - Please check the dataset description for more information.
File: name_gender.csv | Column name | Description | |:----------------|:--------------------------------------------------------------------| | name | The name of the person. (String) | | gender | The gender of the person. (String) | | probability | The probability of the gender being assigned to the person. (Float) |
If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Derek Howard.
In the first half of 2024, Nikodem was the most common name for a newborn child in Poland, with over 3 thousand registrations. Next were Jan, Aleksander, and Anton with over two thousand registrations each.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Girls Names Registered in Ireland by Name, Year and Statistic
View data using web pages
Download .px file (Software required)
In 2023, the most common female first name given to Finnish-speaking newborns in Finland was Aino with *** name registrations, followed by Olivia with *** registrations. Other popular first names given to the roughly ****** newborn girls born in Finland included Sofia, Eevi, and Aada.
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA-M0105, ELRA-M0106 and ELRA-M0107.With over 218 million forms based on 100,000 lemmas, this full-form database covers Arab personal names (both given names and surnames) in both Arabic and English and contains a rich set of romanized name variants for each name with a variety of supplementary information such as gender, name type and frequency statistics. This comprehensive lexicon (over 6.4 million variants) contains precise phonemic transcriptions and vocalized Arabic for all inflected and cliticized forms for each name.This database is provided with three options: 1) proclitics, 2) phonetic information (CARS) and 3) orthographic variants. Subsets excluding some of the three proposed options may be provided upon demand. CARS is an accurate phonemic transcription. Optionally, phonetic transcriptions, IPA and/or SAMPA, can be provided, fine tuned to a customer's specifications.Quantity and size: 218,215,875 lines / 32,659 MB (31.9 GB)File format: flat TSV text filesSamples and a specifications document available upon request.
The most popular name for baby boys in England in 2022 was Noah which was the chosen name for 4,320 babies. Noah was also the second-most popular baby name in Wales in this year while Olivia wa s the most popular name for girls in England.
Official statistics are produced impartially and free from any political influence.
Official Street Names in the City of Los Angeles created and maintained by the Bureau of Engineering.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Top 100 most popular boys' and girls' names.
Source agency: Northern Ireland Statistics and Research Agency
Designation: Official Statistics not designated as National Statistics
Language: English
Alternative title: Babies First Names Bulletin (Northern Ireland)
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This is a public use data file on Delaware's most popular baby names for 2009 to 2016 obtained from the Delaware certificate of birth. The top 15 names for each gender are represented.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1998 onward.
New York State Baby names are aggregated and displayed by the year, county or borough where the mother resided as stated on a New York State or New York City (NYC) birth certificate.
The frequency of the Baby Name is listed, if there are: 5 or more of the same baby name in a county outside of NYC; or 10 or more of the same baby name in a NYC borough.
For more information, check out http://www.health.ny.gov/statistics/vital_statistics/, or go to the "About" tab.
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 on.