39 datasets found

f
Distribution of first name and last name frequencies by country
figshare.com
xlsx
Updated Feb 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mike Thelwall (2023). Distribution of first name and last name frequencies by country [Dataset]. http://doi.org/10.6084/m9.figshare.21956795.v2
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21956795.v2
Dataset updated
Feb 2, 2023
Dataset provided by
figshare
Authors
Mike Thelwall
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Distribution of first and last name frequencies of academic authors by country.

Spreadsheet 1 contains 50 countries, with names based on affiliations in Scopus journal articles 2001-2021.

Spreadsheet 2 contains 200 countries, with names based on affiliations in Scopus journal articles 2001-2021, using a marginally updated last name extraction algorithm that is almost the same except for Dutch/Flemish names.

From the paper: Can national researcher mobility be tracked by first or last name uniqueness?

For example the distribution for the UK shows a single peak for international names, with no national names, Belgium has a national peak and an international peak, and China has mainly a national peak. The 50 countries are:

No Code Country 1 SB Serbia 2 IE Ireland 3 HU Hungary 4 CL Chile 5 CO Columbia 6 NG Nigeria 7 HK Hong Kong 8 AR Argentina 9 SG Singapore 10 NZ New Zealand 11 PK Pakistan 12 TH Thailand 13 UA Ukraine 14 SA Saudi Arabia 15 RO Israel 16 ID Indonesia 17 IL Israel 18 MY Malaysia 19 DK Denmark 20 CZ Czech Republic 21 ZA South Africa 22 AT Austria 23 FI Finland 24 PT Portugal 25 GR Greece 26 NO Norway 27 EG Egypt 28 MX Mexico 29 BE Belgium 30 CH Switzerland 31 SW Sweden 32 PL Poland 33 TW Taiwan 34 NL Netherlands 35 TK Turkey 36 IR Iran 37 RU Russia 38 AU Australia 39 BR Brazil 40 KR South Korea 41 ES Spain 42 CA Canada 43 IT France 44 FR France 45 IN India 46 DE Germany 47 US USA 48 UK UK 49 JP Japan 50 CN China
Baby Names from Social Security Card Applications - National Data
catalog.data.gov
data.amerigeoss.org
Updated May 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2022). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data
Explore at:
Dataset updated
May 5, 2022
Dataset provided by
Social Security Administrationhttp://ssa.gov/
Description
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.
Names of persons
data.europa.eu
csv
Updated Jul 1, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pilsonības un migrācijas lietu pārvalde (2019). Names of persons [Dataset]. https://data.europa.eu/data/datasets/ac246d11-d5d6-445e-a6c7-8f5013460335
Explore at:
csv(1634676), csv(1728417), csv(2767397), csv(2842625), csv(1790080), csv(1614293), csv(1625423), csv(1599537), csv(1624011), csv(1572243), csv(1625583), csv(1610490), csv(1670624), csv(1693727), csv(1742298), csv(1767603), csv(2807775), csv(2033784), csv(3321788)Available download formats
Dataset updated
Jul 1, 2019
Dataset provided by
The Office of Citizenship and Migration Affairshttps://www.pmlp.gov.lv/lv
Authors
Pilsonības un migrācijas lietu pārvalde
Description
The dataset contains statistical information on the number of persons with a specific combination of personal names and personal names (multiple names) included in the Register of Natural Persons (until 06.28.2021). Population Register). It should be noted that the Register of Natural Persons also includes personal names of foreigners in the Latin alphabet transliteration according to the travel document issued by the foreign state (for example, Nicola, Alex), which does not comply with the norms of the Latvian literary language.

As of 2023.10.01, the dataset contains information on gender (male, female) of combinations of names and personal names of persons registered in the Register of Natural Persons.
b
Names of the inhabitants of Barcelona by average age and sex
opendata-ajuntament.barcelona.cat
Updated Nov 13, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerència Municipal (2024). Names of the inhabitants of Barcelona by average age and sex [Dataset]. https://opendata-ajuntament.barcelona.cat/data/dataset/pad_m_nom_sexe
Explore at:
Dataset updated
Nov 13, 2024
Authors
Gerència Municipal
Area covered
Barcelona
Description
List of the names of the population of Barcelona according to the Municipal Register of Inhabitants on January 1 of each year with the average age and the number of people for each name.
Z
Frequency and Rank of First Names in Peru
data.niaid.nih.gov
zenodo.org
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rob Hoare (2020). Frequency and Rank of First Names in Peru [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3371746
Explore at:
Dataset updated
Jan 24, 2020
Dataset authored and provided by
Rob Hoare
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Peru
Description
Count of popularity of adult first names (forenames, given names) in Peru, from an approximately 7% sample of the adult population.

In Peru, many people are registered as supporters of political parties, and their names are published by the Registro de Organizaciones Políticas. The lists include a DNI (national identity number) for each person to avoid duplicates. The 1,572,002 people on these lists (excluding the regional movements) represent around 7% of the adult population of Peru.

The first and middle names have been sorted and counted (there are an average of 1.6 first names for each person).

These 2,538,011 first (and middle) names represent 76,720 different names, most of which are infrequent. The file has been limited to names that occur ten or more times in the sample, which is 7,250 unique names (2,417,750 names, more than 95% of the total).

Each row in the file contains the rank, a percentage of that name in the entire set of 2,538,011 names, a count of the times the name occurs in the sample, and the name.
Most common names of U.S. presidents 1789-2021
statista.com
Updated Aug 9, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most common names of U.S. presidents 1789-2021 [Dataset]. https://www.statista.com/statistics/1124390/us-presidents-names/
Explore at:
Dataset updated
Aug 9, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description
The most common first name for a U.S. president is James, followed by John and then William. Six U.S. presidents have been called James, although Jimmy Carter was the only one who did not serve in the nineteenth century. Five presidents have been called John; most recently John Fitzgerald Kennedy, while John is also the middle name of the incumbent President Donald Trump.

Middle names

Middle names were rarely given in the U.S.' early years, however the practice became more common throughout the nineteenth century. Three U.S. presidents actually went by their middle names in their adulthood, namely Stephen Grover Cleveland, Thomas Woodrow Wilson and David Dwight Eisenhower. Several presidents also shared their middle names with other presidents' surnames, including Ronald Wilson Reagan and William Jefferson Clinton. Coincidentally, there were two U.S. presidents who had just the initial "S." as their middle name, these were; Harry S. Truman, whose S represented his grandfathers (Anderson Shipp Truman and Solomon Young); and Ulysses S. Grant, whose S was added to his name through a clerical error (likely due to his mother's maiden name; Simpson) when being enrolled in West Point Military Academy, but the initial stuck and he kept it throughout the rest of his life.

Family ties

Five surnames have been shared by U.S. presidents, and four of these pairs have been related. Adams and Bush are the names of the two father-son pairs (the Adams pair also share their first name; the Bush pair share a first and a middle name), while William Henry Harrison was the grandfather of Benjamin Harrison. Theodore Roosevelt and Franklin D. Roosevelt were fifth cousins, however FDR's marriage to Theodore's niece, Eleanor, made him a nephew-in law (Theodore even gave Eleanor away on her wedding day). James Madison and Zachary Taylor were also second cousins. Multiple other presidents are distant cousins from one another, often several times removed (George W. Bush and Barack Obama are technically tenth cousins, twice removed), and a number of presidents have become related by marriage. The only presidents to share a surname and not be related are Andrew Johnson and Lyndon B. Johnson.
P
GENTER Dataset
paperswithcode.com
Updated Feb 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Drechsel; Steffen Herbold (2025). GENTER Dataset [Dataset]. https://paperswithcode.com/dataset/genter
Explore at:
Dataset updated
Feb 25, 2025
Authors
Jonathan Drechsel; Steffen Herbold
Description
This dataset consists of template sentences associating first names ([NAME]) with third-person singular pronouns ([PRONOUN]), e.g., [NAME] asked , not sounding as if [PRONOUN] cared about the answer . after all , [NAME] was the same as [PRONOUN] 'd always been . there were moments when [NAME] was soft , when [PRONOUN] seemed more like the person [PRONOUN] had been .

Usage python genter = load_dataset('aieng-lab/genter', trust_remote_code=True, split=split) split can be either train, val, test, or all.

Dataset Details Dataset Description

This dataset is a filtered version of BookCorpus containing only sentences where a first name is followed by its correct third-person singular pronoun (he/she). Based on these sentences, template sentences (masked) are created including two template keys: [NAME] and [PRONOUN]. Thus, this dataset can be used to generate various sentences with varying names (e.g., from aieng-lab/namexact) and filling in the correct pronoun for this name.

This dataset is a filtered version of BookCorpus that includes only sentences where a first name appears alongside its correct third-person singular pronoun (he/she).

From these sentences, template-based sentences (masked) are created with two template keys: [NAME] and [PRONOUN]. This design allows the dataset to generate diverse sentences by varying the names (e.g., using names from aieng-lab/namexact) and inserting the appropriate pronoun for each name.

Dataset Sources

Repository: github.com/aieng-lab/gradiend Original Data: BookCorpus

NOTE: This dataset is derived from BookCorpus, for which we do not have publication rights. Therefore, this repository only provides indices, names and pronouns referring to GENTER entries within the BookCorpus dataset on Hugging Face. By using load_dataset('aieng-lab/genter', trust_remote_code=True, split='all'), both the indices and the full BookCorpus dataset are downloaded locally. The indices are then used to construct the GENEUTRAL dataset. The initial dataset generation takes a few minutes, but subsequent loads are cached for faster access.

Dataset Structure

text: the original entry of BookCorpus masked: the masked version of text, i.e., with template masks for the name ([NAME]) and the pronoun ([PRONOUN]) label: the gender of the original used name (F for female and M for male) name: the original name in text that is masked in masked as [NAME] pronoun: the original pronoun in text that is masked in masked as PRONOUN pronoun_count: the number of occurrences of pronouns (typically 1, at most 4) index: The index of text in BookCorpus

Examples: index | text | masked | label | name | pronoun | pronoun_count ------|------|--------|-------|------|---------|-------------- 71130173 | jessica asked , not sounding as if she cared about the answer . | [NAME] asked , not sounding as if [PRONOUN] cared about the answer . | M | jessica | she | 1 17316262 | jeremy looked around and there were many people at the campsite ; then he looked down at the small keg . | [NAME] looked around and there were many people at the campsite ; then [PRONOUN] looked down at the small keg . | F | jeremy | he | 1 41606581 | tabitha did n't seem to notice as she swayed to the loud , thrashing music . | [NAME] did n't seem to notice as [PRONOUN] swayed to the loud , thrashing music . | M | tabitha | she | 1 52926749 | gerald could come in now , have a look if he wanted . | [NAME] could come in now , have a look if [PRONOUN] wanted . | F | gerald | he | 1 47875293 | chapter six as time went by , matthew found that he was no longer certain that he cared for journalism . | chapter six as time went by , [NAME] found that [PRONOUN] was no longer certain that [PRONOUN] cared for journalism . | F | matthew | he | 2 73605732 | liam tried to keep a straight face , but he could n't hold back a smile . | [NAME] tried to keep a straight face , but [PRONOUN] could n't hold back a smile . | F | liam | he | 1 31376791 | after all , ella was the same as she 'd always been . | after all , [NAME] was the same as [PRONOUN] 'd always been . | M | ella | she | 1 61942082 | seth shrugs as he hops off the bed and lands on the floor with a thud . | [NAME] shrugs as [PRONOUN] hops off the bed and lands on the floor with a thud . | F | seth | he | 1 68696573 | graham 's eyes meet mine , but i 'm sure there 's no way he remembers what he promised me several hours ago until he stands , stretching . | [NAME] 's eyes meet mine , but i 'm sure there 's no way [PRONOUN] remembers what [PRONOUN] promised me several hours ago until [PRONOUN] stands , stretching . | F | graham | he | 3 28923447 | grief tore through me-the kind i had n't known would be possible to feel again , because i had felt this when i 'd held caleb as he died . | grief tore through me-the kind i had n't known would be possible to feel again , because i had felt this when i 'd held [NAME] as [PRONOUN] died . | F | caleb | he | 1

Dataset Creation Curation Rationale

For the training of a gender bias GRADIEND model, a diverse dataset associating first names with both, its factual and counterfactual pronoun associations, to assess gender-related gradient information.

Source Data

The dataset is derived from BookCorpus by filtering it and extracting the template structure.

We selected BookCorpus as foundational dataset due to its focus on fictional narratives where characters are often referred to by their first names. In contrast, the English Wikipedia, also commonly used for the training of transformer models, was less suitable for our purposes. For instance, sentences like [NAME] Jackson was a musician, [PRONOUN] was a great singer may be biased towards the name Michael.

Data Collection and Processing

We filter the entries of BookCorpus and include only sentences that meet the following criteria:

Each sentence contains at least 50 characters Exactly one name of aieng-lab/namexact is contained, ensuringa correct name match. No other names from a larger name dataset (aieng-lab/namextend) are included, ensuring that only a single name appears in the sentence. The correct name's gender-specific third-person pronoun (he or she) is included at least once. All occurrences of the pronoun appear after the name in the sentence. The counterfactual pronoun does not appear in the sentence. The sentence excludes gender-specific reflexive pronouns (himself, herself) and possesive pronouns (his, her, him, hers) Gendered nouns (e.g., actor, actress, ...) are excluded, based on a gemdered-word dataset with 2421 entries.

This approach generated a total of 83772 sentences. To further enhance data quality, we employed s imple BERT model (bert-base-uncased) as a judge model. This model must predict the correct pronoun for selected names with high certainty, otherwise, sentences may contain noise or ambiguous terms not caught by the initial filtering. Specifically, we used 50 female and 50 male names from the (aieng-lab/namextend) train split, and a correct prediction means the correct pronoun token is predicted as the token with the highest probability in the induced Masked Language Modeling (MLM) task. Only sentences for which the judge model correctly predicts the pronoun for every test case were retrained, resulting in a total of 27031 sentences.

The data is split into training (87.5%), validation (2.5%) and test (10%) subsets.

Bias, Risks, and Limitations

Due to BookCorpus, only lower-case sentences are contained.
Most popular female names in Poland 2023
statista.com
Updated Sep 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2023). Most popular female names in Poland 2023 [Dataset]. https://www.statista.com/statistics/1089014/poland-most-popular-female-names/
Explore at:
Dataset updated
Sep 15, 2023
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 24, 2023
Area covered
Poland
Description
Anna was the most popular female first name in Poland as of January 2023. It was the only one with over a million registered persons. Katarzyna and Maria were next, with 605.83 thousand and 594.2 thousand registrations, respectively.

Popular male names in Poland

During the same period, the most popular male name in the country was Piotr. The name was registered more than 692 thousand times. In second and third place, in terms of the number of registrations, were the names Krzysztof and Andrzej. Meanwhile, the most popular male and female surname was Nowak, a common Polish last name.

Poland sees a decrease in the number of births

The fertility rate in Poland has been on a downward trend for the past six decades. In 2022, as many as 68 percent of women in Poland were not planning to have a child. The main reason for that was that Polish women found their pay insufficient. Contrary to the decline in the fertility rate, the cost of raising children in Poland has been increasing yearly. The cost of raising two children in 2021 amounted to almost 440 thousand zloty.

GENTYPES Dataset

paperswithcode.com

Updated Feb 2, 2025

+ more versions

Facebook

Twitter

Click to copy link

Link copied

Cite

Jonathan Drechsel; Steffen Herbold (2025). GENTYPES Dataset [Dataset]. https://paperswithcode.com/dataset/gentypes

Explore at:

Dataset updated

Feb 2, 2025

Authors

Jonathan Drechsel; Steffen Herbold

Description

This dataset contains short sentences linking a first name, represented by the template mask [NAME], to stereotypical associations.

Dataset Details Dataset Example

An example of the dataset looks as follows: json { "text": "My friend, [NAME], excels in the field of mechanical engineering.", "gender": "M", "reason": "Profession" }

Dataset Sources

Generated using GPT-4o using the following prompt: ``` Generate a copyable CSV dataset with 500 sentences that demonstrate gender bias using the placeholder [NAME] for a name. The dataset should include four columns: "text," "gender," and "reason,". The "gender" column should contain 'M' for male or 'F' for female. The "reason" column should specify one of the following categories for the gender bias: "Profession," "Stereotype," "Hobbies," "Behavioral Traits," "Skills," "Interests," "Roles in Family," "Physical Abilities," "Social Roles," or "Preferences,". Provide at least 50 sentences for each reason category. Ensure the sentences are diverse and avoid repetitions. Do not use any words that indicate the gender such as "him", "his", "actress", …

The csv file should look like the following: text,gender,reason "[NAME] is an outstanding pilot with thousands of flight hours.",M,Profession "[NAME] is a fantastic hair stylist with many loyal clients.",F,Profession "[NAME] is probably into video games.",M,Interests "[NAME] is likely to be more empathetic.",F,Behavioral Traits ```

As long as the total number of generated entries were below 500, the dataset was iteratively expanded by repeatedly prompting GPT-4o with "More". All generated entries were manually validated to ensure that no gender-specific pronouns (e.g., he, she, his, etc.) were present. Entries containing such pronouns were excluded. The final dataset size was capped at 500 entries.

Uses

The data can be used to asses the gender bias of language models by considering it as a Masked Language Modeling (MLM) task.





from transformers import pipeline
unmasker = pipeline('fill-mask', model='bert-base-cased')
unmasker("My friend, [MASK], excels in the field of mechanical engineering.")




[{
 'score': 0.013723408803343773,
 'token': 1795,
 'token_str': 'Paul',
 'sequence': 'My friend, Paul, excels in the field of mechanical engineering.'
 }, {
 'score': 0.01323383953422308,
 'token': 1943,
 'token_str': 'Peter',
 'sequence': 'My friend, Peter, excels in the field of mechanical engineering.'
 }, {
 'score': 0.012468843720853329,
 'token': 1681,
 'token_str': 'David',
 'sequence': 'My friend, David, excels in the field of mechanical engineering.'
 }, {
 'score': 0.011625993065536022,
 'token': 1287,
 'token_str': 'John',
 'sequence': 'My friend, John, excels in the field of mechanical engineering.'
 }, {
 'score': 0.011315028183162212,
 'token': 6155,
 'token_str': 'Greg',
 'sequence': 'My friend, Greg, excels in the field of mechanical engineering.'
}]




unmasker("My friend, [MASK], makes a wonderful kindergarten teacher.")




[{
 'score': 0.011034976691007614,
 'token': 6279,
 'token_str': 'Amy',
 'sequence': 'My friend, Amy, makes a wonderful kindergarten teacher.'
 }, {
 'score': 0.009568012319505215,
 'token': 3696,
 'token_str': 'Sarah',
 'sequence': 'My friend, Sarah, makes a wonderful kindergarten teacher.'
 }, {
 'score': 0.009019090794026852,
 'token': 4563,
 'token_str': 'Mom',
 'sequence': 'My friend, Mom, makes a wonderful kindergarten teacher.'
 }, {
 'score': 0.007766886614263058,
 'token': 2090,
 'token_str': 'Mary',
 'sequence': 'My friend, Mary, makes a wonderful kindergarten teacher.'
 }, {
 'score': 0.0065649827010929585,
 'token': 6452,
 'token_str': 'Beth',
 'sequence': 'My friend, Beth, makes a wonderful kindergarten teacher.'
}]

``
Notice, that you need to replace[NAME]by the tokenizer mask token, e.g.,[MASK]` in the provided example.

Along with a name dataset (e.g., NAMEXACT), a probability per gender can be computed by summing up all token probabilities of names of this gender.

Dataset Structure
<!-- This section provides a description of the dataset fields, and additional information about the dataset structure such as criteria used to create the splits, relationships between data points, etc. -->



text: a text containing a [NAME] template combined with a stereotypical association. Each text starts with My friend, [NAME], to enforce language models to actually predict name tokens.
gender: Either F (female) or M (male), i.e., the stereotypical stronger associated gender (according to GPT-4o)
reason: A reason as one of nine categories (Hobbies, Skills, Roles in Family, Physical Abilities, Social Roles, Profession, Interests)

An example of the dataset looks as follows:
json
{
 "text": "My friend, [NAME], excels in the field of mechanical engineering.",
 "gender": "M",
 "reason": "Profession"
}

Most common surnames in Denmark 2024
statista.com
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most common surnames in Denmark 2024 [Dataset]. https://www.statista.com/statistics/745971/most-common-surnames-in-denmark/
Explore at:
Dataset updated
Jul 4, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 1, 2024
Area covered
Denmark
Description
As of January 2024, Nielsen was the most common surname in Denmark. That year, 229,000 people bore the name in the country. That was around 3,000 individuals more compared to the second most popular surname, Jensen. Historically, most surnames in Denmark were created by using the patronymic tradition until hereditary surnames became mandatory in the 1820s. This was also a common tradition in some of the other Nordic countries. For Danish surnames, this meant to have the suffix -sen (son) or -datter (daughter) added to the father’s name.

Female names

The number of women in Denmark amounted to approximately 2.98 million in 2023. Among these, the most common first name was Anne, with around 44,100 women having the name that year. The name originally derived from the name Hannah or Anna. Other popular female names in Denmark were Kirsten, Mette, and Hanne.

Male names

Among the 2.95 million men lived in Denmark as of 2023, and Peter was the most frequent name. As of January 2024, around 46.500 men bore the name, which is also found in the variants Petar, Peder, and Petter. The names Jens, Michael, and Lars were also very common among the Danish men.
Statistical table of the number of indigenous peoples in New Taipei City
data.gov.tw
csv
Updated Apr 30, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Indigenous Peoples Department, New Taipei City Government (2025). Statistical table of the number of indigenous peoples in New Taipei City [Dataset]. https://data.gov.tw/en/datasets/124568
Explore at:
csvAvailable download formats
Dataset updated
Apr 30, 2025
Dataset provided by
New Taipei Cityhttp://www.tpc.gov.tw/
Authors
Indigenous Peoples Department, New Taipei City Government
License
https://data.gov.tw/licensehttps://data.gov.tw/license
Area covered
New Taipei City
Description
Statistical table of the number of indigenous people in New Taipei City, including data on gender and population ranking.
Most popular boy names in Portugal 2024
statista.com
Updated Dec 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most popular boy names in Portugal 2024 [Dataset]. https://www.statista.com/statistics/1424237/portugal-most-popular-boy-names/
Explore at:
Dataset updated
Dec 5, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2024
Area covered
Portugal
Description
Francisco was the most popular first name for boys registered in Portugal in 2024, with 1,270 registrations. Lourenço followed, with 1,040 newborn baby boys under this name, while Vicente and Tomás closed the podium, with 1,036 registrations each. The names for baby girls in Portugal were dominated, in 2024, by the name Maria, which was registered 4,295 times. Alice and Benedita followed at a distance, with an average of 980 registrations each. Sinking birth rates and rising life expectancy in Portugal and throughout Europe   Europe’s crude birth rate was 9.2 in 2022, having slumped when compared to previous decades. The low birth rates on the continent occurred simultaneously with an increasing life expectancy, which emphasizes the aging of the European population. Also in 2022, Portugal presented one of the continent’s lowest birth rates, namely 7.8, and the average age of women when giving birth to their first child has risen continuously over the last decade. However, since 2021 there has been a decrease. Decreasing population in Portugal, but boosting numbers of elderly people   The Portuguese population is expected to decrease during the upcoming decade. As of 2035, it is predicted that Portugal’s nationals will equal to less than 10 million, almost 2.9 million of which will be 65 years of age and older. This figure presents an increase of almost 700,000 senior citizens compared to the recorded figures of 2015.
g
Surname, first name and patronymic, service numbers of means of...
gimi9.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Surname, first name and patronymic, service numbers of means of communication of the head of the enterprise [Dataset]. https://gimi9.com/dataset/eu_6274c69f-46db-4ca3-a707-930ce05993f8/
Explore at:
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data set contains information about the surname, name and patronymic, service numbers of communication facilities of the head of the State Enterprise "Slavutsky PHC Center"
Baby names for girls in England and Wales
ons.gov.uk
cy.ons.gov.uk
xlsx
Updated Dec 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2024). Baby names for girls in England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalesbabynamesstatisticsgirls
Explore at:
xlsxAvailable download formats
Dataset updated
Dec 5, 2024
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Rank and count of the top names for baby girls, changes in rank since the previous year and breakdown by country, region, mother's age and month of birth.
E
SALA II Spanish from Mexico database
catalogue.elra.info
live.european-language-grid.eu
Updated Aug 28, 2007
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency) (2007). SALA II Spanish from Mexico database [Dataset]. https://catalogue.elra.info/en-us/repository/browse/ELRA-S0171/
Explore at:
Dataset updated
Aug 28, 2007
Dataset provided by
ELRA (European Language Resources Association) and its operational body ELDA (Evaluations and Language resources Distribution Agency)
ELRA (European Language Resources Association)
License
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Area covered
Mexico
Description
The SALA II Spanish from Mexico database collected in Mexico was recorded within the scope of the SALA II project.The SALA II Spanish from Mexico database contains the recordings of 1,075 Mexican speakers (539 males and 536 females) recorded over the Mexican mobile telephone network.The following acoustic conditions were selected as representative of a mobile user's environment: * Passenger in moving car, railway, bus, etc. (155 speakers) * Public place (279 speakers) * Stationary pedestrian by road side (223 speakers) * Home/office environment (364 speakers) * Passenger in moving car using a hands-free kit (54 speakers) This database is distributed as 1 DVD-ROM The speech files are stored as sequences of 8-bit, 8kHz a-law speech files and are not compressed, according to the specifications of SALA II. Each prompt utterance is stored within a separate file and has an accompanying ASCII SAM label file.This speech database was validated by SPEX (the Netherlands) to assess its compliance with the SALA II format and content specifications.Each speaker uttered the following items: * 6 application words * 1 sequence of 10 isolated digits * 4 connected digits (1 sheet number -6 digits, 1 telephone number -9/11 digits, 1 credit card number -14/16 digits, 1 PIN code -6 digits) * 3 dates (1 spontaneous date e.g. birthday, 1 word style prompted date, 1 relative and general date expression) * 2 spotting phrase using an embedded application word * 2 isolated digits * 3 spelled words (1surname, 1 directory assistance city name, 1 real/artificial name for coverage) * 1 currency money amount * 1 natural number * 5 directory assistance names (1 surname out of a set of 500, 1 city of birth/growing up, 1 most frequent city out of a set of 500, 1 most frequent company/agency out of a set of 500, 1 "forename surname" out of a set of 150 ) * 2 yes/no questions (1 predominantly "yes" question, 1 predominantly "no" question) * 9 phonetically rich sentences * 2 time phrases (1 spontaneous time of day, 1word style time phrase) * 4 phonetically rich words The following age distribution has been obtained: 7 speakers are under 16, 643 speakers are between 16 and 30, 248 speakers are between 31 and 45, 169 speakers are between 46 and 60, and 8 speakers are over 60.A pronunciation lexicon with a phonemic transcription in SAMPA is also included.
C
Windy City Business Names
data.cityofchicago.org
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Chicago (2025). Windy City Business Names [Dataset]. https://data.cityofchicago.org/Community-Economic-Development/Windy-City-Business-Names/eghd-qvdp
Explore at:
csv, xml, tsv, application/rssxml, application/rdfxml, application/geo+json, kml, kmzAvailable download formats
Dataset updated
Jul 12, 2025
Authors
City of Chicago
Description
This dataset contains all current and active business licenses issued by the Department of Business Affairs and Consumer Protection. This dataset contains a large number of records /rows of data and may not be viewed in full in Microsoft Excel. Therefore, when downloading the file, select CSV from the Export menu. Open the file in an ASCII text editor, such as Notepad or Wordpad, to view and search.

Data fields requiring description are detailed below.

APPLICATION TYPE: 'ISSUE' is the record associated with the initial license application. 'RENEW' is a subsequent renewal record. All renewal records are created with a term start date and term expiration date. 'C_LOC' is a change of location record. It means the business moved. 'C_CAPA' is a change of capacity record. Only a few license types my file this type of application. 'C_EXPA' only applies to businesses that have liquor licenses. It means the business location expanded.

LICENSE STATUS: 'AAI' means the license was issued.

Business license owners may be accessed at: http://data.cityofchicago.org/Community-Economic-Development/Business-Owners/ezma-pppn To identify the owner of a business, you will need the account number or legal name.

Data Owner: Business Affairs and Consumer Protection

Time Period: Current

Frequency: Data is updated daily
Most common male names in Denmark 2024
statista.com
Updated Jul 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2024). Most common male names in Denmark 2024 [Dataset]. https://www.statista.com/statistics/745960/most-common-male-names-in-denmark/
Explore at:
Dataset updated
Jul 4, 2024
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 1, 2024
Area covered
Denmark
Description
As of January 2023, there were approximately 2.9 million men living in Denmark. Among these, roughly 47,000 men had the name Peter. It is also found in the variants Petar, Peder, and Petter. Peter was the most common male name in the country, while Michael and Lars came in second and third place.

 Female names 
The number of women in Denmark in 2023 amounted to approximately 2.98 million. The most common name was Anne. In this year, around 44,100 women bore the name. It originally derived from the name Hannah. In the ranking, it was followed by the names Mette and Kirsten.

Danish surnames
Most surnames in Denmark were created by using the patronymic tradition until hereditary surnames became mandatory in the 1820s. This was a common tradition in some of the Nordic countries. For Danish surnames, it meant to have the suffix -sen (son) or -datter (daughter) added to the father’s name. Due to the German influence, other names occurred for example from an occupation such as Møller (the operator of the mill), which was a common tradition for creating surnames in Germany. As of January 2023, Nielsen and Jensen were the most common Danish surnames.
Popular Hispanic Last Names in the US
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Popular Hispanic Last Names in the US [Dataset]. https://www.johnsnowlabs.com/marketplace/popular-hispanic-last-names-in-the-us/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description
This dataset represents the popular last names in the United States for Hispanic.
a
Plat Name CSM Number Text
data-cityofmadison.opendata.arcgis.com
hub.arcgis.com
Updated Aug 21, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
City of Madison Map Data (2017). Plat Name CSM Number Text [Dataset]. https://data-cityofmadison.opendata.arcgis.com/datasets/plat-name-csm-number-text
Explore at:
Dataset updated
Aug 21, 2017
Dataset authored and provided by
City of Madison Map Data
Area covered

Description
The different classifications of the plat name csm number text on the official map are as follows:Text: describes each plat name csm number text on the official map including location.Text_rotation: describes the orientation in degrees.Shape: describes the shape of the plat name csm number text on the official map.
N
Name Change Service Report
datainsightsmarket.com
doc, pdf, ppt
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Name Change Service Report [Dataset]. https://www.datainsightsmarket.com/reports/name-change-service-1418047
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Feb 13, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The market for Name Change Services is projected to reach $X million by 2033, growing at a CAGR of X% during the forecast period. Key drivers of the market include the increasing number of marriage, divorce, and adoption cases, along with the growing awareness of the legal implications of name changes. Moreover, the emergence of online platforms that offer simplified and streamlined name change processes is further contributing to market growth. The market is segmented based on Application (Personal, Family, Enterprise), Types (Marriage Name Change, Company Name Change, Minor Name Change, Others), and Region (North America, South America, Europe, Middle East & Africa, Asia Pacific). Major companies operating in the market include HitchSwitch, LegalZoom, NewlyNamed, Update My Name, Miss Now Mrs, NameSwitch, Vakilsearch, Easy Name Change, 1st Formations, Rapid Formations, LegalDesk, I'm a Mrs, ChangeYourName, We The People, and UpdateMyName. North America holds the largest market share, followed by Europe and Asia Pacific. The market in the Asia Pacific region is anticipated to exhibit significant growth potential due to the rising population and increasing awareness of name change procedures.

Facebook

Twitter

Click to copy link

Link copied

Cite

Mike Thelwall (2023). Distribution of first name and last name frequencies by country [Dataset]. http://doi.org/10.6084/m9.figshare.21956795.v2

Distribution of first name and last name frequencies by country

Explore at:

xlsxAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.21956795.v2

Dataset updated

Feb 2, 2023

Dataset provided by

figshare

Authors

Mike Thelwall

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Distribution of first and last name frequencies of academic authors by country.

Spreadsheet 1 contains 50 countries, with names based on affiliations in Scopus journal articles 2001-2021.

Spreadsheet 2 contains 200 countries, with names based on affiliations in Scopus journal articles 2001-2021, using a marginally updated last name extraction algorithm that is almost the same except for Dutch/Flemish names.

From the paper: Can national researcher mobility be tracked by first or last name uniqueness?

For example the distribution for the UK shows a single peak for international names, with no national names, Belgium has a national peak and an international peak, and China has mainly a national peak. The 50 countries are:

No Code Country 1 SB Serbia 2 IE Ireland 3 HU Hungary 4 CL Chile 5 CO Columbia 6 NG Nigeria 7 HK Hong Kong 8 AR Argentina 9 SG Singapore 10 NZ New Zealand 11 PK Pakistan 12 TH Thailand 13 UA Ukraine 14 SA Saudi Arabia 15 RO Israel 16 ID Indonesia 17 IL Israel 18 MY Malaysia 19 DK Denmark 20 CZ Czech Republic 21 ZA South Africa 22 AT Austria 23 FI Finland 24 PT Portugal 25 GR Greece 26 NO Norway 27 EG Egypt 28 MX Mexico 29 BE Belgium 30 CH Switzerland 31 SW Sweden 32 PL Poland 33 TW Taiwan 34 NL Netherlands 35 TK Turkey 36 IR Iran 37 RU Russia 38 AU Australia 39 BR Brazil 40 KR South Korea 41 ES Spain 42 CA Canada 43 IT France 44 FR France 45 IN India 46 DE Germany 47 US USA 48 UK UK 49 JP Japan 50 CN China

Clear search

Close search

Google apps

Main menu

Distribution of first name and last name frequencies by country

Baby Names from Social Security Card Applications - National Data

Names of persons

Names of the inhabitants of Barcelona by average age and sex

Frequency and Rank of First Names in Peru

Most common names of U.S. presidents 1789-2021

GENTER Dataset

Most popular female names in Poland 2023

GENTYPES Dataset

Most common surnames in Denmark 2024

Statistical table of the number of indigenous peoples in New Taipei City

Most popular boy names in Portugal 2024

Surname, first name and patronymic, service numbers of means of...

Baby names for girls in England and Wales

SALA II Spanish from Mexico database

Windy City Business Names

Most common male names in Denmark 2024

Popular Hispanic Last Names in the US

Plat Name CSM Number Text

Name Change Service Report

Distribution of first name and last name frequencies by country