https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.
All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names
https://cloud.google.com/bigquery/public-data/usa-names
Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @dcp from Unplash.
What are the most common names?
What are the most common female names?
Are there more female or male names?
Female names by a wide margin?
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.
Official Street Names in the City of Los Angeles created and maintained by the Bureau of Engineering.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a very small but useful dataset if you are ever looking to get jobs for a certain US city in LinkedIn. It contains a list of US cities and states and it's corresponding LinkedIn ID (which is usually externally hidden).
The cities list was retreived from here: https://github.com/kelvins/US-Cities-Database and the names of the ciiadjusted to match the name used in LinkedIn (which could differ in subtle ways).
Some cities do not have an ID, this is because the city is either too small or because there was a difference in the name on LinkedIn which I did not detect (human error). If you ever run in to one of these feel free to enhance this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
First names and last names by country according to affiliations in journal articles 2001-2021 as recorded in Scopus. For 200 countries, there is a complete list of all first names and all last names of at least one researcher with a national affiliation in that country. Each file also records: the number of researchers with that name in the country, the proportion of researchers with that name in the country compared to the world, the number of researchers with that name in the world,
For example, for the USA:
Name Authors in USA Proportion in USA Total Sadrach 3 1.000 3 Rangsan 1 0.083 12 Parry 6 0.273 22 Howard 2008 0.733 2739
Only the first parts of double last names are included. For example, Rodriquez Gonzalez, Maria would have only Rodriquez recorded.
This is from the paper: "Can national researcher mobility be tracked by first or last name uniqueness"
List of countries Afghanistan; Albania; Algeria; Angola; Argentina; Armenia; Australia; Austria; Azerbaijan; Bahamas; Bahrain; Bangladesh; Barbados; Belarus; Belgium; Belize; Benin; Bermuda; Bhutan; Bolivia; Bosnia and Herzegovina; Botswana; Brazil; Brunei Darussalam; Bulgaria; Burkina Faso; Burundi; Cambodia; Cameroon; Canada; Cape Verde; Cayman Islands; Central African Republic; Chad; Chile; China; Colombia; Congo; Costa Rica; Cote d'Ivoire; Croatia; Cuba; Cyprus; Czech Republic; Democratic Republic Congo; Denmark; Djibouti; Dominican Republic; Ecuador; Egypt; El Salvador; Eritrea; Estonia; Ethiopia; Falkland Islands (Malvinas); Faroe Islands; Federated States of Micronesia; Fiji; Finland; France; French Guiana; French Polynesia; Gabon; Gambia; Georgia; Germany; Ghana; Greece; Greenland; Grenada; Guadeloupe; Guam; Guatemala; Guinea; Guinea-Bissau; Guyana; Haiti; Honduras; Hong Kong; Hungary; Iceland; India; Indonesia; Iran; Iraq; Ireland; Israel; Italy; Jamaica; Japan; Jordan; Kazakhstan; Kenya; Kuwait; Kyrgyzstan; Laos; Latvia; Lebanon; Lesotho; Liberia; Libyan Arab Jamahiriya; Liechtenstein; Lithuania; Luxembourg; Macao; Macedonia; Madagascar; Malawi; Malaysia; Maldives; Mali; Malta; Martinique; Mauritania; Mauritius; Mexico; Moldova; Monaco; Mongolia; Montenegro; Morocco; Mozambique; Myanmar; Namibia; Nepal; Netherlands; New Caledonia; New Zealand; Nicaragua; Niger; Nigeria; North Korea; North Macedonia; Norway; Oman; Pakistan; Palau; Palestine; Panama; Papua New Guinea; Paraguay; Peru; Philippines; Poland; Portugal; Puerto Rico; Qatar; Reunion; Romania; Russia; Russian Federation; Rwanda; Saint Kitts and Nevis; Samoa; San Marino; Saudi Arabia; Senegal; Serbia; Seychelles; Sierra Leone; Singapore; Slovakia; Slovenia; Solomon Islands; Somalia; South Africa; South Korea; South Sudan; Spain; Sri Lanka; Sudan; Suriname; Swaziland; Sweden; Switzerland; Syrian Arab Republic; Taiwan; Tajikistan; Tanzania; Thailand; Timor-Leste; Togo; Trinidad and Tobago; Tunisia; Turkey; Uganda; Ukraine; United Arab Emirates; United Kingdom; United States; Uruguay; Uzbekistan; Vanuatu; Venezuela; Viet Nam; Virgin Islands (U.S.); Yemen; Yugoslavia; Zambia; Zimbabwe
https://leadsdeposit.com/restaurant-database/https://leadsdeposit.com/restaurant-database/
Dataset of 700,000 restaurants in the United States complete with detailed contact and geolocation data. The database includes multiple data points such as restaurant name, address, phone number, website, email, opening hours, latitude, longitude, and cuisine.
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA-M0105, ELRA-M0106 and ELRA-M0107.With over 218 million forms based on 100,000 lemmas, this full-form database covers Arab personal names (both given names and surnames) in both Arabic and English and contains a rich set of romanized name variants for each name with a variety of supplementary information such as gender, name type and frequency statistics. This comprehensive lexicon (over 6.4 million variants) contains precise phonemic transcriptions and vocalized Arabic for all inflected and cliticized forms for each name.This database is provided with three options: 1) proclitics, 2) phonetic information (CARS) and 3) orthographic variants. Subsets excluding some of the three proposed options may be provided upon demand. CARS is an accurate phonemic transcription. Optionally, phonetic transcriptions, IPA and/or SAMPA, can be provided, fine tuned to a customer's specifications.Quantity and size: 218,215,875 lines / 32,659 MB (31.9 GB)File format: flat TSV text filesSamples and a specifications document available upon request.
This list is a work-in-progress and will be updated at least quarterly. This version updates column names and corrects spellings of several streets in order to alleviate confusion and simplify street name research. It represents an inventory of official street name spellings in the City of New Orleans. Several sources contain various spellings and formats of street names. This list represents street name spellings and formats researched by the City of New Orleans GIS and City Planning Commission.Note: This list may not represent what is currently displayed on street signs. City of New Orleans official street list is derived from New Orleans street centerline file, 9-1-1 centerline file, and CPC plat maps. Fields include the full street name and the parsed elements along with abbreviations using US Postal Standards. We invite your input to as we work toward one enterprise street name list.Status: Current: Currently a known used street name in New Orleans Other: Currently a known used street name on a planned but not developed street. May be a retired street name.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A list of USA states and two-letter abbreviations for coding.
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
Chinese name components, accompanied by accurate pinyin readings, gender codes, and flags denoting whether name is a given name, surname, or both.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
French First Names from Death Records (1970-2024)
This dataset contains French first names extracted from death records provided by INSEE (French National Institute of Statistics and Economic Studies) covering the period from 1970 to September 2024.
Dataset Description
Data Source
The data is sourced from INSEE's death records database. It includes first names of deceased individuals in France, providing valuable insights into naming patterns across different… See the full description on the dataset page: https://huggingface.co/datasets/eltorio/french_first_names_insee_2024.
The Alesco Phone ID Database data ties together a consumer's true identity, and with linkage to the Alesco Power Identity Graph, we are perfectly positioned to help customers solve today's most challenging marketing, analytics, and identity resolution problems.
Our proprietary Phone ID database combines public and private sources and validates phone numbers against current and historical data 24 hours a day, 365 days a year.
With over 650 million unique phone numbers, device and service information, our one-of-a-kind solutions are now available for your marketing and identity resolution challenges in both B2C and B2B applications!
• Alesco Phone ID provides more than 860 million phone numbers monthly linked to a consumer or business name and includes landline, mobile phone number, VoIP, private and business phone numbers — all permissibly obtained and privacy-compliant and linked to other Alesco data sets
• How we do it: Alesco Phone ID is multi-sourced with daily information and delivered monthly or quarterly to clients. Our proprietary machine learning and advanced analytics processes ensure quality levels far above industry standards. Alesco processes over 100 million phone signals per day, compiling, normalizing, and standardizing phone information from 37 input sources.
• Accuracy: Each of Alesco’s phone data sources are vetted to ensure they are authoritative, giving you confidence in the accuracy of the information. Every record is validated, verified and processed to ensure the widest, most reliable coverage combined with stunning precision.
Ease of use: Alesco’s Phone ID Database is available as an on-premise phone database license, giving you full control to host and access this powerful resource on-site. Ongoing updates are provided on a monthly basis ensure your data is up to date.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset is part of the Geographical repository maintained by Opendatasoft. This dataset contains data for places and equivalent entities in United States of America.This layer both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. Processors and tools are using this data. Enhancements Add ISO 3166-3 codes. Simplify geometries to provide better performance across the services. Add administrative hierarchy.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
A text database of named places in the United States. The US Board on Geographic Names controls the database of official names of places in the US, and the US Geological Survey (USGS) maintains the database. This is a copy of the 2017-06-01 database, which I am using to create an R package for textual analyses of geographic content, to ensure this version remains. The original source was from: https://geonames.usgs.gov/domestic/download_data.htm (which is a very slow server).I have chosen CC0 for the license because, as a creation of the US government, I don't think the database can be copyrighted (and CC0 is the closest match).
• 500K+ Active Amazon Stores • 200K+ Seller Leads • Platforms USA, Germany, UK, Italy, France, Spain, CA • C-Suite/Marketing/Sales Contacts • FBA/Non-FBA Sellers • 15+ data points available for each prospect • Filter your leads by store size, niche, location, and many more • 100% manually researched and verified.
For over a decade, we have been manually collecting Amazon seller data from various data sources such as Amazon, Linkedin, Google, and others. We are specialized to get valid, and potential data so you may conduct ads and begin selling without hesitation.
We designed our data packages for all types of organizations, thus they are reasonably priced. We are always trying to reduce our prices to better suit all of your requirements.
So, if you’re looking to reach out to your targeted Amazon sellers, now is the greatest time to do so and offer your goods, services, and promotions. You can get your targeted Amazon Sellers List with seller contact information.
Alternatively, if you provide Amazon Seller Names or IDs, we will conduct Custom Research and deliver the customized list to you.
Data Points Available:
Full Name Linkedin URL Direct Email Generic Phone Number Business Name and Address Company Website Seller IDs and URLs Revenue Seller Review Count Niche FBA/Non-FBA Country and More
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset is part of the Geographical repository maintained by Opendatasoft. This dataset contains data for counties and equivalent entities in United States of America. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities.Processors and tools are using this data. Enhancements Add ISO 3166-3 codes. Simplify geometries to provide better performance across the services. Add administrative hierarchy.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
French Last Names from Death Records (1970-2024)
This dataset contains French lasst names extracted from death records provided by INSEE (French National Institute of Statistics and Economic Studies) covering the period from 1970 to September 2024.
Dataset Description
Random name generator demo
go to https://sctg-development.github.io/french-names-extractor/
Data Source
The data is sourced from INSEE's death records database. It includes last names… See the full description on the dataset page: https://huggingface.co/datasets/eltorio/french_last_names_insee_2024.
There is no story behind this data.
These are just supplementary datasets which I plan on using for plotting county wise data on maps.. (in particular for using with my kernel : https://www.kaggle.com/stansilas/maps-are-beautiful-unemployment-is-not/)
As that data set didn't have the info I needed for plotting an interactive map using highcharter
.
Since I noticed that most demographic datasets here on Kaggle, either have state code
, state name
, or county name + state name
but not all of it i.e county name, fips code, state name + state code.
Using these two datasets one can get any combination of state county codes etc.
States.csv has State name + code
US counties.csv has county wise data.
Picture : https://unsplash.com/search/usa-states?photo=-RO2DFPl7wE
Counties : https://www.census.gov/geo/reference/codes/cou.html
State :
Not Applicable.
URL from idinfo/citation in CSDGM metadata.
Alesco’s aggregated consumer email database consists of over 2.3 billion U.S. records with name, address and email. The database is fully CAN-SPAM and privacy compliant, and records include referring URL, IP address and date stamp. Postal addresses are address standardized and processed through the US postal service National Change of Address (NCOA) service. Available for licensing!
File size: 2.3 Billion IP Address: 1.9 Billion eAppend data: 1.48 Billion (full name/postal) Acquisition: 269 Million (full demo’s)
Fields Included: -Name -Address -Email -Phone -IP Address
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.
All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names
https://cloud.google.com/bigquery/public-data/usa-names
Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @dcp from Unplash.
What are the most common names?
What are the most common female names?
Are there more female or male names?
Female names by a wide margin?