The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Context
This data tells us the naming trend in US for babies from 1880s to late 2000s. You can explore different factors affecting parents - that compel them to name their baby in a certain trend.
Content
The source can be found here- https://github.com/wesm/pydata-book/tree/2nd-edition/datasets
Acknowledgements
Special thanks to Python for Data Analysis by Wes Mckinney for this.
The most popular baby names by sex and mother's ethnicity in New York City.
Official Street Names in the City of Los Angeles created and maintained by the Bureau of Engineering.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
In the United States, cultural and ethnic diversity is reflected in a wide variety of surnames that have fascinating stories and origins. These surnames, more than simple identifiers, are a reflection of the roots, traditions and origins of their bearers. In this article, we will explore the most common surnames among the inhabitants of this country, offering a vision of how American influences have shaped and enriched the onomastic landscape. As we delve into this list, we'll see how each last name can tell a unique story about American culture and heritage, highlighting the plurality that characterizes this nation.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
French First Names from Death Records (1970-2024)
This dataset contains French first names extracted from death records provided by INSEE (French National Institute of Statistics and Economic Studies) covering the period from 1970 to September 2024.
Dataset Description
Data Source
The data is sourced from INSEE's death records database. It includes first names of deceased individuals in France, providing valuable insights into naming patterns across different… See the full description on the dataset page: https://huggingface.co/datasets/eltorio/french_first_names_insee_2024.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Social Security Administration (SSA) of The United States published the frequency of the born a baby name in the US (United State) after 1879.
This dataset contains raw data in txt format which include year from 1880 to 2019 with name and sex columns.
I have taken a dataset from U.S. Social Security, you can check out from here:https://www.ssa.gov/oact/babynames/limits.html
Use simple python code to Analyzing the name pattern in the US.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is a very small but useful dataset if you are ever looking to get jobs for a certain US city in LinkedIn. It contains a list of US cities and states and it's corresponding LinkedIn ID (which is usually externally hidden).
The cities list was retreived from here: https://github.com/kelvins/US-Cities-Database and the names of the ciiadjusted to match the name used in LinkedIn (which could differ in subtle ways).
Some cities do not have an ID, this is because the city is either too small or because there was a difference in the name on LinkedIn which I did not detect (human error). If you ever run in to one of these feel free to enhance this dataset.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
For sale are domain names with WHO IS information that were registered between Mar 16, 2018 and Mar 31, 2018 by registrants in United States. Domains which obfuscate registrant, administrative, and other WHO IS contact details have been omitted from this dataset. The following information is availble for download in this dataset: - Domain name, Created Date, Updated Date, Expiration Date, Registrar Name- Registrant Company, Name, Address, City, State/Province/Other, Postal Code, Country, Email, Phone #, Fax #- Administrative Company, Name, Address, City, State/Province/Other, Postal Code, Country, Email, Phone #, Fax #- Technical Company, Name, Address, City, State/Province/Other, Postal Code, Country, Email, Phone #, Fax #- Billing Company, Name, Address, City, State/Province/Other, Postal Code, Country, Email, Phone #, Fax #- NameServer1, NameServer2, NameServer3, NameServer4, - DomainStatus1, DomainStatus2, DomainStatus3, DomainStatus4 Still unsure about purchasing this dataset? View and Download a free sample dataset of global domain name registrations in the Lead Generation category Are you interested in a more targeted domain name registration dataset? Select the "Ask Seller a Question" link, send me a message, and I'll get back to you as soon as I can.
Lead Generation
usa,united-states,newly-registered-domains,who-is-data
220910
$20.00
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
This dataset is part of the Geographical repository maintained by Opendatasoft. This dataset contains data for counties and equivalent entities in United States of America. The primary legal divisions of most states are termed counties. In Louisiana, these divisions are known as parishes. In Alaska, which has no counties, the equivalent entities are the organized boroughs, city and boroughs, municipalities, and for the unorganized area, census areas. The latter are delineated cooperatively for statistical purposes by the State of Alaska and the Census Bureau. In four states (Maryland, Missouri, Nevada, and Virginia), there are one or more incorporated places that are independent of any county organization and thus constitute primary divisions of their states. These incorporated places are known as independent cities and are treated as equivalent entities for purposes of data presentation. The District of Columbia and Guam have no primary divisions, and each area is considered an equivalent entity for purposes of data presentation. The Census Bureau treats the following entities as equivalents of counties for purposes of data presentation: Municipios in Puerto Rico, Districts and Islands in American Samoa, Municipalities in the Commonwealth of the Northern Mariana Islands, and Islands in the U.S. Virgin Islands. The entire area of the United States, Puerto Rico, and the Island Areas is covered by counties or equivalent entities.Processors and tools are using this data. Enhancements Add ISO 3166-3 codes. Simplify geometries to provide better performance across the services. Add administrative hierarchy.
A large and fast-growing number of studies across the social sciences use experiments to better understand the role of race in human interactions, particularly in the American context. Researchers often use names to signal the race of individuals portrayed in these experiments. However, those names might also signal other attributes, such as socioeconomic status (e.g., education and income) and citizenship. If they do, researchers need pre-tested names with data on perceptions of these attributes. Such data would permit researchers to draw correct inferences about the causal effect of race in their experiments. In this paper, we provide the largest dataset of validated name perceptions based on three different surveys conducted in the United States. In total, our data include over 44,170 name evaluations from 4,026 respondents for 600 names. In addition to respondent perceptions of race, income, education, and citizenship from names, our data also include respondent characteristics. Our data will be broadly helpful for researchers conducting experiments on the manifold ways in which race shapes American life.
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
This database is part of the ArabLEX set of data which consists of the Database of Arabic General Vocabulary (DAG), Database of Arabic Place Names (DAP), Database of Foreign Names in Arabic (DAF) and Database of Arab Names (DAN) available from ELRA under references, respectively, ELRA-L0131, ELRA-M0105, ELRA-M0106 and ELRA-M0107.This full-form Arabic-English place name database of over 21,000 lemmas and nearly 6.5 million forms provides worldwide coverage of common place names, given in standard MSA orthography, and includes all inflected and cliticized forms for each place name. In addition, precise phonemic transcriptions and full vowel diacritics are designed to enhance Arabic speech technology. Orthographic variants are also extensively covered.This database is provided with three options: 1) proclitics, 2) phonetic information (CARS) and 3) orthographic variants. Subsets excluding some of the three proposed options may be provided upon demand. CARS is an accurate phonemic transcription. Optionally, phonetic transcriptions, IPA and/or SAMPA, can be provided, fine tuned to a customer's specifications.Quantity and size: 6,455,201 lines / 812 MBFile format: flat TSV text filesSamples and a specifications document available upon request.
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
https://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalogue.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
This database covers non-Arabic names, their Arabic equivalents, and Arabic script variants for each name (with the most important variant given first).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains the common names of the national forests and grasslands and their respective FS WWW URL information that is used for both display of the national forest and national grassland boundaries on any map product and for dynamic interactivity of the map. This dataset exhibits the following characteristics: 1. Granularity of the polygon features - The spatial extent of the national forests and the grasslands match the way the agency would like to communicate with the public. 2. Preferred /Common Name of the National Forest Units - The common names of the national forest and grassland match the preferred name column that is present in the common names decision table maintained by the FS Office of Communication. 3. Hyperlinks to FS WWW Home page - This column contains the national forest and their respective FS WWW URL information. This URL could be used on any interactive map applications to link users directly to a forest's home page. Data Source - This dataset is derived from the following FS ALP (Automated Lands Program) Land Status Records System authoritative data sources: 1. Administrative Forest Boundaries 2. Proclaimed Forest Boundaries 3. Ranger District Boundaries 4. National Grassland Areas. The common names decision table maintained by the FS Office of Communication contains the common name and its respective Land Status Records System authoritative data source to be used for building the spatial polygon. The spatial polygons for every feature in this dataset comes from one or more authoritative data sources listed above. The process to create the common names dataset is reusing the already existing ALP names from the data sources listed above.This record was taken from the USDA Enterprise Data Inventory that feeds into the https://data.gov catalog. Data for this record includes the following resources: ISO-19139 metadata ArcGIS Hub Dataset ArcGIS GeoService OGC WMS CSV Shapefile GeoJSON KML https://apps.fs.usda.gov/arcx/rest/services/EDW/EDW_ForestCommonNames_01/MapServer/1 http://data.fs.usda.gov/geodata/edw/datasets.php For complete information, please visit https://data.gov.
[Metadata] Geographic Names for the State of Hawaii as of September 3, 2024. (Data current / last edited in GNIS December 2023). Downloaded by the Hawaii Statewide GIS Program from the U.S. Board on Geographic Names Geographic Names Information System (GNIS) September 3, 2024 (https://www.usgs.gov/u.s.-board-on-geographic-names/download-gnis-data). The Geographic Names Information System (GNIS) is the Federal standard for geographic nomenclature. The U.S. Geological Survey developed the GNIS for the U.S. Board on Geographic Names, a Federal inter-agency body chartered by public law to maintain uniform feature name usage throughout the Government and to promulgate standard names to the public. The GNIS is the official repository of domestic geographic names data; the official vehicle for geographic names use by all departments of the Federal Government; and the source for applying geographic names to Federal electronic and printed products of all types.
For additional information, please refer to metadata at https://files.hawaii.gov/dbedt/op/gis/data/geonames.pdf or contact Hawaii Statewide GIS Program, Office of Planning and Sustainable Development, State of Hawaii; PO Box 2359, Honolulu, Hi. 96804; (808) 587-2846; email: gis@hawaii.gov; Website: https://planning.hawaii.gov/gis.
https://en.wikipedia.org/wiki/Public_domainhttps://en.wikipedia.org/wiki/Public_domain
The Colleges and Universities feature class/shapefile is composed of all Post Secondary Education facilities as defined by the Integrated Post Secondary Education System (IPEDS, http://nces.ed.gov/ipeds/), National Center for Education Statistics (NCES, https://nces.ed.gov/), US Department of Education for the 2018-2019 school year. Included are Doctoral/Research Universities, Masters Colleges and Universities, Baccalaureate Colleges, Associates Colleges, Theological seminaries, Medical Schools and other health care professions, Schools of engineering and technology, business and management, art, music, design, Law schools, Teachers colleges, Tribal colleges, and other specialized institutions. Overall, this data layer covers all 50 states, as well as Puerto Rico and other assorted U.S. territories. This feature class contains all MEDS/MEDS+ as approved by the National Geospatial-Intelligence Agency (NGA) Homeland Security Infrastructure Program (HSIP) Team. Complete field and attribute information is available in the ”Entities and Attributes” metadata section. Geographical coverage is depicted in the thumbnail above and detailed in the "Place Keyword" section of the metadata. This feature class does not have a relationship class but is related to Supplemental Colleges. Colleges and Universities that are not included in the NCES IPEDS data are added to the Supplemental Colleges feature class when found. This release includes the addition of 175 new records, the removal of 468 no longer reported by NCES, and modifications to the spatial location and/or attribution of 6682 records.
Our United States zip code Database offers comprehensive postal code data for spatial analysis, including postal and administrative areas. This dataset contains accurate and up-to-date information on all administrative divisions, cities, and zip codes, making it an invaluable resource for various applications such as address capture and validation, map and visualization, reporting and business intelligence (BI), master data management, logistics and supply chain management, and sales and marketing. Our location data packages are available in various formats, including CSV, optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more. Product features include fully and accurately geocoded data, multi-language support with address names in local and foreign languages, comprehensive city definitions, and the option to combine map data with UNLOCODE and IATA codes, time zones, and daylight saving times. Companies choose our location databases for their enterprise-grade service, reduction in integration time and cost by 30%, and weekly updates to ensure the highest quality.
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_END_USER.pdf
https://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdfhttps://catalog.elra.info/static/from_media/metashare/licences/ELRA_VAR.pdf
This resource covers four million Japanese names and their romanized variants, and includes gender codes, classification codes, and frequency rankings.
The Census data utilized for developing the Community Layer used 2010 TIGER/Line shapefile datasets (TIGER = Topologically Integrated Geographic Encoding and Referencing). TIGER/Line shapefiles are available for free download from the US Census Bureau and include various legal and statistical geographic areas for which the Census tabulates data. The shapefiles are designed to be used in a GIS environment, with the ability to directly link the geographic areas to Census data via a unique GEOID number.The following TIGER/Line datasets should be used: - Counties and Equivalent Entities –primary legal divisions within each state (counties, parishes, etc)- County Subdivisions –includes both legal areas (Minor Civil Divisions or MCDs) and various statistical areas- Places –includes both legal areas (Incorporated Places) and statistical areas (Census Designated Places or CDPs)- Blocks –the smallest geographical area for which Census population counts are recorded; blocks never cross boundaries of any entity for which the Census Bureau tabulates data, including counties, county subdivisions, places, and American Indian, Alaska Native, and Native Hawaiian (AIANNH) areas- American Indian, Alaska Native, and Native Hawaiian (AIANNH) AreasExtracting and Formatting CIS DataA key component of the community layer is the ability to link CIS information spatially. Data from CIS cannot directly be joined with Census data. The two datasets have community name discrepancies which impede an exact match. Therefore, CIS data needs to be formatted to match Census community names. A custom report can be obtained from CIS to include a CID number, Community Name, County, State, Community Status, and Tribal status for all CIS records. Make sure all CID numbers are six digits and you follow the CIS community naming convention outlined in Table 4.2.1.1 in the Community Layer Update Technical Guide 20131206. Converting the CIS name“ADDISON, VILLAGE OF” to “ADDISON TOWN”involves removing unneeded spaces, comma, and preposition to make the join successful to the Census data. Using a comprehensive report at a national level gains efficiencies as bulk edits can be made. Data for each state should be extracted as needed by separating the CIS data into each type of community corresponding to the Census geography layers used, and a new JoinID column (e.g. ADDISON TOWN) can be created for each dataset allowing the CIS data to be joined to the Census data.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
List of each U.S. state Code, Name, Abbreviation and Alpha code, useful to convert one to the other.
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.