The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.
The data (name, year of birth, sex, state, and number) are from a 100 percent sample of Social Security card applications starting with 1910. National data is in another dataset.
Provides a list of all data assets maintained by the Social Security Administration. It consist of Public Data Listing in the Enterprise Data Inventory.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Cultural diversity in the U.S. has led to great variations in names and naming traditions and names have been used to express creativity, personality, cultural identity, and values. Source: https://en.wikipedia.org/wiki/Naming_in_the_United_States
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data.
All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences.
Fork this kernel to get started with this dataset.
https://bigquery.cloud.google.com/dataset/bigquery-public-data:usa_names
https://cloud.google.com/bigquery/public-data/usa-names
Dataset Source: Data.gov. This dataset is publicly available for anyone to use under the following terms provided by the Dataset Source — http://www.data.gov/privacy-policy#data_policy — and is provided "AS IS" without any warranty, express or implied, from Google. Google disclaims all liability for any damages, direct or indirect, resulting from the use of the dataset.
Banner Photo by @dcp from Unplash.
What are the most common names?
What are the most common female names?
Are there more female or male names?
Female names by a wide margin?
This file contains a list of Approved Forms for the Social Security Administration as of 2017.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
US Social Security applications are a great way to track trends in how babies born in the US are named.
Data.gov releases two datasets that are helplful for this: one at the national level and another at the state level. Note that only names with at least 5 babies born in the same year (/ state) are included in this dataset for privacy.
I've taken the raw files here and combined/normalized them into two CSV files (one for each dataset) as well as a SQLite database with two equivalently-defined tables. The code that did these transformations is available here.
New to data exploration in R? Take the free, interactive DataCamp course, "Data Exploration With Kaggle Scripts," to learn the basics of visualizing data with ggplot. You'll also create your first Kaggle Scripts along the way.
This is an updated version of the national baby names database, containing records from 1880-2017.
Description:This data deposit contains the Numerical Identification Death Files (National Archives Identifier 23845618), the NUMIDENT SS-5 Application Files (National Archives Identifier 23845613), the NUMIDENT Claims Files (National Archives Identifier 23852747), and the associated technical documentation. Data Acquisition:These files were e-delivered to Anthony Wray via secure link by the Electronic Records Division of the National Archives and Records Administration (NARA) on 17 October 2019, as per a digitized reproduction order (Quote QO1-525370500 and Quote QO1-528389077). The packing slip is included in the data deposit (docs/Packing Slip.PDF).Rights to Publish:The data are in the public domain, as confirmed by emails received from NARA on 28 December 2023 and 3 January 2024 (see docs/permission_to_publish_email.pdf).How to Cite: Please adhere to the citation and data usage guidelines when using this dataset. See the included LICENSE.txt and README.md files for details. Details:The Numerical Identification Files (NUMIDENT), 1936–2007, series contains records for every Social Security number (SSN) assigned to individuals with a verified death or who would have been over 110 years old by December 31, 2007. There are three types of entries in NUMIDENT: application (SS-5), claim, and death records. A NUMIDENT record may contain more than one entry. Information contained in NUMIDENT records includes: each applicant's full name, SSN, date of birth, place of birth, citizenship, sex, father's name, mother's maiden name, and race/ethnic description (optional). NUMIDENT includes information regarding any subsequent changes made to the applicant's record, including name changes and life or death claims. The death records in NUMIDENT do not include any State reported deaths in accordance with the Social Security Act section 205(r). There are 72,182,729 SS-5 records entries; 25,230,486 claim record entries; and 49,459,293 death record entries.See https://catalog.archives.gov/id/12004494 for more information.Related Data:Visit the CenSoc Project for public micro datasets linked to NUMIDENT: https://censoc.berkeley.edu/.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Social Security Administration (SSA) of The United States published the frequency of the born a baby name in the US (United State) after 1879.
This dataset contains raw data in txt format which include year from 1880 to 2019 with name and sex columns.
I have taken a dataset from U.S. Social Security, you can check out from here:https://www.ssa.gov/oact/babynames/limits.html
Use simple python code to Analyzing the name pattern in the US.
Database list of unassigned numbers.
This public dataset was created by the Social Security Administration and contains all names from Social Security card applications for births that occurred in the United States after 1879. Note that many people born before 1937 never applied for a Social Security card, so their names are not included in this data. For others who did apply, records may not show the place of birth, and again their names are not included in the data. All data are from a 100% sample of records on Social Security card applications as of the end of February 2015. To safeguard privacy, the Social Security Administration restricts names to those with at least 5 occurrences. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery .
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for SOCIAL SECURITY RATE reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This is a roster of active purchase cardholder for the Social Security Administration.
Through an automated confirmation system, an employer matches information provided by a new employee (Form I-9) against existing information contained in Social Security Administration's (SSA) and the Department of Homeland Security's (DHS) U.S. Citizenship & Immigration Services (USCIS) databases. The SSA E-Verify System (SSA E-Verify) determines a specific verification code based upon information (SSN, DOB, L-Name, F-Name) in the NUMIDENT database. The verification code is returned to DHS E-Verify (DHS E-Verify) along with the original verification request. The message to the employer is determined by DHS E-Verify based on SSA's verification code.
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
This is a public use data file on Delaware's most popular baby names for 2009 to 2016 obtained from the Delaware certificate of birth. The top 15 names for each gender are represented.
This dataset includes the names and general locations of educational events and activities focused on educating the public about Social Security benefits and services. In general, the events in this dataset are typically public events, employer-based events, or media placements. Data elements include the name of event, date of event, a general description, and estimated number of attendees.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for SOCIAL SECURITY RATE FOR COMPANIES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
This data set contains the most popular baby names in Utah from 1910-2013. Each record is sorted first on sex, then year of birth, and then on number of occurrences in descending order. When there is a tie on the number of occurrences names are listed in alphabetical order. This sorting makes it easy to determine a name's rank. The first record for each sex & year of birth has rank 1, the second record has rank 2, and so forth.
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.