61 datasets found

Baby Names from Social Security Card Applications - National Data
catalog.data.gov
s.cnmilf.com
+1more
Updated May 5, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Social Security Administration (2022). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data
Explore at:
Dataset updated
May 5, 2022
Dataset provided by
Social Security Administrationhttp://www.ssa.gov/
Description
The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.
Popular Baby Names
kaggle.com
data.cityofnewyork.us
+4more
Updated May 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Utkarsh Singh (2023). Popular Baby Names [Dataset]. https://www.kaggle.com/datasets/utkarshx27/popular-baby-names/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
May 5, 2023
Dataset provided by
Kaggle
Authors
Utkarsh Singh
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Popular Baby Names by Sex and Ethnic Group Data were collected through civil birth registration. Each record represents the ranking of a baby name in the order of frequency. Data can be used to represent the popularity of a name. Caution should be used when assessing the rank of a baby name if the frequency count is close to 10; the ranking may vary year to year.
d
Baby Name popularity over time - Dataset - data.govt.nz - discover and use...
catalogue.data.govt.nz
Updated Nov 8, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Baby Name popularity over time - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/baby-name-popularity-over-time
Explore at:
Dataset updated
Nov 8, 2017
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data set lists the sex and number of birth registrations for each first name, from 1900 onward. Years are grouped by the date of the birth registration, not by the date of birth. Some birth registrations are not included, such as registrations with a sex other than Male or Female (i.e. indeterminate or not recorded), or where the birth registration date is not recorded. These excluded records are so few their exclusion is unlikely to have any significant impact on the data. Where a name has less than 10 instances in a particular year, the name will not be included in the data for that year. Due to this, total volumes will be less than the total birth registrations in that year. As first and middle names are recorded in our system together, the first name has been split off from the middle names. Due to the size of the data set, this was done with an automated system, generally looking for the first space in the name. This means there may be names not correctly added. Also, certain symbols in names may not carry through to the data correctly. Please let us know using the contact email address if you find any errors in the data.
S
Statistics on Swedish names by birth country 2020
snd.se
pdf, zip
Updated Nov 2, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter M. Dahlgren (2021). Statistics on Swedish names by birth country 2020 [Dataset]. http://doi.org/10.5878/s91g-y391
Explore at:
zip(232422238), pdf(248253)Available download formats
Unique identifier
https://doi.org/10.5878/s91g-y391
Dataset updated
Nov 2, 2021
Dataset provided by
University of Gothenburg
Swedish National Data Service
Authors
Peter M. Dahlgren
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 2020
Area covered
Sweden
Dataset funded by
The Institute for Media Studies
Description
This dataset contains statistics on names (first names of women, first names of men, and last names) by country of birth. In total, there are 231,505 names by 202 countries. The data comes from Statistics Sweden's population statistics (name register) and refers to persons registered in Sweden on December 31st, 2020. However, some names are excluded due to confidentiality, such as names with fewer than five carriers. The data is licensed with Creative Commons Attribution 4.0 International (CC BY 4.0) and may be used as long as Statistics Sweden is stated as the source. In this dataset, you will also find (in addition to the original data from Statistics Sweden) tidied data where the ISO code for each country has been added, as well as data in so-called wide format and long format to facilitate easier data processing.

Please see the Swedish version of the post and the README file for more information about the data.
d
Popular Baby Names - Dataset - data.sa.gov.au
data.sa.gov.au
Updated Mar 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Popular Baby Names - Dataset - data.sa.gov.au [Dataset]. https://data.sa.gov.au/data/dataset/popular-baby-names
Explore at:
Dataset updated
Mar 5, 2025
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
South Australia
Description
List of male and female baby names in South Australia from 1944 to 2024. The annual data for baby names is published January/February each year.
o
Geonames - All Cities with a population > 1000
public.opendatasoft.com
data.smartidf.services
+3more
csv, excel, geojson +1
Updated Mar 10, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Geonames - All Cities with a population > 1000 [Dataset]. https://public.opendatasoft.com/explore/dataset/geonames-all-cities-with-a-population-1000/
Explore at:
csv, json, geojson, excelAvailable download formats
Dataset updated
Mar 10, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
All cities with a population > 1000 or seats of adm div (ca 80.000)Sources and ContributionsSources : GeoNames is aggregating over hundred different data sources. Ambassadors : GeoNames Ambassadors help in many countries. Wiki : A wiki allows to view the data and quickly fix error and add missing places. Donations and Sponsoring : Costs for running GeoNames are covered by donations and sponsoring.Enrichment:add country name
d
Most Popular Male and Female First Names - Dataset - data.govt.nz - discover...
catalogue.data.govt.nz
Updated Apr 10, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Most Popular Male and Female First Names - Dataset - data.govt.nz - discover and use data [Dataset]. https://catalogue.data.govt.nz/dataset/most-popular-male-and-female-first-names
Explore at:
Dataset updated
Apr 10, 2017
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
Excel spreadsheet of the 100 male and female first names for each year since 1954 to most recent year, based on births registered in New Zealand during each year.
O
Top 100 Baby Names
data.qld.gov.au
researchdata.edu.au
+1more
csv
Updated Feb 13, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Justice (2025). Top 100 Baby Names [Dataset]. https://www.data.qld.gov.au/dataset/top-100-baby-names
Explore at:
csv(2048), csv(204800), csvAvailable download formats
Dataset updated
Feb 13, 2025
Dataset authored and provided by
Justice
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
Queensland Top 100 Baby Names
Z
INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET
data.niaid.nih.gov
zenodo.org
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafiz Sadman (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Nafiz Sadman
Nishat Anjum
Kishor Datta Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Bangladesh, United States
Description
Introduction

There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

2 Data-set Introduction

2.1 Data Collection

We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

The headline must have one or more words directly or indirectly related to COVID-19.

The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

Avoid taking duplicate reports.

Maintain a time frame for the above mentioned newspapers.

To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

2.2 Data Pre-processing and Statistics

Some pre-processing steps performed on the newspaper report dataset are as follows:

Remove hyperlinks.

Remove non-English alphanumeric characters.

Remove stop words.

Lemmatize text.

While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

The primary data statistics of the two dataset are shown in Table 1 and 2.

Table 1: Covid-News-USA-NNK data statistics

No of words per headline

7 to 20

No of words per body content

150 to 2100

Table 2: Covid-News-BD-NNK data statistics No of words per headline

10 to 20

No of words per body content

100 to 1500

2.3 Dataset Repository

We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

3 Literature Review

Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

4 Our experiments and Result analysis

We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

In February, both the news paper have talked about China and source of the outbreak.

StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

Washington Post discussed global issues more than StarTribune.

StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
S
Baby Names: Beginning 2007
health.data.ny.gov
application/rdfxml +5
Updated Nov 10, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York State Department of Health (2023). Baby Names: Beginning 2007 [Dataset]. https://health.data.ny.gov/Health/Baby-Names-Beginning-2007/jxy9-yhdk
Explore at:
csv, application/rdfxml, application/rssxml, xml, json, tsvAvailable download formats
Dataset updated
Nov 10, 2023
Dataset authored and provided by
New York State Department of Health
Description
New York State Baby Names are aggregated and displayed by the year, county, or borough where the mother resided as stated on a New York State or New York City (NYC) birth certificate. The frequency of the baby name is listed if there are 5 or more of the same baby name in a county outside of NYC or 10 or more of the same baby name in a NYC borough.

Data from: THE RELEVANCY OF MASSIVE HEALTH EDUCATION IN THE BRAZILIAN PRISON...

zenodo.org

csv, pdf

Updated Jul 16, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Janaína L. R. da S. Valentim; Janaína L. R. da S. Valentim; Sara Dias-Trindade; Sara Dias-Trindade; Eloiza da S. G. Oliveira; Eloiza da S. G. Oliveira; José A. M. Moreira; José A. M. Moreira; Felipe Fernandes; Felipe Fernandes; Manoel Honorio Romão; Manoel Honorio Romão; Philippi S. G. de Morais; Philippi S. G. de Morais; Alexandre R. Caitano; Alexandre R. Caitano; Aline P. Dias; Aline P. Dias; Carlos A. P. Oliveira; Carlos A. P. Oliveira; Karilany D. Coutinho; Karilany D. Coutinho; Ricardo B. Ceccim; Ricardo B. Ceccim; Ricardo A. de M. Valentim; Ricardo A. de M. Valentim (2024). THE RELEVANCY OF MASSIVE HEALTH EDUCATION IN THE BRAZILIAN PRISON SYSTEM: THE COURSE "HEALTH CARE FOR PEOPLE DEPRIVED OF FREEDOM" AND ITS IMPACTS [Dataset]. http://doi.org/10.5281/zenodo.6499752

Explore at:

csv, pdfAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.6499752

Dataset updated

Jul 16, 2024

Dataset provided by

Zenodohttp://zenodo.org/

Authors

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Brazil

Description

Dataset name: asppl_dataset_v2.csv

Version: 2.0

Dataset period: 06/07/2018 - 01/14/2022

Dataset Characteristics: Multivalued

Number of Instances: 8118

Number of Attributes: 9

Missing Values: Yes

Area(s): Health and education

Sources:

Virtual Learning Environment of the Brazilian Health System (AVASUS) (Brasil, 2022a);
Brazilian Occupational Classification (CBO) (Brasil, 2022b);
National Registry of Health Establishments (CNES) (Brasil, 2022c);
Brazilian Institute of Geography and Statistics (IBGE) (Brasil, 2022e).

Description: The data contained in the asppl_dataset_v2.csv dataset (see Table 1) originates from participants of the technology-based educational course “Health Care for People Deprived of Freedom.” The course is available on the AVASUS (Brasil, 2022a). This dataset provides elementary data for analyzing the course’s impact and reach and the profile of its participants. In addition, it brings an update of the data presented in work by Valentim et al. (2021).

Table 1: Description of AVASUS dataset features.

Attributes	Description	datatype	Value
gender	Gender of the course participant.	Categorical.	Feminino / Masculino / Não Informado. (In English, Female, Male or Uninformed)
course_progress	Percentage of completion of the course.	Numerical.	Range from 0 to 100.
course_evaluation	A score given to the course by the participant.	Numerical.	0, 1, 2, 3, 4, 5 or NaN.
evaluation_commentary	Comment made by the participant about the course.	Categorical.	Free text or NaN.
region	Brazilian region in which the participant resides.	Categorical.	Brazilian region according to IBGE: Norte, Nordeste, Centro-Oeste, Sudeste or Sul (In English North, Northeast, Midwest, Southeast or South).
CNES	The CNES code refers to the health establishment where the participant works.	Numerical.	CNES Code or NaN.
health_care_level	Identification of the health care network level for which the course participant works.	Categorical.	“ATENCAO PRIMARIA”, “MEDIA COMPLEXIDADE”, “ALTA COMPLEXIDADE”, and their possible combinations. (In English "PRIMARY HEALTH CARE", "SECONDARY HEALTH CARE" AND "TERTIARY HEALTH CARE")
year_enrollment	Year in which the course participant registered.	Numerical.	Year (YYYY).
CBO	Participant occupation.	Categorical.	Text coded according to the Brazilian Classification of Occupations or “Indivíduo sem afiliação formal.” (In English “Individual without formal affiliation.”)

Dataset name: prison_syphilis_and_population_brazil.csv

Dataset period: 2017 - 2020

Dataset Characteristics: Multivalued

Number of Instances: 6

Number of Attributes: 13

Missing Values: No

Source:

National Penitentiary Department (DEPEN) (Brasil, 2022d);

Description: The data contained in the prison_syphilis_and_population_brazil.csv dataset (see Table 2) originate from the National Penitentiary Department Information System (SISDEPEN) (Brasil, 2022d). This dataset provides data on the population and prevalence of syphilis in the Brazilian prison system. In addition, it brings a rate that represents the normalized data for purposes of comparison between the populations of each region and Brazil.

Table 2: Description of DEPEN dataset Features.

Attributes	Description	datatype	Value
Region	Brazilian region in which the participant resides. In addition, the sum of the regions, which refers to Brazil.	Categorical.	Brazil and Brazilian region according to IBGE: North, Northeast, Midwest, Southeast or South.
syphilis_2017	Number of syphilis cases in the prison system in 2017.	Numerical.	Number of syphilis cases.
syphilis_rate_2017	Normalized rate of syphilis cases in 2017.	Numerical.	Syphilis case rate.
syphilis_2018	Number of syphilis cases in the prison system in 2018.	Numerical.	Number of syphilis cases.
syphilis_rate_2018	Normalized rate of syphilis cases in 2018.	Numerical.	Syphilis case rate.
syphilis_2019	Number of syphilis cases in the prison system in 2019.	Numerical.	Number of syphilis cases.
syphilis_rate_2019	Normalized rate of syphilis cases in 2019.	Numerical.	Syphilis case rate.
syphilis_2020	Number of syphilis cases in the prison system in 2020.	Numerical.	Number of syphilis cases.
syphilis_rate_2020	Normalized rate of syphilis cases in 2020.	Numerical.	Syphilis case rate.
pop_2017	Prison population in 2017.	Numerical.	Population number.
pop_2018	Prison population in 2018.	Numerical.	Population number.
pop_2019	Prison population in 2019.	Numerical.	Population number.
pop_2020	Prison population in 2020.	Numerical.	Population number.

Dataset name: students_cumulative_sum.csv

Dataset period: 2018 - 2020

Dataset Characteristics: Multivalued

Number of Instances: 6

Number of Attributes: 7

Missing Values: No

Source:

Virtual Learning Environment of the Brazilian Health System (AVASUS) (Brasil, 2022a);
Brazilian Institute of Geography and Statistics (IBGE) (Brasil, 2022e).

Description: The data contained in the students_cumulative_sum.csv dataset (see Table 3) originate mainly from AVASUS (Brasil, 2022a). This dataset provides data on the number of students by region and year. In addition, it brings a rate that represents the normalized data for purposes of comparison between the populations of each region and Brazil. We used population data estimated by the IBGE (Brasil, 2022e) to calculate the rate.

Table 3: Description of Students dataset Features.

Popular White Last Names in the US
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Popular White Last Names in the US [Dataset]. https://www.johnsnowlabs.com/marketplace/popular-white-last-names-in-the-us/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description
This dataset represents the popular last names in the United States for White.
Popular Black Last Names in the US
johnsnowlabs.com
csv
Updated Jan 20, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John Snow Labs (2021). Popular Black Last Names in the US [Dataset]. https://www.johnsnowlabs.com/marketplace/popular-black-last-names-in-the-us/
Explore at:
csvAvailable download formats
Dataset updated
Jan 20, 2021
Dataset authored and provided by
John Snow Labs
Area covered
United States
Description
This dataset represents the popular last names in the United States for Black.
Top 100 baby names in England and Wales: historical data
ons.gov.uk
cy.ons.gov.uk
xls
Updated Aug 15, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Office for National Statistics (2014). Top 100 baby names in England and Wales: historical data [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/birthsdeathsandmarriages/livebirths/datasets/babynamesenglandandwalestop100babynameshistoricaldata
Explore at:
xlsAvailable download formats
Dataset updated
Aug 15, 2014
Dataset provided by
Office for National Statisticshttp://www.ons.gov.uk/
License
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Description
Historic lists of top 100 names for baby boys and girls for 1904 to 1994 at 10-yearly intervals.
NLUCat
zenodo.org
huggingface.co
+1more
zip
Updated Mar 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zenodo (2024). NLUCat [Dataset]. http://doi.org/10.5281/zenodo.10721193
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10721193
Dataset updated
Mar 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
NLUCat

Dataset Description

Dataset Summary

NLUCat is a dataset of NLU in Catalan. It consists of nearly 12,000 instructions annotated with the most relevant intents and spans. Each instruction is accompanied, in addition, by the instructions received by the annotator who wrote it.

The intents taken into account are the habitual ones of a virtual home assistant (activity calendar, IOT, list management, leisure, etc.), but specific ones have also been added to take into account social and healthcare needs for vulnerable people (information on administrative procedures, menu and medication reminders, etc.).

The spans have been annotated with a tag describing the type of information they contain. They are fine-grained, but can be easily grouped to use them in robust systems.

The examples are not only written in Catalan, but they also take into account the geographical and cultural reality of the speakers of this language (geographic points, cultural references, etc.)

This dataset can be used to train models for intent classification, spans identification and examples generation.

This is the complete version of the dataset. A version prepared to train and evaluate intent classifiers has been published in HuggingFace.

In this repository you'll find the following items:

NLUCat_annotation_guidelines.docx: the guidelines provided to the annotation team

NLUCat_dataset.json: the completed NLUCat dataset

NLUCat_stats.tsv: statistics about de NLUCat dataset

dataset: folder with the dataset as published in HuggingFace, splited and prepared for training and evaluating intent classifiers

reports: folder with the reports done as feedback to the annotators during the annotation process

This dataset can be used for any purpose, whether academic or commercial, under the terms of the CC BY 4.0. Give appropriate credit , provide a link to the license, and indicate if changes were made.

Supported Tasks and Leaderboards

Intent classification, spans identification and examples generation.

Languages

The dataset is in Catalan (ca-ES).

Dataset Structure

Data Instances

Three JSON files, one for each split.

Data Fields

example: `str`. Example

annotation: `dict`. Annotation of the example

intent: `str`. Intent tag

slots: `list`. List of slots

Tag:`str`. tag to the slot

Text:`str`. Text of the slot

Start_char: `int`. First character of the span

End_char: `int`. Last character of the span

Example

An example looks as follows:

{
"example": "Demana una ambulància; la meva dona està de part.",
"annotation": {
"intent": "call_emergency",
"slots": [
{
"Tag": "service",
"Text": "ambulància",
"Start_char": 11,
"End_char": 21
},
{
"Tag": "situation",
"Text": "la meva dona està de part",
"Start_char": 23,
"End_char": 48
}
]
}
},

Data Splits

NLUCat.train: 9128 examples

NLUCat.dev: 1441 examples

NLUCat.test: 1441 examples

Dataset Creation

Curation Rationale

We created this dataset to contribute to the development of language models in Catalan, a low-resource language.

When creating this dataset, we took into account not only the language but the entire socio-cultural reality of the Catalan-speaking population. Special consideration was also given to the needs of the vulnerable population.

Source Data

Initial Data Collection and Normalization

We commissioned a company to create fictitious examples for the creation of this dataset.

Who are the source language producers?

We commissioned the writing of the examples to the company m47 labs.

Annotations

Annotation process

The elaboration of this dataset has been done in three steps, taking as a model the process followed by the NLU-Evaluation-Data dataset, as explained in the paper.
* First step: translation or elaboration of the instructions given to the annotators to write the examples.
* Second step: writing the examples. This step also includes the grammatical correction and normalization of the texts.
* Third step: recording the attempts and the slots of each example. In this step, some modifications were made to the annotation guides to adjust them to the real situations.

Who are the annotators?

The drafting of the examples and their annotation was entrusted to the company m47 labs through a public tender process.

Personal and Sensitive Information

No personal or sensitive information included.

The examples used for the preparation of this dataset are fictitious and, therefore, the information shown is not real.

Considerations for Using the Data

Social Impact of Dataset

We hope that this dataset will help the development of virtual assistants in Catalan, a language that is often not taken into account, and that it will especially help to improve the quality of life of people with special needs.

Discussion of Biases

When writing the examples, the annotators were asked to take into account the socio-cultural reality (geographic points, artists and cultural references, etc.) of the Catalan-speaking population.
Likewise, they were asked to be careful to avoid examples that reinforce the stereotypes that exist in this society. For example: be careful with the gender or origin of personal names that are associated with certain activities.

Other Known Limitations

[N/A]

Additional Information

Dataset Curators

Language Technologies Unit at the Barcelona Supercomputing Center (langtech@bsc.es)

This work has been promoted and financed by the Generalitat de Catalunya through the Aina project.

Licensing Information

This dataset can be used for any purpose, whether academic or commercial, under the terms of the CC BY 4.0.
Give appropriate credit, provide a link to the license, and indicate if changes were made.

Citation Information

DOI

Contributions

The drafting of the examples and their annotation was entrusted to the company m47 labs through a public tender process.
Data from: A large-scale fMRI dataset for the visual processing of...
openneuro.org
Updated Feb 25, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Zonglei Zhen (2023). A large-scale fMRI dataset for the visual processing of naturalistic scenes [Dataset]. http://doi.org/10.18112/openneuro.ds004496.v2.0.1
Explore at:
Unique identifier
https://doi.org/10.18112/openneuro.ds004496.v2.0.1
Dataset updated
Feb 25, 2023
Dataset provided by
OpenNeurohttps://openneuro.org/
Authors
Zhengxin Gong; Ming Zhou; Yuxuan Dai; Yushan Wen; Zonglei Zhen
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary

One ultimate goal of visual neuroscience is to understand how the brain processes visual stimuli encountered in the natural environment. Achieving this goal requires records of brain responses under massive amounts of naturalistic stimuli. Although the scientific community has put a lot of effort into collecting large-scale functional magnetic resonance imaging (fMRI) data under naturalistic stimuli, more naturalistic fMRI datasets are still urgently needed. We present here the Natural Object Dataset (NOD), a large-scale fMRI dataset containing responses to 57,620 naturalistic images from 30 participants. NOD strives for a balance between sampling variation between individuals and sampling variation between stimuli. This enables NOD to be utilized not only for determining whether an observation is generalizable across many individuals, but also for testing whether a response pattern is generalized to a variety of naturalistic stimuli. We anticipate that the NOD together with existing naturalistic neuroimaging datasets will serve as a new impetus for our understanding of the visual processing of naturalistic stimuli.

Data record

The data can be accessed from the OpenNeuro public repository (accession number: ds004496), organized according to the Brain-Imaging-Data-Structure (BIDS) Specification version 1.7.0. In short, the raw data from each subject are saved in “sub-

Stimulus images The stimulus images for different fMRI experiments are deposited in separate folders: “stimuli/imagenet”, “stimuli/coco”, “stimuli/prf”, and “stimuli/floc”. Each experiment folder contains corresponding stimulus images, and the auxiliary files can be found within the “info” subfolder.

Raw MRI data The folder for each participant consists of several session folders: “ses-anat”, “ses-coco”, “ses-imagenet”, “ses-prf”, and “ses-floc”. The session folder in turn includes one or two folders, named as “anat”, “func” or “fmap”, for corresponding modality data. The scan information for each session is provided in the “sub-

Preprocessed volume data from fMRIprep The preprocessed volume-based fMRI data are in subject's native space, saved as “sub-

Preprocessed surface-based data from ciftify Under each run folder, the preprocessed surface-based data are saved standard fsLR space, named as “sub-

Brain activation data from surface-based GLM analyses The brain activation data are derived from GLM analyses on the standard fsLR space, saved as “sub-
Z
higbie/ALWW: American Labor Who's Who (1925) Dataset
data.niaid.nih.gov
zenodo.org
Updated Feb 5, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Toby Higbie (2022). higbie/ALWW: American Labor Who's Who (1925) Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_597301
Explore at:
Dataset updated
Feb 5, 2022
Dataset authored and provided by
Toby Higbie
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
American Labor Who's Who dataset, version 2.2.0

A dataset derived from the digitized text of Solon de Leon, et al., The American Labor Who's Who (New York: Hanford Press, 1925). This release includes separate files the U.S. and "Other Countries" sections of the directory.

The American Labor Who's Who (ALWW), published in 1925, is a directory of activists in the fields of trade unionism, immigrant rights, civil liberties, progressive and radical politics. The directory includes roughly 1,300 entries for U.S. activists and 300 additional non-US activists. Each entry is a telegraphic biography. Some provide only name, professional title and address at the time of publication, but many sketch rich life histories. Nearly all provide details on birth date and place, family background, education, migration, and work histories, as well as key organizations, events publications, home and work addresses.

The ALWW dataset is derived from the text hosted on the HathiTrust digital library: https://catalog.hathitrust.org/Record/000591300. Faculty, staff, and students at UCLA corrected the plain text from the scanned document and parsed the text into comma-separated fields. This release includes separate files for US entries and "Other Countries" entries. About 30 individuals are listed in the US section with the notation "see other countries," mainly Canadians and Mexicans. This subset is also in a separate file in this release.

For more information about this and related projects see: http://socialjusticehistory.org/projects/networkedlabor/.

Contributors Tobias Higbie, Principal Investigator, UCLA History Department Craig Messner, UCLA Center for Digital Humanities Nick de Carlo, UCLA Center for Digital Humanities Zoe Borovsky, UCLA Library

Contents of Release The US and Other Country datasets were developed separately as reflected in their different version numbers. The US entries are more developed and clean. Consider the Other Country files as beta releases. The files listed below are the most up-to-date available. Previous versions are also available via GitHub.

alww-us-2-2-0.csv (all US entries)

alww-othercountries-o.3.2.csv (all other country entries)

alww-othercountries-0.3.2-subset-crossrefd.csv (other country entries cross-referenced in the US entry section)

Field Layouts

The field layouts for the US and Other Country files are slightly different in this release.

US Entries The fields for the US file (alww-us-2-2-0.csv) include: NAME [first and last], NAME-ALWW [name as it appears in the original text], TITLES [named offices or occupations in 1925], ORGS [compiled list of organizations belonged to at any time], BIRTHDATES [m/d/y where present], BIRTHCOUNTRY [derived from Birthplace], BIRTHPLACE [as listed], FATHER [father's occupation, in a few cases includes mother], CAREER (UNABBREVIATED) [education and experience, usually chronological, most common abbreviations expanded to full words], CAREER (ABBREVIATED) [same as previous with original abbreviations], HOME ADDRESS [where present], WORK ADDRESS [where present], PUBLICATIONS [incomplete], INDEX CATEGORY 1 [categories derived from the ALWW index, many have more than one category], INDEX CATEGORY 2, INDEX CATEGORY 3, INDEX CATEGORY 4, INDEX CATEGORY 5, INDEX CATEGORY 6, INDEX CATEGORY 7, INDEX CATEGORY 8, ORIGINAL [unparsed entry text carried over from earlier versions].

Other Countries The other country files (alww-othercountries-o.3.2.csv and alww-othercountries-0.3.2-subset-crossrefd.csv) include these fields: Name [last, first], Titles [named offices or occupations in 1925], Organizations [compiled list of organizations belonged to at any time], Birthdate [as listed], Birthplace [as listed], Father [father's occupation], Other [same as Career above], HomeAddress [as listed], WorkAddress [as listed], Publications [derived from entries].

Related datasets American Labor Press Directory (1925); American Labor Press Directory (1940); Who's Who in Labor (1947).
TIGER/Line Shapefile, Current, State, Alaska, Place
catalog.data.gov
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Commerce, U.S. Census Bureau, Geography Division, Geospatial Products Branch (Point of Contact) (2023). TIGER/Line Shapefile, Current, State, Alaska, Place [Dataset]. https://catalog.data.gov/dataset/tiger-line-shapefile-current-state-alaska-place
Explore at:
Dataset updated
Dec 15, 2023
Dataset provided by
United States Census Bureauhttp://census.gov/
Area covered
Alaska
Description
This resource is a member of a series. The TIGER/Line shapefiles and related database files (.dbf) are an extract of selected geographic and cartographic information from the U.S. Census Bureau's Master Address File / Topologically Integrated Geographic Encoding and Referencing (MAF/TIGER) Database (MTDB). The MTDB represents a seamless national file with no overlaps or gaps between parts, however, each TIGER/Line shapefile is designed to stand alone as an independent data set, or they can be combined to cover the entire nation. The TIGER/Line shapefiles include both incorporated places (legal entities) and census designated places or CDPs (statistical entities). An incorporated place is established to provide governmental functions for a concentration of people as opposed to a minor civil division (MCD), which generally is created to provide services or administer an area without regard, necessarily, to population. Places always nest within a state, but may extend across county and county subdivision boundaries. An incorporated place usually is a city, town, village, or borough, but can have other legal descriptions. CDPs are delineated for the decennial census as the statistical counterparts of incorporated places. CDPs are delineated to provide data for settled concentrations of population that are identifiable by name, but are not legally incorporated under the laws of the state in which they are located. The boundaries for CDPs often are defined in partnership with state, local, and/or tribal officials and usually coincide with visible features or the boundary of an adjacent incorporated place or another legal entity. CDP boundaries often change from one decennial census to the next with changes in the settlement pattern and development; a CDP with the same name as in an earlier census does not necessarily have the same boundary. The only population/housing size requirement for CDPs is that they must contain some housing and population. The boundaries of most incorporated places in this shapefile are as of January 1, 2023, as reported through the Census Bureau's Boundary and Annexation Survey (BAS). The boundaries of all CDPs were delineated as part of the Census Bureau's Participant Statistical Areas Program (PSAP) for the 2020 Census, but some CDPs were added or updated through the 2023 BAS as well.
Historic US Census - 1920
redivis.com
application/jsonl +7
Updated Jan 10, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stanford Center for Population Health Sciences (2020). Historic US Census - 1920 [Dataset]. http://doi.org/10.57761/v43s-pk48
Explore at:
sas, csv, stata, spss, avro, parquet, application/jsonl, arrowAvailable download formats
Unique identifier
https://doi.org/10.57761/v43s-pk48
Dataset updated
Jan 10, 2020
Dataset provided by
Redivis Inc.
Authors
Stanford Center for Population Health Sciences
Time period covered
Jan 1, 1920 - Dec 31, 1920
Area covered
United States
Description
Abstract

The Integrated Public Use Microdata Series (IPUMS) Complete Count Data include more than 650 million individual-level and 7.5 million household-level records. The microdata are the result of collaboration between IPUMS and the nation’s two largest genealogical organizations—Ancestry.com and FamilySearch—and provides the largest and richest source of individual level and household data.

Before Manuscript Submission

All manuscripts (and other items you'd like to publish) must be submitted to

phsdatacore@stanford.edu for approval prior to journal submission.

We will check your cell sizes and citations.

For more information about how to cite PHS and PHS datasets, please visit:

https:/phsdocs.developerhub.io/need-help/citing-phs-data-core

Documentation

Historic data are scarce and often only exists in aggregate tables. The key advantage of historic US census data is the availability of individual and household level characteristics that researchers can tabulate in ways that benefits their specific research questions. The data contain demographic variables, economic variables, migration variables and family variables. Within households, it is possible to create relational data as all relations between household members are known. For example, having data on the mother and her children in a household enables researchers to calculate the mother’s age at birth. Another advantage of the Complete Count data is the possibility to follow individuals over time using a historical identifier.

In sum: the historic US census data are a unique source for research on social and economic change and can provide population health researchers with information about social and economic determinants.

The historic US 1920 census data was collected in January 1920. Enumerators collected data traveling to households and counting the residents who regularly slept at the household. Individuals lacking permanent housing were counted as residents of the place where they were when the data was collected. Household members absent on the day of data collected were either listed to the household with the help of other household members or were scheduled for the last census subdivision.

Notes

We provide household and person data separately so that it is convenient to explore the descriptive statistics on each level. In order to obtain a full dataset, merge the household and person on the variables SERIAL and SERIALP. In order to create a longitudinal dataset, merge datasets on the variable HISTID.

Households with more than 60 people in the original data were broken up for processing purposes. Every person in the large households are considered to be in their own household. The original large households can be identified using the variable SPLIT, reconstructed using the variable SPLITHID, and the original count is found in the variable SPLITNUM.

Coded variables derived from string variables are still in progress. These variables include: occupation and industry.

Missing observations have been allocated and some inconsistencies have been edited for the following variables: SPEAKENG, YRIMMIG, CITIZEN, AGE, BPL, MBPL, FBPL, LIT, SCHOOL, OWNERSHP, MORTGAGE, FARM, CLASSWKR, OCC1950, IND1950, MARST, RACE, SEX, RELATE, MTONGUE. The flag variables indicating an allocated observation for the associated variables can be included in your extract by clicking the ‘Select data quality flags’ box on the extract summary page.

Most inconsistent information was not edited for this release, thus there are observations outside of the universe for some variables. In particular, the variables GQ, and GQTYPE have known inconsistencies and will be improved with the next release.

%3C!-- --%3E

Section 2

This dataset was created on 2020-01-10 18:46:34.647 by merging multiple datasets together. The source datasets for this version were:

IPUMS 1920 households: This dataset includes all households from the 1920 US census.

IPUMS 1920 persons: This dataset includes all individuals from the 1920 US census.

IPUMS 1920 Lookup: This dataset includes variable names, variable labels, variable values, and corresponding variable value labels for the IPUMS 1920 datasets.
d
Mental Health Services Monthly Statistics
digital.nhs.uk
csv, pdf, xls, xlsx
Updated Jul 21, 2016
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). Mental Health Services Monthly Statistics [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/mental-health-services-monthly-statistics
Explore at:
csv(13.0 kB), csv(272.1 kB), pdf(239.2 kB), pdf(729.1 kB), csv(387.3 kB), csv(375.0 kB), csv(1.3 MB), xlsx(118.7 kB), xls(1.1 MB), xls(994.8 kB), xls(389.6 kB), xls(138.2 kB), csv(5.3 kB)Available download formats
Dataset updated
Jul 21, 2016
License
https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions
Time period covered
Mar 1, 2016 - May 31, 2016
Area covered
England
Description
This release presents experimental statistics from the Mental Health Services Data Set (MHSDS), using final submissions for April 2016 and provisional submissions for May 2016. This is the fifth monthly release from the dataset, which replaces the Mental Health and Learning Disabilities Dataset (MHLDDS). As well as analysis of waiting times, first published in March 2016, this release includes elements of the reports that were previously included in monthly reports produced from final MHLDDS submissions. In this publication a new data file has been produced to present the data for people identified as having learning disabilities and/or autistic spectrum disorder (LDA) characteristics. Because of the scope of the changes to the dataset (resulting in the name change to MHSDS and the new name for these monthly reports) it will take time to re-introduce all possible measures that were previously part of the MHLDS Monthly Reports. Additional measures will be added to this report in the coming months. Further details about these changes and the consultation that informed were announced in November. From January 2016 the release includes information on people in children and young people's mental health services, including CAMHS, for the first time. Learning disabilities and autism services have been included since September 2014. This release of final data for April 2016 comprises: - An Executive Summary, which presents national-level analysis across the whole dataset and also for some specific service areas and age groups - Data tables about access and waiting times in mental health services for the based on provisional data for the period 1 March 2016 to 31 May 2016. - A monthly data file which presents 92 measures for mental health, learning disability and autism services at National, Provider and Clinical Commissioning Group (CCG) level. - A Currency and Payments (CAP) data file, containing three measures relating to people assigned to Adult Mental Health Care Clusters. Further measures will be added in future releases. - A data file containing the measures relating to people with learning disabilities and/or autism. - Exploratory analysis of the coverage and completeness of access and waiting times statistics for people entering the Early Intervention in Psychosis pathway. - A set of provider level data quality measures for both months. The report comprises of validity measures for various data items at National and Provider level. From the publication of April data, a coverage report is included showing the number of providers submitting each month and number of records submitted. - A metadata file, which provide contextual information for each measure, including a full description, current uses, method used for analysis and some notes on usage. We will release the reports as experimental statistics until the characteristics of data flowed using the new data standard are understood. A correction has been made to this publication on 10 September 2018. This amendment relates to statistics in the monthly CSV data file; the specific measures effected are listed in the “Corrected Measures” CSV. All listed measures have now been corrected. NHS Digital apologises for any inconvenience caused.

Facebook

Twitter

Click to copy link

Link copied

Cite

Social Security Administration (2022). Baby Names from Social Security Card Applications - National Data [Dataset]. https://catalog.data.gov/dataset/baby-names-from-social-security-card-applications-national-data

Baby Names from Social Security Card Applications - National Data

Explore at:

15 scholarly articles cite this dataset (View in Google Scholar)

Dataset updated

May 5, 2022

Dataset provided by

Social Security Administrationhttp://www.ssa.gov/

Description

The data (name, year of birth, sex, and number) are from a 100 percent sample of Social Security card applications for 1880 onward.

Clear search

Close search

Google apps

Main menu

Baby Names from Social Security Card Applications - National Data

Popular Baby Names

Baby Name popularity over time - Dataset - data.govt.nz - discover and use...

Statistics on Swedish names by birth country 2020

Popular Baby Names - Dataset - data.sa.gov.au

Geonames - All Cities with a population > 1000

Most Popular Male and Female First Names - Dataset - data.govt.nz - discover...

Top 100 Baby Names

INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

Baby Names: Beginning 2007

Data from: THE RELEVANCY OF MASSIVE HEALTH EDUCATION IN THE BRAZILIAN PRISON...

Popular White Last Names in the US

Popular Black Last Names in the US

Top 100 baby names in England and Wales: historical data

NLUCat

NLUCat

Dataset Description

Dataset Summary

Supported Tasks and Leaderboards

Languages

Dataset Structure

Data Instances

Data Fields

Data Splits

Dataset Creation

Curation Rationale

Source Data

Annotations

Personal and Sensitive Information

Considerations for Using the Data

Social Impact of Dataset

Discussion of Biases

Other Known Limitations

Additional Information

Dataset Curators

Licensing Information

Citation Information

Contributions

Data from: A large-scale fMRI dataset for the visual processing of...

higbie/ALWW: American Labor Who's Who (1925) Dataset

TIGER/Line Shapefile, Current, State, Alaska, Place

Historic US Census - 1920

Abstract

Before Manuscript Submission

Documentation

Section 2

Mental Health Services Monthly Statistics

Baby Names from Social Security Card Applications - National DataSee More Versions

Baby Names from Social Security Card Applications - National Data