57 datasets found

The most spoken languages worldwide 2025
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2025
Area covered
World
Description
In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.
Number of native Spanish speakers worldwide 2024, by country
statista.com
boostndoto.org
+5more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.
English proficiency in European countries in 2019
statista.com
Updated Jun 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). English proficiency in European countries in 2019 [Dataset]. https://www.statista.com/statistics/990547/countries-in-europe-for-english/
Explore at:
Dataset updated
Jun 23, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 2019
Area covered
Europe
Description
This statistic presents the leading European countries by their level of English proficiency as of March 2019. According to data provided by Klazz, Sweden had the highest percentage of people who were proficient in English at ** percent of the population.
g
ENGLISH PROFICIENCY LEVEL
global-relocate.com
Updated Oct 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Global Relocate (2024). ENGLISH PROFICIENCY LEVEL [Dataset]. https://global-relocate.com/rankings/english-proficiency-level
Explore at:
Dataset updated
Oct 29, 2024
Dataset provided by
Global Relocate
Description
Using data from reports such as the "English Proficiency Index" (EDU) from Education First, one can see the significant impact of culture, education and globalization on the ability of citizens of different countries to speak English.
Spanish speakers in countries where Spanish is not an official language 2024...
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Spanish speakers in countries where Spanish is not an official language 2024 [Dataset]. https://www.statista.com/statistics/1276290/number-spanish-speakers-non-hispanic-countries-worldwide/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description
The United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.
r
LGA11 Non English Speaking Countries of Birth 2011
researchdata.edu.au
null
Updated Jun 28, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Torrens University Australia - Public Health Information Development Unit (2023). LGA11 Non English Speaking Countries of Birth 2011 [Dataset]. https://researchdata.edu.au/lga11-non-english-birth-2011/2744967
Explore at:
nullAvailable download formats
Dataset updated
Jun 28, 2023
Dataset provided by
Australian Urban Research Infrastructure Network (AURIN)
Authors
Torrens University Australia - Public Health Information Development Unit
License
Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
License information was derived automatically
Area covered

Description
People born in the ten most common non-English speaking background countries by LGA 2011, for the 2011.
Speech Accent Archive
kaggle.com
marketplace.sshopencloud.eu
zip
Updated Nov 6, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rachael Tatman (2017). Speech Accent Archive [Dataset]. https://www.kaggle.com/rtatman/speech-accent-archive
Explore at:
zip(907049873 bytes)Available download formats
Dataset updated
Nov 6, 2017
Authors
Rachael Tatman
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Context:

Everyone who speaks a language, speaks it with an accent. A particular accent essentially reflects a person's linguistic background. When people listen to someone speak with a different accent from their own, they notice the difference, and they may even make certain biased social judgments about the speaker.

The speech accent archive is established to uniformly exhibit a large set of speech accents from a variety of language backgrounds. Native and non-native speakers of English all read the same English paragraph and are carefully recorded. The archive is constructed as a teaching tool and as a research tool. It is meant to be used by linguists as well as other people who simply wish to listen to and compare the accents of different English speakers.

This dataset allows you to compare the demographic and linguistic backgrounds of the speakers in order to determine which variables are key predictors of each accent. The speech accent archive demonstrates that accents are systematic rather than merely mistaken speech.

All of the linguistic analyses of the accents are available for public scrutiny. We welcome comments on the accuracy of our transcriptions and analyses.

Content:

This dataset contains 2140 speech samples, each from a different talker reading the same reading passage. Talkers come from 177 countries and have 214 different native languages. Each talker is speaking in English.

This dataset contains the following files:

reading-passage.txt: the text all speakers read

speakers_all.csv: demographic information on every speaker

recording: a zipped folder containing .mp3 files with speech

Acknowledgements:

This dataset was collected by many individuals (full list here) under the supervision of Steven H. Weinberger. The most up-to-date version of the archive is hosted by George Mason University. If you use this dataset in your work, please include the following citation:

Weinberger, S. (2013). Speech accent archive. George Mason University.

This datasets is distributed under a CC BY-NC-SA 2.0 license.

Inspiration:

The following types of people may find this dataset interesting:

ESL teachers who instruct non-native speakers of English

Actors who need to learn an accent

Engineers who train speech recognition machines

Linguists who do research on foreign accent

Phoneticians who teach phonetic transcription

Speech pathologists

Anyone who finds foreign accent to be interesting
English-Speaking Politicians
kaggle.com
zip
Updated Nov 10, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
maurice rupp (2020). English-Speaking Politicians [Dataset]. https://www.kaggle.com/datasets/mauricerupp/englishspeaking-politicians/code
Explore at:
zip(41917721 bytes)Available download formats
Dataset updated
Nov 10, 2020
Authors
maurice rupp
Description
Content

This dataset contains speeches, interviews and press briefings from over 1'000 english-speaking politicians over the time from 1789 until 2020. The data was scraped from multiple internet sources, each of which is indicated in the column 'URL'.

Dataset Structure

Each speech is treated as one entry, where sentences of other people (e.g. in an interview) are removed. Every paragraph inside the speech is added after a newline (' '). There exist no newlines elsewhere in the data.

Cleaning

Noise tags, time stamps and inaudible words have been removed from the data
Latin America: level of English proficiency 2023, by country
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Latin America: level of English proficiency 2023, by country [Dataset]. https://www.statista.com/statistics/1053066/english-proficiency-latin-america/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2023
Area covered
Latin America, Americas
Description
Argentina scored 562 out of a maximum of 800 points in the English Proficiency Index 2023. That was the highest score among all Latin American countries included in the survey. The Argentine capital, Buenos Aires, also received the highest English proficiency score among all the Latin American cities analyzed. Mexico and Haiti received the lowest scores in the region.
d
PHIDU - Birthplace - Non-English Speaking Residents (LGA) 2016
data.gov.au
ogc:wfs, wms
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PHIDU - Birthplace - Non-English Speaking Residents (LGA) 2016 [Dataset]. https://data.gov.au/dataset/ds-aurin-64eff58ba84f022853d95cb6b099c9ce901f75ca56000165e8c41f415844400d
Explore at:
ogc:wfs, wmsAvailable download formats
Description
This dataset, released August 2017, contains the Australian residents population by their birthplace divided into English speaking (ES) and non-English speaking (NES) countries, 2016. The following …Show full descriptionThis dataset, released August 2017, contains the Australian residents population by their birthplace divided into English speaking (ES) and non-English speaking (NES) countries, 2016. The following countries are designated as ES: Canada, Ireland, New Zealand, South Africa, United Kingdom and the United States of America; the remaining countries are designated as NES. The dataset also includes the population people born overseas and report poor proficiency in English. The data is by Local Government Area (LGA) 2016 geographic boundaries. For more information please see the data source notes on the data. Source: Compiled by PHIDU based on the ABS Census of Population and Housing, August 2016. Please note: AURIN has spatially enabled the original data. "*" - Indicates statistically significant, at the 95% confidence level. "**" - Indicates statistically significant, at the 99% confidence level. "~" - Indicates modelled estimates have Relative Root Mean Square Errors (RRMSEs) from 0.25 to 0.50 and should be used with caution. "~~" - Indicates modelled estimates have RRMSEs greater than 0.50 but less than 1 and are considered too unreliable for general use. '?' - Indicates modelled estimates are considered too unreliable. Blank cell - Indicates data was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data). Copyright attribution: Torrens University Australia - Public Health Information Development Unit, (2018): ; accessed from AURIN on 12/3/2020. Licence type: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia (CC BY-NC-SA 3.0 AU)
g
English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...
gimi9.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language Spoken at Home (7), Immigrant Language Spoken at Home (7), Mother Tongue (10), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada, Provinces and T | gimi9.com [Dataset]. https://gimi9.com/dataset/ca_66011e02-2782-4b4d-806d-87bcf5459cf1/
Explore at:
Area covered
Canada, French
Description
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.
S
Democracy and English Indicators
scidb.cn
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdullah AlKhuraibet (2024). Democracy and English Indicators [Dataset]. http://doi.org/10.57760/sciencedb.16236
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57760/sciencedb.16236
Dataset updated
Apr 12, 2024
Dataset provided by
Science Data Bank
Authors
Abdullah AlKhuraibet
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The data collected aim to test whether English proficiency levels in a country are positively associated with higher democratic values in that country. English proficiency is sourced from statistics by Education First’s "EF English Proficiency Index" which covers countries' scores for the calendar year 2022 and 2021. The EF English Proficiency Index ranks 111 countries in five different categories based on their English proficiency scores that were calculated from the test results of 2.1 million adults. While democratic values are operationalized through the liberal democracy index from the V-Dem Institute annual report for 2022 and 2021. Additionally, the data is utilized to test whether English language media consumption acts as a mediating variable between English proficiency and democracy levels in a country, while also looking at other possible regression variables. In order to conduct the linear regression analyses for the dats, the software that was utilized for this research was Microsoft Excel.The raw data set consists of 90 nation states in two years from 2022 and 2021. The raw data is utilized for two separate data sets the first of which is democracy indicators which has the regression variables of EPI, HDI, and GDP. For this table set there is a total of 360 data entries. HDI scores are a statistical summary measure that is developed by the United Nations Development Programme (UNDP) which measures the levels of human development in 190 countries. The data for nominal gross domestic product scores (GDP) are sourced from the World Bank. Having strong regression variables that have been proven to have a positive link with democracy in the data analysis such as GDP and HDI, would allow the regression analysis to identify whether there is a true relationship between English proficiency and democracy levels in a country. While the second data set has a total of 720 data entries and aims to identify English proficiency indicators the data set has 7 various regression variables which include, LDI scores, Years of Mandatory English Education, Heads of States Publicly speaking English, GDP PPP (2021USD), Common Wealth, BBC web traffic and CNN web traffic. The data for years of mandatory English education is sourced from research at the University of Winnipeg and is coded in the data set based on the number of years a country has English as a mandatory subject. The range of this data is from 0 to 13 years of English being mandatory. It is important to note that this data only concerns public schools and does not extend to the private school systems in each country. The data for heads of state publicly speaking English was done through a video data analysis of all heads of state. The data was only used for heads of state who had been in their position for at least a year to ensure the accuracy of the data collected; with a year in power, for heads of state that had not been in their position for a year, data was taken from the previous head of state. This data only takes into account speeches and interviews that were conducted during their incumbency. The data for each country’s GDP PPP scores are sourced from the World Bank, which was last updated for a majority of the countries in 2021 and is tied to the US dollar. Data for the commonwealth will only include members of the commonwealth that have been historically colonized by the United Kingdom. Any country that falls under that category will be coded as 1 and any country that does not will be coded as 0. For BBC and CNN web traffic that data is sourced by using tools in Semrush which provide a rough estimate of how much web traffic each news site generates in each country. Which will be utilized to identify the average number of web traffic for BBC News and CNN World News for both the 2021 and 2022 calendar. The traffic for each country will also be measured per capita, per 10 thousand people to ensure that the population density of a country does not influence the results. The population of each country for both 2021 and 2022 is sourced from the United Nations revision of World Population Prospects of both 2021 and 2022 respectively.
English Spontaneous Dialogue speech dataset
kaggle.com
zip
Updated Jun 7, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Frank Wong (2024). English Spontaneous Dialogue speech dataset [Dataset]. https://www.kaggle.com/datasets/nexdatafrank/english-spontaneous-dialogue-speech-dataset
Explore at:
zip(529776 bytes)Available download formats
Dataset updated
Jun 7, 2024
Authors
Frank Wong
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
English(the United kingdom) Spontaneous Dialogue Smartphone speech dataset

Description

English(the united kingdom) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 500 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied. For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1393?source=Kaggle

Format

16kHz, 16bit, uncompressed wav, mono channel

Content category

Dialogue based on given topics

Recording condition

Low background noise (indoor)

Recording device

Android smartphone, iPhone

Country

The United Kingdom(GBK)

Language(Region) Code

en-GB

Language

English

Speaker

310 native speakers in total, 42% male and 58% female

Features of annotation

Transcription text, timestamp, speaker ID, gender, noise

Accuracy rate

Sentence accuracy rate(SAR) 95%

Licensing Information

Commercial License
u
English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...
data.urbandatacentre.ca
Updated Oct 19, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language Spoken at Home (7), Immigrant Language Spoken at Home (7), Mother Tongue (10), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 100% Data - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-66011e02-2782-4b4d-806d-87bcf5459cf1
Explore at:
Dataset updated
Oct 19, 2025
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada, French
Description
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.
Level of English proficiency Asia 2024, by country
statista.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista, Level of English proficiency Asia 2024, by country [Dataset]. https://www.statista.com/statistics/1456015/asia-english-proficiency-ranking-by-country/
Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2024
Area covered
Asia, APAC, Asia
Description
Singapore scored 609 out of a maximum of 800 points in the English Proficiency Index 2024, the highest score across the selected Asian countries and territories. In contrast, Cambodia reached an English Proficiency Index score of 408 that year.
Top Languages Spoken in the United States
kaggle.com
zip
Updated Oct 22, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Top Languages Spoken in the United States [Dataset]. https://www.kaggle.com/datasets/thedevastator/top-languages-spoken-in-the-united-states
Explore at:
zip(356420 bytes)Available download formats
Dataset updated
Oct 22, 2022
Authors
The Devastator
Area covered
United States
Description
Top Languages Spoken in the United States

The Impact of linguistics on Community and Business in America

About this dataset

Languages are an important part of daily life in the USA. Here is a table that shows the most common languages spoken in the USA, as well as a big spreadsheet which shows each CBSA (Core-Based Statistical Area, or urban area).

Language usage varies widely throughout the United States. According to the latest census data, over 350 different languages are represented in homes across the country. The following table and spreadsheet provide more detailed information on language usage throughout the various states and cities in the US:

Columns: - index: Index column for dataframe - Table with column headers in row 5 and row headers in column A: Contains language data for each CBSA (Core Based Statistical Area) - Unnamed: 1: Rank of CBSA by total number of speakers of all languages - Unnamed: 2: Name of CBSA - Unnamed: 3: Population of CBSA - Unnamed: 4: Percent of population that speaks English very well - Unnamed: 5 through Unnamed: 58 : Languages spoken by at least 0.1% of the population, with corresponding percentages

How to use the dataset

This dataset can be used to understand the linguistic diversity of the United States, and to compare languages spoken across different states and cities.

This data can also be used to explore trends in language usage over time.

businesses can use this dataset to identify which languages are most commonly spoken in the areas in which they operate and tailor their marketing or customer service accordingly.

Schools could use this dataset to plan language-learning programs based on the needs of their community.

Policymakers could use this data to better understand linguistic diversity in the United States and design programs to support bilingualism or multilingualism

Research Ideas

Businesses can use this dataset to identify which languages are most commonly spoken in the areas in which they operate and cater their marketing or customer service accordingly.

Schools could use this data to plan language-learning programs based on the needs of their community.

Policymakers could use this dataset to better understand linguistic diversity in the United States and design programs to support bilingualism or multilingualism

Acknowledgements

This dataset was created by Gary Hoover. The data was sourced from https://www.kaggle.com/garyhoov/us-languages

License

Unknown License - Please check the dataset description for more information.

Columns

File: Languages Spoken at Home by Urban Area = CBSA.csv

File: US Languages Spoken at Home 2014.csv | Column name | Description | |:-------------------------------------------------------------------|:--------------| | Table with column headers in row 5 and row headers in column A | |
Selected Demographic, Cultural, Educational, Labour Force and Income...
open.canada.ca
datasets.ai
xml
Updated Mar 9, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statistics Canada (2022). Selected Demographic, Cultural, Educational, Labour Force and Income Characteristics (725), First Official Language Spoken (4) and Sex (3) for Population Having English, French or English and French as First Official Language Spoken, for Canada, Provinces, Territories, Census Divisions and Census Subdivisions, 2001 Census - 20% Sample Data [Dataset]. https://open.canada.ca/data/dataset/3f8f670e-a143-4880-897a-d849afe7f8f2
Explore at:
xmlAvailable download formats
Dataset updated
Mar 9, 2022
Dataset provided by
Statistics Canadahttps://statcan.gc.ca/en
License
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Area covered
Canada, French
Description
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.
Anglosphere Baby Names
kaggle.com
zip
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Katherine Koopmans (2025). Anglosphere Baby Names [Dataset]. https://www.kaggle.com/datasets/kkoops/anglosphere-baby-names
Explore at:
zip(24210503 bytes)Available download formats
Dataset updated
Feb 19, 2025
Authors
Katherine Koopmans
License
https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Description
This dataset contains two tables of baby name data for 8 anglosphere regions: Australia, Canada, England and Wales (grouped together), Ireland, Northern Ireland, New Zealand, Scotland and the USA. It can be used to compare the popularity of names in the anglosphere over time, and I have used it to determine the "Country-ness" of each particular name, or, how much more popular it is in its most popular country as compared to all the other countries.
Data from: Immigrant Second Generation in Metropolitan New York
icpsr.umich.edu
ascii, delimited, sas +2
Updated Apr 1, 2011
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mollenkopf, John; Kasinitz, Philip; Waters, Mary (2011). Immigrant Second Generation in Metropolitan New York [Dataset]. http://doi.org/10.3886/ICPSR30302.v1
Explore at:
delimited, spss, sas, stata, asciiAvailable download formats
Unique identifier
https://doi.org/10.3886/ICPSR30302.v1
Dataset updated
Apr 1, 2011
Dataset provided by
Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
Authors
Mollenkopf, John; Kasinitz, Philip; Waters, Mary
License
https://www.icpsr.umich.edu/web/ICPSR/studies/30302/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/30302/terms
Time period covered
1999
Area covered
United States, New York, New York (state)
Description
The study analyzes the forces leading to or impeding the assimilation of 18- to 32-year-olds from immigrant backgrounds that vary in terms of race, language, and the mix of skills and liabilities their parents brought to the United States. To make sure that what we find derives specifically from growing up in an immigrant family, rather than simply being a young person in New York, a comparison group of people from native born White, Black, and Puerto Rican backgrounds was also studied. The sample was drawn from New York City (except for Staten Island) and the surrounding counties in the inner part of the New York-New Jersey metropolitan region where the vast majority of immigrants and native born minority group members live and grow up. The study groups make possible a number of interesting comparisons. Unlike many other immigrant groups, the West Indian first generation speaks English, but the dominant society racially classifies them as Black. The study explored how their experiences resemble or differ from native born African Americans. Dominicans and the Colombian-Peruvian-Ecuadoran population both speak Spanish, but live in different parts of New York, have different class backgrounds prior to immigration, and, quite often, different skin tones. The study compared them to Puerto Rican young people, who, along with their parents, have the benefit of citizenship. Chinese immigrants from the mainland tend to have little education, while young people with overseas Chinese parents come from families with higher incomes, more education, and more English fluency. Respondents were divided into eight groups depending on their parents' origin. Those of immigrant ancestry include: Jewish immigrants from the former Soviet Union; Chinese immigrants from the mainland, Taiwan, Hong Kong, and the Chinese Diaspora; immigrants from the Dominican Republic; immigrants from the English-speaking countries of the West Indies (including Guyana but excluding Haiti and those of Indian origin); and immigrants from Colombia, Ecuador, and Peru. These groups composed 44 percent of the 2000 second-generation population in the defined sample area. For comparative purposes, Whites, Blacks, and Puerto Ricans who were born in the United States and whose parents were born in the United States or Puerto Rico were also interviewed. To be eligible, a respondent had to have a parent from one of these groups. If the respondent was eligible for two groups, he or she was asked which designation he or she preferred. The ability to compare these groups with native born Whites, Blacks, and Puerto Ricans permits researchers to investigate the effects of nativity while controlling for race and language background. About two-thirds of second-generation respondents were born in the United States, mostly in New York City, while one-third were born abroad but arrived in the United States by age 12 and had lived in the country for at least 10 years, except for those from the former Soviet Union, some of whom arrived past the age of 12. The project began with a pilot study in July 1996. Survey data collection took place between November 1999 and December 1999. The study includes demographic variables such as race, ethnicity, language, age, education, income, family size, country of origin, and citizenship status.
g
Selected Demographic, Cultural, Educational, Labour Force and Income...
gimi9.com
Updated May 3, 2012
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2012). Selected Demographic, Cultural, Educational, Labour Force and Income Characteristics (725), First Official Language Spoken (4) and Sex (3) for Population Having English, French or English and French as First Official Language Spoken, for Canada, Provinces | gimi9.com [Dataset]. https://gimi9.com/dataset/ca_3f8f670e-a143-4880-897a-d849afe7f8f2/
Explore at:
Dataset updated
May 3, 2012
Area covered
Canada, French
Description
This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.

Facebook

Twitter

Click to copy link

Link copied

Cite

Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/

The most spoken languages worldwide 2025

Explore at:

464 scholarly articles cite this dataset (View in Google Scholar)

Dataset authored and provided by

Statistahttp://statista.com/

Time period covered

2025

Area covered

World

Description

In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.

Clear search

Close search

Google apps

Main menu

The most spoken languages worldwide 2025

Number of native Spanish speakers worldwide 2024, by country

English proficiency in European countries in 2019

ENGLISH PROFICIENCY LEVEL

Spanish speakers in countries where Spanish is not an official language 2024...

LGA11 Non English Speaking Countries of Birth 2011

Speech Accent Archive

Context:

Content:

Acknowledgements:

Inspiration:

English-Speaking Politicians

Content

Dataset Structure

Cleaning

Latin America: level of English proficiency 2023, by country

PHIDU - Birthplace - Non-English Speaking Residents (LGA) 2016

English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...

Democracy and English Indicators

English Spontaneous Dialogue speech dataset

English(the United kingdom) Spontaneous Dialogue Smartphone speech dataset

Description

Format

Content category

Recording condition

Recording device

Country

Language(Region) Code

Language

Speaker

Features of annotation

Accuracy rate

Licensing Information

English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...

Level of English proficiency Asia 2024, by country

Top Languages Spoken in the United States

Top Languages Spoken in the United States

The Impact of linguistics on Community and Business in America

About this dataset

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

Selected Demographic, Cultural, Educational, Labour Force and Income...

Anglosphere Baby Names

Data from: Immigrant Second Generation in Metropolitan New York

Selected Demographic, Cultural, Educational, Labour Force and Income...

The most spoken languages worldwide 2025