57 datasets found
  1. The most spoken languages worldwide 2025

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2025
    Area covered
    World
    Description

    In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.

  2. Number of native Spanish speakers worldwide 2024, by country

    • statista.com
    • boostndoto.org
    • +5more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

  3. English proficiency in European countries in 2019

    • statista.com
    Updated Jun 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). English proficiency in European countries in 2019 [Dataset]. https://www.statista.com/statistics/990547/countries-in-europe-for-english/
    Explore at:
    Dataset updated
    Jun 23, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Mar 2019
    Area covered
    Europe
    Description

    This statistic presents the leading European countries by their level of English proficiency as of March 2019. According to data provided by Klazz, Sweden had the highest percentage of people who were proficient in English at ** percent of the population.

  4. g

    ENGLISH PROFICIENCY LEVEL

    • global-relocate.com
    Updated Oct 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Global Relocate (2024). ENGLISH PROFICIENCY LEVEL [Dataset]. https://global-relocate.com/rankings/english-proficiency-level
    Explore at:
    Dataset updated
    Oct 29, 2024
    Dataset provided by
    Global Relocate
    Description

    Using data from reports such as the "English Proficiency Index" (EDU) from Education First, one can see the significant impact of culture, education and globalization on the ability of citizens of different countries to speak English.

  5. Spanish speakers in countries where Spanish is not an official language 2024...

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Spanish speakers in countries where Spanish is not an official language 2024 [Dataset]. https://www.statista.com/statistics/1276290/number-spanish-speakers-non-hispanic-countries-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.

  6. r

    LGA11 Non English Speaking Countries of Birth 2011

    • researchdata.edu.au
    null
    Updated Jun 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Torrens University Australia - Public Health Information Development Unit (2023). LGA11 Non English Speaking Countries of Birth 2011 [Dataset]. https://researchdata.edu.au/lga11-non-english-birth-2011/2744967
    Explore at:
    nullAvailable download formats
    Dataset updated
    Jun 28, 2023
    Dataset provided by
    Australian Urban Research Infrastructure Network (AURIN)
    Authors
    Torrens University Australia - Public Health Information Development Unit
    License

    Attribution-NonCommercial-ShareAlike 3.0 (CC BY-NC-SA 3.0)https://creativecommons.org/licenses/by-nc-sa/3.0/
    License information was derived automatically

    Area covered
    Description

    People born in the ten most common non-English speaking background countries by LGA 2011, for the 2011.

  7. Speech Accent Archive

    • kaggle.com
    • marketplace.sshopencloud.eu
    zip
    Updated Nov 6, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rachael Tatman (2017). Speech Accent Archive [Dataset]. https://www.kaggle.com/rtatman/speech-accent-archive
    Explore at:
    zip(907049873 bytes)Available download formats
    Dataset updated
    Nov 6, 2017
    Authors
    Rachael Tatman
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Context:

    Everyone who speaks a language, speaks it with an accent. A particular accent essentially reflects a person's linguistic background. When people listen to someone speak with a different accent from their own, they notice the difference, and they may even make certain biased social judgments about the speaker.

    The speech accent archive is established to uniformly exhibit a large set of speech accents from a variety of language backgrounds. Native and non-native speakers of English all read the same English paragraph and are carefully recorded. The archive is constructed as a teaching tool and as a research tool. It is meant to be used by linguists as well as other people who simply wish to listen to and compare the accents of different English speakers.

    This dataset allows you to compare the demographic and linguistic backgrounds of the speakers in order to determine which variables are key predictors of each accent. The speech accent archive demonstrates that accents are systematic rather than merely mistaken speech.

    All of the linguistic analyses of the accents are available for public scrutiny. We welcome comments on the accuracy of our transcriptions and analyses.

    Content:

    This dataset contains 2140 speech samples, each from a different talker reading the same reading passage. Talkers come from 177 countries and have 214 different native languages. Each talker is speaking in English.

    This dataset contains the following files:

    • reading-passage.txt: the text all speakers read
    • speakers_all.csv: demographic information on every speaker
    • recording: a zipped folder containing .mp3 files with speech

    Acknowledgements:

    This dataset was collected by many individuals (full list here) under the supervision of Steven H. Weinberger. The most up-to-date version of the archive is hosted by George Mason University. If you use this dataset in your work, please include the following citation:

    Weinberger, S. (2013). Speech accent archive. George Mason University.

    This datasets is distributed under a CC BY-NC-SA 2.0 license.

    Inspiration:

    The following types of people may find this dataset interesting:

    • ESL teachers who instruct non-native speakers of English
    • Actors who need to learn an accent
    • Engineers who train speech recognition machines
    • Linguists who do research on foreign accent
    • Phoneticians who teach phonetic transcription
    • Speech pathologists
    • Anyone who finds foreign accent to be interesting
  8. English-Speaking Politicians

    • kaggle.com
    zip
    Updated Nov 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    maurice rupp (2020). English-Speaking Politicians [Dataset]. https://www.kaggle.com/datasets/mauricerupp/englishspeaking-politicians/code
    Explore at:
    zip(41917721 bytes)Available download formats
    Dataset updated
    Nov 10, 2020
    Authors
    maurice rupp
    Description

    Content

    This dataset contains speeches, interviews and press briefings from over 1'000 english-speaking politicians over the time from 1789 until 2020. The data was scraped from multiple internet sources, each of which is indicated in the column 'URL'.

    Dataset Structure

    Each speech is treated as one entry, where sentences of other people (e.g. in an interview) are removed. Every paragraph inside the speech is added after a newline (' '). There exist no newlines elsewhere in the data.

    Cleaning

    Noise tags, time stamps and inaudible words have been removed from the data

  9. Latin America: level of English proficiency 2023, by country

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Latin America: level of English proficiency 2023, by country [Dataset]. https://www.statista.com/statistics/1053066/english-proficiency-latin-america/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    Latin America, Americas
    Description

    Argentina scored 562 out of a maximum of 800 points in the English Proficiency Index 2023. That was the highest score among all Latin American countries included in the survey. The Argentine capital, Buenos Aires, also received the highest English proficiency score among all the Latin American cities analyzed. Mexico and Haiti received the lowest scores in the region.

  10. d

    PHIDU - Birthplace - Non-English Speaking Residents (LGA) 2016

    • data.gov.au
    ogc:wfs, wms
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    PHIDU - Birthplace - Non-English Speaking Residents (LGA) 2016 [Dataset]. https://data.gov.au/dataset/ds-aurin-64eff58ba84f022853d95cb6b099c9ce901f75ca56000165e8c41f415844400d
    Explore at:
    ogc:wfs, wmsAvailable download formats
    Description

    This dataset, released August 2017, contains the Australian residents population by their birthplace divided into English speaking (ES) and non-English speaking (NES) countries, 2016. The following …Show full descriptionThis dataset, released August 2017, contains the Australian residents population by their birthplace divided into English speaking (ES) and non-English speaking (NES) countries, 2016. The following countries are designated as ES: Canada, Ireland, New Zealand, South Africa, United Kingdom and the United States of America; the remaining countries are designated as NES. The dataset also includes the population people born overseas and report poor proficiency in English. The data is by Local Government Area (LGA) 2016 geographic boundaries. For more information please see the data source notes on the data. Source: Compiled by PHIDU based on the ABS Census of Population and Housing, August 2016. Please note: AURIN has spatially enabled the original data. "*" - Indicates statistically significant, at the 95% confidence level. "**" - Indicates statistically significant, at the 99% confidence level. "~" - Indicates modelled estimates have Relative Root Mean Square Errors (RRMSEs) from 0.25 to 0.50 and should be used with caution. "~~" - Indicates modelled estimates have RRMSEs greater than 0.50 but less than 1 and are considered too unreliable for general use. '?' - Indicates modelled estimates are considered too unreliable. Blank cell - Indicates data was not shown/not applicable/not published/not available for the specific area ('#', '..', '^', 'np, 'n.a.', 'n.y.a.' in original PHIDU data). Copyright attribution: Torrens University Australia - Public Health Information Development Unit, (2018): ; accessed from AURIN on 12/3/2020. Licence type: Creative Commons Attribution-NonCommercial-ShareAlike 3.0 Australia (CC BY-NC-SA 3.0 AU)

  11. g

    English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...

    • gimi9.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language Spoken at Home (7), Immigrant Language Spoken at Home (7), Mother Tongue (10), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada, Provinces and T | gimi9.com [Dataset]. https://gimi9.com/dataset/ca_66011e02-2782-4b4d-806d-87bcf5459cf1/
    Explore at:
    Area covered
    Canada, French
    Description

    This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.

  12. S

    Democracy and English Indicators

    • scidb.cn
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdullah AlKhuraibet (2024). Democracy and English Indicators [Dataset]. http://doi.org/10.57760/sciencedb.16236
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 12, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Abdullah AlKhuraibet
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data collected aim to test whether English proficiency levels in a country are positively associated with higher democratic values in that country. English proficiency is sourced from statistics by Education First’s "EF English Proficiency Index" which covers countries' scores for the calendar year 2022 and 2021. The EF English Proficiency Index ranks 111 countries in five different categories based on their English proficiency scores that were calculated from the test results of 2.1 million adults. While democratic values are operationalized through the liberal democracy index from the V-Dem Institute annual report for 2022 and 2021. Additionally, the data is utilized to test whether English language media consumption acts as a mediating variable between English proficiency and democracy levels in a country, while also looking at other possible regression variables. In order to conduct the linear regression analyses for the dats, the software that was utilized for this research was Microsoft Excel.The raw data set consists of 90 nation states in two years from 2022 and 2021. The raw data is utilized for two separate data sets the first of which is democracy indicators which has the regression variables of EPI, HDI, and GDP. For this table set there is a total of 360 data entries. HDI scores are a statistical summary measure that is developed by the United Nations Development Programme (UNDP) which measures the levels of human development in 190 countries. The data for nominal gross domestic product scores (GDP) are sourced from the World Bank. Having strong regression variables that have been proven to have a positive link with democracy in the data analysis such as GDP and HDI, would allow the regression analysis to identify whether there is a true relationship between English proficiency and democracy levels in a country. While the second data set has a total of 720 data entries and aims to identify English proficiency indicators the data set has 7 various regression variables which include, LDI scores, Years of Mandatory English Education, Heads of States Publicly speaking English, GDP PPP (2021USD), Common Wealth, BBC web traffic and CNN web traffic. The data for years of mandatory English education is sourced from research at the University of Winnipeg and is coded in the data set based on the number of years a country has English as a mandatory subject. The range of this data is from 0 to 13 years of English being mandatory. It is important to note that this data only concerns public schools and does not extend to the private school systems in each country. The data for heads of state publicly speaking English was done through a video data analysis of all heads of state. The data was only used for heads of state who had been in their position for at least a year to ensure the accuracy of the data collected; with a year in power, for heads of state that had not been in their position for a year, data was taken from the previous head of state. This data only takes into account speeches and interviews that were conducted during their incumbency. The data for each country’s GDP PPP scores are sourced from the World Bank, which was last updated for a majority of the countries in 2021 and is tied to the US dollar. Data for the commonwealth will only include members of the commonwealth that have been historically colonized by the United Kingdom. Any country that falls under that category will be coded as 1 and any country that does not will be coded as 0. For BBC and CNN web traffic that data is sourced by using tools in Semrush which provide a rough estimate of how much web traffic each news site generates in each country. Which will be utilized to identify the average number of web traffic for BBC News and CNN World News for both the 2021 and 2022 calendar. The traffic for each country will also be measured per capita, per 10 thousand people to ensure that the population density of a country does not influence the results. The population of each country for both 2021 and 2022 is sourced from the United Nations revision of World Population Prospects of both 2021 and 2022 respectively.

  13. English Spontaneous Dialogue speech dataset

    • kaggle.com
    zip
    Updated Jun 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Wong (2024). English Spontaneous Dialogue speech dataset [Dataset]. https://www.kaggle.com/datasets/nexdatafrank/english-spontaneous-dialogue-speech-dataset
    Explore at:
    zip(529776 bytes)Available download formats
    Dataset updated
    Jun 7, 2024
    Authors
    Frank Wong
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    English(the United kingdom) Spontaneous Dialogue Smartphone speech dataset

    Description

    English(the united kingdom) Spontaneous Dialogue Smartphone speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(around 500 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied. For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1393?source=Kaggle

    Format

    16kHz, 16bit, uncompressed wav, mono channel

    Content category

    Dialogue based on given topics

    Recording condition

    Low background noise (indoor)

    Recording device

    Android smartphone, iPhone

    Country

    The United Kingdom(GBK)

    Language(Region) Code

    en-GB

    Language

    English

    Speaker

    310 native speakers in total, 42% male and 58% female

    Features of annotation

    Transcription text, timestamp, speaker ID, gender, noise

    Accuracy rate

    Sentence accuracy rate(SAR) 95%

    Licensing Information

    Commercial License

  14. u

    English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language...

    • data.urbandatacentre.ca
    Updated Oct 19, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). English Spoken at Home (7), French Spoken at Home (7), Aboriginal Language Spoken at Home (7), Immigrant Language Spoken at Home (7), Mother Tongue (10), Age (15A) and Sex (3) for the Population Excluding Institutional Residents of Canada, Provinces and Territories, Census Metropolitan Areas and Census Agglomerations, 2016 Census - 100% Data - Catalogue - Canadian Urban Data Catalogue (CUDC) [Dataset]. https://data.urbandatacentre.ca/dataset/gov-canada-66011e02-2782-4b4d-806d-87bcf5459cf1
    Explore at:
    Dataset updated
    Oct 19, 2025
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada, French
    Description

    This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.

  15. Level of English proficiency Asia 2024, by country

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Level of English proficiency Asia 2024, by country [Dataset]. https://www.statista.com/statistics/1456015/asia-english-proficiency-ranking-by-country/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    Asia, APAC, Asia
    Description

    Singapore scored 609 out of a maximum of 800 points in the English Proficiency Index 2024, the highest score across the selected Asian countries and territories. In contrast, Cambodia reached an English Proficiency Index score of 408 that year.

  16. Top Languages Spoken in the United States

    • kaggle.com
    zip
    Updated Oct 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). Top Languages Spoken in the United States [Dataset]. https://www.kaggle.com/datasets/thedevastator/top-languages-spoken-in-the-united-states
    Explore at:
    zip(356420 bytes)Available download formats
    Dataset updated
    Oct 22, 2022
    Authors
    The Devastator
    Area covered
    United States
    Description

    Top Languages Spoken in the United States

    The Impact of linguistics on Community and Business in America

    About this dataset

    Languages are an important part of daily life in the USA. Here is a table that shows the most common languages spoken in the USA, as well as a big spreadsheet which shows each CBSA (Core-Based Statistical Area, or urban area).

    Language usage varies widely throughout the United States. According to the latest census data, over 350 different languages are represented in homes across the country. The following table and spreadsheet provide more detailed information on language usage throughout the various states and cities in the US:

    Columns: - index: Index column for dataframe - Table with column headers in row 5 and row headers in column A: Contains language data for each CBSA (Core Based Statistical Area) - Unnamed: 1: Rank of CBSA by total number of speakers of all languages - Unnamed: 2: Name of CBSA - Unnamed: 3: Population of CBSA - Unnamed: 4: Percent of population that speaks English very well - Unnamed: 5 through Unnamed: 58 : Languages spoken by at least 0.1% of the population, with corresponding percentages

    How to use the dataset

    1. This dataset can be used to understand the linguistic diversity of the United States, and to compare languages spoken across different states and cities.
    2. This data can also be used to explore trends in language usage over time.
    3. businesses can use this dataset to identify which languages are most commonly spoken in the areas in which they operate and tailor their marketing or customer service accordingly.
    4. Schools could use this dataset to plan language-learning programs based on the needs of their community.
    5. Policymakers could use this data to better understand linguistic diversity in the United States and design programs to support bilingualism or multilingualism

    Research Ideas

    1. Businesses can use this dataset to identify which languages are most commonly spoken in the areas in which they operate and cater their marketing or customer service accordingly.
    2. Schools could use this data to plan language-learning programs based on the needs of their community.
    3. Policymakers could use this dataset to better understand linguistic diversity in the United States and design programs to support bilingualism or multilingualism

    Acknowledgements

    This dataset was created by Gary Hoover. The data was sourced from https://www.kaggle.com/garyhoov/us-languages

    License

    Unknown License - Please check the dataset description for more information.

    Columns

    File: Languages Spoken at Home by Urban Area = CBSA.csv

    File: US Languages Spoken at Home 2014.csv | Column name | Description | |:-------------------------------------------------------------------|:--------------| | Table with column headers in row 5 and row headers in column A | |

  17. Selected Demographic, Cultural, Educational, Labour Force and Income...

    • open.canada.ca
    • datasets.ai
    xml
    Updated Mar 9, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2022). Selected Demographic, Cultural, Educational, Labour Force and Income Characteristics (725), First Official Language Spoken (4) and Sex (3) for Population Having English, French or English and French as First Official Language Spoken, for Canada, Provinces, Territories, Census Divisions and Census Subdivisions, 2001 Census - 20% Sample Data [Dataset]. https://open.canada.ca/data/dataset/3f8f670e-a143-4880-897a-d849afe7f8f2
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Mar 9, 2022
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Area covered
    Canada, French
    Description

    This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.

  18. Anglosphere Baby Names

    • kaggle.com
    zip
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Katherine Koopmans (2025). Anglosphere Baby Names [Dataset]. https://www.kaggle.com/datasets/kkoops/anglosphere-baby-names
    Explore at:
    zip(24210503 bytes)Available download formats
    Dataset updated
    Feb 19, 2025
    Authors
    Katherine Koopmans
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    This dataset contains two tables of baby name data for 8 anglosphere regions: Australia, Canada, England and Wales (grouped together), Ireland, Northern Ireland, New Zealand, Scotland and the USA. It can be used to compare the popularity of names in the anglosphere over time, and I have used it to determine the "Country-ness" of each particular name, or, how much more popular it is in its most popular country as compared to all the other countries.

  19. Data from: Immigrant Second Generation in Metropolitan New York

    • icpsr.umich.edu
    ascii, delimited, sas +2
    Updated Apr 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mollenkopf, John; Kasinitz, Philip; Waters, Mary (2011). Immigrant Second Generation in Metropolitan New York [Dataset]. http://doi.org/10.3886/ICPSR30302.v1
    Explore at:
    delimited, spss, sas, stata, asciiAvailable download formats
    Dataset updated
    Apr 1, 2011
    Dataset provided by
    Inter-university Consortium for Political and Social Researchhttps://www.icpsr.umich.edu/web/pages/
    Authors
    Mollenkopf, John; Kasinitz, Philip; Waters, Mary
    License

    https://www.icpsr.umich.edu/web/ICPSR/studies/30302/termshttps://www.icpsr.umich.edu/web/ICPSR/studies/30302/terms

    Time period covered
    1999
    Area covered
    United States, New York, New York (state)
    Description

    The study analyzes the forces leading to or impeding the assimilation of 18- to 32-year-olds from immigrant backgrounds that vary in terms of race, language, and the mix of skills and liabilities their parents brought to the United States. To make sure that what we find derives specifically from growing up in an immigrant family, rather than simply being a young person in New York, a comparison group of people from native born White, Black, and Puerto Rican backgrounds was also studied. The sample was drawn from New York City (except for Staten Island) and the surrounding counties in the inner part of the New York-New Jersey metropolitan region where the vast majority of immigrants and native born minority group members live and grow up. The study groups make possible a number of interesting comparisons. Unlike many other immigrant groups, the West Indian first generation speaks English, but the dominant society racially classifies them as Black. The study explored how their experiences resemble or differ from native born African Americans. Dominicans and the Colombian-Peruvian-Ecuadoran population both speak Spanish, but live in different parts of New York, have different class backgrounds prior to immigration, and, quite often, different skin tones. The study compared them to Puerto Rican young people, who, along with their parents, have the benefit of citizenship. Chinese immigrants from the mainland tend to have little education, while young people with overseas Chinese parents come from families with higher incomes, more education, and more English fluency. Respondents were divided into eight groups depending on their parents' origin. Those of immigrant ancestry include: Jewish immigrants from the former Soviet Union; Chinese immigrants from the mainland, Taiwan, Hong Kong, and the Chinese Diaspora; immigrants from the Dominican Republic; immigrants from the English-speaking countries of the West Indies (including Guyana but excluding Haiti and those of Indian origin); and immigrants from Colombia, Ecuador, and Peru. These groups composed 44 percent of the 2000 second-generation population in the defined sample area. For comparative purposes, Whites, Blacks, and Puerto Ricans who were born in the United States and whose parents were born in the United States or Puerto Rico were also interviewed. To be eligible, a respondent had to have a parent from one of these groups. If the respondent was eligible for two groups, he or she was asked which designation he or she preferred. The ability to compare these groups with native born Whites, Blacks, and Puerto Ricans permits researchers to investigate the effects of nativity while controlling for race and language background. About two-thirds of second-generation respondents were born in the United States, mostly in New York City, while one-third were born abroad but arrived in the United States by age 12 and had lived in the country for at least 10 years, except for those from the former Soviet Union, some of whom arrived past the age of 12. The project began with a pilot study in July 1996. Survey data collection took place between November 1999 and December 1999. The study includes demographic variables such as race, ethnicity, language, age, education, income, family size, country of origin, and citizenship status.

  20. g

    Selected Demographic, Cultural, Educational, Labour Force and Income...

    • gimi9.com
    Updated May 3, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2012). Selected Demographic, Cultural, Educational, Labour Force and Income Characteristics (725), First Official Language Spoken (4) and Sex (3) for Population Having English, French or English and French as First Official Language Spoken, for Canada, Provinces | gimi9.com [Dataset]. https://gimi9.com/dataset/ca_3f8f670e-a143-4880-897a-d849afe7f8f2/
    Explore at:
    Dataset updated
    May 3, 2012
    Area covered
    Canada, French
    Description

    This table is part of a series of tables that present a portrait of Canada based on the various census topics. The tables range in complexity and levels of geography. Content varies from a simple overview of the country to complex cross-tabulations; the tables may also cover several censuses.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
Organization logo

The most spoken languages worldwide 2025

Explore at:
464 scholarly articles cite this dataset (View in Google Scholar)
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
2025
Area covered
World
Description

In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.

Search
Clear search
Close search
Google apps
Main menu