52 datasets found
  1. Number of native Spanish speakers worldwide 2024, by country

    • statista.com
    • boostndoto.org
    • +5more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

  2. Spanish speakers in countries where Spanish is not an official language 2024...

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Spanish speakers in countries where Spanish is not an official language 2024 [Dataset]. https://www.statista.com/statistics/1276290/number-spanish-speakers-non-hispanic-countries-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.

  3. Hispanic population in the U.S. 2023, by origin

    • statista.com
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2024). Hispanic population in the U.S. 2023, by origin [Dataset]. https://www.statista.com/statistics/234852/us-hispanic-population/
    Explore at:
    Dataset updated
    Oct 21, 2024
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    As of 2023, around 37.99 million people of Mexican descent were living in the United States - the largest of any Hispanic group. Puerto Ricans, Salvadorans, Cubans, and Dominicans rounded out the top five Hispanic groups living in the U.S. in that year.

  4. t

    HISPANIC OR LATINO AND RACE - DP05_MAN_ZIP - Dataset - CKAN

    • portal.tad3.org
    Updated Jul 23, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). HISPANIC OR LATINO AND RACE - DP05_MAN_ZIP - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/hispanic-or-latino-and-race-dp05_man_zip
    Explore at:
    Dataset updated
    Jul 23, 2023
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ACS DEMOGRAPHIC AND HOUSING ESTIMATES HISPANIC OR LATINO AND RACE - DP05 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 The terms “Hispanic,” “Latino,” and “Spanish” are used interchangeably. Some respondents identify with all three terms while others may identify with only one of these three specific terms. People who identify with the terms “Hispanic,” “Latino,” or “Spanish” are those who classify themselves in one of the specific Hispanic, Latino, or Spanish categories listed on the questionnaire (“Mexican, Mexican Am., or Chicano,” “Puerto Rican,” or “Cuban”) as well as those who indicate that they are “another Hispanic, Latino, or Spanish origin.” People who do not identify with one of the specific origins listed on the questionnaire but indicate that they are “another Hispanic, Latino, or Spanish origin” are those whose origins are from Spain, the Spanish-speaking countries of Central or South America, or another Spanish culture or origin. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the UnitedStates. People who identify their origin as Hispanic, Latino, or Spanish may be of any race.

  5. The most spoken languages worldwide 2025

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2025
    Area covered
    World
    Description

    In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.

  6. N

    Norway, MI Hispanic or Latino Population Distribution by Their Ancestries

    • neilsberg.com
    csv, json
    Updated Aug 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2023). Norway, MI Hispanic or Latino Population Distribution by Their Ancestries [Dataset]. https://www.neilsberg.com/research/datasets/6d7c1bb6-3d85-11ee-9abe-0aa64bf2eeb2/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Aug 18, 2023
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Norway, Michigan
    Variables measured
    Hispanic or Latino population with Cuban ancestry, Hispanic or Latino population with Mexican ancestry, Hispanic or Latino population with Puerto Rican ancestry, Hispanic or Latino population with Other Hispanic or Latino ancestry, Hispanic or Latino population with Cuban ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Mexican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Puerto Rican ancestry as Percent of Total Hispanic Population, Hispanic or Latino population with Other Hispanic or Latino ancestry as Percent of Total Hispanic Population
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) Origin / Ancestry for Hispanic population and (b) respective population as a percentage of the total Hispanic population, we initially analyzed and categorized the data for each of the ancestries across the Hispanic or Latino population. It is ensured that the population estimates used in this dataset pertain exclusively to ancestries for the Hispanic or Latino population. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Norway Hispanic or Latino population. It includes the distribution of the Hispanic or Latino population, of Norway, by their ancestries, as identified by the Census Bureau. The dataset can be utilized to understand the origin of the Hispanic or Latino population of Norway.

    Key observations

    Among the Hispanic population in Norway, regardless of the race, the largest group is of Mexican origin, with a population of 14 (100% of the total Hispanic population).

    https://i.neilsberg.com/ch/norway-mi-population-by-race-and-ethnicity.jpeg" alt="Norway Non-Hispanic population by race">

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates.

    Origin for Hispanic or Latino population include:

    • Mexican
    • Black or African American
    • Puerto Rican
    • Cuban
    • Other Hispanic or Latino

    Variables / Data Columns

    • Origin: This column displays the origin for Hispanic or Latino population for the Norway
    • Population: The population of the specific origin for Hispanic or Latino population in the Norway is shown in this column.
    • % of Total Hispanic Population: This column displays the percentage distribution of each Hispanic origin as a proportion of Norway total Hispanic or Latino population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Norway Population by Race & Ethnicity. You can refer the same here

  7. Spanish Spontaneous Dialogue speech dataset

    • kaggle.com
    zip
    Updated Jun 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Wong (2024). Spanish Spontaneous Dialogue speech dataset [Dataset]. https://www.kaggle.com/datasets/nexdatafrank/spanish-spontaneous-dialogue-speech-dataset
    Explore at:
    zip(93236 bytes)Available download formats
    Dataset updated
    Jun 7, 2024
    Authors
    Frank Wong
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Spanish(Spain) Spontaneous Dialogue Telephony speech dataset

    Description

    Spanish(Spain) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(600 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1234?source=Kaggle

    Format

    8kHz 8bit, a-law/u-law pcm, mono channel

    Content category

    Dialogue based on given topics

    Recording condition

    Low background noise (indoor)

    Recording device

    Telephony

    Country

    Spain(ESP)

    Language(Region) Code

    es-ES

    Language

    Spanish

    Speaker

    600 people in total, 49% male and 51% female

    Features of annotation

    Transcription text, timestamp, speaker ID, gender

    Accuracy rate

    Word accuracy rate(WAR) 98%

    Licensing Information

    Commercial License

  8. Hispanic population U.S. 2023, by state

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Hispanic population U.S. 2023, by state [Dataset]. https://www.statista.com/statistics/259850/hispanic-population-of-the-us-by-state/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2023, California had the highest Hispanic population in the United States, with over 15.76 million people claiming Hispanic heritage. Texas, Florida, New York, and Illinois rounded out the top five states for Hispanic residents in that year. History of Hispanic people Hispanic people are those whose heritage stems from a former Spanish colony. The Spanish Empire colonized most of Central and Latin America in the 15th century, which began when Christopher Columbus arrived in the Americas in 1492. The Spanish Empire expanded its territory throughout Central America and South America, but the colonization of the United States did not include the Northeastern part of the United States. Despite the number of Hispanic people living in the United States having increased, the median income of Hispanic households has fluctuated slightly since 1990. Hispanic population in the United States Hispanic people are the second-largest ethnic group in the United States, making Spanish the second most common language spoken in the country. In 2021, about one-fifth of Hispanic households in the United States made between 50,000 to 74,999 U.S. dollars. The unemployment rate of Hispanic Americans has fluctuated significantly since 1990, but has been on the decline since 2010, with the exception of 2020 and 2021, due to the impact of the coronavirus (COVID-19) pandemic.

  9. Number of students learning Spanish worldwide 2024, by country

    • statista.com
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of students learning Spanish worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/1276319/number-spanish-language-students-country-worldwide/
    Explore at:
    Dataset updated
    Jan 22, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, Spain
    Description

    The United States is the country with the largest number of Spanish language students, at approximately 8.59 million people in 2024. The second country is Brazil, with around 4.05 million students of the Spanish language. Moreover, the United States is also the non-hispanic country with the largest number of native Spanish speakers in the world.

  10. F

    Mexican Spanish General Conversation Speech Dataset for ASR

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Mexican Spanish General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-spanish-mexico
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Mexico
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Mexican Spanish General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of Spanish speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Mexican Spanish communication.

    Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade Spanish speech models that understand and respond to authentic Mexican accents and dialects.

    Speech Data

    The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Mexican Spanish. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

    Participant Diversity:
    Speakers: 60 verified native Mexican Spanish speakers from FutureBeeAI’s contributor community.
    Regions: Representing various provinces of Mexico to ensure dialectal diversity and demographic balance.
    Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.
    Recording Details:
    Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.
    Duration: Each conversation ranges from 15 to 60 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.
    Environment: Quiet, echo-free settings with no background noise.

    Topic Diversity

    The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

    Sample Topics Include:
    Family & Relationships
    Food & Recipes
    Education & Career
    Healthcare Discussions
    Social Issues
    Technology & Gadgets
    Travel & Local Culture
    Shopping & Marketplace Experiences, and many more.

    Transcription

    Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

    Transcription Highlights:
    Speaker-segmented dialogues
    Time-coded utterances
    Non-speech elements (pauses, laughter, etc.)
    High transcription accuracy, achieved through double QA pass, average WER < 5%

    These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

    Metadata

    The dataset comes with granular metadata for both speakers and recordings:

    Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.
    Recording Metadata: Topic, duration, audio format, device type, and sample rate.

    Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

    Usage and Applications

    This dataset is a versatile resource for multiple Spanish speech and language AI applications:

    ASR Development: Train accurate speech-to-text systems for Mexican Spanish.
    Voice Assistants: Build smart assistants capable of understanding natural Mexican conversations.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px;

  11. Multivariable logistic regression modeling: Ever vaccinated for COVID-19.

    • plos.figshare.com
    xls
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandy K. Aguilar-Palma; Thomas P. McCoy; Lilli Mann-Jackson; Jorge Alonzo; Mohammed Sheikh Eldin Jibriel; Dorcas Mabiala Johnson; Tony Locklear; Amanda E. Tanner; Mark A. Hall; Alain G. Bertoni; Ana D. Sucaldito; Laurie P. Russell; Scott D. Rhodes (2025). Multivariable logistic regression modeling: Ever vaccinated for COVID-19. [Dataset]. http://doi.org/10.1371/journal.pone.0317794.t003
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sandy K. Aguilar-Palma; Thomas P. McCoy; Lilli Mann-Jackson; Jorge Alonzo; Mohammed Sheikh Eldin Jibriel; Dorcas Mabiala Johnson; Tony Locklear; Amanda E. Tanner; Mark A. Hall; Alain G. Bertoni; Ana D. Sucaldito; Laurie P. Russell; Scott D. Rhodes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Multivariable logistic regression modeling: Ever vaccinated for COVID-19.

  12. Percentage of Hispanic population in the U.S. by state 2023

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Percentage of Hispanic population in the U.S. by state 2023 [Dataset]. https://www.statista.com/statistics/259865/percentage-of-hispanic-population-in-the-us-by-state/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2022, around 48.59 percent of New Mexico's population was of Hispanic origin, compared to the national percentage of 19.45. California, Texas, and Arizona also registered shares over 30 percent. The distribution of the U.S. population by ethnicity can be accessed here.

  13. Sample characteristics of the participants (N = 180).

    • plos.figshare.com
    xls
    Updated Jan 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sandy K. Aguilar-Palma; Thomas P. McCoy; Lilli Mann-Jackson; Jorge Alonzo; Mohammed Sheikh Eldin Jibriel; Dorcas Mabiala Johnson; Tony Locklear; Amanda E. Tanner; Mark A. Hall; Alain G. Bertoni; Ana D. Sucaldito; Laurie P. Russell; Scott D. Rhodes (2025). Sample characteristics of the participants (N = 180). [Dataset]. http://doi.org/10.1371/journal.pone.0317794.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sandy K. Aguilar-Palma; Thomas P. McCoy; Lilli Mann-Jackson; Jorge Alonzo; Mohammed Sheikh Eldin Jibriel; Dorcas Mabiala Johnson; Tony Locklear; Amanda E. Tanner; Mark A. Hall; Alain G. Bertoni; Ana D. Sucaldito; Laurie P. Russell; Scott D. Rhodes
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample characteristics of the participants (N = 180).

  14. F

    Spanish (Spain) Call Center Data for Delivery & Logistics AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Spanish (Spain) Call Center Data for Delivery & Logistics AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/delivery-call-center-conversation-spanish-spain
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Area covered
    Spain
    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Spanish Call Center Speech Dataset for the Delivery and Logistics industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Spanish-speaking customers. With over 30 hours of real-world, unscripted call center audio, this dataset captures authentic delivery-related conversations essential for training high-performance ASR models.

    Curated by FutureBeeAI, this dataset empowers AI teams, logistics tech providers, and NLP researchers to build accurate, production-ready models for customer support automation in delivery and logistics.

    Speech Data

    The dataset contains 30 hours of dual-channel call center recordings between native Spanish speakers. Captured across various delivery and logistics service scenarios, these conversations cover everything from order tracking to missed delivery resolutions offering a rich, real-world training base for AI models.

    Participant Diversity:
    Speakers: 60 native Spanish speakers from our verified contributor pool.
    Regions: Multiple provinces of Spain for accent and dialect diversity.
    Participant Profile: Balanced gender distribution (60% male, 40% female) with ages ranging from 18 to 70.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted customer-agent dialogues.
    Call Duration: 5 to 15 minutes on average.
    Audio Format: Stereo WAV, 16-bit depth, recorded at 8kHz and 16kHz.
    Recording Environment: Captured in clean, noise-free, echo-free conditions.

    Topic Diversity

    This speech corpus includes both inbound and outbound delivery-related conversations, covering varied outcomes (positive, negative, neutral) to train adaptable voice models.

    Inbound Calls:
    Order Tracking
    Delivery Complaints
    Undeliverable Addresses
    Return Process Enquiries
    Delivery Method Selection
    Order Modifications, and more
    Outbound Calls:
    Delivery Confirmations
    Subscription Offer Calls
    Incorrect Address Follow-ups
    Missed Delivery Notifications
    Delivery Feedback Surveys
    Out-of-Stock Alerts, and others

    This comprehensive coverage reflects real-world logistics workflows, helping voice AI systems interpret context and intent with precision.

    Transcription

    All recordings come with high-quality, human-generated verbatim transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., pauses, noise)
    High transcription accuracy with word error rate under 5% via dual-layer quality checks.

    These transcriptions support fast, reliable model development for Spanish voice AI applications in the delivery sector.

    Metadata

    Detailed metadata is included for each participant and conversation:

    Participant Metadata: ID, age, gender, region, accent, dialect.
    Conversation Metadata: Topic, call type, sentiment, sample rate, and technical attributes.

    This metadata aids in training specialized models, filtering demographics, and running advanced analytics.

    Usage and Applications

    This dataset

  15. e

    Use Basque as much as or more than Spanish among the population of the...

    • euskadi.eus
    csv, xlsx
    Updated Mar 14, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Use Basque as much as or more than Spanish among the population of the Basque Country (>= 16 years of age) by the scope of use according to provinc. [Dataset]. https://www.euskadi.eus/use-basque-as-much-as-or-more-than-spanish-among-the-population-of-the-basque-country-16-years-of-age-by-the-scope-of-use-according-to-provinc/aa30-12375/en/
    Explore at:
    xlsx(17.62), csv(0.49)Available download formats
    Dataset updated
    Mar 14, 2017
    Area covered
    Basque Country
    Description

    The Sociolinguistic Survey seeks both to explore knowledge, use and family transmission of the Basque language and to promote it. The aim of the survey is to obtain official statistical information on the use of language interaction between people. It is conducted every five years on people aged over 15. More information in the https://www.euskadi.eus/estadisticas-del-departamento-de-cultura-y-politica-linguistica/web01-s2kultur/es/ departmental statistical portal.

  16. e

    Use Basque as much as or more than Spanish among the bilingual population of...

    • euskadi.eus
    csv, xlsx
    Updated Nov 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Use Basque as much as or more than Spanish among the bilingual population of the Basque Country (>= 16 years of age) by age group (%). [Dataset]. https://www.euskadi.eus/use-basque-as-much-as-or-more-than-spanish-among-the-bilingual-population-of-the-basque-country-16-years-of-age-by-age-group/aa30-12375/en/
    Explore at:
    xlsx(16.79), csv(0.47)Available download formats
    Dataset updated
    Nov 22, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Basque Country
    Description

    The Sociolinguistic Survey seeks both to explore knowledge, use and family transmission of the Basque language and to promote it. The aim of the survey is to obtain official statistical information on the use of language interaction between people. It is conducted every five years on people aged over 15. More information in the https://www.euskadi.eus/estadisticas-del-departamento-de-cultura-y-politica-linguistica/web01-s2kultur/es/ departmental statistical portal.

  17. o

    Longitudinal Study of the Second Generation in Spain, Waves 1, 2, & 3

    • openicpsr.org
    Updated Nov 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Portes; Rosa Aparicio (2021). Longitudinal Study of the Second Generation in Spain, Waves 1, 2, & 3 [Dataset]. http://doi.org/10.3886/E155023V1
    Explore at:
    Dataset updated
    Nov 19, 2021
    Dataset provided by
    Ortega y Gassett and Gregorio Marañon Foundation (FOM: La Fundación Ortega-Marañón)
    University of Miami, Princeton University
    Authors
    Alejandro Portes; Rosa Aparicio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Spain
    Description

    Combined Longitudinal Study of the Second Generation in Spain data set, Waves 1, 2, and 3. This is the publicly available version of the ILSEG data (ILSEG is the Spanish acronym for Investigación Longitudinal de la Segunda Generación, Longitudinal Study of the Second Generation). Questions address the situations and plans for the future of young Spaniards who are children of immigrants to Spain, who were living in Madrid and Barcelona and attending secondary school in 2007-2008 and the 2011-2012 and 2015-2016 follow ups). The longitudinal study of the second Generation (ILSEG in its Spanish initials) represents the first attempt to conduct a large-scale study of the adaptation of children of immigrants to Spanish society over time. To that end, a large and statistically representative sample of children born to foreign parents in Spain or those brought at an early age to the country was identified and interviewed in metropolitan Madrid and Barcelona for wave 1. In total, almost 7,000 children of immigrants attending basic secondary school in close to 200 educational centers in both cities took part in the study. Because of sample attrition, wave 2 introduced a replacement sample. Additionally, a native born sample of children of Spaniards was also included to enable comparisons between native and immigrant-origin populations of the same age cohort.Topics include basic demographics, national origins, Spanish language acquisition, foreign language knowledge and retention, parents' education and employment, respondents' education and aspirations, religion, household arrangements, life experiences, and attitudes about Spanish society. Demographic variables include age, sex, birth country, language proficiency (Spanish and Catalan), language spoken in the home, number of siblings, mother's and father's birth country, religion, national identity, parent's sex, parent's marital status, parent's birth year, and the year the parent arrived in Spain.

  18. Ranking of languages spoken at home in the U.S. 2024, by number of speakers

    • statista.com
    Updated Nov 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Ranking of languages spoken at home in the U.S. 2024, by number of speakers [Dataset]. https://www.statista.com/statistics/183483/ranking-of-languages-spoken-at-home-in-the-us-in-2008/
    Explore at:
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2024
    Area covered
    United States
    Description

    In 2024, some 45 million people in the United States spoke Spanish at home. In comparison, the second most spoken non-English language spoken by households was Chinese, at just 3.7 million speakers.The distribution of the U.S. population by ethnicity can be accessed here. A ranking of the most spoken languages across the world can be accessed here.

  19. Mother tongue of the Catalan population 2024

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Mother tongue of the Catalan population 2024 [Dataset]. https://www.statista.com/statistics/454810/mother-tongue-of-the-catalan-population/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jun 17, 2024 - Sep 2, 2024
    Area covered
    Catalonia, Spain
    Description

    The Catalan and Spanish languages coexist in the coastal region of Catalonia, both enjoying official and equal status. As of 2024, more than ** percent of the population of Catalonia considered Spanish their mother tongue, whereas less than ** percent reported being native speakers of Catalan. Catalonia was the second most populous autonomous community in Spain in 2024 with about * million people. Editorial scene in Catalonia Despite the fact that the vast majority of books in Spain are published in Spanish, the Catalan language ranked second in the country’s editorial scene at about * percent of book publications, revealing the weight of this language among other languages spoken in Spain. In fact, Catalan was one of the most translated languages in this country according to the latest studies. Catalonia in Spain The Catalan participation in the Spanish GDP was estimated at ** percent in 2023. This figure maintained steadily over the last few years, with an average share of about ** percent of the total GDP of the country. The average GDP per capita in Catalonia was significantly higher than that of the rest of Spain at ****** euros in 2022. During the same period, Spain’s average GDP per capita was ****** euros.

  20. e

    Use Basque as much as or more than Spanish among the population of the...

    • opendata.euskadi.eus
    • euskadi.eus
    csv, xlsx
    Updated Mar 14, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Use Basque as much as or more than Spanish among the population of the Basque Country (>= 16 years of age) by age group (%). [Dataset]. https://opendata.euskadi.eus/catalogo/-/use-basque-as-much-as-or-more-than-spanish-among-the-population-of-the-basque-country-16-years-of-age-by-age-group/
    Explore at:
    csv(0.39), xlsx(17.52)Available download formats
    Dataset updated
    Mar 14, 2017
    Area covered
    Basque Country
    Description

    The Sociolinguistic Survey seeks both to explore knowledge, use and family transmission of the Basque language and to promote it. The aim of the survey is to obtain official statistical information on the use of language interaction between people. It is conducted every five years on people aged over 15. More information in the https://www.euskadi.eus/estadisticas-del-departamento-de-cultura-y-politica-linguistica/web01-s2kultur/es/ departmental statistical portal.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
Organization logo

Number of native Spanish speakers worldwide 2024, by country

Explore at:
8 scholarly articles cite this dataset (View in Google Scholar)
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description

Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

Search
Clear search
Close search
Google apps
Main menu