84 datasets found
  1. Number of native Spanish speakers worldwide 2024, by country

    • statista.com
    • boostndoto.org
    • +5more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

  2. Spanish speakers in countries where Spanish is not an official language 2024...

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Spanish speakers in countries where Spanish is not an official language 2024 [Dataset]. https://www.statista.com/statistics/1276290/number-spanish-speakers-non-hispanic-countries-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    World
    Description

    The United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.

  3. Number of students learning Spanish worldwide 2024, by country

    • statista.com
    Updated Jan 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Number of students learning Spanish worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/1276319/number-spanish-language-students-country-worldwide/
    Explore at:
    Dataset updated
    Jan 22, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide, Spain
    Description

    The United States is the country with the largest number of Spanish language students, at approximately 8.59 million people in 2024. The second country is Brazil, with around 4.05 million students of the Spanish language. Moreover, the United States is also the non-hispanic country with the largest number of native Spanish speakers in the world.

  4. Spanish Language Datasets | 1.8M+ Sentences | Translation Data | TTS |...

    • datarade.ai
    Updated Jul 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxford Languages (2025). Spanish Language Datasets | 1.8M+ Sentences | Translation Data | TTS | Dictionary Display | Translations | EU & LATAM Coverage [Dataset]. https://datarade.ai/data-products/spanish-language-datasets-1-8m-sentences-nlp-tts-dic-oxford-languages
    Explore at:
    .json, .xml, .csv, .xls, .txt, .mp3, .wavAvailable download formats
    Dataset updated
    Jul 11, 2025
    Dataset authored and provided by
    Oxford Languageshttps://lexico.com/es
    Area covered
    Honduras, Costa Rica, Bolivia (Plurinational State of), Colombia, Ecuador, Paraguay, Chile, Nicaragua, Panama, Cuba
    Description

    Linguistically annotated Spanish language datasets with headwords, definitions, senses, examples, POS tags, semantic metadata, and usage info. Ideal for dictionary tools, NLP, and TTS model training or fine-tuning.

    Our Spanish language datasets are carefully compiled and annotated by language and linguistic experts; you can find them available for licensing:

    1. Spanish Monolingual Dictionary Data
    2. Spanish Bilingual Dictionary Data
    3. Spanish Sentences Data
    4. Synonyms and Antonyms Data
    5. Audio Data
    6. Spanish Word List Data

    Key Features (approximate numbers):

    1. Spanish Monolingual Dictionary Data

    Our Spanish monolingual reliably offers clear definitions and examples, a large volume of headwords, and comprehensive coverage of the Spanish language.

    • Words: 73,000
    • Senses: 123,000
    • Example sentences: 104,000
    • Format: XML and JSON formats
    • Delivery: Email (link-based file sharing) and REST API
    • Updated frequency: annually
    1. Spanish Bilingual Dictionary Data

    The bilingual data provides translations in both directions, from English to Spanish and from Spanish to English. It is annually reviewed and updated by our in-house team of language experts. Offers significant coverage of the language, providing a large volume of translated words of excellent quality.

    • Translations: 221,300
    • Senses: 103,500
    • Example sentences: 74,500
    • Example translations: 83,800
    • Format: XML and JSON formats
    • Delivery: Email (link-based file sharing) and REST API
    • Updated frequency: annually
    1. Spanish Sentences Data

    Spanish sentences retrieved from the corpus are ideal for NLP model training, presenting approximately 20 million words. The sentences provide a great coverage of Spanish-speaking countries and are accordingly tagged to a particular country or dialect.

    • Sentences volume: 1,840,000
    • Format: XML and JSON format
    • Delivery: Email (link-based file sharing) and REST API
    1. Spanish Synonyms and Antonyms Data

    This Spanish language dataset offers a rich collection of synonyms and antonyms, accompanied by detailed definitions and part-of-speech (POS) annotations, making it a comprehensive resource for building linguistically aware AI systems and language technologies.

    • Synonyms: 127,700
    • Antonyms: 9,500
    • Format: XML format
    • Delivery: Email (link-based file sharing)
    • Updated frequency: annually
    1. Spanish Audio Data (word-level)

    Curated word-level audio data for the Spanish language, which covers all varieties of world Spanish, providing rich dialectal diversity in the Spanish language.

    • Audio files: 20,900
    • Format: XLSX (for index), MP3 and WAV (audio files)
    1. Spanish Word List Data

    This language data contains a carefully curated and comprehensive list of 450,000 Spanish words.

    • Wordforms: 450,000
    • Format: CSV and TXT formats
    • Delivery: Email (link-based file sharing)

    Use Cases:

    We consistently work with our clients on new use cases as language technology continues to evolve. These include NLP applications, TTS, dictionary display tools, games, translation, word embedding, and word sense disambiguation (WSD).

    If you have a specific use case in mind that isn't listed here, we’d be happy to explore it with you. Don’t hesitate to get in touch with us at Oxford.Languages@oup.com to start the conversation.

    Pricing:

    Oxford Languages offers flexible pricing based on use case and delivery format. Our datasets are licensed via term-based IP agreements and tiered pricing for API-delivered data. Whether you’re integrating into a product, training an LLM, or building custom NLP solutions, we tailor licensing to your specific needs.

    Contact our team or email us at Oxford.Languages@oup.com to explore pricing options and discover how our language data can support your goals.

    About the sample:

    The samples offer a brief overview of one or two language datasets (monolingual or/and bilingual dictionary data). To help you explore the structure and features of our dataset, we provide a sample in CSV format for preview purposes only.

    If you need the complete original sample or more details about any dataset, please contact us (Growth.OL@oup.com) to request access or further information

  5. Spanish Spontaneous Dialogue speech dataset

    • kaggle.com
    zip
    Updated Jun 7, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Wong (2024). Spanish Spontaneous Dialogue speech dataset [Dataset]. https://www.kaggle.com/datasets/nexdatafrank/spanish-spontaneous-dialogue-speech-dataset
    Explore at:
    zip(93236 bytes)Available download formats
    Dataset updated
    Jun 7, 2024
    Authors
    Frank Wong
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Spanish(Spain) Spontaneous Dialogue Telephony speech dataset

    Description

    Spanish(Spain) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics, covering 20+ domains. Transcribed with text content, speaker's ID, gender, age and other attributes. Our dataset was collected from extensive and diversify speakers(600 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.For more details, please refer to the link: https://www.nexdata.ai/datasets/speechrecog/1234?source=Kaggle

    Format

    8kHz 8bit, a-law/u-law pcm, mono channel

    Content category

    Dialogue based on given topics

    Recording condition

    Low background noise (indoor)

    Recording device

    Telephony

    Country

    Spain(ESP)

    Language(Region) Code

    es-ES

    Language

    Spanish

    Speaker

    600 people in total, 49% male and 51% female

    Features of annotation

    Transcription text, timestamp, speaker ID, gender

    Accuracy rate

    Word accuracy rate(WAR) 98%

    Licensing Information

    Commercial License

  6. h

    messirve

    • huggingface.co
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Spanish Info Retrieval, messirve [Dataset]. https://huggingface.co/datasets/spanish-ir/messirve
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset authored and provided by
    Spanish Info Retrieval
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    July 2025 UPDATE: We released version 1.1, adding almost 200k new queries 🎉🎉🎉. Use with: country = "full" # "ar", "bo", ... version = "1.1" dataset = datasets.load_dataset("spanish-ir/messirve", country, revision=version) print(dataset)

      Dataset Card for MessIRve
    

    MessIRve is a large-scale dataset for Spanish IR, designed to better capture the information needs of Spanish speakers across different countries. Queries are obtained from Google's autocomplete API… See the full description on the dataset page: https://huggingface.co/datasets/spanish-ir/messirve.

  7. f

    Table_1_Parental Burnout Assessment (PBA) in Different Hispanic Countries:...

    • figshare.com
    • frontiersin.figshare.com
    docx
    Updated Jun 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Denisse Manrique-Millones; Georgy M. Vasin; Sergio Dominguez-Lara; Rosa Millones-Rivalles; Ricardo T. Ricci; Milagros Abregu Rey; María Josefina Escobar; Daniela Oyarce; Pablo Pérez-Díaz; María Pía Santelices; Claudia Pineda-Marín; Javier Tapia; Mariana Artavia; Maday Valdés Pacheco; María Isabel Miranda; Raquel Sánchez Rodríguez; Clara Isabel Morgades-Bamba; Ainize Peña-Sarrionandia; Fernando Salinas-Quiroz; Paola Silva Cabrera; Moïra Mikolajczak; Isabelle Roskam (2023). Table_1_Parental Burnout Assessment (PBA) in Different Hispanic Countries: An Exploratory Structural Equation Modeling Approach.DOCX [Dataset]. http://doi.org/10.3389/fpsyg.2022.827014.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 14, 2023
    Dataset provided by
    Frontiers
    Authors
    Denisse Manrique-Millones; Georgy M. Vasin; Sergio Dominguez-Lara; Rosa Millones-Rivalles; Ricardo T. Ricci; Milagros Abregu Rey; María Josefina Escobar; Daniela Oyarce; Pablo Pérez-Díaz; María Pía Santelices; Claudia Pineda-Marín; Javier Tapia; Mariana Artavia; Maday Valdés Pacheco; María Isabel Miranda; Raquel Sánchez Rodríguez; Clara Isabel Morgades-Bamba; Ainize Peña-Sarrionandia; Fernando Salinas-Quiroz; Paola Silva Cabrera; Moïra Mikolajczak; Isabelle Roskam
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Parental burnout is a unique and context-specific syndrome resulting from a chronic imbalance of risks over resources in the parenting domain. The current research aims to evaluate the psychometric properties of the Spanish version of the Parental Burnout Assessment (PBA) across Spanish-speaking countries with two consecutive studies. In Study 1, we analyzed the data through a bifactor model within an Exploratory Structural Equation Modeling (ESEM) on the pooled sample of participants (N = 1,979) obtaining good fit indices. We then attained measurement invariance across both gender and countries in a set of nested models with gradually increasing parameter constraints. Latent means comparisons across countries showed that among the participants’ countries, Chile had the highest parental burnout score, likewise, comparisons across gender evidenced that mothers displayed higher scores than fathers, as shown in previous studies. Reliability coefficients were high. In Study 2 (N = 1,171), we tested the relations between parental burnout and three specific consequences, i.e., escape and suicidal ideations, parental neglect, and parental violence toward one’s children. The medium to large associations found provided support for the PBA’s predictive validity. Overall, we concluded that the Spanish version of the PBA has good psychometric properties. The results support its relevance for the assessment of parental burnout among Spanish-speaking parents, offering new opportunities for cross-cultural research in the parenting domain.

  8. t

    HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN

    • portal.tad3.org
    Updated Nov 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). HISPANIC OR LATINO AND RACE - DP05_PIN_T - Dataset - CKAN [Dataset]. https://portal.tad3.org/dataset/hispanic-or-latino-and-race-dp05_pin_t
    Explore at:
    Dataset updated
    Nov 17, 2024
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    ACS DEMOGRAPHIC AND HOUSING ESTIMATES HISPANIC OR LATINO AND RACE - DP05 Universe - Total population Survey-Program - American Community Survey 5-year estimates Years - 2020, 2021, 2022 The terms “Hispanic,” “Latino,” and “Spanish” are used interchangeably. Some respondents identify with all three terms while others may identify with only one of these three specific terms. People who identify with the terms “Hispanic,” “Latino,” or “Spanish” are those who classify themselves in one of the specific Hispanic, Latino, or Spanish categories listed on the questionnaire (“Mexican, Mexican Am., or Chicano,” “Puerto Rican,” or “Cuban”) as well as those who indicate that they are “another Hispanic, Latino, or Spanish origin.” People who do not identify with one of the specific origins listed on the questionnaire but indicate that they are “another Hispanic, Latino, or Spanish origin” are those whose origins are from Spain, the Spanish-speaking countries of Central or South America, or another Spanish culture or origin. Origin can be viewed as the heritage, nationality group, lineage, or country of birth of the person or the person’s parents or ancestors before their arrival in the UnitedStates. People who identify their origin as Hispanic, Latino, or Spanish may be of any race.

  9. Spanish-language e-book price 2018-2023, by country

    • statista.com
    Updated Nov 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Spanish-language e-book price 2018-2023, by country [Dataset]. https://www.statista.com/statistics/1032412/spanish-language-ebook-price-worldwide-country/
    Explore at:
    Dataset updated
    Nov 27, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    Worldwide
    Description

    In 2023, a Spanish-language e-book cost on average ***** euros in Spain, where such e-books were the most expensive in comparison to other Spanish-speaking countries. Mexico and Peru followed, where Spanish-language e-books cost an average of *** euros and *** euros respectively.

  10. Nexdata | Spanish Speech Data by Mobile Phone | 435 Hours

    • datarade.ai
    • data.nexdata.ai
    Updated Nov 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). Nexdata | Spanish Speech Data by Mobile Phone | 435 Hours [Dataset]. https://datarade.ai/data-products/nexdata-spanish-speech-data-by-mobile-phone-435-hours-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Nov 11, 2025
    Dataset authored and provided by
    Nexdata
    Area covered
    Spain
    Description

    Spanish(Spain) Scripted Monologue Smartphone speech dataset, collected from monologue based on given scripts, covering generic domain, human-machine interaction, smart home command and in-car command, numbers, news and other domains. Transcribed with text content and other attributes. Our dataset was collected from extensive and diversify speakers(989 people in total), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

    Format

    16kHz, 16bit, uncompressed wav, mono channel;

    Recording condition

    Low background noise(indoor), without echo;

    Content category

    Generic domain; news; human-machine interaction; smart home command and control; in-car command and control; numbers

    Recording device

    Android Smartphone, iPhone;

    Speaker

    989 speakers totally, with 49% male and 51% female ; and 57% speakers of all are in the age group of 17-25,39% speakers of all are in the age group of 26-45, 4% speakers of all are in the age group of 46-60;

    Country

    Spain(ESP);

    Language(Region) Code

    es-ES;

    Language

    Spanish;

    Features of annotation

    Transcription text;

    Accuracy Rate

    Sentence Accuracy Rate (SAR) 95%

  11. Spanish Spontaneous Dialogue Telephony speech

    • kaggle.com
    zip
    Updated Jun 11, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Frank Wong (2024). Spanish Spontaneous Dialogue Telephony speech [Dataset]. https://www.kaggle.com/datasets/nexdatafrank/spanish-spontaneous-dialogue-telephony-speech/code
    Explore at:
    zip(215338 bytes)Available download formats
    Dataset updated
    Jun 11, 2024
    Authors
    Frank Wong
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    88-Hours-Mexican-Spanish-Conversational-Speech-Data-by-Telephone

    Description

    Spanish(Mexico) Spontaneous Dialogue Telephony speech dataset, collected from dialogues based on given topics. Transcribed with text content, timestamp, speaker's ID, gender and other attributes. Our dataset was collected from extensive and diversify speakers(122 native speakers), geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied. For more details, please refer to the link:https://www.nexdata.ai/datasets/speechrecog/1352?source=Kaggle

    Format

    8kHz 8bit, a-law/u-law pcm, mono channel

    Content category

    Dialogue based on given topics

    Recording condition

    Low background noise (indoor)

    Recording device

    Telephony

    Country

    Mexico(MEX)

    Language(Region) Code

    es-MX

    Language

    Spanish

    Speaker

    122 people in total, 53% male and 47% female

    Features of annotation

    Transcription text, timestamp, speaker ID, gender, noise

    Accuracy rate

    Word accuracy rate(WAR) 98%

    Licensing Information

    Commercial License

  12. 488h Spanish phone calls dataset

    • kaggle.com
    zip
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    simon graves (2025). 488h Spanish phone calls dataset [Dataset]. https://www.kaggle.com/datasets/simongraves/spanish-speech-recognition-dataset
    Explore at:
    zip(93217 bytes)Available download formats
    Dataset updated
    Jul 30, 2025
    Authors
    simon graves
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Spanish Telephone Dialogues Dataset - 488 Hours

    Dataset comprises 488 hours of high-quality telephone audio recordings in Spanish, featuring 600 native speakers and achieving a 95% sentence accuracy rate. Designed for advancing speech recognition models and language processing, this extensive speech data corpus covers diverse topics and domains, making it ideal for training robust automatic speech recognition (ASR) systems. - Get the data

    Dataset characteristics:

    CharacteristicData
    DescriptionAudio of telephone dialogues in Spanish for training NLP models in real-world conversational scenarios.
    Data typesAudio
    TasksSpeech recognition, NLP
    CountrySpain (ESP)
    Hours of telephone dialogue488
    Number of speakers600
    LabelingAnnotation (text content, speaker's ID, gender, age and other attributes)
    GenderMale (49%), Female (51%)
    Recording deviceTelephone

    Here's a sample dataset to check out. For full access, go here.

    Dataset structure

    • audio - audio file
    • text - text transcription
    • Spanish Speech Recognition.csv - metadata for the data

    Similar Datasets:

    1. Portuguese Speech Recognition Dataset
    2. American Speech Recognition Dataset
    3. French Speech Recognition Dataset
  13. The most spoken languages worldwide 2025

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, The most spoken languages worldwide 2025 [Dataset]. https://www.statista.com/statistics/266808/the-most-spoken-languages-worldwide/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2025
    Area covered
    World
    Description

    In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.

  14. Spanish_Visitors_Analysis

    • kaggle.com
    zip
    Updated Mar 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R. H. Amezqueta (2024). Spanish_Visitors_Analysis [Dataset]. https://www.kaggle.com/datasets/rudyhernndez/spanish-visitors-analysis
    Explore at:
    zip(20312200 bytes)Available download formats
    Dataset updated
    Mar 4, 2024
    Authors
    R. H. Amezqueta
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Data sources:

    Files * Visitors_Turist_Sites: This Geodataframe is based on a selection of 154 municipalities where, from the Hotel Occupancy Survey (Encuesta de Ocupación Hotelera) conducted by the National Institute of Statistics (Spain), distinctions are made between different nationalities of visitors.

    • Spanish_Provinces_Peninsula: Provincial limits of Spain (Iberian Peninsula and Balearic Islands)

    • Spanish_Provinces_CanaryIslands: Provincial limits of Spain (Canary Islands)

    • Geo_world: The cartographic bases of the National Atlas of Spain (ANE) - World Map.

    License: All this data is licensed under CC-BY 4.0.** https://creativecommons.org/licenses/by/4.0/deed.es

  15. o

    Longitudinal Study of the Second Generation in Spain, Waves 1, 2, & 3

    • openicpsr.org
    Updated Nov 19, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alejandro Portes; Rosa Aparicio (2021). Longitudinal Study of the Second Generation in Spain, Waves 1, 2, & 3 [Dataset]. http://doi.org/10.3886/E155023V1
    Explore at:
    Dataset updated
    Nov 19, 2021
    Dataset provided by
    University of Miami, Princeton University
    Ortega y Gassett and Gregorio Marañon Foundation (FOM: La Fundación Ortega-Marañón)
    Authors
    Alejandro Portes; Rosa Aparicio
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Spain
    Description

    Combined Longitudinal Study of the Second Generation in Spain data set, Waves 1, 2, and 3. This is the publicly available version of the ILSEG data (ILSEG is the Spanish acronym for Investigación Longitudinal de la Segunda Generación, Longitudinal Study of the Second Generation). Questions address the situations and plans for the future of young Spaniards who are children of immigrants to Spain, who were living in Madrid and Barcelona and attending secondary school in 2007-2008 and the 2011-2012 and 2015-2016 follow ups). The longitudinal study of the second Generation (ILSEG in its Spanish initials) represents the first attempt to conduct a large-scale study of the adaptation of children of immigrants to Spanish society over time. To that end, a large and statistically representative sample of children born to foreign parents in Spain or those brought at an early age to the country was identified and interviewed in metropolitan Madrid and Barcelona for wave 1. In total, almost 7,000 children of immigrants attending basic secondary school in close to 200 educational centers in both cities took part in the study. Because of sample attrition, wave 2 introduced a replacement sample. Additionally, a native born sample of children of Spaniards was also included to enable comparisons between native and immigrant-origin populations of the same age cohort.Topics include basic demographics, national origins, Spanish language acquisition, foreign language knowledge and retention, parents' education and employment, respondents' education and aspirations, religion, household arrangements, life experiences, and attitudes about Spanish society. Demographic variables include age, sex, birth country, language proficiency (Spanish and Catalan), language spoken in the home, number of siblings, mother's and father's birth country, religion, national identity, parent's sex, parent's marital status, parent's birth year, and the year the parent arrived in Spain.

  16. Refugee requests in Spain

    • kaggle.com
    zip
    Updated Jan 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Blanco Gonzalez (2024). Refugee requests in Spain [Dataset]. https://www.kaggle.com/datasets/mariablancogonzalez/refugee-requests-in-spain
    Explore at:
    zip(11685 bytes)Available download formats
    Dataset updated
    Jan 23, 2024
    Authors
    Maria Blanco Gonzalez
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Spain
    Description

    Datasets on refugee claims in Spain between 2013 and 2021. This dataset is composed of two data frames. Each data frame is distributed by male and female requests.

    AsiloCA: request made focused on each autonomous community. Some usefull features information:

    • CA -> Spanish autonomous community
    • Solicitantes -> number of asylum requests
    • Año -> year of the request
    • Pais -> country

    AsiloEspaña: requests made focused on the countries of origin. Some usefull features information:

    • Nacionalidad -> applicant's nationality
    • Hombres -> number of men requests
    • Mujeres -> number of women requests
    • Total -> total number of requests made by country and year
    • Admitidas -> total number of admited requests
    • Año -> year
  17. Flow of emigration abroad of people aged 25 and over by year, sex, country...

    • ine.es
    csv, html, json +4
    Updated Dec 16, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INE - Instituto Nacional de Estadística (2021). Flow of emigration abroad of people aged 25 and over by year, sex, country of birth (Spanish/foreign) and level of studies (grouping of levels) [Dataset]. https://www.ine.es/jaxiT3/Tabla.htm?t=49983&L=1
    Explore at:
    json, html, xls, text/pc-axis, csv, txt, xlsxAvailable download formats
    Dataset updated
    Dec 16, 2021
    Dataset provided by
    National Statistics Institutehttp://www.ine.es/
    Authors
    INE - Instituto Nacional de Estadística
    License

    https://www.ine.es/aviso_legalhttps://www.ine.es/aviso_legal

    Time period covered
    Jan 1, 2019 - Jan 1, 2021
    Variables measured
    Sex, National Total, Country of birth, Level of education, Demographic Concepts
    Description

    Migration Statistic: Flow of emigration abroad of people aged 25 and over by year, sex, country of birth (Spanish/foreign) and level of studies (grouping of levels). Annual. National.

  18. Hispanic population U.S. 2023, by state

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Hispanic population U.S. 2023, by state [Dataset]. https://www.statista.com/statistics/259850/hispanic-population-of-the-us-by-state/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    2023
    Area covered
    United States
    Description

    In 2023, California had the highest Hispanic population in the United States, with over 15.76 million people claiming Hispanic heritage. Texas, Florida, New York, and Illinois rounded out the top five states for Hispanic residents in that year. History of Hispanic people Hispanic people are those whose heritage stems from a former Spanish colony. The Spanish Empire colonized most of Central and Latin America in the 15th century, which began when Christopher Columbus arrived in the Americas in 1492. The Spanish Empire expanded its territory throughout Central America and South America, but the colonization of the United States did not include the Northeastern part of the United States. Despite the number of Hispanic people living in the United States having increased, the median income of Hispanic households has fluctuated slightly since 1990. Hispanic population in the United States Hispanic people are the second-largest ethnic group in the United States, making Spanish the second most common language spoken in the country. In 2021, about one-fifth of Hispanic households in the United States made between 50,000 to 74,999 U.S. dollars. The unemployment rate of Hispanic Americans has fluctuated significantly since 1990, but has been on the decline since 2010, with the exception of 2020 and 2021, due to the impact of the coronavirus (COVID-19) pandemic.

  19. Nexdata | Spanish(Spain) Unscripted Call Center Telephony speech dataset |...

    • datarade.ai
    • data.nexdata.ai
    Updated Nov 9, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nexdata (2025). Nexdata | Spanish(Spain) Unscripted Call Center Telephony speech dataset | 81 Hours [Dataset]. https://datarade.ai/data-products/nexdata-spanish-spain-unscripted-call-center-telephony-spe-nexdata
    Explore at:
    .bin, .json, .xml, .csv, .xls, .sql, .txtAvailable download formats
    Dataset updated
    Nov 9, 2025
    Dataset authored and provided by
    Nexdata
    Area covered
    Spain
    Description

    Spanish(Spain) Unscripted Call Center Telephony speech dataset, covers telecom domain. Including terms and emotions in call center scenario, mirrors real-world interactions. Transcribed with text content, speaker's ID and other attributes. Our dataset was collected from extensive and diversify speakers, geographicly speaking, enhancing model performance in real and complex tasks. Quality tested by various AI companies. We strictly adhere to data protection regulations and privacy standards, ensuring the maintenance of user privacy and legal rights throughout the data collection, storage, and usage processes, our datasets are all GDPR, CCPA, PIPL complied.

    Format

    8kHz 16bit, wav, mono channel

    Recording condition

    Phone recording system, with low background noise (call center scenario)

    Recording content

    Spontaneous inbound and outbound callings in typical domain, such as telecom

    Country

    Spain(ESP),etc.

    Language(Region) Code

    es-ES, etc.

    Language

    Spanish

    Features of annotation

    Transcription text, timestamps, speaker ID, noise symbols, sensitive information

    Accuracy

    Word Accuracy Rate (WAR) 98% (punctuation, sentence symbols, accent and other non-speech labeling are not included in accuracy statistics due to subjectivity)

  20. h

    hispanic-people-liveness-detection-video-dataset

    • huggingface.co
    Updated Apr 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Unique Data (2024). hispanic-people-liveness-detection-video-dataset [Dataset]. https://huggingface.co/datasets/UniqueData/hispanic-people-liveness-detection-video-dataset
    Explore at:
    Dataset updated
    Apr 24, 2024
    Authors
    Unique Data
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Biometric Attack Dataset, Hispanic People

      The similar dataset that includes all ethnicities - Anti Spoofing Real Dataset
    

    The dataset for face anti spoofing and face recognition includes images and videos of hispanic people. 32,600+ photos & video of 16,300 people from 20 countries. The dataset helps in enchancing the performance of the model by providing wider range of data for a specific ethnic group. The videos were gathered by capturing faces of genuine individuals… See the full description on the dataset page: https://huggingface.co/datasets/UniqueData/hispanic-people-liveness-detection-video-dataset.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista, Number of native Spanish speakers worldwide 2024, by country [Dataset]. https://www.statista.com/statistics/991020/number-native-spanish-speakers-country-worldwide/
Organization logo

Number of native Spanish speakers worldwide 2024, by country

Explore at:
8 scholarly articles cite this dataset (View in Google Scholar)
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
World
Description

Mexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.

Search
Clear search
Close search
Google apps
Main menu