Facebook
TwitterMexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.
Facebook
TwitterThe United States is the non-hispanic country with the largest number of native Spanish speakers in the world, with approximately 41.89 million people with a native command of the language in 2024. However, the European Union had the largest group of non-native speakers with limited proficiency of Spanish, at around 28 million people. Furthermore, Mexico is the country with the largest number of native Spanish speakers in the world as of 2024.
Facebook
TwitterIn 2023, California had the highest Hispanic population in the United States, with over 15.76 million people claiming Hispanic heritage. Texas, Florida, New York, and Illinois rounded out the top five states for Hispanic residents in that year. History of Hispanic people Hispanic people are those whose heritage stems from a former Spanish colony. The Spanish Empire colonized most of Central and Latin America in the 15th century, which began when Christopher Columbus arrived in the Americas in 1492. The Spanish Empire expanded its territory throughout Central America and South America, but the colonization of the United States did not include the Northeastern part of the United States. Despite the number of Hispanic people living in the United States having increased, the median income of Hispanic households has fluctuated slightly since 1990. Hispanic population in the United States Hispanic people are the second-largest ethnic group in the United States, making Spanish the second most common language spoken in the country. In 2021, about one-fifth of Hispanic households in the United States made between 50,000 to 74,999 U.S. dollars. The unemployment rate of Hispanic Americans has fluctuated significantly since 1990, but has been on the decline since 2010, with the exception of 2020 and 2021, due to the impact of the coronavirus (COVID-19) pandemic.
Facebook
TwitterAs of 2023, around 37.99 million people of Mexican descent were living in the United States - the largest of any Hispanic group. Puerto Ricans, Salvadorans, Cubans, and Dominicans rounded out the top five Hispanic groups living in the U.S. in that year.
Facebook
TwitterThe United States is the country with the largest number of Spanish language students, at approximately 8.59 million people in 2024. The second country is Brazil, with around 4.05 million students of the Spanish language. Moreover, the United States is also the non-hispanic country with the largest number of native Spanish speakers in the world.
Facebook
TwitterLinguistically annotated Spanish language datasets with headwords, definitions, senses, examples, POS tags, semantic metadata, and usage info. Ideal for dictionary tools, NLP, and TTS model training or fine-tuning.
Our Spanish language datasets are carefully compiled and annotated by language and linguistic experts; you can find them available for licensing:
Key Features (approximate numbers):
Our Spanish monolingual reliably offers clear definitions and examples, a large volume of headwords, and comprehensive coverage of the Spanish language.
The bilingual data provides translations in both directions, from English to Spanish and from Spanish to English. It is annually reviewed and updated by our in-house team of language experts. Offers significant coverage of the language, providing a large volume of translated words of excellent quality.
Spanish sentences retrieved from the corpus are ideal for NLP model training, presenting approximately 20 million words. The sentences provide a great coverage of Spanish-speaking countries and are accordingly tagged to a particular country or dialect.
This Spanish language dataset offers a rich collection of synonyms and antonyms, accompanied by detailed definitions and part-of-speech (POS) annotations, making it a comprehensive resource for building linguistically aware AI systems and language technologies.
Curated word-level audio data for the Spanish language, which covers all varieties of world Spanish, providing rich dialectal diversity in the Spanish language.
This language data contains a carefully curated and comprehensive list of 450,000 Spanish words.
Use Cases:
We consistently work with our clients on new use cases as language technology continues to evolve. These include NLP applications, TTS, dictionary display tools, games, translation, word embedding, and word sense disambiguation (WSD).
If you have a specific use case in mind that isn't listed here, we’d be happy to explore it with you. Don’t hesitate to get in touch with us at Oxford.Languages@oup.com to start the conversation.
Pricing:
Oxford Languages offers flexible pricing based on use case and delivery format. Our datasets are licensed via term-based IP agreements and tiered pricing for API-delivered data. Whether you’re integrating into a product, training an LLM, or building custom NLP solutions, we tailor licensing to your specific needs.
Contact our team or email us at Oxford.Languages@oup.com to explore pricing options and discover how our language data can support your goals.
About the sample:
The samples offer a brief overview of one or two language datasets (monolingual or/and bilingual dictionary data). To help you explore the structure and features of our dataset, we provide a sample in CSV format for preview purposes only.
If you need the complete original sample or more details about any dataset, please contact us (Growth.OL@oup.com) to request access or further information
Facebook
TwitterIn 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
July 2025 UPDATE: We released version 1.1, adding almost 200k new queries 🎉🎉🎉. Use with: country = "full" # "ar", "bo", ... version = "1.1" dataset = datasets.load_dataset("spanish-ir/messirve", country, revision=version) print(dataset)
Dataset Card for MessIRve
MessIRve is a large-scale dataset for Spanish IR, designed to better capture the information needs of Spanish speakers across different countries. Queries are obtained from Google's autocomplete API… See the full description on the dataset page: https://huggingface.co/datasets/spanish-ir/messirve.
Facebook
TwitterLATAM Data Suite provides high-quality datasets in Spanish, Portuguese, and American English. Ideal for NLP, AI, LLMs, translation, and education, it combines linguistic depth and regional authenticity to power scalable, multilingual language technologies.
Discover our expertly curated language datasets in the LATAM Data Suite. Compiled and annotated by language and linguistic experts, this suite offers high-quality resources tailored to your needs. This suite includes:
Monolingual and Bilingual Dictionary Data Featuring headwords, definitions, word senses, part-of-speech (POS) tags, and semantic metadata.
Sentences Curated examples of real-world usage with contextual annotations.
Synonyms & Antonyms Lexical relations to support semantic search, paraphrasing, and language understanding.
Audio Data Native speaker recordings for TTS and pronunciation modeling.
Word Lists Frequency-ranked and thematically grouped lists.
Learn more about the datasets included in the data suite:
Key Features (approximate numbers):
Our Portuguese monolingual covers both European and Latin American varieties, featuring clear definitions and examples, a large volume of headwords, and comprehensive coverage of the Portuguese language.
The bilingual data provides translations in both directions, from English to Portuguese and from Portuguese to English. It is annually reviewed and updated by our in-house team of language experts. Offers comprehensive coverage of the language, providing a substantial volume of translated words of excellent quality that span both European and Latin American Portuguese varieties.
Our Spanish monolingual reliably offers clear definitions and examples, a large volume of headwords, and comprehensive coverage of the Spanish language.
The bilingual data provides translations in both directions, from English to Spanish and from Spanish to English. It is annually reviewed and updated by our in-house team of language experts. Offers significant coverage of the language, providing a large volume of translated words of excellent quality.
Spanish sentences retrieved from corpus are ideal for NLP model training, presenting approximately 20 million words. The sentences provide a great coverage of Spanish-speaking countries and are accordingly tagged to a particular country or dialect.
This Spanish language dataset offers a rich collection of synonyms and antonyms, accompanied by detailed definitions and part-of-speech (POS) annotations, making it a comprehensive resource for building linguistically aware AI systems and language technologies.
Curated word-level audio data for the Spanish language, which covers all varieties of world Spanish, providing rich dialectal diversity in the Spanish language.
This language data contains a carefully curated and comprehensive list of 450,000 Spanish words.
Our American English Monolingual Dictionary Data is the foremost au...
Facebook
TwitterBased on land area, Brazil is the largest country in Latin America by far, with a total area of over 8.5 million square kilometers. Argentina follows with almost 2.8 million square kilometers. Cuba, whose surface area extends over almost 111,000 square kilometers, is the Caribbean country with the largest territory.
Brazil: a country with a lot to offer
Brazil's borders reach nearly half of the South American subcontinent, making it the fifth-largest country in the world and the third-largest country in the Western Hemisphere. Along with its landmass, Brazil also boasts the largest population and economy in the region. Although Brasília is the capital, the most significant portion of the country's population is concentrated along its coastline in the cities of São Paulo and Rio de Janeiro.
South America: a region of extreme geographic variation
With the Andes mountain range in the West, the Amazon Rainforest in the East, the Equator in the North, and Cape Horn as the Southern-most continental tip, South America has some of the most diverse climatic and ecological terrains in the world. At its core, its biodiversity can largely be attributed to the Amazon, the world's largest tropical rainforest, and the Amazon river, the world's largest river. However, with this incredible wealth of ecology also comes great responsibility. In the past decade, roughly 80,000 square kilometers of the Brazilian Amazon were destroyed. And, as of late 2019, there were at least 1,000 threatened species in Brazil alone.
Facebook
TwitterIn 2020, about 93.8 percent of the Mexican population was monolingual in Spanish. Around five percent spoke a combination of Spanish and indigenous languages. Spanish is the third-most spoken native language worldwide, after Mandarin Chinese and Hindi. Mexican Spanish Spanish was first being used in Mexico in the 16th century, at the time of Spanish colonization during the Conquest campaigns of what is now Mexico and the Caribbean. As of 2018, Mexico is the country with the largest number of native Spanish speakers worldwide. Mexican Spanish is influenced by English and Nahuatl, and has about 120 million users. The Mexican government uses Spanish in the majority of its proceedings, however it recognizes 68 national languages, 63 of which are indigenous. Indigenous languages spoken Of the indigenous languages spoken, two of the most widely used are Nahuatl and Maya. Due to a history of marginalization of indigenous groups, most indigenous languages are endangered, and many linguists warn they might cease to be used after a span of just a few decades. In recent years, legislative attempts such as the San Andréas Accords have been made to protect indigenous groups, who make up about 25 million of Mexico’s 125 million total inhabitants, though the efficacy of such measures is yet to be seen.
Facebook
TwitterComprehensive ranking dataset of the top 100 YouTube channels from Spain. This dataset features 100 channels with detailed statistics including subscriber counts, total video views, video count, and global rankings. The leading channel has 57,600,000 subscribers and 21,172,711,450 total views. Each entry includes comprehensive metrics to analyze channel performance, growth trends, and competitive positioning. This dataset is regularly updated to reflect the latest YouTube channel statistics and ranking changes, providing valuable insights for content creators, marketers, and researchers analyzing YouTube ecosystem trends and channel performance benchmarks.
Facebook
TwitterIn 2022, around 48.59 percent of New Mexico's population was of Hispanic origin, compared to the national percentage of 19.45. California, Texas, and Arizona also registered shares over 30 percent. The distribution of the U.S. population by ethnicity can be accessed here.
Facebook
TwitterArgentina scored 562 out of a maximum of 800 points in the English Proficiency Index 2023. That was the highest score among all Latin American countries included in the survey. The Argentine capital, Buenos Aires, also received the highest English proficiency score among all the Latin American cities analyzed. Mexico and Haiti received the lowest scores in the region.
Facebook
Twitterhttps://www.meticulousresearch.com/privacy-policyhttps://www.meticulousresearch.com/privacy-policy
Europe Online Language Learning Market, by Learning Mode (Self-learning Apps, Tutoring), Age Group, Language (English, Mandarin, Spanish), End User (Individual Learners, Educational Institutions), and Country - Forecast to 2032
Facebook
TwitterIn 2023, Spanish-language e-books sold in Spain made up **** percent of the global Spanish-language e-book sales revenue. Mexico was the second largest market with over ** percent of the global sales. The United States ranked third.
Facebook
TwitterAs of October 2025, English was the dominant language for online content, used by nearly half of all websites worldwide. Spanish ranked second, accounting for around 6 percent of web content, followed by German with 5.9 percent. English as the leading online language United States and India, the countries with the most internet users after China, are also the world's biggest English-speaking markets. The internet user base in both countries combined, as of January 2023, was over a billion individuals. This has led to most of the online information being created in English. Consequently, even those who are not native speakers may use it for convenience. Global internet usage by regions As of October 2024, the number of internet users worldwide was 5.52 billion. In the same period, Northern Europe and North America were leading in terms of internet penetration rates worldwide, with around 97 percent of its populations accessing the internet.
Facebook
TwitterIn 2025, approximately 23 million people lived in the São Paulo metropolitan area, making it the biggest in Latin America and the Caribbean and the sixth most populated in the world. The homonymous state of São Paulo was also the most populous federal entity in the country. The second place for the region was Mexico City with 22.75 million inhabitants. Brazil's cities Brazil is home to two large metropolises, only counting the population within the city limits, São Paulo had approximately 11.45 million inhabitants, and Rio de Janeiro around 6.21 million inhabitants. It also contains a number of smaller, but well known cities such as Brasília, Salvador, Belo Horizonte and many others, which report between 2 and 3 million inhabitants each. As a result, the country's population is primarily urban, with nearly 88 percent of inhabitants living in cities. Mexico City Mexico City's metropolitan area ranks sevenths in the ranking of most populated cities in the world. Founded over the Aztec city of Tenochtitlan in 1521 after the Spanish conquest as the capital of the Viceroyalty of New Spain, the city still stands as one of the most important in Latin America. Nevertheless, the preeminent economic, political, and cultural position of Mexico City has not prevented the metropolis from suffering the problems affecting the rest of the country, namely, inequality and violence. Only in 2023, the city registered a crime incidence of 52,723 reported cases for every 100,000 inhabitants and around 24 percent of the population lived under the poverty line.
Facebook
TwitterThe number of enrollments in language schools in Spain reveals that Spaniards are well aware of the importance of foreign languages in modern times. During the 2022/23 academic year, almost 331,000 people were registered at the Spanish language schools to add a new language to their curricula. In a globalized world, languages are taking a much more important role on the job market. The most studied and spoken languages in the world include English, Mandarin, Hindi or Spanish.
The importance of language knowledge in the job market Enrollment numbers at language schools come as no surprise considering that foreign languages have become a vital asset for job seekers in the last years. English, par excellence the most used language for international affairs, unsurprisingly ranked first on the list of most valued languages on the Spanish job market, with approximately 65.2 of job openings that require foreign language skills demanding this one. Far from that stood French, with 17.38 percent of the job openings.
Languages in the Spanish multimedia scene Most of the best selling albums Spain during 2022 were recorded in the country’s main language Spanish, with 38 albums in the top 50. As for videogames, 96 percent of the games produced in the country had English as a language option. Spanish was the second most used language, being present in 91 percent of productions.
Facebook
TwitterAs of August 2025, Brazil was home to ****** million Facebook users. In Latin America, it was followed by Mexico and Colombia with approximately *** million and ** million Facebook users, respectively. In the Caribbean, the Dominican Republic was the country with the largest number of people on the social media platform. Facebook’s forecasted future Brazil is expected to continue building up its Facebook audience in the coming years. It is estimated that by 2025, the South American nation will reach nearly ***** million users on the social network. By that same year, more than ***** million Mexicans are forecast to be on Facebook, according to another source. Despite such awaited growth, Facebook’s market share decreased in most of the six largest Latin American countries between 2019 and 2020 – the exception was Chile, where an increase of *** percent was recorded. Concurrently, Instagram, also owned by Facebook, Inc., experienced an increase in its market share across the region. Pandemic Facebook posting In March 2020, when COVID-19 was officially characterized by the World Health Organization as a pandemic, Facebook users in Brazil made nearly ** percent more posts than they had during the same month a year prior. Furthermore, the contents of posts addressing the virus, made during the month of March throughout Latin America, were more visual than textual. Namely, ** percent of posts using the words ‘coronavirus’ or ‘COVID-19’ consisted of videos and almost ********* of them contained photos, whereas only *** percent included users’ statuses. At that same time, Latin American governments flocked to the social network to communicate with the region's inhabitants, increasing their Facebook posting behavior by almost ** percent in one year.
Facebook
TwitterMexico is the country with the largest number of native Spanish speakers in the world. As of 2024, 132.5 million people in Mexico spoke Spanish with a native command of the language. Colombia was the nation with the second-highest number of native Spanish speakers, at around 52.7 million. Spain came in third, with 48 million, and Argentina fourth, with 46 million. Spanish, a world language As of 2023, Spanish ranked as the fourth most spoken language in the world, only behind English, Chinese, and Hindi, with over half a billion speakers. Spanish is the official language of over 20 countries, the majority on the American continent, nonetheless, it's also one of the official languages of Equatorial Guinea in Africa. Other countries have a strong influence, like the United States, Morocco, or Brazil, countries included in the list of non-Hispanic countries with the highest number of Spanish speakers. The second most spoken language in the U.S. In the most recent data, Spanish ranked as the language, other than English, with the highest number of speakers, with 12 times more speakers as the second place. Which comes to no surprise following the long history of migrations from Latin American countries to the Northern country. Moreover, only during the fiscal year 2022. 5 out of the top 10 countries of origin of naturalized people in the U.S. came from Spanish-speaking countries.