In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.
As of 2025, ***** was the most spoken Indian language worldwide and ranked third globally, with approximately *** million speakers. ******* was the second most spoken Indian language, with approximately *** million speakers globally.
As of February 2025, English was the most popular language for web content, with over 49.4 percent of websites using it. Spanish ranked second, with six percent of web content, while the content in the German language followed, with 5.6 percent. English as the leading online language United States and India, the countries with the most internet users after China, are also the world's biggest English-speaking markets. The internet user base in both countries combined, as of January 2023, was over a billion individuals. This has led to most of the online information being created in English. Consequently, even those who are not native speakers may use it for convenience. Global internet usage by regions As of October 2024, the number of internet users worldwide was 5.52 billion. In the same period, Northern Europe and North America were leading in terms of internet penetration rates worldwide, with around 97 percent of its populations accessing the internet.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Latgalian Tezaurs (LTG T) is a lexical database and online dictionary of Latgalian (ISO 639-3 ltg). The pilot version of December 2024 contains more than 450 entries, including many idioms and other multi-word units. Entries include spelling variants and dialect forms and name the sources where the lexical unit has been documented. Audio recordings illustrate pronunciation by native speakers. Inflection tables show the current standard. Word senses are defined in Latvian and illustrated with selected examples from the corpora of written and spoken Latgalian.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global market size for digital Spanish language learning was valued at approximately USD 1.2 billion in 2023 and is projected to reach around USD 3.8 billion by 2032, growing at a robust CAGR of 13.6% from 2024 to 2032. This impressive growth is driven by numerous factors, including the increasing globalization and cultural exchange, technological advancements in digital learning platforms, and the rising demand for multilingual proficiency in the professional world. These growth factors are collectively contributing to the substantial expansion of the digital Spanish language learning market.
One of the primary growth drivers for this market is the increasing globalization of business and the growing importance of Spanish as a global language. With over 580 million speakers worldwide, Spanish ranks as the second most spoken native language, following Mandarin. Businesses, educational institutions, and individuals are increasingly recognizing the value of Spanish proficiency, leading to a surge in demand for effective and accessible language learning solutions. This trend is particularly pronounced in the corporate sector, where organizations are looking to enhance their workforce's language skills to facilitate better communication with Spanish-speaking clients and partners.
Technological advancements have also played a crucial role in propelling the market forward. The proliferation of smartphones, high-speed internet connections, and advanced software applications has made digital language learning more accessible and engaging. Innovative features such as artificial intelligence, machine learning, and immersive virtual reality experiences are being integrated into language learning platforms, providing users with personalized and interactive learning experiences. These technological innovations are not only enhancing the effectiveness of language learning but also making it more appealing to a broader audience.
Furthermore, the COVID-19 pandemic has acted as a catalyst for the growth of the digital Spanish language learning market. With traditional classroom-based learning disrupted, there has been a significant shift towards online education, including language learning. The convenience, flexibility, and accessibility offered by digital platforms have attracted a diverse range of learners, from individual enthusiasts to educational institutions and corporate entities. This shift is expected to have a lasting impact, with online and digital learning becoming an integral part of the education landscape even in the post-pandemic era.
Regionally, North America and Europe have been at the forefront of adopting digital Spanish language learning solutions, driven by a combination of high internet penetration, a strong emphasis on education, and a multicultural population. However, the Asia Pacific region is emerging as a significant growth market, fueled by increasing interest in language learning, rapid digitalization, and the growing presence of global businesses requiring multilingual capabilities. Latin America, with its native Spanish-speaking population, also presents substantial opportunities for market expansion, particularly in the educational and corporate sectors.
The rise of the Language Learning App has significantly contributed to the accessibility and convenience of acquiring new languages. These apps offer a variety of features, such as interactive exercises, real-time feedback, and community engagement, which make learning more engaging and effective. The ability to learn anytime and anywhere has made language learning apps particularly popular among busy professionals and students who seek to integrate language acquisition into their daily routines. As technology continues to evolve, these apps are incorporating advanced features like speech recognition and AI-driven personalized learning paths, further enhancing the user experience and effectiveness of language learning.
The digital Spanish language learning market is segmented by product type into software, apps, online courses, and tutoring services. Each segment caters to different preferences and needs of learners, offering a diverse range of options for acquiring Spanish language skills. Software solutions, including comprehensive language learning programs, h
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
The corpus contains recordings by the native speakers of the North Levantine Arabic (apc) acquired during 2020, 2021, and 2023 in Prague, Paris, Kabardia, and St. Petersburg. Altogether, there were 13 speakers (9 male and 4 female, aged 1x 15-20, 7x 20-30, 4x 30-40, and 1x 40-50).
The recordings contain both monologues and dialogues on the topics of everyday life (health, education, family life, sports, culture) as well as information on both host countries (living abroad) and country of origin (Syria traditions, education system, etc.). Both types are spontaneous, the participants were given only the general subject and talked on the topic or discussed it freely. The transcription and translation team consisted of students of Arabic at Charles University, with an additional quality check provided by the native speakers of the dialect.
The textual data is split between the (parallel) transcriptions (.apc) and translations (.eng), with one segment per line. The additional .yaml file provides mapping to the corresponding audio file (with the duration and offset in the "%S.%03d" format, i.e., seconds and milliseconds) and a unique speaker ID. The audio data is shared in the 48kHz .wav format, with dialogues and monologues in separate folders. All of the recordings are mono, with a single channel. For dialogues, there is a separate file for each speaker, e.g., "16072022_Family-01.wav" and "16072022_Family-02.wav".
The data provided in this repository corresponds to the test split of the dialectal Arabic to English shared task hosted at the 22nd edition of the International Conference on Spoken Language Translation, i.e., IWSLT 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Khan, Rabia; Khan, Huzaifa Saleem; Ijaz, Shireen (2025), “English-Pashto Language Dataset (EPLD)”, Mendeley Data, V1, doi: 10.17632/vmgv4s6vrn.1
https://www.kaggle.com/huzaifasaleemkhan
CC BY 4.0
10.17632/vmgv4s6vrn.1
The English-Pashtu Language Dataset (EPLD) is a comprehensive resource aimed to provide linguistic insights into the Pashtu language. It contains the knowledge and study of Pashtu language with the basics of communication like counting, alphabets, pronoun, basic sentences used in everyday life. Every data is translated from English to Pashtu for better human understanding and clarity. The data is carefully proofread and verified by the native speakers and the language experts. Pashto language has multiple variations and accents depending on the geographical factors. This dataset explains and addresses the key differences of words and sounds of Pashto, which may sound similar or different from English on the basis of gender, tense of the statement, relationship of the speaker etc. This dataset is designed to support language learning, natural language processing (NLP) research and computational linguistic studies focusing on Pashto language.
This dataset is consist of four .xml files. Each XML file is structured with tags for easy parsing and integration into computational systems. Data organization within the files ensures seamless extraction and manipulation for research or application purposes.
The Dataset contain the four .xml files, each file addresses and focus on a specific Pashto language aspect.
• Contain numeric data counts starting from 0 to 100 • Define how numbers are called in Pashtu. (Like in English- “10” is called “Ten” and in Pashtu it is called as “Lass”) • Number is represented in English and then translated into Pashtu.
• Contain Pashtu alphabets and also the alphabet sound. • The dataset includes the alphabets which sounds similar and different from English.
• Dataset showcase the variation of pronouns used in Pashtu Language on the basis on gender (masculine and feminine). • The pronouns also vary from 1st person, 2nd person and 3rd person.
• The dataset contain 104 basic sentences. • The sentences are diverse in nature. • The English sentences are translated into Pashto following all the language rules and grammar of Pashtu.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global dubbing market size is projected to grow from $2.1 billion in 2023 to approximately $3.8 billion by 2032, reflecting a compound annual growth rate (CAGR) of 6.5%. This robust growth can be attributed to the increasing demand for localized content in various languages, catering to diverse audiences around the globe. The globalization of the entertainment industry and the rise in digital content consumption are key growth drivers for the dubbing industry.
One of the significant growth factors of the dubbing market is the proliferation of streaming platforms. With the advent of global streaming giants like Netflix, Amazon Prime, and Disney+, there is a burgeoning demand for dubbed content to cater to non-native language speakers. These platforms are investing heavily in localization to expand their user base across different linguistic demographics, significantly boosting the demand for dubbing services. Furthermore, advancements in dubbing technologies, such as AI and machine learning, are making the process more efficient and cost-effective, thus encouraging more media producers to opt for dubbing.
Another key driver is the surge in international co-productions. As entertainment industries across the globe collaborate more frequently, the need for dubbing has escalated to make content accessible and marketable in different regions. This trend is particularly notable in the Asia-Pacific region, where countries like China, India, and South Korea are seeing a significant rise in the production of content that is later dubbed for international markets. Moreover, the growing popularity of anime and K-dramas globally has further accelerated the demand for dubbing services.
The role of governmental regulations and incentives in promoting local language content is another crucial growth factor. Several countries have mandated the availability of content in their local languages to preserve cultural heritage and enhance accessibility. For example, the European Union has introduced quotas requiring streaming services to offer a certain percentage of European content, often necessitating dubbing. Additionally, tax incentives and grants for local language production encourage content creators to dub their works, thereby driving market growth.
Regionally, the dubbing market is witnessing varied growth patterns. North America remains a stronghold due to its advanced entertainment industry and the presence of leading streaming platforms. However, the Asia-Pacific region is emerging as a lucrative market owing to its vast and diverse audience base. Europe continues to be significant due to its multilingual population and stringent content localization regulations. Latin America and the Middle East & Africa are also showing promising growth, driven by increasing internet penetration and the rising popularity of global content.
The type segment of the dubbing market comprises voice dubbing, lip sync dubbing, and subtitle dubbing. Voice dubbing is primarily used for animated series and movies where matching lip movements to the dialogue is not required. This type has gained traction due to the increasing popularity of animated content and video games, where voice quality and emotional expression are critical. The flexibility and cost-efficiency of voice dubbing make it a popular choice among content creators and distributors.
Lip sync dubbing, where the dubbed dialogue is synchronized with the lip movements of the actors, is particularly prevalent in live-action films and TV series. This type is more intricate and expensive than voice dubbing, as it requires meticulous timing and skilled voice actors to ensure a natural viewing experience. The growing demand for live-action content across various languages and regions has led to an increased focus on lip sync dubbing services. With advancements in AI and machine learning, this process is becoming more streamlined, further propelling its adoption.
Subtitle dubbing, which involves translating and displaying the dialogue as text on the screen, is often used as a cost-effective alternative to voice and lip sync dubbing. While it does not provide the same immersive experience as other types, it is widely accepted for its affordability and ease of implementation. This type is particularly popular in regions with lower budgets for media production. However, the effectiveness of subtitle dubbing largely depends on the viewers' literacy levels and their preference for reading subtitles while watching content.
https://www.statsndata.org/how-to-orderhttps://www.statsndata.org/how-to-order
The K-12 English Language Learning (ELL) market has emerged as a vital component of the educational landscape, catering to the diverse linguistic needs of students across various educational settings. This sector focuses on providing resources and pedagogical methodologies that enable non-native English speakers to
The statistic reflects the distribution of languages in Canada in 2022. In 2022, 87.1 percent of the total population in Canada spoke English as their native tongue.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
In 2025, there were around 1.53 billion people worldwide who spoke English either natively or as a second language, slightly more than the 1.18 billion Mandarin Chinese speakers at the time of survey. Hindi and Spanish accounted for the third and fourth most widespread languages that year. Languages in the United States The United States does not have an official language, but the country uses English, specifically American English, for legislation, regulation, and other official pronouncements. The United States is a land of immigration, and the languages spoken in the United States vary as a result of the multicultural population. The second most common language spoken in the United States is Spanish or Spanish Creole, which over than 43 million people spoke at home in 2023. There were also 3.5 million Chinese speakers (including both Mandarin and Cantonese),1.8 million Tagalog speakers, and 1.57 million Vietnamese speakers counted in the United States that year. Different languages at home The percentage of people in the United States speaking a language other than English at home varies from state to state. The state with the highest percentage of population speaking a language other than English is California. About 45 percent of its population was speaking a language other than English at home in 2023.