There were approximately 370 thousand Indian nationals residing in the United Kingdom in 2021, around thousand more than there were a year earlier.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States Employment: American Indian or Alaska Native data was reported at 1,980.000 Person th in Feb 2025. This records an increase from the previous number of 1,956.000 Person th for Jan 2025. United States Employment: American Indian or Alaska Native data is updated monthly, averaging 1,327.500 Person th from Jan 2000 (Median) to Feb 2025, with 302 observations. The data reached an all-time high of 1,980.000 Person th in Feb 2025 and a record low of 837.000 Person th in Oct 2003. United States Employment: American Indian or Alaska Native data remains active status in CEIC and is reported by U.S. Bureau of Labor Statistics. The data is categorized under Global Database’s United States – Table US.G030: Current Population Survey: Employment.
In 2020/21 there were approximately 696,000 Polish nationals living in the United Kingdom, the highest non-British population at this time. Indian and Irish were the joint second-largest nationalities at approximately 370,000 people.
All too often, archaeological studies of the Contact Period, as it occurred in the Chesapeake Bay region, have focused on the European impact on Native American life. The opposite side of this interaction—the effects Indians had on colonial life—has been downplayed. Indian-made artifacts found on colonial sites are often seen as little more than indicators of “trade.” However, a closer examination of the evidence suggests that the Native impact on English settlers was more profound. Using data from the NEH-funded Comparative Archaeological Study of Colonial Chesapeake Culture Project, Indian artifacts from a number of Chesapeake sites are being studied. This paper shows that pipes, pottery, beads, and other components of Indian material culture played an important and functional role in early colonial life. Indian materials eventually took on antiquarian significance as well. As a comparison to this study of colonial sites, the same data categories are then applied to two 17th-century Native American sites included as part of the NEH project, in order to measure the influence of European material culture on Indian life.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Indian English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.
This visual speech dataset contains 1000 videos in Indian English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.
While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.
The dataset provides comprehensive metadata for each video recording and participant:
Underway surface air temperature and sea water temperature were collected aboard the Skelton Castle while in route from England to Bombay India as part of the East India Company during the dates 28 February 1800 to 3 June 1800. The data were prepared by one Mr. R. Perrins on behalf of Sir Anthony Carlisle as part of a study "to determine whether fishes possess any other temperature than that of the water in which they live." A table containing the data was found in Nicholson's "Journal of Natural Philosophy", published in 1804.
This statistic represents results of a survey about the share of English speakers across India in 2019, by region. During the surveyed time period, the share of respondents who spoke English in urban areas was around 88 percent while this was about three percent for rural respondents.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the English Language General Conversation Speech Dataset, a comprehensive and diverse collection of voice data specifically curated to advance the development of English language speech recognition models, with a particular focus on Indian accents and dialects.
With high-quality audio recordings, detailed metadata, and accurate transcriptions, it empowers researchers and developers to enhance natural language processing, conversational AI, and Generative Voice AI algorithms. Moreover, it facilitates the creation of sophisticated voice assistants and voice bots tailored to the unique linguistic nuances found in the English language spoken in India.
Speech Data:This training dataset comprises 100 hours of audio recordings covering a wide range of topics and scenarios, ensuring robustness and accuracy in speech technology applications. To achieve this, we collaborated with a diverse network of 110 native English speakers from different part of India. This collaborative effort guarantees a balanced representation of Indian accents, dialects, and demographics, reducing biases and promoting inclusivity.
Each audio recording captures the essence of spontaneous, unscripted conversations between two individuals, with an average duration ranging from 15 to 60 minutes. The speech data is available in WAV format, with stereo channel files having a bit depth of 16 bits and a sample rate of 8 kHz. The recording environment is generally quiet, without background noise and echo.
Metadata:In addition to the audio recordings, our dataset provides comprehensive metadata for each participant. This metadata includes the participant's age, gender, country, state, and dialect. Furthermore, additional metadata such as recording device detail, topic of recording, bit depth, and sample rate will be provided.
The metadata serves as a valuable tool for understanding and characterizing the data, facilitating informed decision-making in the development of English language speech recognition models.
Transcription:This dataset provides a manual verbatim transcription of each audio file to enhance your workflow efficiency. The transcriptions are available in JSON format. The transcriptions capture speaker-wise transcription with time-coded segmentation along with non-speech labels and tags.
Our goal is to expedite the deployment of English language conversational AI and NLP models by offering ready-to-use transcriptions, ultimately saving valuable time and resources in the development process.
Updates and Customization:We understand the importance of collecting data in various environments to build robust ASR models. Therefore, our voice dataset is regularly updated with new audio data captured in diverse real-world conditions.
If you require a custom training dataset with specific environmental conditions such as in-car, busy street, restaurant, or any other scenario, we can accommodate your request. We can provide voice data with customized sample rates ranging from 8kHz to 48kHz, allowing you to fine-tune your models for different audio recording setups. Additionally, we can also customize the transcription following your specific guidelines and requirements, to further support your ASR development process.
License:This audio dataset, created by FutureBeeAI, is now available for commercial use.
Conclusion:Whether you are training or fine-tuning speech recognition models, advancing NLP algorithms, exploring generative voice AI, or building cutting-edge voice assistants and bots, our dataset serves as a reliable and valuable resource.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Indian English Scripted Monologue Speech Dataset for the General Domain. This meticulously curated dataset is designed to advance the development of General domain English language speech recognition models.
This training dataset comprises over 6,000 high-quality scripted prompt recordings in Indian English. These recordings cover various General domain topics and scenarios, designed to build robust and accurate speech technology.
Each scripted prompt is crafted to reflect real-life scenarios encountered in the General domain, ensuring applicability in training robust natural language processing and speech recognition models.
In addition to high-quality audio recordings, the dataset includes meticulously prepared text files with verbatim transcriptions of each audio file. These transcriptions are essential for training accurate and robust speech recognition models.
The dataset provides comprehensive metadata for each audio recording and participant:
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Indian English Scripted Monologue Speech Dataset for the Travel Domain. This meticulously curated dataset is designed to advance the development of English language speech recognition models, particularly for the Travel industry.
This training dataset comprises over 6,000 high-quality scripted prompt recordings in Indian English. These recordings cover various topics and scenarios relevant to the Travel domain, designed to build robust and accurate customer service speech technology.
Each scripted prompt is crafted to reflect real-life scenarios encountered in the Travel domain, ensuring applicability in training robust natural language processing and speech recognition models.
In addition to high-quality audio recordings, the dataset includes meticulously prepared text files with verbatim transcriptions of each audio file. These transcriptions are essential for training accurate and robust speech recognition models.
Data are presented for daily rainfall, stream discharge and hydraulic conductivity of soils from catchments located in the Upper Nilgiris Reserve Forest in the state of Tamil Nadu. The catchments are dominated by four land cover types, shola, grassland, pine and wattle. The data were collected between May 2014 and December 2016. Tipping bucket wired rain gauges were used to measure rainfall. Stream discharge was measured from stilling wells and capacitance probe-based water level recorders. A mini-disk infiltrometer was used to measure the hydraulic conductivity of soils. Dry season data has not been included in this dataset as its focus is on extreme rain events. The data were collected as part of a series of eco-hydrology projects that explored the impact of land cover on rain-runoff response, carbon sequestration and nutrient and sediment discharge. The dataset presented here was collected by a team of three to five researchers and field assistants who were engaged in the installation of the data loggers and their regular operation and maintenance. Four research agencies have partnered across multiple projects to sustain the data collection efforts that started in June 2013 and continue (June 2020). These are the Foundation for Ecological Research, Advocacy and Learning - Pondicherry, the Ashoka Trust for Research in Ecology and the Environment - Bangalore, the Lancaster Environmental Centre, Lancaster University - UK, and the National Centre for Biological Sciences - Bangalore. Funding was provided by Ministry of Earth Sciences Government of India from the Changing Water Cycle programme (Grant Ref: MoES/NERC/16/02/10 PC-II) and the Hydrologic footprint of Invasive Alien Species project (MOES/PAMC/H&C/85/2016-PC-II). Additional funding was provided by UKRI Natural Environment Research Council grant NE/I022450/1 (Western Ghats-Capacity within the NERC Changing Water Cycle programme) and WWF-India as part of the Noyyal-Bhavani program.This research took place inside protected areas in the Nilgiri Division for which permissions and support were provided continually by the Tamil Nadu Forest Department, particularly the office of the District Forest Officer, Udhagamandalam.
Data collected between 2014 and 2016 from self-identified lesbian, gay, bisexual, trans and queer (LGBTQ) individuals in India and the UK. This data was collected at specific workshops held in India and the UK, and via the project's website (see Related Resources).
The study used a 7 phase mixed methods design: 1. Project planning and research design, including formally establishing the advisory group and meeting 1, setting milestones and setting in place all agreements/ethical approvals 2. Literature review exploring key measures used to rate and assess LGBTQ 'friendliness'/inclusion nationally, supra-nationally and internationally 3. A spatial assessment of LGBTQ liveabilities that includes, but moves beyond, the measures identified in phase 2, applying these at a local scale e.g. policy indicators and place based cultural indicators 4. Twenty focus groups (80 participants, sample targeting marginalised LGBTQ people), coupled with online qualitative questionnaires (150), and shorter SMS text questionnaires (200)/App responses (200) to identify add to the liveability index created in phase 3 and what makes life un/liveable for a range of LGBTQ people and how this varies spatially 5. Participants in the data collection will be invited to reconfigure place through UK/India street theatre performances. These will be video recorded, edited into one short video and widely distributed. Data will be collected by observing interactions; on the spot audience surveys; reflections on the event 6. The research will analyse the data sets as they are collected. At the end of the data collection phase time will be taken to look across all 4 data sets to create a liveability index 7. Research dissemination will be targeted at community and academic audiences, including end of project conferences in India/UK, collating policy/community reports, academic outputs. The impact plan details the short (transnational support systems; empowerment of participants), medium (policy changes, inform practice) and long-term (changing perceptions of LGBTQ people) social impacts and how these will be achieved.
The main research objective is to move beyond exclusion/inclusion of Lesbian, Gay, Bisexual, Trans, Queer (LGBTQ) communities in UK and India creating a liveability model that can be adapted globally. Whilst work has been done to explore the implications of Equalities legislation, including contesting the normalisations of neo-liberalisms, there has yet to be an investigation into what might make every day spaces liveable for LGBTQ people. This project addresses social exclusion, not only through identifying exclusions, but also by exploring how life might become liveable in everyday places in two very different contexts. In 2013 the Marriage (Same Sex) Act passed in the UK, and in India the Delhi High Court's reading down Indian Penal Code 377 in 2009 to decriminalize sexual acts between consenting same-sex people was overturned by the Supreme Court. Yet bullying, mental health and safety continue to be crucial to understanding British LGBTQ lives, in contrast the overturned the revoke of Penal Code 377 2013, this has resulted in increased visibilities of LGBTQ people. These different contexts are used to explore liveable lives as more than lives that are just 'bearable' and moves beyond norms of happiness and wellbeing. This research refuses to be fixed to understanding social liberations through the exclusion/inclusion, in place/out of place dichotomies. Using commonplace to move beyond 'in place' towards being common to the place itself. Place can then be shared in common as well as collectively made in ways that do not necessarily impose normative agendas/regulatory conditionalities. Social liberations are examined in the transformation of everyday encounters without conforming to hegemonies or making 'normal' our own. Whilst the focus is sexual and gender liberations, the project will enable considerations of others social differences. It will show how places produce differential liveabilities both where legislative change has been achieved and where it has just been repealed. Thus, the project offers academic and policy insights into safety, difference and vibrant and fair societies.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains the result of a SPARQL query run on the Wikidata Query Service on 13 February 2020 around 22:25 UTC. They are archived here as a means to determine progress with the coverage of disease-related terms in languages other than English, particularly in languages of India.
The SPARQL query
was for
Wikidata items for concepts that have a Disease Ontology ID (P699)
sorted by number of sitelinks
optionally with their Wikidata label in English
optionally with their Wikipedia article title in English
optionally with their Wikidata label in Hindi, Bangla and Swahili
optionally with their Wikidata label in Marathi, Telugu, Eastern Punjabi, Western Punjabi, Gujarathi, Maithili, Kannada, Odia, Bhojpuri, Tamil, Nepali, Urdu, Malayalam, Esperanto
is contained in the file SPARQL.txt,
whereas the results are available in several formats, as provided by the Wikidata Query Service:
query.csv
query.tsv query.html
query.json.txt (Zenodo produced an error upon trying to upload the file as query.json, so I renamed it, which worked fine).
A simplified version of the SPARQL query can also be fed into the TABernacle tool that represents the live data in a way that facilitates editing the missing pieces.
This data provides the integrated cadastral framework for the specified Canada Land. The cadastral framework consists of active and superseded cadastral parcel, roads, easements, administrative areas, active lines, points and annotations. The cadastral lines form the boundaries of the parcels. COGO attributes are associated to the lines and depict the adjusted framework of the cadastral fabric. The cadastral annotations consist of lot numbers, block numbers, township numbers, etc. The cadastral framework is compiled from Canada Lands Survey Records (CLSR), Registration Plans (RS) and Location Sketches (LS) archived in the Canada Lands Survey Records.
Sikhism is a religion that originated on the Indian subcontinent during the fifteenth century. Sikhs follow the teachings of 'gurus', who descend from the first guru Guru Naruk who established the faith. Followers of Sikhism are monotheists, believing in only one god, and other core beliefs include the need to meditate, the importance of community and communal living, and the need to serve humanity selflessly (or 'seva'). Sikhism and the British Empire In total, there are around 26 million Sikhs worldwide, and over 24 million of these live in India. Outside of India, the largest Sikh populations are mostly found in former territories of the British Empire - the UK and Canada both have Sikh populations of over half a million people. Migration from India to other parts of the British Empire was high in the 19th century, due to the labor demands of relatively newer colonies, as well as those where slavery had been abolished. These countries also remain popular destinations for Sikh migrants today, as many are highly trained and English-speaking. Other regions with significant Sikh populations Italy also has a sizeable Sikh population, as many migrated there after serving there in the British Army during WWI, and they are now heavily represented in Italy's dairy industry. The Sikh population of Saudi Arabia is also reflective of the fact that the largest Indian diaspora in the world can now be found in the Middle East - this is due to the labor demands of the fossil fuel industries and their associated secondary industries, although a large share of Indians in this part of the world are there on a temporary basis.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Indian English Scripted Monologue Speech Dataset for the Real Estate Domain. This meticulously curated dataset is designed to advance the development of English language speech recognition models, particularly for the Real Estate industry.
This training dataset comprises over 6,000 high-quality scripted prompt recordings in Indian English. These recordings cover various topics and scenarios relevant to the Real Estate domain, designed to build robust and accurate customer service speech technology.
Each scripted prompt is crafted to reflect real-life scenarios encountered in the Real Estate domain, ensuring applicability in training robust natural language processing and speech recognition models.
In addition to high-quality audio recordings, the dataset includes meticulously prepared text files with verbatim transcriptions of each audio file. These transcriptions are essential for training accurate and robust speech recognition models.
https://www.futurebeeai.com/data-license-agreementhttps://www.futurebeeai.com/data-license-agreement
Welcome to the Indian English Scripted Monologue Speech Dataset for the Retail & E-commerce Domain. This meticulously curated dataset is designed to advance the development of English language speech recognition models, particularly for the Retail & E-commerce industry.
This training dataset comprises over 6,000 high-quality scripted prompt recordings in Indian English. These recordings cover various topics and scenarios relevant to the Retail & E-commerce domain, designed to build robust and accurate customer service speech technology.
Each scripted prompt is crafted to reflect real-life scenarios encountered in the Retail & E-commerce domain, ensuring applicability in training robust natural language processing and speech recognition models.
In addition to high-quality audio recordings, the dataset includes meticulously prepared text files with verbatim transcriptions of each audio file. These transcriptions are essential for training accurate and robust speech recognition models.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Text-to-speech (TTS) voices, which vary in their apparent native language and dialect, are increasingly widespread. In this paper, we test how speakers perceive and align toward TTS voices that represent American, British, and Indian dialects of English and the extent that social attitudes shape patterns of convergence and divergence. We also test whether top-down knowledge of the talker, manipulated as a “human” or “device” guise, mediates these attitudes and accommodation. Forty-six American English-speaking participants completed identical interactions with 6 talkers (2 from each dialect) and rated each talker on a variety of social factors. Accommodation was assessed with AXB perceptual similarity by a separate group of raters. Results show that speakers had the strongest positive social attitudes toward the Indian English voices and converged toward them more. Conversely, speakers rate the American English voices as less human-like and diverge from them. Finally, speakers overall show more accommodation toward TTS voices that were presented in a “human” guise. We discuss these results through the lens of the Communication Accommodation Theory (CAT).
Life expectancy in India was 25.4 in the year 1800, and over the course of the next 220 years, it has increased to almost 70. Between 1800 and 1920, life expectancy in India remained in the mid to low twenties, with the largest declines coming in the 1870s and 1910s; this was because of the Great Famine of 1876-1878, and the Spanish Flu Pandemic of 1918-1919, both of which were responsible for the deaths of up to six and seventeen million Indians respectively; as well as the presence of other endemic diseases in the region, such as smallpox. From 1920 onwards, India's life expectancy has consistently increased, but it is still below the global average.
In 2022, the majority of Indian adults had a wealth of 10,000 U.S. dollars or less. On the other hand, about 0.1 percent were worth more than one million dollars that year. India The Republic of India is one of the world’s largest and most economically powerful states. India gained independence from Great Britain on August 15, 1947, after having been under their power for 200 years. With a population of about 1.4 billion people, it was the second most populous country in the world. Of that 1.4 billion, about 28.5 million lived in New Delhi, the capital. Wealth inequality India suffers from extreme income inequality. It is estimated that the top 10 percent of the population holds 77 percent of the national wealth. Billionaire fortune has increase sporadically in the last years whereas minimum wages have remain stunted.
There were approximately 370 thousand Indian nationals residing in the United Kingdom in 2021, around thousand more than there were a year earlier.