Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consist of population of three years in Tamil Nadu.
This file consist of information about the places, population , district and position of place.
This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Tamil Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Tamil -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.
Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.
The dataset features 30 hours of dual-channel call center recordings between native Tamil speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.
This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.
Such domain-rich variety ensures model generalization across common real estate support conversations.
All recordings are accompanied by precise, manually verified transcriptions in JSON format.
These transcriptions streamline ASR and NLP development for Tamil real estate voice applications.
Detailed metadata accompanies each participant and conversation:
This enables smart filtering, dialect-focused model training, and structured dataset exploration.
This dataset is ideal for voice AI and NLP systems built for the real estate sector:
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
It is easy to find list of Constituencies , but GeoData of each one is pretty complicated , so this dataset may dilute that.
Thanks to ,
http://projects.datameet.org/maps/ https://www.elections.tn.gov.in/Form21C_TNLA2021.aspx
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Population: Tamil Nadu data was reported at 77.222 Person mn in 2025. This records an increase from the previous number of 76.993 Person mn for 2024. Population: Tamil Nadu data is updated yearly, averaging 66.611 Person mn from Mar 1994 (Median) to 2025, with 32 observations. The data reached an all-time high of 77.222 Person mn in 2025 and a record low of 57.670 Person mn in 1994. Population: Tamil Nadu data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under Global Database’s India – Table IN.GBG001: Population. [COVID-19-IMPACT]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Social media platforms have become integral tools in the conduct of foreign policy for many nations, including India. This dataset serves as a resource for analyzing ‘Social Media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic.’ The data were collected through a web-based questionnaire distributed primarily to people aged 18 – 61 and above in India. A total of 171 valid data were collected from 17 states offering extensive geographic coverage and stored in Mendeley. The 15 contributor states are Goa, Maharashtra, Tamil Nadu, Gujarat, Delhi, Assam, Haryana, Jammu and Kashmir, Karnataka, Kerala, Punjab, Rajasthan, Tripura, Uttar Pradesh and West Bengal. It encompasses diverse question formats, including single-choice, multiple-choice, quizzes, and open-ended. The study underscores the opportunities and challenges of employing 'X' diplomacy in India's foreign policy. Thus, there were two hypotheses. First, India's effective use of 'X' diplomacy positively impacts public perception of India's foreign policy effectiveness. Second, India's adept use of 'X' diplomacy during the COVID-19 pandemic enhances its ability to manage and respond to the crisis effectively. This data shows public perception of the effective use of social media by the Government of India, particularly in the crisis situation. Data also highlight the significant change in India’s narrative through its ‘X’ diplomacy, effectively setting the narratives, public perceptions, and diplomatic strategies. This data can be fully utilized in the study of the significance of social media in India’s foreign policy, the role of social media like ‘X’ in the making of India’s foreign policy, how effective social media like ‘X’ was during the Covid-19 pandemic and how Indian government utilized social media like ‘X’ to delivered messages and to set the narrative in the international politics.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tamil Nadu Budget 2025-26 : HUMAN RESOURCES MANAGEMENT DEPARTMENT
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data was reported at 348,538.000 Person in 2022. This records an increase from the previous number of 332,030.000 Person for 2021. Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data is updated yearly, averaging 236,161.000 Person from Dec 2005 (Median) to 2022, with 15 observations. The data reached an all-time high of 348,538.000 Person in 2022 and a record low of 159,843.000 Person in 2005. Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data remains active status in CEIC and is reported by Central Bureau of Health Intelligence. The data is categorized under India Premium Database’s Health Sector – Table IN.HLB005: Health Human Resources: Number of Nurses: Registered.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Introducing the Tamil Scripted Monologue Speech Dataset for the Healthcare Domain, a voice dataset built to accelerate the development and deployment of Tamil language automatic speech recognition (ASR) systems, with a sharp focus on real-world healthcare interactions.
This dataset includes over 6,000 high-quality scripted audio prompts recorded in Tamil, representing typical voice interactions found in the healthcare industry. The data is tailored for use in voice technology systems that power virtual assistants, patient-facing AI tools, and intelligent customer service platforms.
The prompts span a broad range of healthcare-specific interactions, such as:
To maximize authenticity, the prompts integrate linguistic elements and healthcare-specific terms such as:
These elements make the dataset exceptionally suited for training AI systems to understand and respond to natural healthcare-related speech patterns.
Every audio recording is accompanied by a verbatim, manually verified transcription.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data was reported at 13.600 NA in 2020. This records a decrease from the previous number of 14.000 NA for 2019. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data is updated yearly, averaging 15.800 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 18.200 NA in 1999 and a record low of 13.600 NA in 2020. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH002: Vital Statistics: Birth Rate: by States.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
gmr_tn - Global Mobility Report Vs Positive Cases
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tamil Nadu Budget 2023-24: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
The Tamil Scripted Monologue Speech Dataset for the General Domain is a carefully curated resource designed to support the development of Tamil language speech recognition systems. This dataset focuses on general-purpose conversational topics and is ideal for a wide range of AI applications requiring natural, domain-agnostic Tamil speech data.
This dataset features over 6,000 high-quality scripted monologue recordings in Tamil. The prompts span diverse real-life topics commonly encountered in general conversations and are intended to help train robust and accurate speech-enabled technologies.
The dataset covers a wide variety of general conversation scenarios, including:
To enhance authenticity, the prompts include:
Each prompt is designed to reflect everyday use cases, making it suitable for developing generalized NLP and ASR solutions.
Every audio file in the dataset is accompanied by a verbatim text transcription, ensuring accurate training and evaluation of speech models.
Rich metadata is included for detailed filtering and analysis:
This dataset can power a variety of Tamil language AI technologies, including:
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Tamil Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 30 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Tamil -speaking travelers.
Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.
The dataset includes 30 hours of dual-channel audio recordings between native Tamil speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.
Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).
These scenarios help models understand and respond to diverse traveler needs in real-time.
Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.
Extensive metadata enriches each call and speaker for better filtering and AI training:
This dataset is ideal for a variety of AI use cases in the travel and tourism space:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data was reported at 64,012.000 Person in 2022. This records an increase from the previous number of 61,465.000 Person for 2021. Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data is updated yearly, averaging 55,975.000 Person from Dec 2005 (Median) to 2022, with 15 observations. The data reached an all-time high of 64,012.000 Person in 2022 and a record low of 52,909.000 Person in 2005. Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data remains active status in CEIC and is reported by Central Bureau of Health Intelligence. The data is categorized under India Premium Database’s Health Sector – Table IN.HLB005: Health Human Resources: Number of Nurses: Registered.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Tamil Nadu 2021 State Assembly Elections - Dataset contains Constituency wise contestant, polling details & Winning details
Tamil Nadu 2021 State Assembly Elections - Event date -6-April-2021 Results out on - 2 May-2021
Data File - Tamil_Nadu_State_Elections_2021_Constituency_Metadata.csv
Data File - Tamil_Nadu_State_Elections_2021_Details.csv
General Information 👍 -
Total State Election Constituency in Tamil Nadu - 234 Total Lok Sabha Election Constituency in Tamil Nadu - 39
Thanks 🤩👐to https://www.elections.tn.gov.in/Elections.aspx for detailed information
Tamil Nadu is one of the highly industrialized states in India, It Contributes 10-15 % of India's GDP. 2021 State elections date help to understand the voting pattern, People of TN have given mandate to which party EDA, Data Visualisation & ML Modeling
According to the 76th round of the NSO survey conducted between July and December 2018, a higher percentage of men had disabilities compared to women in India. Specifically in Tamil Nadu, two percent of men had multiple disabilities, while this was at 1.9 percent among females. The National Statistical Office (NSO) is the statistical wing of the Ministry of Statistics and Programme Implementation (MOSPI), mainly responsible for laying down standards for statistical analysis, data collection, and implementation.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
This Tamil Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Tamil speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.
Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.
The dataset features 30 Hours of dual-channel call center conversations between native Tamil speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.
The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).
These real-world interactions help build speech models that understand healthcare domain nuances and user intent.
Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.
Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.
This dataset can be used across a range of healthcare and voice AI use cases:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
An Open Context "subjects" dataset item. Open Context publishes structured data as granular, URL identified Web resources. This "Survey Unit" record is part of the "Database of non-human primate dietary studies" data publication.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tamil Nadu Budget 2021-22: Demands For Grant - 35 - Human Resources Management Department
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tamil Nadu Budget 2024-25: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset consist of population of three years in Tamil Nadu.
This file consist of information about the places, population , district and position of place.
This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.