56 datasets found
  1. Tamilnadu Population

    • kaggle.com
    Updated Sep 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Vaishnavi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu
    Description

    Context

    This dataset consist of population of three years in Tamil Nadu.

    Content

    This file consist of information about the places, population , district and position of place.

    Acknowledgements

    This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.

  2. F

    Tamil Call Center Data for Realestate AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Call Center Data for Realestate AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/realestate-call-center-conversation-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Tamil Call Center Speech Dataset for the Real Estate industry is purpose-built to accelerate the development of speech recognition, spoken language understanding, and conversational AI systems tailored for Tamil -speaking Real Estate customers. With over 30 hours of unscripted, real-world audio, this dataset captures authentic conversations between customers and real estate agents ideal for building robust ASR models.

    Curated by FutureBeeAI, this dataset equips voice AI developers, real estate tech platforms, and NLP researchers with the data needed to create high-accuracy, production-ready models for property-focused use cases.

    Speech Data

    The dataset features 30 hours of dual-channel call center recordings between native Tamil speakers. Captured in realistic real estate consultation and support contexts, these conversations span a wide array of property-related topics from inquiries to investment advice offering deep domain coverage for AI model development.

    Participant Diversity:
    Speakers: 60 native Tamil speakers from our verified contributor community.
    Regions: Representing different regions across Tamil Nadu to ensure accent and dialect variation.
    Participant Profile: Balanced gender mix (60% male, 40% female) and age range from 18 to 70.
    Recording Details:
    Conversation Nature: Naturally flowing, unscripted agent-customer discussions.
    Call Duration: Average 5–15 minutes per call.
    Audio Format: Stereo WAV, 16-bit, recorded at 8kHz and 16kHz.
    Recording Environment: Captured in noise-free and echo-free conditions.

    Topic Diversity

    This speech corpus includes both inbound and outbound calls, featuring positive, neutral, and negative outcomes across a wide range of real estate scenarios.

    Inbound Calls:
    Property Inquiries
    Rental Availability
    Renovation Consultation
    Property Features & Amenities
    Investment Property Evaluation
    Ownership History & Legal Info, and more
    Outbound Calls:
    New Listing Notifications
    Post-Purchase Follow-ups
    Property Recommendations
    Value Updates
    Customer Satisfaction Surveys, and others

    Such domain-rich variety ensures model generalization across common real estate support conversations.

    Transcription

    All recordings are accompanied by precise, manually verified transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-coded Segments
    Non-speech Tags (e.g., background noise, pauses)
    High transcription accuracy with word error rate below 5% via dual-layer human review.

    These transcriptions streamline ASR and NLP development for Tamil real estate voice applications.

    Metadata

    Detailed metadata accompanies each participant and conversation:

    Participant Metadata: ID, age, gender, location, accent, and dialect.
    Conversation Metadata: Topic, call type, sentiment, sample rate, and technical details.

    This enables smart filtering, dialect-focused model training, and structured dataset exploration.

    Usage and Applications

    This dataset is ideal for voice AI and NLP systems built for the real estate sector:

    <span

  3. Geometry of Tamil Nadu Constituencies - 2021

    • kaggle.com
    Updated Sep 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mannar Amuthan (2021). Geometry of Tamil Nadu Constituencies - 2021 [Dataset]. https://www.kaggle.com/mannaramuthan/tamil-nadu-assembly-constituencies-geo-data-2021/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 21, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mannar Amuthan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Tamil Nadu
    Description

    Context

    It is easy to find list of Constituencies , but GeoData of each one is pretty complicated , so this dataset may dilute that.

    Acknowledgements

    Thanks to ,

    http://projects.datameet.org/maps/ https://www.elections.tn.gov.in/Form21C_TNLA2021.aspx

  4. I

    India Population: Tamil Nadu

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/population/population-tamil-nadu
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2013 - Mar 1, 2024
    Area covered
    India
    Variables measured
    Population
    Description

    Population: Tamil Nadu data was reported at 77.222 Person mn in 2025. This records an increase from the previous number of 76.993 Person mn for 2024. Population: Tamil Nadu data is updated yearly, averaging 66.611 Person mn from Mar 1994 (Median) to 2025, with 32 observations. The data reached an all-time high of 77.222 Person mn in 2025 and a record low of 57.670 Person mn in 1994. Population: Tamil Nadu data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under Global Database’s India – Table IN.GBG001: Population. [COVID-19-IMPACT]

  5. m

    Data from: A Dataset on 'Social media and India’s Foreign Policy: The Case...

    • data.mendeley.com
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mukund Narvenkar (2024). A Dataset on 'Social media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic' [Dataset]. http://doi.org/10.17632/xfr9y9ggkm.3
    Explore at:
    Dataset updated
    Dec 19, 2024
    Authors
    Mukund Narvenkar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Social media platforms have become integral tools in the conduct of foreign policy for many nations, including India. This dataset serves as a resource for analyzing ‘Social Media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic.’ The data were collected through a web-based questionnaire distributed primarily to people aged 18 – 61 and above in India. A total of 171 valid data were collected from 17 states offering extensive geographic coverage and stored in Mendeley. The 15 contributor states are Goa, Maharashtra, Tamil Nadu, Gujarat, Delhi, Assam, Haryana, Jammu and Kashmir, Karnataka, Kerala, Punjab, Rajasthan, Tripura, Uttar Pradesh and West Bengal. It encompasses diverse question formats, including single-choice, multiple-choice, quizzes, and open-ended. The study underscores the opportunities and challenges of employing 'X' diplomacy in India's foreign policy. Thus, there were two hypotheses. First, India's effective use of 'X' diplomacy positively impacts public perception of India's foreign policy effectiveness. Second, India's adept use of 'X' diplomacy during the COVID-19 pandemic enhances its ability to manage and respond to the crisis effectively. This data shows public perception of the effective use of social media by the Government of India, particularly in the crisis situation. Data also highlight the significant change in India’s narrative through its ‘X’ diplomacy, effectively setting the narratives, public perceptions, and diplomatic strategies. This data can be fully utilized in the study of the significance of social media in India’s foreign policy, the role of social media like ‘X’ in the making of India’s foreign policy, how effective social media like ‘X’ was during the Covid-19 pandemic and how Indian government utilized social media like ‘X’ to delivered messages and to set the narrative in the international politics.

  6. o

    Tamil Nadu Budget 2025-26 : HUMAN RESOURCES MANAGEMENT DEPARTMENT - Datasets...

    • openbudgetsindia.org
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Tamil Nadu Budget 2025-26 : HUMAN RESOURCES MANAGEMENT DEPARTMENT - Datasets - Open Budgets India [Dataset]. https://openbudgetsindia.org/dataset/tamil-nadu-budget-2025-26-human-resources-management-department
    Explore at:
    Dataset updated
    Mar 31, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu, India
    Description

    Tamil Nadu Budget 2025-26 : HUMAN RESOURCES MANAGEMENT DEPARTMENT

  7. India Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives [Dataset]. https://www.ceicdata.com/en/india/health-human-resources-number-of-nurses-registered/number-of-nurses-registered-tamil-nadu-general-nursing-and-midwives
    Explore at:
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2009 - Dec 1, 2022
    Area covered
    India
    Description

    Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data was reported at 348,538.000 Person in 2022. This records an increase from the previous number of 332,030.000 Person for 2021. Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data is updated yearly, averaging 236,161.000 Person from Dec 2005 (Median) to 2022, with 15 observations. The data reached an all-time high of 348,538.000 Person in 2022 and a record low of 159,843.000 Person in 2005. Number of Nurses: Registered: Tamil Nadu: General Nursing and Midwives data remains active status in CEIC and is reported by Central Bureau of Health Intelligence. The data is categorized under India Premium Database’s Health Sector – Table IN.HLB005: Health Human Resources: Number of Nurses: Registered.

  8. F

    Tamil Scripted Monologue Speech Data for Healthcare

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Scripted Monologue Speech Data for Healthcare [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/healthcare-scripted-speech-monologues-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Introducing the Tamil Scripted Monologue Speech Dataset for the Healthcare Domain, a voice dataset built to accelerate the development and deployment of Tamil language automatic speech recognition (ASR) systems, with a sharp focus on real-world healthcare interactions.

    Speech Data

    This dataset includes over 6,000 high-quality scripted audio prompts recorded in Tamil, representing typical voice interactions found in the healthcare industry. The data is tailored for use in voice technology systems that power virtual assistants, patient-facing AI tools, and intelligent customer service platforms.

    Participant Diversity
    Speakers: 60 native Tamil speakers.
    Regional Balance: Participants are sourced from multiple regions across Tamil Nadu, reflecting diverse dialects and linguistic traits.
    Demographics: Includes a mix of male and female participants (60:40 ratio), aged between 18 and 70 years.
    Recording Specifications
    Nature of Recordings: Scripted monologues based on healthcare-related use cases.
    Duration: Each clip ranges between 5 to 30 seconds, offering short, context-rich speech samples.
    Audio Format: WAV files recorded in mono, with 16-bit depth and sample rates of 8 kHz and 16 kHz.
    Environment: Clean and echo-free spaces ensure clear and noise-free audio capture.

    Topic Coverage

    The prompts span a broad range of healthcare-specific interactions, such as:

    Patient check-in and follow-up communication
    Appointment booking and cancellation dialogues
    Insurance and regulatory support queries
    Medication, test results, and consultation discussions
    General health tips and wellness advice
    Emergency and urgent care communication
    Technical support for patient portals and apps
    Domain-specific scripted statements and FAQs

    Contextual Depth

    To maximize authenticity, the prompts integrate linguistic elements and healthcare-specific terms such as:

    Names: Gender- and region-appropriate Tamil Nadu names
    Addresses: Varied local address formats spoken naturally
    Dates & Times: References to appointment dates, times, follow-ups, and schedules
    Medical Terminology: Common medical procedures, symptoms, and treatment references
    Numbers & Measurements: Health data like dosages, vitals, and test result values
    Healthcare Institutions: Names of clinics, hospitals, and diagnostic centers

    These elements make the dataset exceptionally suited for training AI systems to understand and respond to natural healthcare-related speech patterns.

    Transcription

    Every audio recording is accompanied by a verbatim, manually verified transcription.

    Content: The transcription mirrors the exact scripted prompt recorded by the speaker.
    Format: Files are delivered in plain text (.TXT) format with consistent naming conventions for seamless integration.
    <b style="font-weight:

  9. I

    India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban

    • ceicdata.com
    Updated Mar 26, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban [Dataset]. https://www.ceicdata.com/en/india/vital-statistics-birth-rate-by-states/vital-statistics-birth-rate-per-1000-population-tamil-nadu-urban
    Explore at:
    Dataset updated
    Mar 26, 2025
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2009 - Dec 1, 2020
    Area covered
    India
    Variables measured
    Vital Statistics
    Description

    Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data was reported at 13.600 NA in 2020. This records a decrease from the previous number of 14.000 NA for 2019. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data is updated yearly, averaging 15.800 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 18.200 NA in 1999 and a record low of 13.600 NA in 2020. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu: Urban data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH002: Vital Statistics: Birth Rate: by States.

  10. tamilnadu_covid19_dataset

    • kaggle.com
    Updated Jun 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirmal (2020). tamilnadu_covid19_dataset [Dataset]. https://www.kaggle.com/chandrunirmal/tamilnadu-covid19-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 13, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nirmal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Tamil Nadu
    Description

    gmr_tn - Global Mobility Report Vs Positive Cases

  11. o

    Tamil Nadu Budget 2023-24: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT...

    • openbudgetsindia.org
    Updated Apr 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Tamil Nadu Budget 2023-24: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT - Datasets - Open Budgets India [Dataset]. https://openbudgetsindia.org/dataset/tamil-nadu-demand-for-grants-35-human-resources-management-department-2023-24
    Explore at:
    Dataset updated
    Apr 3, 2023
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu, India
    Description

    Tamil Nadu Budget 2023-24: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT

  12. F

    Tamil General Domain Scripted Monologue Speech Data

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil General Domain Scripted Monologue Speech Data [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/general-scripted-speech-monologues-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The Tamil Scripted Monologue Speech Dataset for the General Domain is a carefully curated resource designed to support the development of Tamil language speech recognition systems. This dataset focuses on general-purpose conversational topics and is ideal for a wide range of AI applications requiring natural, domain-agnostic Tamil speech data.

    Speech Data

    This dataset features over 6,000 high-quality scripted monologue recordings in Tamil. The prompts span diverse real-life topics commonly encountered in general conversations and are intended to help train robust and accurate speech-enabled technologies.

    Participant Diversity
    Speakers: 60 native Tamil speakers
    Regions: Broad regional coverage ensures diverse accents and dialects
    Demographics: Participants aged 18 to 70, with a 60:40 male-to-female ratio
    Recording Specifications
    Recording Type: Scripted monologues and prompt-based recordings
    Audio Duration: 5 to 30 seconds per file
    Format: WAV, mono channel, 16-bit, 8 kHz & 16 kHz sample rates
    Environment: Clean, noise-free conditions to ensure clarity and usability

    Topic Coverage

    The dataset covers a wide variety of general conversation scenarios, including:

    Daily Conversations
    Topic-Specific Discussions
    General Knowledge and Advice
    Idioms and Sayings

    Contextual Features

    To enhance authenticity, the prompts include:

    Names: Male and female names specific to different Tamil Nadu regions
    Addresses: Commonly used address formats in daily Tamil speech
    Dates & Times: References used in general scheduling and time expressions
    Organization Names: Names of businesses, institutions, and other entities
    Numbers & Currencies: Mentions of quantities, prices, and monetary values

    Each prompt is designed to reflect everyday use cases, making it suitable for developing generalized NLP and ASR solutions.

    Transcription

    Every audio file in the dataset is accompanied by a verbatim text transcription, ensuring accurate training and evaluation of speech models.

    Content: Exact match to the spoken audio
    Format: Plain text (.TXT), named identically to the corresponding audio file
    Quality Control: All transcripts are validated by native Tamil transcribers

    Metadata

    Rich metadata is included for detailed filtering and analysis:

    Speaker Metadata: Unique speaker ID, age, gender, region, and dialect
    Audio Metadata: Prompt transcript, recording setup, device specs, sample rate, bit depth, and format

    Applications & Use Cases

    This dataset can power a variety of Tamil language AI technologies, including:

    Speech Recognition Training: ASR model development and fine-tuning
    <div

  13. F

    Tamil Call Center Data for Travel AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Call Center Data for Travel AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/travel-call-center-conversation-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Tamil Call Center Speech Dataset for the Travel industry is purpose-built to power the next generation of voice AI applications for travel booking, customer support, and itinerary assistance. With over 30 hours of unscripted, real-world conversations, the dataset enables the development of highly accurate speech recognition and natural language understanding models tailored for Tamil -speaking travelers.

    Created by FutureBeeAI, this dataset supports researchers, data scientists, and conversational AI teams in building voice technologies for airlines, travel portals, and hospitality platforms.

    Speech Data

    The dataset includes 30 hours of dual-channel audio recordings between native Tamil speakers engaged in real travel-related customer service conversations. These audio files reflect a wide variety of topics, accents, and scenarios found across the travel and tourism industry.

    Participant Diversity:
    Speakers: 60 native Tamil contributors from our verified pool.
    Regions: Covering multiple Tamil Nadu regions to capture accent and dialectal variation.
    Participant Profile: Balanced representation of age (18–70) and gender (60% male, 40% female).
    Recording Details:
    Conversation Nature: Naturally flowing, spontaneous customer-agent calls.
    Call Duration: Between 5 and 15 minutes per session.
    Audio Format: Stereo WAV, 16-bit depth, at 8kHz and 16kHz.
    Recording Environment: Captured in controlled, noise-free, echo-free settings.

    Topic Diversity

    Inbound and outbound conversations span a wide range of real-world travel support situations with varied outcomes (positive, neutral, negative).

    Inbound Calls:
    Booking Assistance
    Destination Information
    Flight Delays or Cancellations
    Support for Disabled Passengers
    Health and Safety Travel Inquiries
    Lost or Delayed Luggage, and more
    Outbound Calls:
    Promotional Travel Offers
    Customer Feedback Surveys
    Booking Confirmations
    Flight Rescheduling Alerts
    Visa Expiry Notifications, and others

    These scenarios help models understand and respond to diverse traveler needs in real-time.

    Transcription

    Each call is accompanied by manually curated, high-accuracy transcriptions in JSON format.

    Transcription Includes:
    Speaker-Segmented Dialogues
    Time-Stamped Segments
    Non-speech Markers (e.g., pauses, coughs)
    High transcription accuracy by dual-layered transcription review ensures word error rate under 5%.

    Metadata

    Extensive metadata enriches each call and speaker for better filtering and AI training:

    Participant Metadata: ID, age, gender, region, accent, and dialect.
    Conversation Metadata: Topic, domain, call type, sentiment, and audio specs.

    Usage and Applications

    This dataset is ideal for a variety of AI use cases in the travel and tourism space:

    ASR Systems: Train Tamil speech-to-text engines for travel platforms.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap:

  14. I

    India Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives [Dataset]. https://www.ceicdata.com/en/india/health-human-resources-number-of-nurses-registered/number-of-nurses-registered-tamil-nadu-auxiliary-nurse-midwives
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2009 - Dec 1, 2022
    Area covered
    India
    Description

    Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data was reported at 64,012.000 Person in 2022. This records an increase from the previous number of 61,465.000 Person for 2021. Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data is updated yearly, averaging 55,975.000 Person from Dec 2005 (Median) to 2022, with 15 observations. The data reached an all-time high of 64,012.000 Person in 2022 and a record low of 52,909.000 Person in 2005. Number of Nurses: Registered: Tamil Nadu: Auxiliary Nurse Midwives data remains active status in CEIC and is reported by Central Bureau of Health Intelligence. The data is categorized under India Premium Database’s Health Sector – Table IN.HLB005: Health Human Resources: Number of Nurses: Registered.

  15. Tamil Nadu 2021 State Assembly Elections

    • kaggle.com
    zip
    Updated May 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Praveen (2021). Tamil Nadu 2021 State Assembly Elections [Dataset]. https://www.kaggle.com/praveengovi/tamil-nadu-2021-state-elections
    Explore at:
    zip(100257 bytes)Available download formats
    Dataset updated
    May 14, 2021
    Authors
    Praveen
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu
    Description

    Context

    Tamil Nadu 2021 State Assembly Elections - Dataset contains Constituency wise contestant, polling details & Winning details

    Content

    Tamil Nadu 2021 State Assembly Elections - Event date -6-April-2021 Results out on - 2 May-2021

    Data File - Tamil_Nadu_State_Elections_2021_Constituency_Metadata.csv

    • Constituency - Demographic location name in Tamil Nadu ( India )
    • District - District on which Constituency belongs to ( Probably District may contain 6-8 Constituency )
    • Reserved. - Holds it belongs to SC/ST or General
    • Lok_sabha_constituency - Lok Sabha is a national election constituency ( Probably each Lok Sabha constituency have 6 State Constituency )
    • State_Name - State Name in India - "Tamil Nadu"

    Data File - Tamil_Nadu_State_Elections_2021_Details.csv

    • Constituency - Demographic location name in Tamil Nadu ( India )
    • Candidate - Name of the Candidate who contested in the Constituency
    • Party - Name of the party the Candidate belongs to
    • EVM_Votes - No. of Electoral Voting Machine Votes
    • Postal_Votes - No. of Votes from Postal
    • Total_Votes - Total Votes for the candidate
    • %_of_Votes - Percentage of votes the candidate get in his constituency
    • Tot_Constituency_votes_polled - Total no. of votes polled in the constituency
    • Tot_votes_by_parties - Total votes the party got in all the constituency
    • Winning_votes - Total votes declated as win
    • Win_Lost_Flag 🎊 - True or False

    General Information 👍 -

    Total State Election Constituency in Tamil Nadu - 234 Total Lok Sabha Election Constituency in Tamil Nadu - 39

    Acknowledgements

    Thanks 🤩👐to https://www.elections.tn.gov.in/Elections.aspx for detailed information

    Inspiration

    Tamil Nadu is one of the highly industrialized states in India, It Contributes 10-15 % of India's GDP. 2021 State elections date help to understand the voting pattern, People of TN have given mandate to which party EDA, Data Visualisation & ML Modeling

  16. Share of disabled population in Tamil Nadu India 2018, by type and gender

    • statista.com
    Updated Sep 26, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2023). Share of disabled population in Tamil Nadu India 2018, by type and gender [Dataset]. https://www.statista.com/statistics/1080066/india-disabled-persons-by-type-and-gender-tamil-nadu/
    Explore at:
    Dataset updated
    Sep 26, 2023
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 2018 - Dec 2018
    Area covered
    India
    Description

    According to the 76th round of the NSO survey conducted between July and December 2018, a higher percentage of men had disabilities compared to women in India. Specifically in Tamil Nadu, two percent of men had multiple disabilities, while this was at 1.9 percent among females. The National Statistical Office (NSO) is the statistical wing of the Ministry of Statistics and Programme Implementation (MOSPI), mainly responsible for laying down standards for statistical analysis, data collection, and implementation.

  17. F

    Tamil Call Center Data for Healthcare AI

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Call Center Data for Healthcare AI [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/healthcare-call-center-conversation-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    This Tamil Call Center Speech Dataset for the Healthcare industry is purpose-built to accelerate the development of Tamil speech recognition, spoken language understanding, and conversational AI systems. With 30 Hours of unscripted, real-world conversations, it delivers the linguistic and contextual depth needed to build high-performance ASR models for medical and wellness-related customer service.

    Created by FutureBeeAI, this dataset empowers voice AI teams, NLP researchers, and data scientists to develop domain-specific models for hospitals, clinics, insurance providers, and telemedicine platforms.

    Speech Data

    The dataset features 30 Hours of dual-channel call center conversations between native Tamil speakers. These recordings cover a variety of healthcare support topics, enabling the development of speech technologies that are contextually aware and linguistically rich.

    Participant Diversity:
    Speakers: 60 verified native Tamil speakers from our contributor community.
    Regions: Diverse regions across Tamil Nadu to ensure broad dialectal representation.
    Participant Profile: Age range of 18–70 with a gender mix of 60% male and 40% female.
    RecordingDetails:
    Conversation Nature: Naturally flowing, unscripted conversations.
    Call Duration: Each session ranges between 5 to 15 minutes.
    Audio Format: WAV format, stereo, 16-bit depth at 8kHz and 16kHz sample rates.
    Recording Environment: Captured in clear conditions without background noise or echo.

    Topic Diversity

    The dataset spans inbound and outbound calls, capturing a broad range of healthcare-specific interactions and sentiment types (positive, neutral, negative).

    Inbound Calls:
    Appointment Scheduling
    New Patient Registration
    Surgical Consultation
    Dietary Advice and Consultations
    Insurance Coverage Inquiries
    Follow-up Treatment Requests, and more
    OutboundCalls:
    Appointment Reminders
    Preventive Care Campaigns
    Test Results & Lab Reports
    Health Risk Assessment Calls
    Vaccination Updates
    Wellness Subscription Outreach, and more

    These real-world interactions help build speech models that understand healthcare domain nuances and user intent.

    Transcription

    Every audio file is accompanied by high-quality, manually created transcriptions in JSON format.

    Transcription Includes:
    Speaker-identified Dialogues
    Time-coded Segments
    Non-speech Annotations (e.g., silence, cough)
    High transcription accuracy with word error rate is below 5%, backed by dual-layer QA checks.

    Metadata

    Each conversation and speaker includes detailed metadata to support fine-tuned training and analysis.

    Participant Metadata: ID, gender, age, region, accent, and dialect.
    Conversation Metadata: Topic, sentiment, call type, sample rate, and technical specs.

    Usage and Applications

    This dataset can be used across a range of healthcare and voice AI use cases:

    <b style="font-weight:

  18. o

    Mundanthurai Sanctuary, Tamil Nadu, India. from Asia/India

    • opencontext.org
    Updated Nov 29, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rebecca Haywood (2021). Mundanthurai Sanctuary, Tamil Nadu, India. from Asia/India [Dataset]. https://opencontext.org/subjects/e76f9ba7-1283-4134-9bc5-0968eea9314c
    Explore at:
    Dataset updated
    Nov 29, 2021
    Dataset provided by
    Open Context
    Authors
    Rebecca Haywood
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Asia, Tamil Nadu, India
    Description

    An Open Context "subjects" dataset item. Open Context publishes structured data as granular, URL identified Web resources. This "Survey Unit" record is part of the "Database of non-human primate dietary studies" data publication.

  19. o

    Tamil Nadu Budget 2021-22: Demands For Grant - 35 - Human Resources...

    • openbudgetsindia.org
    Updated Oct 13, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Tamil Nadu Budget 2021-22: Demands For Grant - 35 - Human Resources Management Department - Datasets - Open Budgets India [Dataset]. https://openbudgetsindia.org/dataset/tamil-nadu-demands-for-grant-35-human-resources-management-department-2021-22
    Explore at:
    Dataset updated
    Oct 13, 2021
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu, India
    Description

    Tamil Nadu Budget 2021-22: Demands For Grant - 35 - Human Resources Management Department

  20. o

    Tamil Nadu Budget 2024-25: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT...

    • openbudgetsindia.org
    Updated Mar 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Tamil Nadu Budget 2024-25: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT - Datasets - Open Budgets India [Dataset]. https://openbudgetsindia.org/dataset/tamil-nadu-budget-2024-25-demand-for-grants-35-human-resources-management-department
    Explore at:
    Dataset updated
    Mar 1, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu, India
    Description

    Tamil Nadu Budget 2024-25: Demand for Grants 35- HUMAN RESOURCES MANAGEMENT DEPARTMENT

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population
Organization logo

Tamilnadu Population

Place wise population collection

Explore at:
113 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 18, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vaishnavi
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Tamil Nadu
Description

Context

This dataset consist of population of three years in Tamil Nadu.

Content

This file consist of information about the places, population , district and position of place.

Acknowledgements

This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.

Search
Clear search
Close search
Google apps
Main menu