100+ datasets found
  1. Tamilnadu Population

    • kaggle.com
    zip
    Updated Sep 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population
    Explore at:
    zip(19029 bytes)Available download formats
    Dataset updated
    Sep 18, 2020
    Authors
    Vaishnavi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Tamil Nadu
    Description

    Context

    This dataset consist of population of three years in Tamil Nadu.

    Content

    This file consist of information about the places, population , district and position of place.

    Acknowledgements

    This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.

  2. I

    India Population: Tamil Nadu

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/population/population-tamil-nadu
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 2013 - Mar 1, 2024
    Area covered
    India
    Variables measured
    Population
    Description

    Population: Tamil Nadu data was reported at 77.222 Person mn in 2025. This records an increase from the previous number of 76.993 Person mn for 2024. Population: Tamil Nadu data is updated yearly, averaging 66.611 Person mn from Mar 1994 (Median) to 2025, with 32 observations. The data reached an all-time high of 77.222 Person mn in 2025 and a record low of 57.670 Person mn in 1994. Population: Tamil Nadu data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under Global Database’s India – Table IN.GBG001: Population. [COVID-19-IMPACT]

  3. Tamil Nadu Travel Trips

    • kaggle.com
    zip
    Updated Oct 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pj0888 (2024). Tamil Nadu Travel Trips [Dataset]. https://www.kaggle.com/datasets/pj0888/tamil-nadu-travel-trips
    Explore at:
    zip(37947 bytes)Available download formats
    Dataset updated
    Oct 21, 2024
    Authors
    Pj0888
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Tamil Nadu
    Description

    To provide a detailed description of your dataset, let's go over each feature based on your dataset structure and the columns in the file. I'll also explain potential meanings for each column and what could be inferred from them.

    Columns Description (Assuming from your Dataset)

    Based on the columns mentioned in your dataset (Tamil_nadu_taxi_trips_cleaned.csv), here's a detailed description of each:

    1. Date_Time:

      • Description: The timestamp representing the exact date and time of the taxi trip's start. It can be broken down into hour, day, month, and year to perform time-based analysis like peak travel hours or trends across months.
      • Type: DateTime
      • Potential Analysis: You could analyze trends based on time, such as determining peak traffic hours, fare variation by time of day, or the busiest travel days of the week.
    2. Pickup_Location:

      • Description: This represents the geographical or categorical location where the taxi trip begins. The value could be a specific location name, zone, or area code.
      • Type: Categorical (encoded later into numerical form)
      • Potential Analysis: You can analyze the distribution of trips across different locations, identify popular pickup spots, or perform clustering on locations to find patterns.
    3. Drop_Location:

      • Description: The destination location where the taxi trip ends. Like Pickup_Location, this could be represented as a location name or area code.
      • Type: Categorical (encoded later into numerical form)
      • Potential Analysis: This can be used for analyzing the most common destinations, calculating distances between pickup and drop locations, and evaluating demand for rides to certain areas.
    4. Distance_km:

      • Description: The distance traveled during the trip in kilometers.
      • Type: Numeric
      • Potential Analysis: This feature is directly related to the fare prediction, as longer distances tend to result in higher fares. You can also analyze average trip distances, or correlate distances with time spent in traffic.
    5. Fare_INR:

      • Description: The fare charged for the trip, represented in Indian Rupees (INR).
      • Type: Numeric
      • Potential Analysis: This is a key feature for fare prediction models. You could also analyze average fares, identify outliers (like unusually high or low fares), or see how fare correlates with other features such as distance, time of day, and number of passengers.
    6. No_of_Passengers:

      • Description: The number of passengers on the trip.
      • Type: Numeric (integer)
      • Potential Analysis: You can analyze the frequency of trips with different numbers of passengers, check if the number of passengers impacts the fare, or evaluate how many shared rides or group trips occur.
    7. Travel_Time_hrs:

      • Description: The duration of the taxi trip in hours.
      • Type: Numeric
      • Potential Analysis: This is an important feature for analyzing traffic conditions and travel efficiency. You can evaluate if longer travel times correlate with higher fares and whether travel time increases during rush hours.
    8. Tips_INR:

      • Description: The amount of tip given by the passenger in INR.
      • Type: Numeric
      • Potential Analysis: You can analyze tipping patterns, see if there's a relationship between distance, fare, and tips, or identify passengers' tipping behavior based on time of day or specific locations.
    9. Tourist_Place_Nearby:

      • Description: Indicates whether the pickup or drop location is near a tourist attraction.
      • Type: Categorical (likely a binary indicator, i.e., yes/no)
      • Potential Analysis: This feature could be used to analyze the impact of tourist locations on fare prices, distance, and passenger frequency. You can also identify if tourists are more likely to tip.
    10. Weather_Condition:

      • Description: Represents the weather conditions during the trip (e.g., sunny, rainy, cloudy, etc.).
      • Type: Categorical (encoded later into numerical form)
      • Potential Analysis: Weather conditions may impact both travel times and fare amounts. For example, rainy weather could lead to longer travel times, affecting fare amounts.
    11. Vehicle_Type:

      • Description: Specifies the type of vehicle used for the taxi trip (e.g., sedan, SUV, auto-rickshaw, etc.).
      • Type: Categorical (encoded later into numerical form)
      • Potential Analysis: Different vehicle types may result in varying fare structures. You can analyze how different vehicle types affect fare, travel time, and tipping behavior.

    Steps for Dataset Analysis

    1. Handling Missing Data:
      • As seen earlier, several columns had missing values (Date_Time, Pickup_Location, Drop_Location, Distance_km, etc.). Filling these appropriate...
  4. I

    India Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/vital-statistics-natural-growth-rate-by-states/vital-statistics-natural-growth-rate-per-1000-population-tamil-nadu
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2009 - Dec 1, 2020
    Area covered
    India
    Variables measured
    Vital Statistics
    Description

    Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data was reported at 7.700 NA in 2020. This records a decrease from the previous number of 8.100 NA for 2019. Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data is updated yearly, averaging 8.600 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 11.400 NA in 2001 and a record low of 7.700 NA in 2020. Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH004: Vital Statistics: Natural Growth Rate: by States.

  5. I

    India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu

    • ceicdata.com
    Updated Dec 15, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2023). India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/vital-statistics-birth-rate-by-states/vital-statistics-birth-rate-per-1000-population-tamil-nadu
    Explore at:
    Dataset updated
    Dec 15, 2023
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2009 - Dec 1, 2020
    Area covered
    India
    Variables measured
    Vital Statistics
    Description

    Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data was reported at 13.800 NA in 2020. This records a decrease from the previous number of 14.200 NA for 2019. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data is updated yearly, averaging 15.900 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 19.300 NA in 2000 and a record low of 13.800 NA in 2020. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH002: Vital Statistics: Birth Rate: by States.

  6. TamilNadu Legislative Election Dataset

    • kaggle.com
    zip
    Updated Jun 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sai Raam (2024). TamilNadu Legislative Election Dataset [Dataset]. https://www.kaggle.com/datasets/srinrealyf/1971-2021-tamilnadu-legislative-election-dataset
    Explore at:
    zip(110725 bytes)Available download formats
    Dataset updated
    Jun 29, 2024
    Authors
    Sai Raam
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Tamil Nadu
    Description

    This comprehensive dataset provides detailed information on Tamil Nadu state elections spanning from 1971 to 2021. It encompasses data from multiple legislative assembly elections, capturing a wide range of variables essential for political and social analysis.

    Applications

    • Political Analysis: Analyze trends in voter behavior, party performance, and electoral outcomes over five decades.
    • Social Research: Study the impact of socio-economic factors on election results and voter preferences.
    • Historical Trends: Examine historical shifts in political dominance and changes in the political landscape of Tamil Nadu.
    • Predictive Modelling: Utilize the dataset for predictive analysis and forecasting future election outcomes.

    Column Description - ac_no : Assembly Consistuency Number - ac_name: Assembly Consistuency Name - winning_cand: Name of Winning Candidate - party: Name of Party - totelectors: Total number of electors in the consistuency - tot votes: Total votes secured by winning candidate - poll_percentage: Percentage of polls polled - margin: Margin difference between winner and runner - winning_percentage: Percentage of marginal win - district: Name of district

    Acknowledgement

    The data has been extracted from official eci website and cross-checked with other sites for validation. If you use this work or want to appreciate me you can drop a hi to linkedIn.com/in/srinrealyf

    Keywords

    Tamil Nadu, India, Election, Election Data, Legislative Election, MLA Election

    If you find this dataset helpful, consider upvoting the dataset and drop your comments for any feedback

  7. Tamil Nadu School Data - SSA

    • kaggle.com
    zip
    Updated Feb 3, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arunmozhi (2019). Tamil Nadu School Data - SSA [Dataset]. https://www.kaggle.com/tecoholic/tamil-nadu-school-data-ssa
    Explore at:
    zip(6007 bytes)Available download formats
    Dataset updated
    Feb 3, 2019
    Authors
    Arunmozhi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Tamil Nadu
    Description

    Context

    The dataset is the statistical information on schools in different districts of the state of Tamil Nadu, India.

    Content

    The dataset contains the following information:

    • No. of schools of different management type
    • No. of teachers working in schools categorized by management type
    • No. of students enrolled the various schools grouped by management type

    Acknowledgements

    The data was collected from the open data portal of the Tamil Nadu government from the following locations:

  8. I

    India Census: Population: Tamil Nadu

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Census: Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/census-population-by-states/census-population-tamil-nadu
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 1901 - Mar 1, 2011
    Area covered
    India
    Variables measured
    Population
    Description

    Census: Population: Tamil Nadu data was reported at 72,147,030.000 Person in 03-01-2011. This records an increase from the previous number of 62,405,679.000 Person for 03-01-2001. Census: Population: Tamil Nadu data is updated decadal, averaging 31,903,000.000 Person from Mar 1901 (Median) to 03-01-2011, with 12 observations. The data reached an all-time high of 72,147,030.000 Person in 03-01-2011 and a record low of 19,252,630.000 Person in 03-01-1901. Census: Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAB002: Census: Population: by States.

  9. F

    Tamil Scripted Monologue Speech Data for Healthcare

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Scripted Monologue Speech Data for Healthcare [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/healthcare-scripted-speech-monologues-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Introducing the Tamil Scripted Monologue Speech Dataset for the Healthcare Domain, a voice dataset built to accelerate the development and deployment of Tamil language automatic speech recognition (ASR) systems, with a sharp focus on real-world healthcare interactions.

    Speech Data

    This dataset includes over 6,000 high-quality scripted audio prompts recorded in Tamil, representing typical voice interactions found in the healthcare industry. The data is tailored for use in voice technology systems that power virtual assistants, patient-facing AI tools, and intelligent customer service platforms.

    Participant Diversity
    Speakers: 60 native Tamil speakers.
    Regional Balance: Participants are sourced from multiple regions across Tamil Nadu, reflecting diverse dialects and linguistic traits.
    Demographics: Includes a mix of male and female participants (60:40 ratio), aged between 18 and 70 years.
    Recording Specifications
    Nature of Recordings: Scripted monologues based on healthcare-related use cases.
    Duration: Each clip ranges between 5 to 30 seconds, offering short, context-rich speech samples.
    Audio Format: WAV files recorded in mono, with 16-bit depth and sample rates of 8 kHz and 16 kHz.
    Environment: Clean and echo-free spaces ensure clear and noise-free audio capture.

    Topic Coverage

    The prompts span a broad range of healthcare-specific interactions, such as:

    Patient check-in and follow-up communication
    Appointment booking and cancellation dialogues
    Insurance and regulatory support queries
    Medication, test results, and consultation discussions
    General health tips and wellness advice
    Emergency and urgent care communication
    Technical support for patient portals and apps
    Domain-specific scripted statements and FAQs

    Contextual Depth

    To maximize authenticity, the prompts integrate linguistic elements and healthcare-specific terms such as:

    Names: Gender- and region-appropriate Tamil Nadu names
    Addresses: Varied local address formats spoken naturally
    Dates & Times: References to appointment dates, times, follow-ups, and schedules
    Medical Terminology: Common medical procedures, symptoms, and treatment references
    Numbers & Measurements: Health data like dosages, vitals, and test result values
    Healthcare Institutions: Names of clinics, hospitals, and diagnostic centers

    These elements make the dataset exceptionally suited for training AI systems to understand and respond to natural healthcare-related speech patterns.

    Transcription

    Every audio recording is accompanied by a verbatim, manually verified transcription.

    Content: The transcription mirrors the exact scripted prompt recorded by the speaker.
    Format: Files are delivered in plain text (.TXT) format with consistent naming conventions for seamless integration.
    <b style="font-weight:

  10. Tamil (Tamizh) Wikipedia Text Dataset for NLP

    • kaggle.com
    zip
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Younus_Mohamed (2024). Tamil (Tamizh) Wikipedia Text Dataset for NLP [Dataset]. https://www.kaggle.com/datasets/younusmohamed/tamil-tamizh-wikipedia-articles
    Explore at:
    zip(339341289 bytes)Available download formats
    Dataset updated
    Nov 12, 2024
    Authors
    Younus_Mohamed
    License

    Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
    License information was derived automatically

    Description

    This dataset is part of a larger mission to transform Tamil into a high-resource language in the field of Natural Language Processing (NLP). As one of the oldest and most culturally rich languages, Tamil has a unique linguistic structure, yet it remains underrepresented in the NLP landscape. This dataset, extracted from Tamil Wikipedia, serves as a foundational resource to support Tamil language processing, text mining, and machine learning applications.

    What’s Included

    - Text Data: This dataset contains over 569,000 articles in raw text form, extracted from Tamil Wikipedia. The collection is ideal for language model training, word frequency analysis, and text mining.

    - Scripts and Processing Tools: Code snippets are provided for processing .bz2 compressed files, generating word counts, and handling data for NLP applications.

    Why This Dataset?

    Despite having a documented lexicon of over 100,000 words, only a fraction of these are actively used in everyday communication. The largest available Tamil treebank currently holds only 10,000 words, limiting the scope for training accurate language models. This dataset aims to bridge that gap by providing a robust, open-source corpus for researchers, developers, and linguists who want to work on Tamil language technologies.

    ** How You Can Use This Dataset**

    - Language Modeling: Train or fine-tune models like BERT, GPT, or LSTM-based language models for Tamil. - Linguistic Research: Analyze Tamil morphology, syntax, and vocabulary usage. - Data Augmentation: Use the raw text to generate augmented data for multilingual NLP applications. - Word Embeddings and Semantic Analysis: Create embeddings for Tamil words, useful in multilingual setups or standalone applications.

    Let’s Collaborate!

    I believe that advancing Tamil in NLP cannot be a solo effort. Contributions in the form of additional data, annotations, or even new tools for Tamil language processing are welcome! By working together, we can make Tamil a truly high-resource language in NLP.

    License

    This dataset is based on content from Tamil Wikipedia and is shared under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0). Proper attribution to Wikipedia is required when using this data.

  11. F

    Tamil Open Ended Question Answer Text Dataset

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil Open Ended Question Answer Text Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/tamil-open-ended-question-answer-text-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    The Tamil Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the Tamil language, advancing the field of artificial intelligence.

    Dataset Content:

    This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in Tamil. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.

    Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native Tamil people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.

    This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.

    Question Diversity:

    To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.

    Answer Formats:

    To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.

    Data Format and Annotation Details:

    This fully labeled Tamil Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.

    Quality and Accuracy:

    The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.

    Both the question and answers in Tamil are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.

    Continuous Updates and Customization:

    The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.

    License:

    The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy Tamil Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.

  12. Share of disabled population in Tamil Nadu India 2018, by type and gender

    • statista.com
    Updated Jan 15, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2021). Share of disabled population in Tamil Nadu India 2018, by type and gender [Dataset]. https://www.statista.com/statistics/1080066/india-disabled-persons-by-type-and-gender-tamil-nadu/
    Explore at:
    Dataset updated
    Jan 15, 2021
    Dataset authored and provided by
    Statistahttp://statista.com/
    Time period covered
    Jul 2018 - Dec 2018
    Area covered
    India
    Description

    According to the 76th round of the NSO survey conducted between July and December 2018, a higher percentage of men had disabilities compared to women in India. Specifically in Tamil Nadu, two percent of men had multiple disabilities, while this was at 1.9 percent among females. The National Statistical Office (NSO) is the statistical wing of the Ministry of Statistics and Programme Implementation (MOSPI), mainly responsible for laying down standards for statistical analysis, data collection, and implementation.

  13. I

    India Census: Population: Tamil Nadu: Female

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Census: Population: Tamil Nadu: Female [Dataset]. https://www.ceicdata.com/en/india/census-population-by-states/census-population-tamil-nadu-female
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 1901 - Mar 1, 2011
    Area covered
    India
    Variables measured
    Population
    Description

    Census: Population: Tamil Nadu: Female data was reported at 36,009,055.000 Person in 03-01-2011. This records an increase from the previous number of 31,004,770.000 Person for 03-01-2001. Census: Population: Tamil Nadu: Female data is updated decadal, averaging 15,945,649.000 Person from Mar 1901 (Median) to 03-01-2011, with 12 observations. The data reached an all-time high of 36,009,055.000 Person in 03-01-2011 and a record low of 9,833,232.000 Person in 03-01-1901. Census: Population: Tamil Nadu: Female data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAB002: Census: Population: by States.

  14. F

    Tamil General Domain Scripted Monologue Speech Data

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Tamil General Domain Scripted Monologue Speech Data [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/general-scripted-speech-monologues-tamil-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The Tamil Scripted Monologue Speech Dataset for the General Domain is a carefully curated resource designed to support the development of Tamil language speech recognition systems. This dataset focuses on general-purpose conversational topics and is ideal for a wide range of AI applications requiring natural, domain-agnostic Tamil speech data.

    Speech Data

    This dataset features over 6,000 high-quality scripted monologue recordings in Tamil. The prompts span diverse real-life topics commonly encountered in general conversations and are intended to help train robust and accurate speech-enabled technologies.

    Participant Diversity
    Speakers: 60 native Tamil speakers
    Regions: Broad regional coverage ensures diverse accents and dialects
    Demographics: Participants aged 18 to 70, with a 60:40 male-to-female ratio
    Recording Specifications
    Recording Type: Scripted monologues and prompt-based recordings
    Audio Duration: 5 to 30 seconds per file
    Format: WAV, mono channel, 16-bit, 8 kHz & 16 kHz sample rates
    Environment: Clean, noise-free conditions to ensure clarity and usability

    Topic Coverage

    The dataset covers a wide variety of general conversation scenarios, including:

    Daily Conversations
    Topic-Specific Discussions
    General Knowledge and Advice
    Idioms and Sayings

    Contextual Features

    To enhance authenticity, the prompts include:

    Names: Male and female names specific to different Tamil Nadu regions
    Addresses: Commonly used address formats in daily Tamil speech
    Dates & Times: References used in general scheduling and time expressions
    Organization Names: Names of businesses, institutions, and other entities
    Numbers & Currencies: Mentions of quantities, prices, and monetary values

    Each prompt is designed to reflect everyday use cases, making it suitable for developing generalized NLP and ASR solutions.

    Transcription

    Every audio file in the dataset is accompanied by a verbatim text transcription, ensuring accurate training and evaluation of speech models.

    Content: Exact match to the spoken audio
    Format: Plain text (.TXT), named identically to the corresponding audio file
    Quality Control: All transcripts are validated by native Tamil transcribers

    Metadata

    Rich metadata is included for detailed filtering and analysis:

    Speaker Metadata: Unique speaker ID, age, gender, region, and dialect
    Audio Metadata: Prompt transcript, recording setup, device specs, sample rate, bit depth, and format

    Applications & Use Cases

    This dataset can power a variety of Tamil language AI technologies, including:

    Speech Recognition Training: ASR model development and fine-tuning
    <div

  15. Tamil Nadu Crop Production Dataset

    • kaggle.com
    zip
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Usman (2024). Tamil Nadu Crop Production Dataset [Dataset]. https://www.kaggle.com/datasets/usmanlovescode/tamil-nadu-crop-production-dataset
    Explore at:
    zip(107473 bytes)Available download formats
    Dataset updated
    Jan 22, 2024
    Authors
    Muhammad Usman
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Tamil Nadu
    Description

    Delve into the agriculture of Tamil Nadu with this dataset, offering comprehensive insights into crop production. Covering various crops, the dataset provides valuable information on yields, trends, and patterns, serving as a crucial resource for researchers, policymakers, and stakeholders interested in understanding the agricultural landscape of Tamil Nadu.

  16. Tamil Nadu Non working population

    • knoema.com
    csv, json, sdmx, xls
    Updated Jan 1, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Knoema (2013). Tamil Nadu Non working population [Dataset]. https://knoema.com/atlas/India/Tamil-Nadu/topics/Demographics/Population/Non-working-population
    Explore at:
    xls, json, csv, sdmxAvailable download formats
    Dataset updated
    Jan 1, 2013
    Dataset authored and provided by
    Knoemahttp://knoema.com/
    Time period covered
    2001 - 2011
    Area covered
    Tamil Nadu, India
    Variables measured
    Non working population
    Description

    Non working population of Tamil Nadu surged by 13.71% from 34,527,397 persons in 2001 to 39,262,349 persons in 2011. Since the 13.71% jump in 2011, non working population remained stable by 0.00% in 2011.

  17. Cities and Towns in TN - Population statistics

    • kaggle.com
    zip
    Updated Sep 15, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gokul Prakash P (2020). Cities and Towns in TN - Population statistics [Dataset]. https://www.kaggle.com/gokulprakash22/cities-in-tamil-nadu-population-statistics
    Explore at:
    zip(19074 bytes)Available download formats
    Dataset updated
    Sep 15, 2020
    Authors
    Gokul Prakash P
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This Dataset consist of population statistics by census years of cities and towns in Tamil Nadu obtained from various sources.

    Content

    This Dataset consist of 6 columns - Name of city or town, Status of that city/town, District of that city/town and 3 columns of population statistics by census years(1991-03-01, 2001-03-01, 2011-03-01)

    Acknowledgements

    This is done during the internship at Tact Labs. Thanks to Aishwarya who helped me in collecting the dataset.

  18. Dinamalar Tamil News Corpus (1.9 million records)

    • kaggle.com
    zip
    Updated Jan 5, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vijayabhaskar J (2020). Dinamalar Tamil News Corpus (1.9 million records) [Dataset]. https://www.kaggle.com/vijayabhaskar96/tamil-news-dataset-19-million-records
    Explore at:
    zip(1051706225 bytes)Available download formats
    Dataset updated
    Jan 5, 2020
    Authors
    Vijayabhaskar J
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    There were no large datasets in the Tamil language apart from the Tamil wiki dataset (120k articles), So I decided to create my own. This dataset is the result of it!

    Content

    The data is acquired by scrapping the publicly available articles published on Dinamalar.com, which is a well-known newspaper in Tamil nadu, India. The dataset contains articles from 2009 - 2019.

    Acknowledgements

    This dataset exists because of Dinamalar.com. All thanks to them.

  19. COVID 19 Vaccination Coverage across Tamilnadu

    • kaggle.com
    zip
    Updated Jun 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suriyakanth R (2021). COVID 19 Vaccination Coverage across Tamilnadu [Dataset]. https://www.kaggle.com/datasets/suriyakanth2711/covid-19-vaccination-coverage-across-tamilnadu
    Explore at:
    zip(9494 bytes)Available download formats
    Dataset updated
    Jun 24, 2021
    Authors
    Suriyakanth R
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Tamil Nadu
    Description

    Coronavirus disease 2019 (COVID-19), also known as the coronavirus, or COVID, is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in Wuhan, China, in December 2019. The disease has since spread worldwide, leading to an ongoing pandemic. And this data is all about the COVID 19 Vaccination Coverage across Health Unit district wise in Tamil Nadu.

    Content

    Data columns (total 39 columns): Column

    0 S.No
    1 Health Unit District
    2 Achievement towards vaccination of 1st Dosage Covishield to HCW
    3 Achievement towards vaccination of 2nd Dosage Covishield to HCW
    4 Achievement towards vaccination of 1st Dosage Covishield to FLW
    5 Achievement towards vaccination of 2nd Dosage Covishield to FLW
    6 Achievement towards vaccination of 1st Dosage Covishield to beneficiaries of 18 years and less than 44 years age group
    7 Achievement towards vaccination of 2nd Dosage Covishield to beneficiaries of 18 years and less than 44 years age group
    8 Achievement towards vaccination of 1st Dosage Covishield to beneficiaries of 45 years and less than 60 years age group with Comorbidities
    9 Achievement towards vaccination of 2nd Dosage Covishield to beneficiaries of 45 years and less than 60 years age group with Comorbidities
    10 Achievement towards vaccination of 1st Dosage Covishield to 60+ years beneficiaries with Comorbidities
    11 Achievement towards vaccination of 2nd Dosage Covishield to 60+ years beneficiaries with Comorbidities
    12 Total Achievement of vaccination to beneficiaries under 1st Dose of Covishield
    13 Total Achievement of vaccination to beneficiaries under 2nd Dose of Covishield
    14 Achievement towards vaccination of 1st Dosage Covaxin to HCW
    15 Achievement towards vaccination of 2nd Dosage Covaxin to HCW
    16 Achievement towards vaccination of 1st Dosage Covaxin to FLW
    17 Achievement towards vaccination of 2nd Dosage Covaxin to FLW
    18 Achievement towards vaccination of 1st Dosage Covaxin to beneficiaries of 18 years and less than 44 years age group
    19 Achievement towards vaccination of 2nd Dosage Covaxin to beneficiaries of 18 years and less than 44 years age group
    20 Achievement towards vaccination of 1st Dosage Covaxin to beneficiaries of 45 years and less than 60 years age group with Comorbidities
    21 Achievement towards vaccination of 2nd Dosage Covaxin to beneficiaries of 45 years and less than 60 years age group with Comorbidities
    22 Achievement towards vaccination of 1st Dosage Covaxin to 60+ years beneficiaries with Comorbidities
    23 Achievement towards vaccination of 2nd Dosage Covaxin to 60+ years beneficiaries with Comorbidities
    24 Total Achievement of vaccination to beneficiaries under 1st Dose of Covaxin
    25 Total Achievement of vaccination to beneficiaries under 2nd Dose of Covaxin ...

  20. arani_data

    • kaggle.com
    zip
    Updated Apr 7, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anup1997 (2022). arani_data [Dataset]. https://www.kaggle.com/datasets/anup1997/arani-data
    Explore at:
    zip(135047 bytes)Available download formats
    Dataset updated
    Apr 7, 2022
    Authors
    Anup1997
    Description

    This dataset will show the male population in Arani village where is in South eastern side of the Tamil Nadu State.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population
Organization logo

Tamilnadu Population

Place wise population collection

Explore at:
112 scholarly articles cite this dataset (View in Google Scholar)
zip(19029 bytes)Available download formats
Dataset updated
Sep 18, 2020
Authors
Vaishnavi
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered
Tamil Nadu
Description

Context

This dataset consist of population of three years in Tamil Nadu.

Content

This file consist of information about the places, population , district and position of place.

Acknowledgements

This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.

Search
Clear search
Close search
Google apps
Main menu