100+ datasets found

Tamilnadu Population
kaggle.com
zip
Updated Sep 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population
Explore at:
zip(19029 bytes)Available download formats
Dataset updated
Sep 18, 2020
Authors
Vaishnavi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Tamil Nadu
Description
Context

This dataset consist of population of three years in Tamil Nadu.

Content

This file consist of information about the places, population , district and position of place.

Acknowledgements

This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.
I
India Population: Tamil Nadu
ceicdata.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, India Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/population/population-tamil-nadu
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 1, 2013 - Mar 1, 2024
Area covered
India
Variables measured
Population
Description
Population: Tamil Nadu data was reported at 77.222 Person mn in 2025. This records an increase from the previous number of 76.993 Person mn for 2024. Population: Tamil Nadu data is updated yearly, averaging 66.611 Person mn from Mar 1994 (Median) to 2025, with 32 observations. The data reached an all-time high of 77.222 Person mn in 2025 and a record low of 57.670 Person mn in 1994. Population: Tamil Nadu data remains active status in CEIC and is reported by Ministry of Statistics and Programme Implementation. The data is categorized under Global Database’s India – Table IN.GBG001: Population. [COVID-19-IMPACT]
Tamil Nadu Travel Trips
kaggle.com
zip
Updated Oct 21, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pj0888 (2024). Tamil Nadu Travel Trips [Dataset]. https://www.kaggle.com/datasets/pj0888/tamil-nadu-travel-trips
Explore at:
zip(37947 bytes)Available download formats
Dataset updated
Oct 21, 2024
Authors
Pj0888
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Tamil Nadu
Description
To provide a detailed description of your dataset, let's go over each feature based on your dataset structure and the columns in the file. I'll also explain potential meanings for each column and what could be inferred from them.

Columns Description (Assuming from your Dataset)

Based on the columns mentioned in your dataset (Tamil_nadu_taxi_trips_cleaned.csv), here's a detailed description of each:

Date_Time:

Description: The timestamp representing the exact date and time of the taxi trip's start. It can be broken down into hour, day, month, and year to perform time-based analysis like peak travel hours or trends across months.

Type: DateTime

Potential Analysis: You could analyze trends based on time, such as determining peak traffic hours, fare variation by time of day, or the busiest travel days of the week.

Pickup_Location:

Description: This represents the geographical or categorical location where the taxi trip begins. The value could be a specific location name, zone, or area code.

Type: Categorical (encoded later into numerical form)

Potential Analysis: You can analyze the distribution of trips across different locations, identify popular pickup spots, or perform clustering on locations to find patterns.

Drop_Location:

Description: The destination location where the taxi trip ends. Like Pickup_Location, this could be represented as a location name or area code.

Type: Categorical (encoded later into numerical form)

Potential Analysis: This can be used for analyzing the most common destinations, calculating distances between pickup and drop locations, and evaluating demand for rides to certain areas.

Distance_km:

Description: The distance traveled during the trip in kilometers.

Type: Numeric

Potential Analysis: This feature is directly related to the fare prediction, as longer distances tend to result in higher fares. You can also analyze average trip distances, or correlate distances with time spent in traffic.

Fare_INR:

Description: The fare charged for the trip, represented in Indian Rupees (INR).

Type: Numeric

Potential Analysis: This is a key feature for fare prediction models. You could also analyze average fares, identify outliers (like unusually high or low fares), or see how fare correlates with other features such as distance, time of day, and number of passengers.

No_of_Passengers:

Description: The number of passengers on the trip.

Type: Numeric (integer)

Potential Analysis: You can analyze the frequency of trips with different numbers of passengers, check if the number of passengers impacts the fare, or evaluate how many shared rides or group trips occur.

Travel_Time_hrs:

Description: The duration of the taxi trip in hours.

Type: Numeric

Potential Analysis: This is an important feature for analyzing traffic conditions and travel efficiency. You can evaluate if longer travel times correlate with higher fares and whether travel time increases during rush hours.

Tips_INR:

Description: The amount of tip given by the passenger in INR.

Type: Numeric

Potential Analysis: You can analyze tipping patterns, see if there's a relationship between distance, fare, and tips, or identify passengers' tipping behavior based on time of day or specific locations.

Tourist_Place_Nearby:

Description: Indicates whether the pickup or drop location is near a tourist attraction.

Type: Categorical (likely a binary indicator, i.e., yes/no)

Potential Analysis: This feature could be used to analyze the impact of tourist locations on fare prices, distance, and passenger frequency. You can also identify if tourists are more likely to tip.

Weather_Condition:

Description: Represents the weather conditions during the trip (e.g., sunny, rainy, cloudy, etc.).

Type: Categorical (encoded later into numerical form)

Potential Analysis: Weather conditions may impact both travel times and fare amounts. For example, rainy weather could lead to longer travel times, affecting fare amounts.

Vehicle_Type:

Description: Specifies the type of vehicle used for the taxi trip (e.g., sedan, SUV, auto-rickshaw, etc.).

Type: Categorical (encoded later into numerical form)

Potential Analysis: Different vehicle types may result in varying fare structures. You can analyze how different vehicle types affect fare, travel time, and tipping behavior.

Steps for Dataset Analysis

Handling Missing Data:

As seen earlier, several columns had missing values (Date_Time, Pickup_Location, Drop_Location, Distance_km, etc.). Filling these appropriate...
I
India Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu...
ceicdata.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, India Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/vital-statistics-natural-growth-rate-by-states/vital-statistics-natural-growth-rate-per-1000-population-tamil-nadu
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2009 - Dec 1, 2020
Area covered
India
Variables measured
Vital Statistics
Description
Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data was reported at 7.700 NA in 2020. This records a decrease from the previous number of 8.100 NA for 2019. Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data is updated yearly, averaging 8.600 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 11.400 NA in 2001 and a record low of 7.700 NA in 2020. Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH004: Vital Statistics: Natural Growth Rate: by States.
I
India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu
ceicdata.com
Updated Dec 15, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com (2023). India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/vital-statistics-birth-rate-by-states/vital-statistics-birth-rate-per-1000-population-tamil-nadu
Explore at:
Dataset updated
Dec 15, 2023
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 1, 2009 - Dec 1, 2020
Area covered
India
Variables measured
Vital Statistics
Description
Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data was reported at 13.800 NA in 2020. This records a decrease from the previous number of 14.200 NA for 2019. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data is updated yearly, averaging 15.900 NA from Dec 1997 (Median) to 2020, with 23 observations. The data reached an all-time high of 19.300 NA in 2000 and a record low of 13.800 NA in 2020. Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAH002: Vital Statistics: Birth Rate: by States.
TamilNadu Legislative Election Dataset
kaggle.com
zip
Updated Jun 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sai Raam (2024). TamilNadu Legislative Election Dataset [Dataset]. https://www.kaggle.com/datasets/srinrealyf/1971-2021-tamilnadu-legislative-election-dataset
Explore at:
zip(110725 bytes)Available download formats
Dataset updated
Jun 29, 2024
Authors
Sai Raam
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Area covered
Tamil Nadu
Description
This comprehensive dataset provides detailed information on Tamil Nadu state elections spanning from 1971 to 2021. It encompasses data from multiple legislative assembly elections, capturing a wide range of variables essential for political and social analysis.

Applications

Political Analysis: Analyze trends in voter behavior, party performance, and electoral outcomes over five decades.

Social Research: Study the impact of socio-economic factors on election results and voter preferences.

Historical Trends: Examine historical shifts in political dominance and changes in the political landscape of Tamil Nadu.

Predictive Modelling: Utilize the dataset for predictive analysis and forecasting future election outcomes.

Column Description - ac_no : Assembly Consistuency Number - ac_name: Assembly Consistuency Name - winning_cand: Name of Winning Candidate - party: Name of Party - totelectors: Total number of electors in the consistuency - tot votes: Total votes secured by winning candidate - poll_percentage: Percentage of polls polled - margin: Margin difference between winner and runner - winning_percentage: Percentage of marginal win - district: Name of district

Acknowledgement

The data has been extracted from official eci website and cross-checked with other sites for validation. If you use this work or want to appreciate me you can drop a hi to linkedIn.com/in/srinrealyf

Keywords

Tamil Nadu, India, Election, Election Data, Legislative Election, MLA Election

If you find this dataset helpful, consider upvoting the dataset and drop your comments for any feedback
Tamil Nadu School Data - SSA
kaggle.com
zip
Updated Feb 3, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arunmozhi (2019). Tamil Nadu School Data - SSA [Dataset]. https://www.kaggle.com/tecoholic/tamil-nadu-school-data-ssa
Explore at:
zip(6007 bytes)Available download formats
Dataset updated
Feb 3, 2019
Authors
Arunmozhi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Tamil Nadu
Description
Context

The dataset is the statistical information on schools in different districts of the state of Tamil Nadu, India.

Content

The dataset contains the following information:

No. of schools of different management type

No. of teachers working in schools categorized by management type

No. of students enrolled the various schools grouped by management type

Acknowledgements

The data was collected from the open data portal of the Tamil Nadu government from the following locations:

SSA, Tamil Nadu : Statistics on Student Enrollment by School Management

SSA, Tamil Nadu : Statistics on Schools by School Management

SSA, Tamil Nadu : Statistics on Teachers by School Management
I
India Census: Population: Tamil Nadu
ceicdata.com
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, India Census: Population: Tamil Nadu [Dataset]. https://www.ceicdata.com/en/india/census-population-by-states/census-population-tamil-nadu
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 1, 1901 - Mar 1, 2011
Area covered
India
Variables measured
Population
Description
Census: Population: Tamil Nadu data was reported at 72,147,030.000 Person in 03-01-2011. This records an increase from the previous number of 62,405,679.000 Person for 03-01-2001. Census: Population: Tamil Nadu data is updated decadal, averaging 31,903,000.000 Person from Mar 1901 (Median) to 03-01-2011, with 12 observations. The data reached an all-time high of 72,147,030.000 Person in 03-01-2011 and a record low of 19,252,630.000 Person in 03-01-1901. Census: Population: Tamil Nadu data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAB002: Census: Population: by States.
F
Tamil Scripted Monologue Speech Data for Healthcare
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Tamil Scripted Monologue Speech Data for Healthcare [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/healthcare-scripted-speech-monologues-tamil-india
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
Introducing the Tamil Scripted Monologue Speech Dataset for the Healthcare Domain, a voice dataset built to accelerate the development and deployment of Tamil language automatic speech recognition (ASR) systems, with a sharp focus on real-world healthcare interactions.
Speech Data
This dataset includes over 6,000 high-quality scripted audio prompts recorded in Tamil, representing typical voice interactions found in the healthcare industry. The data is tailored for use in voice technology systems that power virtual assistants, patient-facing AI tools, and intelligent customer service platforms.
•Participant Diversity
•
Speakers: 60 native Tamil speakers.

•
Regional Balance: Participants are sourced from multiple regions across Tamil Nadu, reflecting diverse dialects and linguistic traits.

•
Demographics: Includes a mix of male and female participants (60:40 ratio), aged between 18 and 70 years.

•Recording Specifications
•
Nature of Recordings: Scripted monologues based on healthcare-related use cases.

•
Duration: Each clip ranges between 5 to 30 seconds, offering short, context-rich speech samples.

•
Audio Format: WAV files recorded in mono, with 16-bit depth and sample rates of 8 kHz and 16 kHz.

•
Environment: Clean and echo-free spaces ensure clear and noise-free audio capture.

Topic Coverage
The prompts span a broad range of healthcare-specific interactions, such as:
•Patient check-in and follow-up communication
•Appointment booking and cancellation dialogues
•Insurance and regulatory support queries
•Medication, test results, and consultation discussions
•General health tips and wellness advice
•Emergency and urgent care communication
•Technical support for patient portals and apps
•Domain-specific scripted statements and FAQs
Contextual Depth
To maximize authenticity, the prompts integrate linguistic elements and healthcare-specific terms such as:
•
Names: Gender- and region-appropriate Tamil Nadu names

•
Addresses: Varied local address formats spoken naturally

•
Dates & Times: References to appointment dates, times, follow-ups, and schedules

•
Medical Terminology: Common medical procedures, symptoms, and treatment references

•
Numbers & Measurements: Health data like dosages, vitals, and test result values

•
Healthcare Institutions: Names of clinics, hospitals, and diagnostic centers

These elements make the dataset exceptionally suited for training AI systems to understand and respond to natural healthcare-related speech patterns.
Transcription
Every audio recording is accompanied by a verbatim, manually verified transcription.
•
Content: The transcription mirrors the exact scripted prompt recorded by the speaker.

•
Format: Files are delivered in plain text (.TXT) format with consistent naming conventions for seamless integration.

•
<b style="font-weight:
Tamil (Tamizh) Wikipedia Text Dataset for NLP
kaggle.com
zip
Updated Nov 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Younus_Mohamed (2024). Tamil (Tamizh) Wikipedia Text Dataset for NLP [Dataset]. https://www.kaggle.com/datasets/younusmohamed/tamil-tamizh-wikipedia-articles
Explore at:
zip(339341289 bytes)Available download formats
Dataset updated
Nov 12, 2024
Authors
Younus_Mohamed
License
Attribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
Description
This dataset is part of a larger mission to transform Tamil into a high-resource language in the field of Natural Language Processing (NLP). As one of the oldest and most culturally rich languages, Tamil has a unique linguistic structure, yet it remains underrepresented in the NLP landscape. This dataset, extracted from Tamil Wikipedia, serves as a foundational resource to support Tamil language processing, text mining, and machine learning applications.

What’s Included

- Text Data: This dataset contains over 569,000 articles in raw text form, extracted from Tamil Wikipedia. The collection is ideal for language model training, word frequency analysis, and text mining.

- Scripts and Processing Tools: Code snippets are provided for processing .bz2 compressed files, generating word counts, and handling data for NLP applications.

Why This Dataset?

Despite having a documented lexicon of over 100,000 words, only a fraction of these are actively used in everyday communication. The largest available Tamil treebank currently holds only 10,000 words, limiting the scope for training accurate language models. This dataset aims to bridge that gap by providing a robust, open-source corpus for researchers, developers, and linguists who want to work on Tamil language technologies.

** How You Can Use This Dataset**

- Language Modeling: Train or fine-tune models like BERT, GPT, or LSTM-based language models for Tamil. - Linguistic Research: Analyze Tamil morphology, syntax, and vocabulary usage. - Data Augmentation: Use the raw text to generate augmented data for multilingual NLP applications. - Word Embeddings and Semantic Analysis: Create embeddings for Tamil words, useful in multilingual setups or standalone applications.

Let’s Collaborate!

I believe that advancing Tamil in NLP cannot be a solo effort. Contributions in the form of additional data, annotations, or even new tools for Tamil language processing are welcome! By working together, we can make Tamil a truly high-resource language in NLP.

License

This dataset is based on content from Tamil Wikipedia and is shared under the Creative Commons Attribution-ShareAlike 3.0 Unported License (CC BY-SA 3.0). Proper attribution to Wikipedia is required when using this data.
F
Tamil Open Ended Question Answer Text Dataset
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Tamil Open Ended Question Answer Text Dataset [Dataset]. https://www.futurebeeai.com/dataset/prompt-response-dataset/tamil-open-ended-question-answer-text-dataset
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
The Tamil Open-Ended Question Answering Dataset is a meticulously curated collection of comprehensive Question-Answer pairs. It serves as a valuable resource for training Large Language Models (LLMs) and Question-answering models in the Tamil language, advancing the field of artificial intelligence.
Dataset Content:
This QA dataset comprises a diverse set of open-ended questions paired with corresponding answers in Tamil. There is no context paragraph given to choose an answer from, and each question is answered without any predefined context content. The questions cover a broad range of topics, including science, history, technology, geography, literature, current affairs, and more.
Each question is accompanied by an answer, providing valuable information and insights to enhance the language model training process. Both the questions and answers were manually curated by native Tamil people, and references were taken from diverse sources like books, news articles, websites, and other reliable references.
This question-answer prompt completion dataset contains different types of prompts, including instruction type, continuation type, and in-context learning (zero-shot, few-shot) type. The dataset also contains questions and answers with different types of rich text, including tables, code, JSON, etc., with proper markdown.
Question Diversity:
To ensure diversity, this Q&A dataset includes questions with varying complexity levels, ranging from easy to medium and hard. Different types of questions, such as multiple-choice, direct, and true/false, are included. Additionally, questions are further classified into fact-based and opinion-based categories, creating a comprehensive variety. The QA dataset also contains the question with constraints and persona restrictions, which makes it even more useful for LLM training.
Answer Formats:
To accommodate varied learning experiences, the dataset incorporates different types of answer formats. These formats include single-word, short phrases, single sentences, and paragraph types of answers. The answer contains text strings, numerical values, date and time formats as well. Such diversity strengthens the Language model's ability to generate coherent and contextually appropriate answers.
Data Format and Annotation Details:
This fully labeled Tamil Open Ended Question Answer Dataset is available in JSON and CSV formats. It includes annotation details such as id, language, domain, question_length, prompt_type, question_category, question_type, complexity, answer_type, rich_text.
Quality and Accuracy:
The dataset upholds the highest standards of quality and accuracy. Each question undergoes careful validation, and the corresponding answers are thoroughly verified. To prioritize inclusivity, the dataset incorporates questions and answers representing diverse perspectives and writing styles, ensuring it remains unbiased and avoids perpetuating discrimination.
Both the question and answers in Tamil are grammatically accurate without any word or grammatical errors. No copyrighted, toxic, or harmful content is used while building this dataset.
Continuous Updates and Customization:
The entire dataset was prepared with the assistance of human curators from the FutureBeeAI crowd community. Continuous efforts are made to add more assets to this dataset, ensuring its growth and relevance. Additionally, FutureBeeAI offers the ability to collect custom question-answer data tailored to specific needs, providing flexibility and customization options.
License:
The dataset, created by FutureBeeAI, is now ready for commercial use. Researchers, data scientists, and developers can utilize this fully labeled and ready-to-deploy Tamil Open Ended Question Answer Dataset to enhance the language understanding capabilities of their generative ai models, improve response generation, and explore new approaches to NLP question-answering tasks.
Share of disabled population in Tamil Nadu India 2018, by type and gender
statista.com
Updated Jan 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2021). Share of disabled population in Tamil Nadu India 2018, by type and gender [Dataset]. https://www.statista.com/statistics/1080066/india-disabled-persons-by-type-and-gender-tamil-nadu/
Explore at:
Dataset updated
Jan 15, 2021
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jul 2018 - Dec 2018
Area covered
India
Description
According to the 76th round of the NSO survey conducted between July and December 2018, a higher percentage of men had disabilities compared to women in India. Specifically in Tamil Nadu, two percent of men had multiple disabilities, while this was at 1.9 percent among females. The National Statistical Office (NSO) is the statistical wing of the Ministry of Statistics and Programme Implementation (MOSPI), mainly responsible for laying down standards for statistical analysis, data collection, and implementation.
I
India Census: Population: Tamil Nadu: Female
ceicdata.com
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CEICdata.com, India Census: Population: Tamil Nadu: Female [Dataset]. https://www.ceicdata.com/en/india/census-population-by-states/census-population-tamil-nadu-female
Explore at:
Dataset provided by
CEICdata.com
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 1, 1901 - Mar 1, 2011
Area covered
India
Variables measured
Population
Description
Census: Population: Tamil Nadu: Female data was reported at 36,009,055.000 Person in 03-01-2011. This records an increase from the previous number of 31,004,770.000 Person for 03-01-2001. Census: Population: Tamil Nadu: Female data is updated decadal, averaging 15,945,649.000 Person from Mar 1901 (Median) to 03-01-2011, with 12 observations. The data reached an all-time high of 36,009,055.000 Person in 03-01-2011 and a record low of 9,833,232.000 Person in 03-01-1901. Census: Population: Tamil Nadu: Female data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAB002: Census: Population: by States.
F
Tamil General Domain Scripted Monologue Speech Data
futurebeeai.com
wav
Updated Aug 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
FutureBee AI (2022). Tamil General Domain Scripted Monologue Speech Data [Dataset]. https://www.futurebeeai.com/dataset/monologue-speech-dataset/general-scripted-speech-monologues-tamil-india
Explore at:
wavAvailable download formats
Dataset updated
Aug 1, 2022
Dataset provided by
FutureBeeAI
Authors
FutureBee AI
License
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Dataset funded by
FutureBeeAI
Description
Introduction
The Tamil Scripted Monologue Speech Dataset for the General Domain is a carefully curated resource designed to support the development of Tamil language speech recognition systems. This dataset focuses on general-purpose conversational topics and is ideal for a wide range of AI applications requiring natural, domain-agnostic Tamil speech data.
Speech Data
This dataset features over 6,000 high-quality scripted monologue recordings in Tamil. The prompts span diverse real-life topics commonly encountered in general conversations and are intended to help train robust and accurate speech-enabled technologies.
•Participant Diversity
•
Speakers: 60 native Tamil speakers

•
Regions: Broad regional coverage ensures diverse accents and dialects

•
Demographics: Participants aged 18 to 70, with a 60:40 male-to-female ratio

•Recording Specifications
•
Recording Type: Scripted monologues and prompt-based recordings

•
Audio Duration: 5 to 30 seconds per file

•
Format: WAV, mono channel, 16-bit, 8 kHz & 16 kHz sample rates

•
Environment: Clean, noise-free conditions to ensure clarity and usability

Topic Coverage
The dataset covers a wide variety of general conversation scenarios, including:
•Daily Conversations
•Topic-Specific Discussions
•General Knowledge and Advice
•Idioms and Sayings
Contextual Features
To enhance authenticity, the prompts include:
•
Names: Male and female names specific to different Tamil Nadu regions

•
Addresses: Commonly used address formats in daily Tamil speech

•
Dates & Times: References used in general scheduling and time expressions

•
Organization Names: Names of businesses, institutions, and other entities

•
Numbers & Currencies: Mentions of quantities, prices, and monetary values

Each prompt is designed to reflect everyday use cases, making it suitable for developing generalized NLP and ASR solutions.
Transcription
Every audio file in the dataset is accompanied by a verbatim text transcription, ensuring accurate training and evaluation of speech models.
•
Content: Exact match to the spoken audio

•
Format: Plain text (.TXT), named identically to the corresponding audio file

•
Quality Control: All transcripts are validated by native Tamil transcribers

Metadata
Rich metadata is included for detailed filtering and analysis:
•
Speaker Metadata: Unique speaker ID, age, gender, region, and dialect

•
Audio Metadata: Prompt transcript, recording setup, device specs, sample rate, bit depth, and format

Applications & Use Cases
This dataset can power a variety of Tamil language AI technologies, including:
•
Speech Recognition Training: ASR model development and fine-tuning

<div
Tamil Nadu Crop Production Dataset
kaggle.com
zip
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Usman (2024). Tamil Nadu Crop Production Dataset [Dataset]. https://www.kaggle.com/datasets/usmanlovescode/tamil-nadu-crop-production-dataset
Explore at:
zip(107473 bytes)Available download formats
Dataset updated
Jan 22, 2024
Authors
Muhammad Usman
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Tamil Nadu
Description
Delve into the agriculture of Tamil Nadu with this dataset, offering comprehensive insights into crop production. Covering various crops, the dataset provides valuable information on yields, trends, and patterns, serving as a crucial resource for researchers, policymakers, and stakeholders interested in understanding the agricultural landscape of Tamil Nadu.
Tamil Nadu Non working population
knoema.com
csv, json, sdmx, xls
Updated Jan 1, 2013
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Knoema (2013). Tamil Nadu Non working population [Dataset]. https://knoema.com/atlas/India/Tamil-Nadu/topics/Demographics/Population/Non-working-population
Explore at:
xls, json, csv, sdmxAvailable download formats
Dataset updated
Jan 1, 2013
Dataset authored and provided by
Knoemahttp://knoema.com/
Time period covered
2001 - 2011
Area covered
Tamil Nadu, India
Variables measured
Non working population
Description
Non working population of Tamil Nadu surged by 13.71% from 34,527,397 persons in 2001 to 39,262,349 persons in 2011. Since the 13.71% jump in 2011, non working population remained stable by 0.00% in 2011.
Cities and Towns in TN - Population statistics
kaggle.com
zip
Updated Sep 15, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gokul Prakash P (2020). Cities and Towns in TN - Population statistics [Dataset]. https://www.kaggle.com/gokulprakash22/cities-in-tamil-nadu-population-statistics
Explore at:
zip(19074 bytes)Available download formats
Dataset updated
Sep 15, 2020
Authors
Gokul Prakash P
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This Dataset consist of population statistics by census years of cities and towns in Tamil Nadu obtained from various sources.

Content

This Dataset consist of 6 columns - Name of city or town, Status of that city/town, District of that city/town and 3 columns of population statistics by census years(1991-03-01, 2001-03-01, 2011-03-01)

Acknowledgements

This is done during the internship at Tact Labs. Thanks to Aishwarya who helped me in collecting the dataset.
Dinamalar Tamil News Corpus (1.9 million records)
kaggle.com
zip
Updated Jan 5, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vijayabhaskar J (2020). Dinamalar Tamil News Corpus (1.9 million records) [Dataset]. https://www.kaggle.com/vijayabhaskar96/tamil-news-dataset-19-million-records
Explore at:
zip(1051706225 bytes)Available download formats
Dataset updated
Jan 5, 2020
Authors
Vijayabhaskar J
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

There were no large datasets in the Tamil language apart from the Tamil wiki dataset (120k articles), So I decided to create my own. This dataset is the result of it!

Content

The data is acquired by scrapping the publicly available articles published on Dinamalar.com, which is a well-known newspaper in Tamil nadu, India. The dataset contains articles from 2009 - 2019.

Acknowledgements

This dataset exists because of Dinamalar.com. All thanks to them.
COVID 19 Vaccination Coverage across Tamilnadu
kaggle.com
zip
Updated Jun 24, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suriyakanth R (2021). COVID 19 Vaccination Coverage across Tamilnadu [Dataset]. https://www.kaggle.com/datasets/suriyakanth2711/covid-19-vaccination-coverage-across-tamilnadu
Explore at:
zip(9494 bytes)Available download formats
Dataset updated
Jun 24, 2021
Authors
Suriyakanth R
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
Tamil Nadu
Description
Coronavirus disease 2019 (COVID-19), also known as the coronavirus, or COVID, is a contagious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). The first known case was identified in Wuhan, China, in December 2019. The disease has since spread worldwide, leading to an ongoing pandemic. And this data is all about the COVID 19 Vaccination Coverage across Health Unit district wise in Tamil Nadu.

Content

Data columns (total 39 columns): Column

0 S.No
1 Health Unit District
2 Achievement towards vaccination of 1st Dosage Covishield to HCW
3 Achievement towards vaccination of 2nd Dosage Covishield to HCW
4 Achievement towards vaccination of 1st Dosage Covishield to FLW
5 Achievement towards vaccination of 2nd Dosage Covishield to FLW
6 Achievement towards vaccination of 1st Dosage Covishield to beneficiaries of 18 years and less than 44 years age group
7 Achievement towards vaccination of 2nd Dosage Covishield to beneficiaries of 18 years and less than 44 years age group
8 Achievement towards vaccination of 1st Dosage Covishield to beneficiaries of 45 years and less than 60 years age group with Comorbidities
9 Achievement towards vaccination of 2nd Dosage Covishield to beneficiaries of 45 years and less than 60 years age group with Comorbidities
10 Achievement towards vaccination of 1st Dosage Covishield to 60+ years beneficiaries with Comorbidities
11 Achievement towards vaccination of 2nd Dosage Covishield to 60+ years beneficiaries with Comorbidities
12 Total Achievement of vaccination to beneficiaries under 1st Dose of Covishield
13 Total Achievement of vaccination to beneficiaries under 2nd Dose of Covishield
14 Achievement towards vaccination of 1st Dosage Covaxin to HCW
15 Achievement towards vaccination of 2nd Dosage Covaxin to HCW
16 Achievement towards vaccination of 1st Dosage Covaxin to FLW
17 Achievement towards vaccination of 2nd Dosage Covaxin to FLW
18 Achievement towards vaccination of 1st Dosage Covaxin to beneficiaries of 18 years and less than 44 years age group
19 Achievement towards vaccination of 2nd Dosage Covaxin to beneficiaries of 18 years and less than 44 years age group
20 Achievement towards vaccination of 1st Dosage Covaxin to beneficiaries of 45 years and less than 60 years age group with Comorbidities
21 Achievement towards vaccination of 2nd Dosage Covaxin to beneficiaries of 45 years and less than 60 years age group with Comorbidities
22 Achievement towards vaccination of 1st Dosage Covaxin to 60+ years beneficiaries with Comorbidities
23 Achievement towards vaccination of 2nd Dosage Covaxin to 60+ years beneficiaries with Comorbidities
24 Total Achievement of vaccination to beneficiaries under 1st Dose of Covaxin
25 Total Achievement of vaccination to beneficiaries under 2nd Dose of Covaxin ...
arani_data
kaggle.com
zip
Updated Apr 7, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anup1997 (2022). arani_data [Dataset]. https://www.kaggle.com/datasets/anup1997/arani-data
Explore at:
zip(135047 bytes)Available download formats
Dataset updated
Apr 7, 2022
Authors
Anup1997
Description
This dataset will show the male population in Arani village where is in South eastern side of the Tamil Nadu State.

Facebook

Twitter

Click to copy link

Link copied

Cite

Vaishnavi (2020). Tamilnadu Population [Dataset]. https://www.kaggle.com/datasets/vaishnavivenkatesan/tamilnadu-population

Tamilnadu Population

Place wise population collection

Explore at:

112 scholarly articles cite this dataset (View in Google Scholar)

zip(19029 bytes)Available download formats

Dataset updated

Sep 18, 2020

Authors

Vaishnavi

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Area covered

Tamil Nadu

Description

Context

This dataset consist of population of three years in Tamil Nadu.

Content

This file consist of information about the places, population , district and position of place.

Acknowledgements

This is done during the internship at Tact Labs. Thanks to Aishwarya who aided me in collecting the dataset.

Clear search

Close search

Google apps

Main menu

Tamilnadu Population

Context

Content

Acknowledgements

India Population: Tamil Nadu

Tamil Nadu Travel Trips

Columns Description (Assuming from your Dataset)

Steps for Dataset Analysis

India Vital Statistics: Natural Growth Rate: per 1000 Population: Tamil Nadu...

India Vital Statistics: Birth Rate: per 1000 Population: Tamil Nadu

TamilNadu Legislative Election Dataset

Tamil Nadu School Data - SSA

Context

Content

Acknowledgements

India Census: Population: Tamil Nadu

Tamil Scripted Monologue Speech Data for Healthcare

Introduction

Speech Data

Topic Coverage

Contextual Depth

Transcription

Tamil (Tamizh) Wikipedia Text Dataset for NLP

What’s Included

Why This Dataset?

** How You Can Use This Dataset**

Let’s Collaborate!

License

Tamil Open Ended Question Answer Text Dataset

Dataset Content:

Question Diversity:

Answer Formats:

Data Format and Annotation Details:

Quality and Accuracy:

Continuous Updates and Customization:

License:

Share of disabled population in Tamil Nadu India 2018, by type and gender

India Census: Population: Tamil Nadu: Female

Tamil General Domain Scripted Monologue Speech Data

Introduction

Speech Data

Topic Coverage

Contextual Features

Transcription

Metadata

Applications & Use Cases

Tamil Nadu Crop Production Dataset

Tamil Nadu Non working population

Cities and Towns in TN - Population statistics

Context

Content

Acknowledgements

Dinamalar Tamil News Corpus (1.9 million records)

Context

Content

Acknowledgements

COVID 19 Vaccination Coverage across Tamilnadu

Content

arani_data

Tamilnadu Population

Place wise population collection

Context

Content

Acknowledgements

How You Can Use This Dataset