This study is Pew Research Center's most comprehensive, in-depth exploration of India to date. For this report, Pew surveyed 29,999 Indian adults (including 22,975 who identify as Hindu, 3,336 who identify as Muslim, 1,782 who identify as Sikh, 1,011 who identify as Christian, 719 who identify as Buddhist, 109 who identify as Jain and 67 who identify as belonging to another religion or as religiously unaffiliated). Interviews for this nationally representative survey were conducted face-to-face under the direction of RTI International from November 17, 2019, to March 23, 2020. Respondents were surveyed about religious beliefs and practices, religious identity, nationalism, and tolerance in Indian society.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
This Dataset contains All India, State, Gender of the Head of Household and Religion-wise Total Number of Households
Note: Religion includes Buddhist, Christian, Hindu, Jain, Muslim, Others, Sikh
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The religious dataset consisting of Hindu and Muslim hate comments from Bangladesh and India in the Bangla language is a collection of online comments that contain religious hate speech targeting either the Hindu or Muslim communities. These comments were gathered from various sources such as newspapers, social media platforms, and online forums. The purpose of collecting this data is to analyze the prevalence of religious intolerance, identify patterns in hate speech, and contribute to the development of tools for automatically detecting and mitigating such content.
Key Features of the Dataset: Source and Collection:
Comments were sourced from both Bangladesh and India, reflecting religious sentiments in these neighboring countries where tensions between religious groups have often been a social issue. Sources include Bangla-language social media, news articles, opinion pieces, and comments sections on websites.
Content: The dataset contains a mix of both Hindu-targeted hate speech and Muslim-targeted hate speech, with derogatory, offensive, and inflammatory remarks based on religion. Hate comments include stereotypical statements, incitement to violence, communal hatred, and discriminatory language directed at members of the opposing community.
Purpose and Use Cases: Hate Speech Detection: This dataset is useful for developing machine learning models that can automatically identify and flag harmful content on social media platforms. Social Science Research: Researchers can study the psychological and sociopolitical factors that drive such hate speech. Policy and Moderation Tools: Governments, social media platforms, and civil society organizations can use insights from this dataset to design anti-hate speech policies and create moderation systems that reduce online hate.
Challenges: Contextual Nuances: Understanding the cultural and religious context of Bangla comments is crucial for accurately identifying hate speech. A comment that might seem neutral in one context could be deeply offensive in another. Code-Switching: Some comments might mix Bangla with English or regional languages, complicating the classification and sentiment analysis process. Bias in Data: The dataset might reflect a certain level of social bias depending on the region from which it was collected, which needs to be addressed when training AI models.
Conclusion: This dataset offers valuable insights into the dynamics of religious hate speech in Bangladesh and India, two countries with diverse religious populations and a history of interfaith tension. It can help in the development of tools for mitigating online hate speech, while also fostering better understanding and tolerance across religious communities.
The idea for making this dataset is came to me when I was searching project for submitting on jovian.ml platform as a part of their task. (This is very good site for beginners who want to learn data science python skills, they arranged course in collaboration with freecodecamp). I thought to make unique project for that I need my own dataset. That's why I created this dataset.
Contains God names with meaning in Sanskrit , translated in English for better understanding. I collected this data from different scriptures & Sanskrit literatures. More will be added soon.
Thanks to my school Sanskrit teacher Mr. V. B. Patil Sir, which introduced us to this language.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Kerala. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India Census: Population: by Religion: Muslim: Urban data was reported at 68,740,419.000 Person in 2011. This records an increase from the previous number of 49,393,496.000 Person for 2001. India Census: Population: by Religion: Muslim: Urban data is updated yearly, averaging 59,066,957.500 Person from Mar 2001 (Median) to 2011, with 2 observations. The data reached an all-time high of 68,740,419.000 Person in 2011 and a record low of 49,393,496.000 Person in 2001. India Census: Population: by Religion: Muslim: Urban data remains active status in CEIC and is reported by Census of India. The data is categorized under India Premium Database’s Demographic – Table IN.GAE001: Census: Population: by Religion.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Delhi. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Census: Population: by Religion: Christian: Punjab: Female data was reported at 166,189.000 Person in 03-01-2011. This records an increase from the previous number of 138,127.000 Person for 03-01-2001. Census: Population: by Religion: Christian: Punjab: Female data is updated decadal, averaging 152,158.000 Person from Mar 2001 (Median) to 03-01-2011, with 2 observations. The data reached an all-time high of 166,189.000 Person in 03-01-2011 and a record low of 138,127.000 Person in 03-01-2001. Census: Population: by Religion: Christian: Punjab: Female data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAE004: Census: Population: by Religion: Christian.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India Census: Population: by Religion: Hindu: Male data was reported at 498,306,968.000 Person in 2011. This records an increase from the previous number of 428,678,554.000 Person for 2001. India Census: Population: by Religion: Hindu: Male data is updated yearly, averaging 463,492,761.000 Person from Mar 2001 (Median) to 2011, with 2 observations. The data reached an all-time high of 498,306,968.000 Person in 2011 and a record low of 428,678,554.000 Person in 2001. India Census: Population: by Religion: Hindu: Male data remains active status in CEIC and is reported by Census of India. The data is categorized under India Premium Database’s Demographic – Table IN.GAE001: Census: Population: by Religion.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Census: Population: by Religion: Christian: Lakshadweep: Male data was reported at 286.000 Person in 03-01-2011. This records a decrease from the previous number of 422.000 Person for 03-01-2001. Census: Population: by Religion: Christian: Lakshadweep: Male data is updated decadal, averaging 354.000 Person from Mar 2001 (Median) to 03-01-2011, with 2 observations. The data reached an all-time high of 422.000 Person in 03-01-2001 and a record low of 286.000 Person in 03-01-2011. Census: Population: by Religion: Christian: Lakshadweep: Male data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAE004: Census: Population: by Religion: Christian.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract divided by religion. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying urban or rura.
This dataset captures regional superstitions and beliefs from all 28 states and 8 union territories across India, offering a glimpse into the diverse cultural fabric that shapes daily life. It aims to preserve and explore India’s rich cultural heritage through data. The collection is notable for its uniqueness, as there is no existing large-scale dataset that details Indian superstitions region-wise with comparable depth and breadth. It holds significant cultural and linguistic value, making it highly suitable for Natural Language Processing (NLP) applications and serving an educational purpose by raising awareness about India’s intangible cultural heritage.
The dataset includes the following columns:
* id
: A unique identifier for each superstition entry.
* superstition_name
: A concise name or label for the superstition.
* description
: A detailed explanation of the belief or practice.
* region
: Specifies the State or Union Territory where the belief is commonly observed.
* category
: Indicates the type of superstition, such as omen, protection, health, taboo, wealth, pregnancy-related, weekly beliefs, or ghost/spirits.
* origin_theory
: Provides the folk or cultural explanation, or historical root of the belief.
* modern_status
: Identifies whether the belief is still followed in the specified region (Yes/No/Partially).
* is_harmful
: States whether the belief might have detrimental effects, e.g., social or medical.
* source
: Describes the type of source from which the data was collected, such as oral tradition, community elders, or user contributed.
* user_contributed
: A flag indicating if the entry was directly contributed by users or sourced from communities.
The dataset is provided in CSV format and consists of two main files: train.csv
and test.csv
. The train.csv
file contains over 500 superstition entries, with approximately 20 entries per state or Union Territory. The test.csv
file includes over 100 entries, typically 1 to 2 per state or Union Territory, intended for model validation or exploration. Both files share a similar structure, making them suitable for supervised learning tasks.
This dataset is ideally suited for various applications, including: * Exploring regional cultural differences in beliefs across India. * Training machine learning models for classifying or generating folklore-related text. * Developing AI assistants that can understand and respond to regional cultural nuances. * Enhancing chatbots with culturally relevant responses and information. * Academic research in social sciences, humanities, cultural studies, anthropology, folklore, and linguistics. * NLP applications such as text classification, sentiment analysis, and entity recognition. * Building cultural AI systems.
The dataset's geographic scope covers all 28 states and 8 union territories of India, ensuring a wide representation of regional beliefs. The time range encompasses traditions passed down through generations, with a focus on their modern status. While specific demographic details are not outlined, the dataset captures beliefs shaping daily life across the country, aiming for a fair representation of all regions.
CC-BY
The dataset is intended for: * Researchers in cultural studies, anthropology, folklore, and linguistics. * Data scientists and machine learning engineers working on NLP applications. * Academics engaged in social sciences and humanities research. * Developers creating AI assistants or chatbots. * Anyone interested in India's intangible cultural heritage.
Original Data Source: Regional Indian Superstitions & Beliefs
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
This dataset provides the unemployment rates for major religious groups in India, based on usual status (ps+ss). For years before 2017-18, the data was obtained in different quinquennial rounds of NSSO conducted from 2004-05 (NSS 61st) to 2011-12 (NSS 68th round). From 2017-18 the data is sourced from the annual report of the Periodic Labour Force Survey (PLFS) conducted by the Ministry of Statistics and Programme Implementation. The data highlights unemployment trends within different religious communities.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Census: Population: by Religion: Christian: Madhya Pradesh: Female data was reported at 107,985.000 Person in 03-01-2011. This records an increase from the previous number of 85,025.000 Person for 03-01-2001. Census: Population: by Religion: Christian: Madhya Pradesh: Female data is updated decadal, averaging 96,505.000 Person from Mar 2001 (Median) to 03-01-2011, with 2 observations. The data reached an all-time high of 107,985.000 Person in 03-01-2011 and a record low of 85,025.000 Person in 03-01-2001. Census: Population: by Religion: Christian: Madhya Pradesh: Female data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAE004: Census: Population: by Religion: Christian.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Assam. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contains year- and region-wise compiled data on the distribution of households (per thousand) of different social groups (castes) and religions by their access and use of principal sources of drinking water. The different types of social groups and religions covered in the dataset include Schedule Tribes (STs), Scheduled Castes (SCs), Other Backward Communities (OBCs), Hinduism, Sikhism, Islam (Muslim), Christianity, etc., and the households with different types of access and use of drinking water in the dataset include exclusive, community, common, neighbour's source, etc.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In modern times, the whole world is divided into different subjects. In this, Indian economic inequality is divided into different sections of tradition. They are poor-rich, unequal distribution of income, caste, religion, gender, etc. Is divided on the basis of. In this, caste-based inequality is detrimental to Indian economic development. Caste was created in Indian society as a system of income and distribution in the society. Caste is omnipresently governed by different and peculiar traditional rules and norms. Therefore, it can be said that in a caste-based economy, business and property rights are inherited as well as hereditary, and each caste is forced to keep them the same. All the castes in India are based on this socialization. Although conversion is possible in India, caste cannot be changed under any circumstances. A person who is born in the same caste dies in the same caste. In India, it is called caste discrimination that creates castes at this social level. In the literature of modern economics, the concept of exclusion and economic discrimination is considered to be related to race, caste, or gender.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Mizoram. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Maharashtra. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
https://dataful.in/terms-and-conditionshttps://dataful.in/terms-and-conditions
The dataset contain the primary census abstract categorised by religion in Uttarakhand. The list contains different religions including Hindu, Buddhist, Christian, Muslim, Jain, Sikh etc.. along with the region specifying whether it is urban or rural. The data is of the 2011 census.
This study is Pew Research Center's most comprehensive, in-depth exploration of India to date. For this report, Pew surveyed 29,999 Indian adults (including 22,975 who identify as Hindu, 3,336 who identify as Muslim, 1,782 who identify as Sikh, 1,011 who identify as Christian, 719 who identify as Buddhist, 109 who identify as Jain and 67 who identify as belonging to another religion or as religiously unaffiliated). Interviews for this nationally representative survey were conducted face-to-face under the direction of RTI International from November 17, 2019, to March 23, 2020. Respondents were surveyed about religious beliefs and practices, religious identity, nationalism, and tolerance in Indian society.