https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.
All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.
Here is the data dictionary for (Indian) Shark Tank season's dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rural population (% of total population) in India was reported at 63.13 % in 2024, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Rural population - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Key information about India Employed Persons
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Researchers in R&D (per million people) in India was reported at 259 in 2020, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Researchers in R&D (per million people) - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.
Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
License information was derived automatically
PerCapita_CO2_Footprint_InDioceses_FULLBurhans, Molly A., Cheney, David M., Gerlt, R.. . “PerCapita_CO2_Footprint_InDioceses_FULL”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.MethodologyThis is the first global Carbon footprint of the Catholic population. We will continue to improve and develop these data with our research partners over the coming years. While it is helpful, it should also be viewed and used as a "beta" prototype that we and our research partners will build from and improve. The years of carbon data are (2010) and (2015 - SHOWN). The year of Catholic data is 2018. The year of population data is 2016. Care should be taken during future developments to harmonize the years used for catholic, population, and CO2 data.1. Zonal Statistics: Esri Population Data and Dioceses --> Population per dioceses, non Vatican based numbers2. Zonal Statistics: FFDAS and Dioceses and Population dataset --> Mean CO2 per Diocese3. Field Calculation: Population per Diocese and Mean CO2 per diocese --> CO2 per Capita4. Field Calculation: CO2 per Capita * Catholic Population --> Catholic Carbon FootprintAssumption: PerCapita CO2Deriving per-capita CO2 from mean CO2 in a geography assumes that people's footprint accounts for their personal lifestyle and involvement in local business and industries that are contribute CO2. Catholic CO2Assumes that Catholics and non-Catholic have similar CO2 footprints from their lifestyles.Derived from:A multiyear, global gridded fossil fuel CO2 emission data product: Evaluation and analysis of resultshttp://ffdas.rc.nau.edu/About.htmlRayner et al., JGR, 2010 - The is the first FFDAS paper describing the version 1.0 methods and results published in the Journal of Geophysical Research.Asefi et al., 2014 - This is the paper describing the methods and results of the FFDAS version 2.0 published in the Journal of Geophysical Research.Readme version 2.2 - A simple readme file to assist in using the 10 km x 10 km, hourly gridded Vulcan version 2.2 results.Liu et al., 2017 - A paper exploring the carbon cycle response to the 2015-2016 El Nino through the use of carbon cycle data assimilation with FFDAS as the boundary condition for FFCO2."S. Asefi‐Najafabady P. J. Rayner K. R. Gurney A. McRobert Y. Song K. Coltin J. Huang C. Elvidge K. BaughFirst published: 10 September 2014 https://doi.org/10.1002/2013JD021296 Cited by: 30Link to FFDAS data retrieval and visualization: http://hpcg.purdue.edu/FFDAS/index.phpAbstractHigh‐resolution, global quantification of fossil fuel CO2 emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high‐resolution fossil fuel CO2 emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long‐term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long‐term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter‐term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO2 emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO2 emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set."Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/Esri Gridded Population Data 2016DescriptionThis layer is a global estimate of human population for 2016. Esri created this estimate by modeling a footprint of where people live as a dasymetric settlement likelihood surface, and then assigned 2016 population estimates stored on polygons of the finest level of geography available onto the settlement surface. Where people live means where their homes are, as in where people sleep most of the time, and this is opposed to where they work. Another way to think of this estimate is a night-time estimate, as opposed to a day-time estimate.Knowledge of population distribution helps us understand how humans affect the natural world and how natural events such as storms and earthquakes, and other phenomena affect humans. This layer represents the footprint of where people live, and how many people live there.Dataset SummaryEach cell in this layer has an integer value with the estimated number of people likely to live in the geographic region represented by that cell. Esri additionally produced several additional layers World Population Estimate Confidence 2016: the confidence level (1-5) per cell for the probability of people being located and estimated correctly. World Population Density Estimate 2016: this layer is represented as population density in units of persons per square kilometer.World Settlement Score 2016: the dasymetric likelihood surface used to create this layer by apportioning population from census polygons to the settlement score raster.To use this layer in analysis, there are several properties or geoprocessing environment settings that should be used:Coordinate system: WGS_1984. This service and its underlying data are WGS_1984. We do this because projecting population count data actually will change the populations due to resampling and either collapsing or splitting cells to fit into another coordinate system. Cell Size: 0.0013474728 degrees (approximately 150-meters) at the equator. No Data: -1Bit Depth: 32-bit signedThis layer has query, identify, pixel, and export image functions enabled, and is restricted to a maximum analysis size of 30,000 x 30,000 pixels - an area about the size of Africa.Frye, C. et al., (2018). Using Classified and Unclassified Land Cover Data to Estimate the Footprint of Human Settlement. Data Science Journal. 17, p.20. DOI: http://doi.org/10.5334/dsj-2018-020.What can you do with this layer?This layer is unsuitable for mapping or cartographic use, and thus it does not include a convenient legend. Instead, this layer is useful for analysis, particularly for estimating counts of people living within watersheds, coastal areas, and other areas that do not have standard boundaries. Esri recommends using the Zonal Statistics tool or the Zonal Statistics to Table tool where you provide input zones as either polygons, or raster data, and the tool will summarize the count of population within those zones. https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/data-management/2016-world-population-estimate-services-are-now-available/
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data was reported at 19.800 NA in 2016. This records a decrease from the previous number of 20.000 NA for 2015. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data is updated yearly, averaging 21.200 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 23.400 NA in 2000 and a record low of 19.800 NA in 2016. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s India – Table IN.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Social media platforms have become integral tools in the conduct of foreign policy for many nations, including India. This dataset serves as a resource for analyzing ‘Social Media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic.’ The data were collected through a web-based questionnaire distributed primarily to people aged 18 – 61 and above in India. A total of 171 valid data were collected from 17 states offering extensive geographic coverage and stored in Mendeley. The 15 contributor states are Goa, Maharashtra, Tamil Nadu, Gujarat, Delhi, Assam, Haryana, Jammu and Kashmir, Karnataka, Kerala, Punjab, Rajasthan, Tripura, Uttar Pradesh and West Bengal. It encompasses diverse question formats, including single-choice, multiple-choice, quizzes, and open-ended. The study underscores the opportunities and challenges of employing 'X' diplomacy in India's foreign policy. Thus, there were two hypotheses. First, India's effective use of 'X' diplomacy positively impacts public perception of India's foreign policy effectiveness. Second, India's adept use of 'X' diplomacy during the COVID-19 pandemic enhances its ability to manage and respond to the crisis effectively. This data shows public perception of the effective use of social media by the Government of India, particularly in the crisis situation. Data also highlight the significant change in India’s narrative through its ‘X’ diplomacy, effectively setting the narratives, public perceptions, and diplomatic strategies. This data can be fully utilized in the study of the significance of social media in India’s foreign policy, the role of social media like ‘X’ in the making of India’s foreign policy, how effective social media like ‘X’ was during the Covid-19 pandemic and how Indian government utilized social media like ‘X’ to delivered messages and to set the narrative in the international politics.
The internet penetration rate in India rose over 55 percent in 2025, from about 14 percent in 2014. Although these figures seem relatively low, it meant that more than half of the population of 1.4 billion people had internet access that year. This also ranked the country second in the world in terms of active internet users. Internet availability and accessibility By 2021, the number of internet connections across the country tripled with urban areas accounting for a higher density of connections than rural regions. Despite incredibly low internet prices, internet usage in India has yet to reach its full potential. Lack of awareness and a tangible gender gap lie at the heart of the matter, with affordable mobile handsets and mobile internet connections presenting only a partial solution. Reliance Jio was the popular choice among Indian internet subscribers, offering them wider coverage at cheap rates. Digital living Home to one of the largest bases of netizens in the world, India is abuzz with internet activities being carried out every moment of every day. From information and research to shopping and entertainment to living in smart homes, Indians have welcomed digital living with open arms. Among these, social media usage was one of the most common reasons for accessing the internet.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The ORBIT (Object Recognition for Blind Image Training) -India Dataset is a collection of 105,243 images of 76 commonly used objects, collected by 12 individuals in India who are blind or have low vision. This dataset is an "Indian subset" of the original ORBIT dataset [1, 2], which was collected in the UK and Canada. In contrast to the ORBIT dataset, which was created in a Global North, Western, and English-speaking context, the ORBIT-India dataset features images taken in a low-resource, non-English-speaking, Global South context, a home to 90% of the world’s population of people with blindness. Since it is easier for blind or low-vision individuals to gather high-quality data by recording videos, this dataset, like the ORBIT dataset, contains images (each sized 224x224) derived from 587 videos. These videos were taken by our data collectors from various parts of India using the Find My Things [3] Android app. Each data collector was asked to record eight videos of at least 10 objects of their choice.
Collected between July and November 2023, this dataset represents a set of objects commonly used by people who are blind or have low vision in India, including earphones, talking watches, toothbrushes, and typical Indian household items like a belan (rolling pin), and a steel glass. These videos were taken in various settings of the data collectors' homes and workspaces using the Find My Things Android app.
The image dataset is stored in the ‘Dataset’ folder, organized by folders assigned to each data collector (P1, P2, ...P12) who collected them. Each collector's folder includes sub-folders named with the object labels as provided by our data collectors. Within each object folder, there are two subfolders: ‘clean’ for images taken on clean surfaces and ‘clutter’ for images taken in cluttered environments where the objects are typically found. The annotations are saved inside a ‘Annotations’ folder containing a JSON file per video (e.g., P1--coffee mug--clean--231220_084852_coffee mug_224.json) that contains keys corresponding to all frames/images in that video (e.g., "P1--coffee mug--clean--231220_084852_coffee mug_224--000001.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, "P1--coffee mug--clean--231220_084852_coffee mug_224--000002.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, ...). The ‘object_not_present_issue’ key is True if the object is not present in the image, and the ‘pii_present_issue’ key is True, if there is a personally identifiable information (PII) present in the image. Note, all PII present in the images has been blurred to protect the identity and privacy of our data collectors. This dataset version was created by cropping images originally sized at 1080 × 1920; therefore, an unscaled version of the dataset will follow soon.
This project was funded by the Engineering and Physical Sciences Research Council (EPSRC) Industrial ICASE Award with Microsoft Research UK Ltd. as the Industrial Project Partner. We would like to acknowledge and express our gratitude to our data collectors for their efforts and time invested in carefully collecting videos to build this dataset for their community. The dataset is designed for developing few-shot learning algorithms, aiming to support researchers and developers in advancing object-recognition systems. We are excited to share this dataset and would love to hear from you if and how you use this dataset. Please feel free to reach out if you have any questions, comments or suggestions.
REFERENCES:
Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision. DOI: https://doi.org/10.25383/city.14294597
microsoft/ORBIT-Dataset. https://github.com/microsoft/ORBIT-Dataset
Linda Yilin Wen, Cecily Morrison, Martin Grayson, Rita Faia Marques, Daniela Massiceti, Camilla Longden, and Edward Cutrell. 2024. Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24). Association for Computing Machinery, New York, NY, USA, Article 403, 1–6. https://doi.org/10.1145/3613905.3648641
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India Population: Census: Age: 15 to 19 Year data was reported at 120,526.449 Person th in 03-01-2011. This records an increase from the previous number of 100,216.000 Person th for 03-01-2001. India Population: Census: Age: 15 to 19 Year data is updated decadal, averaging 100,216.000 Person th from Mar 1991 (Median) to 03-01-2011, with 3 observations. The data reached an all-time high of 120,526.449 Person th in 03-01-2011 and a record low of 79,035.000 Person th in 03-01-1991. India Population: Census: Age: 15 to 19 Year data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAD001: Census: Population: by Age Group.
The National Family Health Survey (NFHS) was carried out as the principal activity of a collaborative project to strengthen the research capabilities of the Population Reasearch Centres (PRCs) in India, initiated by the Ministry of Health and Family Welfare (MOHFW), Government of India, and coordinated by the International Institute for Population Sciences (IIPS), Bombay. Interviews were conducted with a nationally representative sample of 89,777 ever-married women in the age group 13-49, from 24 states and the National Capital Territoty of Delhi. The main objective of the survey was to collect reliable and up-to-date information on fertility, family planning, mortality, and maternal and child health. Data collection was carried out in three phases from April 1992 to September 1993. THe NFHS is one of the most complete surveys of its kind ever conducted in India.
The households covered in the survey included 500,492 residents. The young age structure of the population highlights the momentum of the future population growth of the country; 38 percent of household residents are under age 15, with their reproductive years still in the future. Persons age 60 or older constitute 8 percent of the population. The population sex ratio of the de jure residents is 944 females per 1,000 males, which is slightly higher than sex ratio of 927 observed in the 1991 Census.
The primary objective of the NFHS is to provide national-level and state-level data on fertility, nuptiality, family size preferences, knowledge and practice of family planning, the potentiel demand for contraception, the level of unwanted fertility, utilization of antenatal services, breastfeeding and food supplemation practises, child nutrition and health, immunizations, and infant and child mortality. The NFHS is also designed to explore the demographic and socioeconomic determinants of fertility, family planning, and maternal and child health. This information is intended to assist policymakers, adminitrators and researchers in assessing and evaluating population and family welfare programmes and strategies. The NFHS used uniform questionnaires and uniform methods of sampling, data collection and analysis with the primary objective of providing a source of demographic and health data for interstate comparisons. The data collected in the NFHS are also comparable with those of the Demographic and Health Surveys (DHS) conducted in many other countries.
National
The population covered by the 1992-93 DHS is defined as the universe of all women age 13-49 who were either permanent residents of the households in the NDHS sample or visitors present in the households on the night before the survey were eligible to be interviewed.
Sample survey data
SAMPLE DESIGN
The sample design for the NFHS was discussed during a Sample Design Workshop held in Madurai in Octber, 1991. The workshop was attended by representative from the PRCs; the COs; the Office of the Registrar General, India; IIPS and the East-West Center/Macro International. A uniform sample design was adopted in all the NFHS states. The Sample design adopted in each state is a systematic, stratified sample of households, with two stages in rural areas and three stages in urban areas.
SAMPLE SIZE AND ALLOCATION
The sample size for each state was specified in terms of a target number of completed interviews with eligible women. The target sample size was set considering the size of the state, the time and ressources available for the survey and the need for separate estimates for urban and rural areas of the stat. The initial target sample size was 3,000 completed interviews with eligible women for states having a population of 25 million or less in 1991; 4,000 completed interviews for large states with more than 25 million population; 8,000 for Uttar Pradesh, the largest state; and 1,000 each for the six small northeastern states. In States with a substantial number of backward districts, the initial target samples were increased so as to allow separate estimates to be made for groups of backward districts.
The urban and rural samples within states were drawn separetly and , to the extent possible, sample allocation was proportional to the size of the urban-rural populations (to facilitate the selection of a self-weighting sample for each state). In states where the urban population was not sufficiently large to provide a sample of at least 1,000 completed interviews with eligible women, the urban areas were appropriately oversampled (except in the six small northeastern states).
THE RURAL SAMPLE: THE FRAME, STRATIFICATION AND SELECTION
A two-stage stratified sampling was adopted for the rural areas: selection of villages followed by selection of households. Because the 1991 Census data were not available at the time of sample selection in most states, the 1981 Census list of villages served as the sampling frame in all the states with the exception of Assam, Delhi and Punjab. In these three states the 1991 Census data were used as the sampling frame.
Villages were stratified prior to selection on the basis of a number of variables. The firts level of stratification in all the states was geographic, with districts subdivided into regions according to their geophysical characteristics. Within each of these regions, villages were further stratified using some of the following variables : village size, distance from the nearest town, proportion of nonagricultural workers, proportion of the population belonging to scheduled castes/scheduled tribes, and female literacy. However, not all variables were used in every state. Each state was examined individually and two or three variables were selected for stratification, with the aim of creating not more than 12 strata for small states and not more than 15 strata for large states. Females literacy was often used for implicit stratification (i.e., the villages were ordered prior to selection according to the proportion of females who were literate). Primary sampling Units (PSUs) were selected systematically, with probaility proportional to size (PPS). In some cases, adjacent villages with small population sizes were combined into a single PSU for the purpose of sample selection. On average, 30 households were selected for interviewing in each selected PSU.
In every state, all the households in the selected PSUs were listed about two weeks prior to the survey. This listing provided the necessary frame for selecting households at the second sampling stage. The household listing operation consisted of preparing up-to-date notional and layout sketch maps of each selected PSU, assigning numbers to structures, recording addresses (or locations) of these structures, identifying the residential structures, and listing the names of the heads of all the households in the residentiak structures in the selected PSU. Each household listing team consisted of a lister and a mapper. The listing operation was supervised by the senior field staff of the concerned CO and the PRC in each state. Special efforts were made not to miss any household in the selected PSU during the listing operation. In PSUs with fewer than 500 households, a complete household listing was done. In PSUs with 500 or more households, segmentation of the PSU was done on the basis of existing wards in the PSU, and two segments were selected using either systematic sampling or PPS sampling. The household listing in such PSUs was carried out in the selected segments. The households to be interviewed were selected from provided with the original household listing, layout sketch map and the household sample selected for each PSU. All the selected households were approached during the data collection, and no substitution of a household was allowed under any circumstances.
THE RURAL URBAN SAMPLE: THE FRAME, STRATIFICATION AND SELECTION
A three-stage sample design was adopted for the urban areas in each state: selection of cities/towns, followed by urban blocks, and finally households. Cities and towns were selected using the 1991 population figures while urban blocks were selected using the 1991 list of census enumeration blocks in all the states with the exception of the firts phase states. For the first phase states, the list of urban blocks provided by the National Sample Survey Organization (NSSSO) served as the sampling frame.
All cities and towns were subdivided into three strata: (1) self-selecting cities (i.e., cities with a population large enough to be selected with certainty), (2) towns that are district headquaters, and (3) other towns. Within each stratum, the cities/towns were arranged according to the same kind of geographic stratification used in the rural areas. In self-selecting cities, the sample was selected according to a two-stage sample design: selection of the required number of urban blocks, followed by selection of households in each of selected blocks. For district headquarters and other towns, a three stage sample design was used: selection of towns with PPS, followed by selection of two census blocks per selected town, followed by selection of households from each selected block. As in rural areas, a household listing was carried out in the selected blocks, and an average of 20 households per block was selected systematically.
Face-to-face
Three types of questionnaires were used in the NFHS: the Household Questionnaire, the Women's Questionnaire, and the Village Questionnaire. The overall content
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Urban population (% of total population) in India was reported at 36.87 % in 2024, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Urban population (% of total) - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India: Poverty ratio, percent living on less than 5.50 USD a day: The latest value from 2021 is 81.8 percent, a decline from 83 percent in 2020. In comparison, the world average is 25.11 percent, based on data from 71 countries. Historically, the average for India from 1977 to 2021 is 89.86 percent. The minimum value, 80.7 percent, was reached in 2019 while the maximum of 97.8 percent was recorded in 1977.
The National Family Health Survey 2019-21 (NFHS-5), the fifth in the NFHS series, provides information on population, health, and nutrition for India, each state/union territory (UT), and for 707 districts.
The primary objective of the 2019-21 round of National Family Health Surveys is to provide essential data on health and family welfare, as well as data on emerging issues in these areas, such as levels of fertility, infant and child mortality, maternal and child health, and other health and family welfare indicators by background characteristics at the national and state levels. Similar to NFHS-4, NFHS-5 also provides information on several emerging issues including perinatal mortality, high-risk sexual behaviour, safe injections, tuberculosis, noncommunicable diseases, and the use of emergency contraception.
The information collected through NFHS-5 is intended to assist policymakers and programme managers in setting benchmarks and examining progress over time in India’s health sector. Besides providing evidence on the effectiveness of ongoing programmes, NFHS-5 data will help to identify the need for new programmes in specific health areas.
The clinical, anthropometric, and biochemical (CAB) component of NFHS-5 is designed to provide vital estimates of the prevalence of malnutrition, anaemia, hypertension, high blood glucose levels, and waist and hip circumference, Vitamin D3, HbA1c, and malaria parasites through a series of biomarker tests and measurements.
National coverage
The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-54, and all children aged 0-5 resident in the household.
Sample survey data [ssd]
A uniform sample design, which is representative at the national, state/union territory, and district level, was adopted in each round of the survey. Each district is stratified into urban and rural areas. Each rural stratum is sub-stratified into smaller substrata which are created considering the village population and the percentage of the population belonging to scheduled castes and scheduled tribes (SC/ST). Within each explicit rural sampling stratum, a sample of villages was selected as Primary Sampling Units (PSUs); before the PSU selection, PSUs were sorted according to the literacy rate of women age 6+ years. Within each urban sampling stratum, a sample of Census Enumeration Blocks (CEBs) was selected as PSUs. Before the PSU selection, PSUs were sorted according to the percentage of SC/ST population. In the second stage of selection, a fixed number of 22 households per cluster was selected with an equal probability systematic selection from a newly created list of households in the selected PSUs. The list of households was created as a result of the mapping and household listing operation conducted in each selected PSU before the household selection in the second stage. In all, 30,456 Primary Sampling Units (PSUs) were selected across the country in NFHS-5 drawn from 707 districts as on March 31st 2017, of which fieldwork was completed in 30,198 PSUs.
For further details on sample design, see Section 1.2 of the final report.
Computer Assisted Personal Interview [capi]
Four survey schedules/questionnaires: Household, Woman, Man, and Biomarker were canvassed in 18 local languages using Computer Assisted Personal Interviewing (CAPI).
Electronic data collected in the 2019-21 National Family Health Survey were received on a daily basis via the SyncCloud system at the International Institute for Population Sciences, where the data were stored on a password-protected computer. Secondary editing of the data, which required resolution of computer-identified inconsistencies and coding of open-ended questions, was conducted in the field by the Field Agencies and at the Field Agencies central office, and IIPS checked the secondary edits before the dataset was finalized.
Field-check tables were produced by IIPS and the Field Agencies on a regular basis to identify certain types of errors that might have occurred in eliciting information and recording question responses. Information from the field-check tables on the performance of each fieldwork team and individual investigator was promptly shared with the Field Agencies during the fieldwork so that the performance of the teams could be improved, if required.
A total of 664,972 households were selected for the sample, of which 653,144 were occupied. Among the occupied households, 636,699 were successfully interviewed, for a response rate of 98 percent.
In the interviewed households, 747,176 eligible women age 15-49 were identified for individual women’s interviews. Interviews were completed with 724,115 women, for a response rate of 97 percent. In all, there were 111,179 eligible men age 15-54 in households selected for the state module. Interviews were completed with 101,839 men, for a response rate of 92 percent.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India: Population density, people per square km: The latest value from 2021 is 473 people per square km, an increase from 470 people per square km in 2020. In comparison, the world average is 456 people per square km, based on data from 196 countries. Historically, the average for India from 1961 to 2021 is 305 people per square km. The minimum value, 153 people per square km, was reached in 1961 while the maximum of 473 people per square km was recorded in 2021.
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Indian English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Indian English communication.
Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic Indian accents and dialects.
The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Indian English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.
The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.
Each audio file is paired with a human-verified, verbatim transcription available in JSON format.
These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.
The dataset comes with granular metadata for both speakers and recordings:
Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.
This dataset is a versatile resource for multiple English speech and language AI applications:
https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement
Welcome to the Indian English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.
This visual speech dataset contains 1000 videos in Indian English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.
While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.
The dataset provides comprehensive metadata for each video recording and participant:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The KHOJ (Know Your High Court Judges) dataset includes data on more than 1700 judges appointed between 1993 (after the creation of the collegium) and 2021. The dataset captures information across 43 variables including the personal, educational and professional backgrounds of India’s High Court judges. It opens pathways for researchers who are looking to probe deeper or wider into the composition of the High Courts and those who want to undertake jurimetrics studies which explore the linkage between judicial behaviour and the background of judges.
The core philosophy behind building such a dataset is the realization that people of the country should have more information about judges whose decisions have a real impact on such people's lives.
This dataset is the result of a joint effort over 15 months involving more than 30 students and 10 professionals who volunteered their time and efforts in preparing this dataset. This was a collaboration between NLUO’s Centre for Public Policy, Law and Good Governance, Agami and CivicDataLab. It started with the Summer of Data 2021 programme where students from across the country became the original data creators using official and publicly accessible data sources.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains information about total number of human trafficking cases reported per State/Union Territories in India, number of victims trafficked/rescued, nationality of the victims, age-group, purpose of trafficking, police and court disposal of cases, and number of culprits arrested/acquitted.
To know more about the Indian states and Union Territories, you may refer Know India
Till 2019, India had 29 states and 7 Union Territories. But in 2020, there were changes in the demographics and now, there are 28 states and 8 union territories.
Here is a short description about few terms present in the dataset. For further reading, you may refer this site.
So, if Final Report column contains 0, it implies that the investigation is not yet complete.
The data has been taken from the National Crime Records Bureau portal of India.
I recently watched some movies/documentaries on Human Trafficking which prompted me to compile this dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset tabulates the Non-Hispanic population of Ontario by race. It includes the distribution of the Non-Hispanic population of Ontario across various race categories as identified by the Census Bureau. The dataset can be utilized to understand the Non-Hispanic population distribution of Ontario across relevant racial categories.
Key observations
Of the Non-Hispanic population in Ontario, the largest racial group is White alone with a population of 5,873 (92.07% of the total Non-Hispanic population).
When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.
Racial categories include:
Variables / Data Columns
Good to know
Margin of Error
Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.
Custom data
If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.
Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.
This dataset is a part of the main dataset for Ontario Population by Race & Ethnicity. You can refer the same here
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.
All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.
Here is the data dictionary for (Indian) Shark Tank season's dataset.