100+ datasets found
  1. 🦈 Shark Tank India dataset 🇮🇳

    • kaggle.com
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Satya Thirumani (2025). 🦈 Shark Tank India dataset 🇮🇳 [Dataset]. https://www.kaggle.com/datasets/thirumani/shark-tank-india
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Satya Thirumani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Shark Tank India Data set.

    Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.

    All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.

    Here is the data dictionary for (Indian) Shark Tank season's dataset.

    • Season Number - Season number
    • Startup Name - Company name or product name
    • Episode Number - Episode number within the season
    • Pitch Number - Overall pitch number
    • Season Start - Season first aired date
    • Season End - Season last aired date
    • Original Air Date - Episode original/first aired date, on OTT/TV
    • Episode Title - Episode title in SonyLiv
    • Anchor - Name of the episode presenter/host
    • Industry - Industry name or type
    • Business Description - Business Description
    • Company Website - Company Website URL
    • Started in - Year in which startup was started/incorporated
    • Number of Presenters - Number of presenters
    • Male Presenters - Number of male presenters
    • Female Presenters - Number of female presenters
    • Transgender Presenters - Number of transgender/LGBTQ presenters
    • Couple Presenters - Are presenters wife/husband ? 1-yes, 0-no
    • Pitchers Average Age - All pitchers average age, <30 young, 30-50 middle, >50 old
    • Pitchers City - Presenter's town/city or place where company head office exists
    • Pitchers State - Indian state pitcher hails from or state where company head office exists
    • Yearly Revenue - Yearly revenue, in lakhs INR, -1 means negative revenue, 0 means pre-revenue
    • Monthly Sales - Total monthly sales, in lakhs
    • Gross Margin - Gross margin/profit of company, in percentages
    • Net Margin - Net margin/profit of company, in percentages
    • EBITDA - Earnings Before Interest, Taxes, Depreciation, and Amortization
    • Cash Burn - In loss in current year; burning/paying money from their pocket (yes/no)
    • SKUs - Stock Keeping Units or number of varieties, at the time of pitch
    • Has Patents - Pitcher has Patents/Intellectual property (filed/granted), at the time of pitch
    • Bootstrapped - Startup is bootstrapped or not (yes/no)
    • Part of Match off - Competition between two similar brands, pitched at same time
    • Original Ask Amount - Original Ask Amount, in lakhs INR
    • Original Offered Equity - Original Offered Equity, in percentages
    • Valuation Requested - Valuation Requested, in lakhs INR
    • Received Offer - Received offer or not, 1-received, 0-not received
    • Accepted Offer - Accepted offer or not, 1-accepted, 0-rejected
    • Total Deal Amount - Total Deal Amount, in lakhs INR
    • Total Deal Equity - Total Deal Equity, in percentages
    • Total Deal Debt - Total Deal debt/loan amount, in lakhs INR
    • Debt Interest - Debt interest rate, in percentages
    • Deal Valuation - Deal Valuation, in lakhs INR
    • Number of sharks in deal - Number of sharks involved in deal
    • Deal has conditions - Deal has conditions or not? (yes or no)
    • Royalty Percentage - Royalty percentage, if it's royalty deal
    • Royalty Recouped Amount - Royalty recouped amount, if it's royalty deal, in lakhs
    • Advisory Shares Equity - Deal with Advisory shares or equity, in percentages
    • Namita Investment Amount - Namita Investment Amount, in lakhs INR
    • Namita Investment Equity - Namita Investment Equity, in percentages
    • Namita Debt Amount - Namita Debt Amount, in lakhs INR
    • Vineeta Investment Amount - Vineeta Investment Amount, in lakhs INR
    • Vineeta Investment Equity - Vineeta Investment Equity, in percentages
    • Vineeta Debt Amount - Vineeta Debt Amount, in lakhs INR
    • Anupam Investment Amount - Anupam Investment Amount, in lakhs INR
    • Anupam Investment Equity - Anupam Investment Equity, in percentages
    • Anupam Debt Amount - Anupam Debt Amount, in lakhs INR
    • Aman Investment Amount - Aman Investment Amount, in lakhs INR
    • Aman Investment Equity - Aman Investment Equity, in percentages
    • Aman Debt Amount - Aman Debt Amount, in lakhs INR
    • Peyush Investment Amount - Peyush Investment Amount, in lakhs INR
    • Peyush Investment Equity - Peyush Investment Equity, in percentages
    • Peyush Debt Amount - Peyush Debt Amount, in lakhs INR
    • Ritesh Investment Amount - Ritesh Investment Amount, in lakhs INR
    • Ritesh Investment Equity - Ritesh Investment Equity, in percentages
    • Ritesh Debt Amount - Ritesh Debt Amount, in lakhs INR
    • Amit Investment Amount - Amit Investment Amount, in lakhs INR
    • Amit Investment Equity - Amit Investment Equity, in percentages
    • Amit Debt Amount - Amit Debt Amount, in lakhs INR
    • Guest Investment Amount - Guest Investment Amount, in lakhs INR
    • Guest Investment Equity - Guest Investment Equity, in percentages
    • Guest Debt Amount - Guest Debt Amount, in lakhs INR
    • Invested Guest Name - Name of the guest(s) who invested in deal
    • All Guest Names - Name of all guests, who are present in episode
    • Namita Present - Whether Namita present in episode or not
    • Vineeta Present - Whether Vineeta present in episode or not
    • Anupam ...
  2. T

    India - Rural Population

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jan 13, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). India - Rural Population [Dataset]. https://tradingeconomics.com/india/rural-population-percent-of-total-population-wb-data.html
    Explore at:
    excel, xml, json, csvAvailable download formats
    Dataset updated
    Jan 13, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    India
    Description

    Rural population (% of total population) in India was reported at 63.13 % in 2024, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Rural population - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.

  3. India Employed Persons

    • ceicdata.com
    Updated Mar 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com (2025). India Employed Persons [Dataset]. https://www.ceicdata.com/en/indicator/india/employed-persons
    Explore at:
    Dataset updated
    Mar 15, 2025
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2010 - Dec 1, 2021
    Area covered
    India
    Variables measured
    Employment
    Description

    Key information about India Employed Persons

    • India Employed Persons was reported at 470,495,536.230 Person in Dec 2021
    • It recorded an increase from the previous number of 447,183,819.730 Person for Dec 2020
    • India Employed Persons data is updated yearly, averaging 384,395,378.330 Person from Dec 1970 to 2021, with 52 observations
    • The data reached an all-time high of 485,507,600.000 Person in 2019 and a record low of 209,275,793.440 Person in 1970
    • India Employed Persons data remains active status in CEIC and is reported by CEIC Data
    • The data is categorized under World Trend Plus’s Global Economic Monitor – Table: Employed Persons: Annual: Asia

    Organisation for Economic Co-operation and Development provides annual Employed Persons.

  4. T

    India - Researchers In R&D (per Million People)

    • tradingeconomics.com
    csv, excel, json, xml
    Updated May 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2017). India - Researchers In R&D (per Million People) [Dataset]. https://tradingeconomics.com/india/researchers-in-r-d-per-million-people-wb-data.html
    Explore at:
    csv, xml, json, excelAvailable download formats
    Dataset updated
    May 29, 2017
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    India
    Description

    Researchers in R&D (per million people) in India was reported at 259 in 2020, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Researchers in R&D (per million people) - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.

  5. a

    PerCapita CO2 Footprint InDioceses FULL

    • hub.arcgis.com
    • catholic-geo-hub-cgisc.hub.arcgis.com
    Updated Sep 23, 2019
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    burhansm2 (2019). PerCapita CO2 Footprint InDioceses FULL [Dataset]. https://hub.arcgis.com/content/95787df270264e6ea1c99ffa6ff844ff
    Explore at:
    Dataset updated
    Sep 23, 2019
    Dataset authored and provided by
    burhansm2
    License

    Attribution-NoDerivs 4.0 (CC BY-ND 4.0)https://creativecommons.org/licenses/by-nd/4.0/
    License information was derived automatically

    Area covered
    Description

    PerCapita_CO2_Footprint_InDioceses_FULLBurhans, Molly A., Cheney, David M., Gerlt, R.. . “PerCapita_CO2_Footprint_InDioceses_FULL”. Scale not given. Version 1.0. MO and CT, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2019.MethodologyThis is the first global Carbon footprint of the Catholic population. We will continue to improve and develop these data with our research partners over the coming years. While it is helpful, it should also be viewed and used as a "beta" prototype that we and our research partners will build from and improve. The years of carbon data are (2010) and (2015 - SHOWN). The year of Catholic data is 2018. The year of population data is 2016. Care should be taken during future developments to harmonize the years used for catholic, population, and CO2 data.1. Zonal Statistics: Esri Population Data and Dioceses --> Population per dioceses, non Vatican based numbers2. Zonal Statistics: FFDAS and Dioceses and Population dataset --> Mean CO2 per Diocese3. Field Calculation: Population per Diocese and Mean CO2 per diocese --> CO2 per Capita4. Field Calculation: CO2 per Capita * Catholic Population --> Catholic Carbon FootprintAssumption: PerCapita CO2Deriving per-capita CO2 from mean CO2 in a geography assumes that people's footprint accounts for their personal lifestyle and involvement in local business and industries that are contribute CO2. Catholic CO2Assumes that Catholics and non-Catholic have similar CO2 footprints from their lifestyles.Derived from:A multiyear, global gridded fossil fuel CO2 emission data product: Evaluation and analysis of resultshttp://ffdas.rc.nau.edu/About.htmlRayner et al., JGR, 2010 - The is the first FFDAS paper describing the version 1.0 methods and results published in the Journal of Geophysical Research.Asefi et al., 2014 - This is the paper describing the methods and results of the FFDAS version 2.0 published in the Journal of Geophysical Research.Readme version 2.2 - A simple readme file to assist in using the 10 km x 10 km, hourly gridded Vulcan version 2.2 results.Liu et al., 2017 - A paper exploring the carbon cycle response to the 2015-2016 El Nino through the use of carbon cycle data assimilation with FFDAS as the boundary condition for FFCO2."S. Asefi‐Najafabady P. J. Rayner K. R. Gurney A. McRobert Y. Song K. Coltin J. Huang C. Elvidge K. BaughFirst published: 10 September 2014 https://doi.org/10.1002/2013JD021296 Cited by: 30Link to FFDAS data retrieval and visualization: http://hpcg.purdue.edu/FFDAS/index.phpAbstractHigh‐resolution, global quantification of fossil fuel CO2 emissions is emerging as a critical need in carbon cycle science and climate policy. We build upon a previously developed fossil fuel data assimilation system (FFDAS) for estimating global high‐resolution fossil fuel CO2 emissions. We have improved the underlying observationally based data sources, expanded the approach through treatment of separate emitting sectors including a new pointwise database of global power plants, and extended the results to cover a 1997 to 2010 time series at a spatial resolution of 0.1°. Long‐term trend analysis of the resulting global emissions shows subnational spatial structure in large active economies such as the United States, China, and India. These three countries, in particular, show different long‐term trends and exploration of the trends in nighttime lights, and population reveal a decoupling of population and emissions at the subnational level. Analysis of shorter‐term variations reveals the impact of the 2008–2009 global financial crisis with widespread negative emission anomalies across the U.S. and Europe. We have used a center of mass (CM) calculation as a compact metric to express the time evolution of spatial patterns in fossil fuel CO2 emissions. The global emission CM has moved toward the east and somewhat south between 1997 and 2010, driven by the increase in emissions in China and South Asia over this time period. Analysis at the level of individual countries reveals per capita CO2 emission migration in both Russia and India. The per capita emission CM holds potential as a way to succinctly analyze subnational shifts in carbon intensity over time. Uncertainties are generally lower than the previous version of FFDAS due mainly to an improved nightlight data set."Global Diocesan Boundaries:Burhans, M., Bell, J., Burhans, D., Carmichael, R., Cheney, D., Deaton, M., Emge, T. Gerlt, B., Grayson, J., Herries, J., Keegan, H., Skinner, A., Smith, M., Sousa, C., Trubetskoy, S. “Diocesean Boundaries of the Catholic Church” [Feature Layer]. Scale not given. Version 1.2. Redlands, CA, USA: GoodLands Inc., Environmental Systems Research Institute, Inc., 2016.Using: ArcGIS. 10.4. Version 10.0. Redlands, CA: Environmental Systems Research Institute, Inc., 2016.Boundary ProvenanceStatistics and Leadership DataCheney, D.M. “Catholic Hierarchy of the World” [Database]. Date Updated: August 2019. Catholic Hierarchy. Using: Paradox. Retrieved from Original Source.Catholic HierarchyAnnuario Pontificio per l’Anno .. Città del Vaticano :Tipografia Poliglotta Vaticana, Multiple Years.The data for these maps was extracted from the gold standard of Church data, the Annuario Pontificio, published yearly by the Vatican. The collection and data development of the Vatican Statistics Office are unknown. GoodLands is not responsible for errors within this data. We encourage people to document and report errant information to us at data@good-lands.org or directly to the Vatican.Additional information about regular changes in bishops and sees comes from a variety of public diocesan and news announcements.GoodLands’ polygon data layers, version 2.0 for global ecclesiastical boundaries of the Roman Catholic Church:Although care has been taken to ensure the accuracy, completeness and reliability of the information provided, due to this being the first developed dataset of global ecclesiastical boundaries curated from many sources it may have a higher margin of error than established geopolitical administrative boundary maps. Boundaries need to be verified with appropriate Ecclesiastical Leadership. The current information is subject to change without notice. No parties involved with the creation of this data are liable for indirect, special or incidental damage resulting from, arising out of or in connection with the use of the information. We referenced 1960 sources to build our global datasets of ecclesiastical jurisdictions. Often, they were isolated images of dioceses, historical documents and information about parishes that were cross checked. These sources can be viewed here:https://docs.google.com/spreadsheets/d/11ANlH1S_aYJOyz4TtG0HHgz0OLxnOvXLHMt4FVOS85Q/edit#gid=0To learn more or contact us please visit: https://good-lands.org/Esri Gridded Population Data 2016DescriptionThis layer is a global estimate of human population for 2016. Esri created this estimate by modeling a footprint of where people live as a dasymetric settlement likelihood surface, and then assigned 2016 population estimates stored on polygons of the finest level of geography available onto the settlement surface. Where people live means where their homes are, as in where people sleep most of the time, and this is opposed to where they work. Another way to think of this estimate is a night-time estimate, as opposed to a day-time estimate.Knowledge of population distribution helps us understand how humans affect the natural world and how natural events such as storms and earthquakes, and other phenomena affect humans. This layer represents the footprint of where people live, and how many people live there.Dataset SummaryEach cell in this layer has an integer value with the estimated number of people likely to live in the geographic region represented by that cell. Esri additionally produced several additional layers World Population Estimate Confidence 2016: the confidence level (1-5) per cell for the probability of people being located and estimated correctly. World Population Density Estimate 2016: this layer is represented as population density in units of persons per square kilometer.World Settlement Score 2016: the dasymetric likelihood surface used to create this layer by apportioning population from census polygons to the settlement score raster.To use this layer in analysis, there are several properties or geoprocessing environment settings that should be used:Coordinate system: WGS_1984. This service and its underlying data are WGS_1984. We do this because projecting population count data actually will change the populations due to resampling and either collapsing or splitting cells to fit into another coordinate system. Cell Size: 0.0013474728 degrees (approximately 150-meters) at the equator. No Data: -1Bit Depth: 32-bit signedThis layer has query, identify, pixel, and export image functions enabled, and is restricted to a maximum analysis size of 30,000 x 30,000 pixels - an area about the size of Africa.Frye, C. et al., (2018). Using Classified and Unclassified Land Cover Data to Estimate the Footprint of Human Settlement. Data Science Journal. 17, p.20. DOI: http://doi.org/10.5334/dsj-2018-020.What can you do with this layer?This layer is unsuitable for mapping or cartographic use, and thus it does not include a convenient legend. Instead, this layer is useful for analysis, particularly for estimating counts of people living within watersheds, coastal areas, and other areas that do not have standard boundaries. Esri recommends using the Zonal Statistics tool or the Zonal Statistics to Table tool where you provide input zones as either polygons, or raster data, and the tool will summarize the count of population within those zones. https://www.esri.com/arcgis-blog/products/arcgis-living-atlas/data-management/2016-world-population-estimate-services-are-now-available/

  6. I

    India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30...

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female [Dataset]. https://www.ceicdata.com/en/india/health-statistics/in-mortality-from-cvd-cancer-diabetes-or-crd-between-exact-ages-30-and-70-female
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2000 - Dec 1, 2016
    Area covered
    India
    Description

    India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data was reported at 19.800 NA in 2016. This records a decrease from the previous number of 20.000 NA for 2015. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data is updated yearly, averaging 21.200 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 23.400 NA in 2000 and a record low of 19.800 NA in 2016. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s India – Table IN.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;

  7. m

    Data from: A Dataset on 'Social media and India’s Foreign Policy: The Case...

    • data.mendeley.com
    Updated Dec 19, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mukund Narvenkar (2024). A Dataset on 'Social media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic' [Dataset]. http://doi.org/10.17632/xfr9y9ggkm.3
    Explore at:
    Dataset updated
    Dec 19, 2024
    Authors
    Mukund Narvenkar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Social media platforms have become integral tools in the conduct of foreign policy for many nations, including India. This dataset serves as a resource for analyzing ‘Social Media and India’s Foreign Policy: The Case Study of ‘X’ Diplomacy during the Covid-19 Pandemic.’ The data were collected through a web-based questionnaire distributed primarily to people aged 18 – 61 and above in India. A total of 171 valid data were collected from 17 states offering extensive geographic coverage and stored in Mendeley. The 15 contributor states are Goa, Maharashtra, Tamil Nadu, Gujarat, Delhi, Assam, Haryana, Jammu and Kashmir, Karnataka, Kerala, Punjab, Rajasthan, Tripura, Uttar Pradesh and West Bengal. It encompasses diverse question formats, including single-choice, multiple-choice, quizzes, and open-ended. The study underscores the opportunities and challenges of employing 'X' diplomacy in India's foreign policy. Thus, there were two hypotheses. First, India's effective use of 'X' diplomacy positively impacts public perception of India's foreign policy effectiveness. Second, India's adept use of 'X' diplomacy during the COVID-19 pandemic enhances its ability to manage and respond to the crisis effectively. This data shows public perception of the effective use of social media by the Government of India, particularly in the crisis situation. Data also highlight the significant change in India’s narrative through its ‘X’ diplomacy, effectively setting the narratives, public perceptions, and diplomatic strategies. This data can be fully utilized in the study of the significance of social media in India’s foreign policy, the role of social media like ‘X’ in the making of India’s foreign policy, how effective social media like ‘X’ was during the Covid-19 pandemic and how Indian government utilized social media like ‘X’ to delivered messages and to set the narrative in the international politics.

  8. Internet penetration rate in India 2014-2025

    • statista.com
    • ai-chatbox.pro
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista (2025). Internet penetration rate in India 2014-2025 [Dataset]. https://www.statista.com/statistics/792074/india-internet-penetration-rate/
    Explore at:
    Dataset updated
    Jul 14, 2025
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    India
    Description

    The internet penetration rate in India rose over 55 percent in 2025, from about 14 percent in 2014. Although these figures seem relatively low, it meant that more than half of the population of 1.4 billion people had internet access that year. This also ranked the country second in the world in terms of active internet users. Internet availability and accessibility By 2021, the number of internet connections across the country tripled with urban areas accounting for a higher density of connections than rural regions. Despite incredibly low internet prices, internet usage in India has yet to reach its full potential. Lack of awareness and a tangible gender gap lie at the heart of the matter, with affordable mobile handsets and mobile internet connections presenting only a partial solution. Reliance Jio was the popular choice among Indian internet subscribers, offering them wider coverage at cheap rates. Digital living Home to one of the largest bases of netizens in the world, India is abuzz with internet activities being carried out every moment of every day. From information and research to shopping and entertainment to living in smart homes, Indians have welcomed digital living with open arms. Among these, social media usage was one of the most common reasons for accessing the internet.

  9. The ORBIT (Object Recognition for Blind Image Training)-India Dataset

    • zenodo.org
    Updated Apr 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gesu India; Gesu India; Martin Grayson; Martin Grayson; Daniela Massiceti; Daniela Massiceti; Cecily Morrison; Cecily Morrison; Simon Robinson; Simon Robinson; Jennifer Pearson; Jennifer Pearson; Matt Jones; Matt Jones (2025). The ORBIT (Object Recognition for Blind Image Training)-India Dataset [Dataset]. http://doi.org/10.5281/zenodo.12608444
    Explore at:
    Dataset updated
    Apr 24, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Gesu India; Gesu India; Martin Grayson; Martin Grayson; Daniela Massiceti; Daniela Massiceti; Cecily Morrison; Cecily Morrison; Simon Robinson; Simon Robinson; Jennifer Pearson; Jennifer Pearson; Matt Jones; Matt Jones
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    The ORBIT (Object Recognition for Blind Image Training) -India Dataset is a collection of 105,243 images of 76 commonly used objects, collected by 12 individuals in India who are blind or have low vision. This dataset is an "Indian subset" of the original ORBIT dataset [1, 2], which was collected in the UK and Canada. In contrast to the ORBIT dataset, which was created in a Global North, Western, and English-speaking context, the ORBIT-India dataset features images taken in a low-resource, non-English-speaking, Global South context, a home to 90% of the world’s population of people with blindness. Since it is easier for blind or low-vision individuals to gather high-quality data by recording videos, this dataset, like the ORBIT dataset, contains images (each sized 224x224) derived from 587 videos. These videos were taken by our data collectors from various parts of India using the Find My Things [3] Android app. Each data collector was asked to record eight videos of at least 10 objects of their choice.

    Collected between July and November 2023, this dataset represents a set of objects commonly used by people who are blind or have low vision in India, including earphones, talking watches, toothbrushes, and typical Indian household items like a belan (rolling pin), and a steel glass. These videos were taken in various settings of the data collectors' homes and workspaces using the Find My Things Android app.

    The image dataset is stored in the ‘Dataset’ folder, organized by folders assigned to each data collector (P1, P2, ...P12) who collected them. Each collector's folder includes sub-folders named with the object labels as provided by our data collectors. Within each object folder, there are two subfolders: ‘clean’ for images taken on clean surfaces and ‘clutter’ for images taken in cluttered environments where the objects are typically found. The annotations are saved inside a ‘Annotations’ folder containing a JSON file per video (e.g., P1--coffee mug--clean--231220_084852_coffee mug_224.json) that contains keys corresponding to all frames/images in that video (e.g., "P1--coffee mug--clean--231220_084852_coffee mug_224--000001.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, "P1--coffee mug--clean--231220_084852_coffee mug_224--000002.jpeg": {"object_not_present_issue": false, "pii_present_issue": false}, ...). The ‘object_not_present_issue’ key is True if the object is not present in the image, and the ‘pii_present_issue’ key is True, if there is a personally identifiable information (PII) present in the image. Note, all PII present in the images has been blurred to protect the identity and privacy of our data collectors. This dataset version was created by cropping images originally sized at 1080 × 1920; therefore, an unscaled version of the dataset will follow soon.

    This project was funded by the Engineering and Physical Sciences Research Council (EPSRC) Industrial ICASE Award with Microsoft Research UK Ltd. as the Industrial Project Partner. We would like to acknowledge and express our gratitude to our data collectors for their efforts and time invested in carefully collecting videos to build this dataset for their community. The dataset is designed for developing few-shot learning algorithms, aiming to support researchers and developers in advancing object-recognition systems. We are excited to share this dataset and would love to hear from you if and how you use this dataset. Please feel free to reach out if you have any questions, comments or suggestions.

    REFERENCES:

    1. Daniela Massiceti, Lida Theodorou, Luisa Zintgraf, Matthew Tobias Harris, Simone Stumpf, Cecily Morrison, Edward Cutrell, and Katja Hofmann. 2021. ORBIT: A real-world few-shot dataset for teachable object recognition collected from people who are blind or low vision. DOI: https://doi.org/10.25383/city.14294597

    2. microsoft/ORBIT-Dataset. https://github.com/microsoft/ORBIT-Dataset

    3. Linda Yilin Wen, Cecily Morrison, Martin Grayson, Rita Faia Marques, Daniela Massiceti, Camilla Longden, and Edward Cutrell. 2024. Find My Things: Personalized Accessibility through Teachable AI for People who are Blind or Low Vision. In Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems (CHI EA '24). Association for Computing Machinery, New York, NY, USA, Article 403, 1–6. https://doi.org/10.1145/3613905.3648641

  10. India Population: Census: Age: 15 to 19 Year

    • ceicdata.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, India Population: Census: Age: 15 to 19 Year [Dataset]. https://www.ceicdata.com/en/india/census-population-by-age-group/population-census-age-15-to-19-year
    Explore at:
    Dataset provided by
    CEIC Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Mar 1, 1991 - Mar 1, 2011
    Area covered
    India
    Variables measured
    Population
    Description

    India Population: Census: Age: 15 to 19 Year data was reported at 120,526.449 Person th in 03-01-2011. This records an increase from the previous number of 100,216.000 Person th for 03-01-2001. India Population: Census: Age: 15 to 19 Year data is updated decadal, averaging 100,216.000 Person th from Mar 1991 (Median) to 03-01-2011, with 3 observations. The data reached an all-time high of 120,526.449 Person th in 03-01-2011 and a record low of 79,035.000 Person th in 03-01-1991. India Population: Census: Age: 15 to 19 Year data remains active status in CEIC and is reported by Office of the Registrar General & Census Commissioner, India. The data is categorized under India Premium Database’s Demographic – Table IN.GAD001: Census: Population: by Age Group.

  11. i

    National Family Health Survey 1992-1993 - India

    • catalog.ihsn.org
    • dev.ihsn.org
    • +1more
    Updated Jul 6, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Institute for Population Sciences (IIPS) (2017). National Family Health Survey 1992-1993 - India [Dataset]. https://catalog.ihsn.org/catalog/2547
    Explore at:
    Dataset updated
    Jul 6, 2017
    Dataset authored and provided by
    International Institute for Population Sciences (IIPS)
    Time period covered
    1992 - 1993
    Area covered
    India
    Description

    Abstract

    The National Family Health Survey (NFHS) was carried out as the principal activity of a collaborative project to strengthen the research capabilities of the Population Reasearch Centres (PRCs) in India, initiated by the Ministry of Health and Family Welfare (MOHFW), Government of India, and coordinated by the International Institute for Population Sciences (IIPS), Bombay. Interviews were conducted with a nationally representative sample of 89,777 ever-married women in the age group 13-49, from 24 states and the National Capital Territoty of Delhi. The main objective of the survey was to collect reliable and up-to-date information on fertility, family planning, mortality, and maternal and child health. Data collection was carried out in three phases from April 1992 to September 1993. THe NFHS is one of the most complete surveys of its kind ever conducted in India.

    The households covered in the survey included 500,492 residents. The young age structure of the population highlights the momentum of the future population growth of the country; 38 percent of household residents are under age 15, with their reproductive years still in the future. Persons age 60 or older constitute 8 percent of the population. The population sex ratio of the de jure residents is 944 females per 1,000 males, which is slightly higher than sex ratio of 927 observed in the 1991 Census.

    The primary objective of the NFHS is to provide national-level and state-level data on fertility, nuptiality, family size preferences, knowledge and practice of family planning, the potentiel demand for contraception, the level of unwanted fertility, utilization of antenatal services, breastfeeding and food supplemation practises, child nutrition and health, immunizations, and infant and child mortality. The NFHS is also designed to explore the demographic and socioeconomic determinants of fertility, family planning, and maternal and child health. This information is intended to assist policymakers, adminitrators and researchers in assessing and evaluating population and family welfare programmes and strategies. The NFHS used uniform questionnaires and uniform methods of sampling, data collection and analysis with the primary objective of providing a source of demographic and health data for interstate comparisons. The data collected in the NFHS are also comparable with those of the Demographic and Health Surveys (DHS) conducted in many other countries.

    Geographic coverage

    National

    Analysis unit

    • Household
    • Data collected for women 13-49, indicators calculated for women 15-49

    Universe

    The population covered by the 1992-93 DHS is defined as the universe of all women age 13-49 who were either permanent residents of the households in the NDHS sample or visitors present in the households on the night before the survey were eligible to be interviewed.

    Kind of data

    Sample survey data

    Sampling procedure

    SAMPLE DESIGN

    The sample design for the NFHS was discussed during a Sample Design Workshop held in Madurai in Octber, 1991. The workshop was attended by representative from the PRCs; the COs; the Office of the Registrar General, India; IIPS and the East-West Center/Macro International. A uniform sample design was adopted in all the NFHS states. The Sample design adopted in each state is a systematic, stratified sample of households, with two stages in rural areas and three stages in urban areas.

    SAMPLE SIZE AND ALLOCATION

    The sample size for each state was specified in terms of a target number of completed interviews with eligible women. The target sample size was set considering the size of the state, the time and ressources available for the survey and the need for separate estimates for urban and rural areas of the stat. The initial target sample size was 3,000 completed interviews with eligible women for states having a population of 25 million or less in 1991; 4,000 completed interviews for large states with more than 25 million population; 8,000 for Uttar Pradesh, the largest state; and 1,000 each for the six small northeastern states. In States with a substantial number of backward districts, the initial target samples were increased so as to allow separate estimates to be made for groups of backward districts.

    The urban and rural samples within states were drawn separetly and , to the extent possible, sample allocation was proportional to the size of the urban-rural populations (to facilitate the selection of a self-weighting sample for each state). In states where the urban population was not sufficiently large to provide a sample of at least 1,000 completed interviews with eligible women, the urban areas were appropriately oversampled (except in the six small northeastern states).

    THE RURAL SAMPLE: THE FRAME, STRATIFICATION AND SELECTION

    A two-stage stratified sampling was adopted for the rural areas: selection of villages followed by selection of households. Because the 1991 Census data were not available at the time of sample selection in most states, the 1981 Census list of villages served as the sampling frame in all the states with the exception of Assam, Delhi and Punjab. In these three states the 1991 Census data were used as the sampling frame.

    Villages were stratified prior to selection on the basis of a number of variables. The firts level of stratification in all the states was geographic, with districts subdivided into regions according to their geophysical characteristics. Within each of these regions, villages were further stratified using some of the following variables : village size, distance from the nearest town, proportion of nonagricultural workers, proportion of the population belonging to scheduled castes/scheduled tribes, and female literacy. However, not all variables were used in every state. Each state was examined individually and two or three variables were selected for stratification, with the aim of creating not more than 12 strata for small states and not more than 15 strata for large states. Females literacy was often used for implicit stratification (i.e., the villages were ordered prior to selection according to the proportion of females who were literate). Primary sampling Units (PSUs) were selected systematically, with probaility proportional to size (PPS). In some cases, adjacent villages with small population sizes were combined into a single PSU for the purpose of sample selection. On average, 30 households were selected for interviewing in each selected PSU.

    In every state, all the households in the selected PSUs were listed about two weeks prior to the survey. This listing provided the necessary frame for selecting households at the second sampling stage. The household listing operation consisted of preparing up-to-date notional and layout sketch maps of each selected PSU, assigning numbers to structures, recording addresses (or locations) of these structures, identifying the residential structures, and listing the names of the heads of all the households in the residentiak structures in the selected PSU. Each household listing team consisted of a lister and a mapper. The listing operation was supervised by the senior field staff of the concerned CO and the PRC in each state. Special efforts were made not to miss any household in the selected PSU during the listing operation. In PSUs with fewer than 500 households, a complete household listing was done. In PSUs with 500 or more households, segmentation of the PSU was done on the basis of existing wards in the PSU, and two segments were selected using either systematic sampling or PPS sampling. The household listing in such PSUs was carried out in the selected segments. The households to be interviewed were selected from provided with the original household listing, layout sketch map and the household sample selected for each PSU. All the selected households were approached during the data collection, and no substitution of a household was allowed under any circumstances.

    THE RURAL URBAN SAMPLE: THE FRAME, STRATIFICATION AND SELECTION

    A three-stage sample design was adopted for the urban areas in each state: selection of cities/towns, followed by urban blocks, and finally households. Cities and towns were selected using the 1991 population figures while urban blocks were selected using the 1991 list of census enumeration blocks in all the states with the exception of the firts phase states. For the first phase states, the list of urban blocks provided by the National Sample Survey Organization (NSSSO) served as the sampling frame.

    All cities and towns were subdivided into three strata: (1) self-selecting cities (i.e., cities with a population large enough to be selected with certainty), (2) towns that are district headquaters, and (3) other towns. Within each stratum, the cities/towns were arranged according to the same kind of geographic stratification used in the rural areas. In self-selecting cities, the sample was selected according to a two-stage sample design: selection of the required number of urban blocks, followed by selection of households in each of selected blocks. For district headquarters and other towns, a three stage sample design was used: selection of towns with PPS, followed by selection of two census blocks per selected town, followed by selection of households from each selected block. As in rural areas, a household listing was carried out in the selected blocks, and an average of 20 households per block was selected systematically.

    Mode of data collection

    Face-to-face

    Research instrument

    Three types of questionnaires were used in the NFHS: the Household Questionnaire, the Women's Questionnaire, and the Village Questionnaire. The overall content

  12. T

    India - Urban Population (% Of Total)

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Jul 22, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2013). India - Urban Population (% Of Total) [Dataset]. https://tradingeconomics.com/india/urban-population-percent-of-total-wb-data.html
    Explore at:
    excel, json, xml, csvAvailable download formats
    Dataset updated
    Jul 22, 2013
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 1, 1976 - Dec 31, 2025
    Area covered
    India
    Description

    Urban population (% of total population) in India was reported at 36.87 % in 2024, according to the World Bank collection of development indicators, compiled from officially recognized sources. India - Urban population (% of total) - actual values, historical data, forecasts and projections were sourced from the World Bank on July of 2025.

  13. I

    India Poverty at 5.50 USD per day - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated Dec 14, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2019). India Poverty at 5.50 USD per day - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/India/poverty_ratio_high_range/
    Explore at:
    xml, excel, csvAvailable download formats
    Dataset updated
    Dec 14, 2019
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1977 - Dec 31, 2021
    Area covered
    India
    Description

    India: Poverty ratio, percent living on less than 5.50 USD a day: The latest value from 2021 is 81.8 percent, a decline from 83 percent in 2020. In comparison, the world average is 25.11 percent, based on data from 71 countries. Historically, the average for India from 1977 to 2021 is 89.86 percent. The minimum value, 80.7 percent, was reached in 2019 while the maximum of 97.8 percent was recorded in 1977.

  14. w

    National Family Survey 2019-2021 - India

    • microdata.worldbank.org
    • catalog.ihsn.org
    • +1more
    Updated May 12, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    International Institute for Population Sciences (IIPS) (2022). National Family Survey 2019-2021 - India [Dataset]. https://microdata.worldbank.org/index.php/catalog/4482
    Explore at:
    Dataset updated
    May 12, 2022
    Dataset provided by
    Ministry of Health and Family Welfare (MoHFW)
    International Institute for Population Sciences (IIPS)
    Time period covered
    2019 - 2021
    Area covered
    India
    Description

    Abstract

    The National Family Health Survey 2019-21 (NFHS-5), the fifth in the NFHS series, provides information on population, health, and nutrition for India, each state/union territory (UT), and for 707 districts.

    The primary objective of the 2019-21 round of National Family Health Surveys is to provide essential data on health and family welfare, as well as data on emerging issues in these areas, such as levels of fertility, infant and child mortality, maternal and child health, and other health and family welfare indicators by background characteristics at the national and state levels. Similar to NFHS-4, NFHS-5 also provides information on several emerging issues including perinatal mortality, high-risk sexual behaviour, safe injections, tuberculosis, noncommunicable diseases, and the use of emergency contraception.

    The information collected through NFHS-5 is intended to assist policymakers and programme managers in setting benchmarks and examining progress over time in India’s health sector. Besides providing evidence on the effectiveness of ongoing programmes, NFHS-5 data will help to identify the need for new programmes in specific health areas.

    The clinical, anthropometric, and biochemical (CAB) component of NFHS-5 is designed to provide vital estimates of the prevalence of malnutrition, anaemia, hypertension, high blood glucose levels, and waist and hip circumference, Vitamin D3, HbA1c, and malaria parasites through a series of biomarker tests and measurements.

    Geographic coverage

    National coverage

    Analysis unit

    • Household
    • Individual
    • Children age 0-5
    • Woman age 15-49
    • Man age 15 to 54

    Universe

    The survey covered all de jure household members (usual residents), all women aged 15-49, all men age 15-54, and all children aged 0-5 resident in the household.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    A uniform sample design, which is representative at the national, state/union territory, and district level, was adopted in each round of the survey. Each district is stratified into urban and rural areas. Each rural stratum is sub-stratified into smaller substrata which are created considering the village population and the percentage of the population belonging to scheduled castes and scheduled tribes (SC/ST). Within each explicit rural sampling stratum, a sample of villages was selected as Primary Sampling Units (PSUs); before the PSU selection, PSUs were sorted according to the literacy rate of women age 6+ years. Within each urban sampling stratum, a sample of Census Enumeration Blocks (CEBs) was selected as PSUs. Before the PSU selection, PSUs were sorted according to the percentage of SC/ST population. In the second stage of selection, a fixed number of 22 households per cluster was selected with an equal probability systematic selection from a newly created list of households in the selected PSUs. The list of households was created as a result of the mapping and household listing operation conducted in each selected PSU before the household selection in the second stage. In all, 30,456 Primary Sampling Units (PSUs) were selected across the country in NFHS-5 drawn from 707 districts as on March 31st 2017, of which fieldwork was completed in 30,198 PSUs.

    For further details on sample design, see Section 1.2 of the final report.

    Mode of data collection

    Computer Assisted Personal Interview [capi]

    Research instrument

    Four survey schedules/questionnaires: Household, Woman, Man, and Biomarker were canvassed in 18 local languages using Computer Assisted Personal Interviewing (CAPI).

    Cleaning operations

    Electronic data collected in the 2019-21 National Family Health Survey were received on a daily basis via the SyncCloud system at the International Institute for Population Sciences, where the data were stored on a password-protected computer. Secondary editing of the data, which required resolution of computer-identified inconsistencies and coding of open-ended questions, was conducted in the field by the Field Agencies and at the Field Agencies central office, and IIPS checked the secondary edits before the dataset was finalized.

    Field-check tables were produced by IIPS and the Field Agencies on a regular basis to identify certain types of errors that might have occurred in eliciting information and recording question responses. Information from the field-check tables on the performance of each fieldwork team and individual investigator was promptly shared with the Field Agencies during the fieldwork so that the performance of the teams could be improved, if required.

    Response rate

    A total of 664,972 households were selected for the sample, of which 653,144 were occupied. Among the occupied households, 636,699 were successfully interviewed, for a response rate of 98 percent.

    In the interviewed households, 747,176 eligible women age 15-49 were identified for individual women’s interviews. Interviews were completed with 724,115 women, for a response rate of 97 percent. In all, there were 111,179 eligible men age 15-54 in households selected for the state module. Interviews were completed with 101,839 men, for a response rate of 92 percent.

  15. I

    India Population density - data, chart | TheGlobalEconomy.com

    • theglobaleconomy.com
    csv, excel, xml
    Updated May 11, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Globalen LLC (2020). India Population density - data, chart | TheGlobalEconomy.com [Dataset]. www.theglobaleconomy.com/India/population_density/
    Explore at:
    excel, csv, xmlAvailable download formats
    Dataset updated
    May 11, 2020
    Dataset authored and provided by
    Globalen LLC
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 31, 1961 - Dec 31, 2021
    Area covered
    India
    Description

    India: Population density, people per square km: The latest value from 2021 is 473 people per square km, an increase from 470 people per square km in 2020. In comparison, the world average is 456 people per square km, based on data from 196 countries. Historically, the average for India from 1961 to 2021 is 305 people per square km. The minimum value, 153 people per square km, was reached in 1961 while the maximum of 473 people per square km was recorded in 2021.

  16. F

    Indian English General Conversation Speech Dataset for ASR

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Indian English General Conversation Speech Dataset for ASR [Dataset]. https://www.futurebeeai.com/dataset/speech-dataset/general-conversation-english-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Indian English General Conversation Speech Dataset — a rich, linguistically diverse corpus purpose-built to accelerate the development of English speech technologies. This dataset is designed to train and fine-tune ASR systems, spoken language understanding models, and generative voice AI tailored to real-world Indian English communication.

    Curated by FutureBeeAI, this 30 hours dataset offers unscripted, spontaneous two-speaker conversations across a wide array of real-life topics. It enables researchers, AI developers, and voice-first product teams to build robust, production-grade English speech models that understand and respond to authentic Indian accents and dialects.

    Speech Data

    The dataset comprises 30 hours of high-quality audio, featuring natural, free-flowing dialogue between native speakers of Indian English. These sessions range from informal daily talks to deeper, topic-specific discussions, ensuring variability and context richness for diverse use cases.

    Participant Diversity:
    Speakers: 60 verified native Indian English speakers from FutureBeeAI’s contributor community.
    Regions: Representing various provinces of India to ensure dialectal diversity and demographic balance.
    Demographics: A balanced gender ratio (60% male, 40% female) with participant ages ranging from 18 to 70 years.
    Recording Details:
    Conversation Style: Unscripted, spontaneous peer-to-peer dialogues.
    Duration: Each conversation ranges from 15 to 60 minutes.
    Audio Format: Stereo WAV files, 16-bit depth, recorded at 16kHz sample rate.
    Environment: Quiet, echo-free settings with no background noise.

    Topic Diversity

    The dataset spans a wide variety of everyday and domain-relevant themes. This topic diversity ensures the resulting models are adaptable to broad speech contexts.

    Sample Topics Include:
    Family & Relationships
    Food & Recipes
    Education & Career
    Healthcare Discussions
    Social Issues
    Technology & Gadgets
    Travel & Local Culture
    Shopping & Marketplace Experiences, and many more.

    Transcription

    Each audio file is paired with a human-verified, verbatim transcription available in JSON format.

    Transcription Highlights:
    Speaker-segmented dialogues
    Time-coded utterances
    Non-speech elements (pauses, laughter, etc.)
    High transcription accuracy, achieved through double QA pass, average WER < 5%

    These transcriptions are production-ready, enabling seamless integration into ASR model pipelines or conversational AI workflows.

    Metadata

    The dataset comes with granular metadata for both speakers and recordings:

    Speaker Metadata: Age, gender, accent, dialect, state/province, and participant ID.
    Recording Metadata: Topic, duration, audio format, device type, and sample rate.

    Such metadata helps developers fine-tune model training and supports use-case-specific filtering or demographic analysis.

    Usage and Applications

    This dataset is a versatile resource for multiple English speech and language AI applications:

    ASR Development: Train accurate speech-to-text systems for Indian English.
    Voice Assistants: Build smart assistants capable of understanding natural Indian conversations.
    <div style="margin-top:10px; margin-bottom: 10px; padding-left: 30px; display: flex; gap: 16px; align-items:

  17. F

    Audio Visual Speech Dataset: Indian English

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Audio Visual Speech Dataset: Indian English [Dataset]. https://www.futurebeeai.com/dataset/multi-modal-dataset/indian-english-visual-speech-dataset
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    Welcome to the Indian English Language Visual Speech Dataset! This dataset is a collection of diverse, single-person unscripted spoken videos supporting research in visual speech recognition, emotion detection, and multimodal communication.

    Dataset Content

    This visual speech dataset contains 1000 videos in Indian English language each paired with a corresponding high-fidelity audio track. Each participant is answering a specific question in a video in an unscripted and spontaneous nature.

    Participant Diversity:
    Speakers: The dataset includes visual speech data from more than 200 participants from different states/provinces of India.
    Regions: Ensures a balanced representation of Skip 3 accents, dialects, and demographics.
    Participant Profile: Participants range from 18 to 70 years old, representing both males and females in a 60:40 ratio, respectively.

    Video Data

    While recording each video extensive guidelines are kept in mind to maintain the quality and diversity.

    Recording Details:
    File Duration: Average duration of 30 seconds to 3 minutes per video.
    Formats: Videos are available in MP4 or MOV format.
    Resolution: Videos are recorded in ultra-high-definition resolution with 30 fps or above.
    Device: Both the latest Android and iOS devices are used in this collection.
    Recording Conditions: Videos were recorded under various conditions to ensure diversity and reduce bias:
    Indoor and Outdoor Settings: Includes both indoor and outdoor recordings.
    Lighting Variations: Captures videos in daytime, nighttime, and varying lighting conditions.
    Camera Positions: Includes handheld and fixed camera positions, as well as portrait and landscape orientations.
    Face Orientation: Contains straight face and tilted face angles.
    Participant Positions: Records participants in both standing and seated positions.
    Motion Variations: Features both stationary and moving videos, where participants pass through different lighting conditions.
    Occlusions: Includes videos where the participant's face is partially occluded by hand movements, microphones, hair, glasses, and facial hair.
    Focus: In each video, the participant's face remains in focus throughout the video duration, ensuring the face stays within the video frame.
    Video Content: In each video, the participant answers a specific question in an unscripted manner. These questions are designed to capture various emotions of participants. The dataset contain videos expressing following human emotions:
    Happy
    Sad
    Excited
    Angry
    Annoyed
    Normal
    Question Diversity: For each human emotion participant answered a specific question expressing that particular emotion.

    Metadata

    The dataset provides comprehensive metadata for each video recording and participant:

  18. Z

    KHOJ - Dataset about the Indian High Court judges

    • data.niaid.nih.gov
    • explore.openaire.eu
    • +1more
    Updated Jul 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rangin Pallav Tripathy (2024). KHOJ - Dataset about the Indian High Court judges [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7670117
    Explore at:
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Apoorv Anand
    Rangin Pallav Tripathy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The KHOJ (Know Your High Court Judges) dataset includes data on more than 1700 judges appointed between 1993 (after the creation of the collegium) and 2021. The dataset captures information across 43 variables including the personal, educational and professional backgrounds of India’s High Court judges. It opens pathways for researchers who are looking to probe deeper or wider into the composition of the High Courts and those who want to undertake jurimetrics studies which explore the linkage between judicial behaviour and the background of judges.

    The core philosophy behind building such a dataset is the realization that people of the country should have more information about judges whose decisions have a real impact on such people's lives.

    This dataset is the result of a joint effort over 15 months involving more than 30 students and 10 professionals who volunteered their time and efforts in preparing this dataset. This was a collaboration between NLUO’s Centre for Public Policy, Law and Good Governance, Agami and CivicDataLab. It started with the Summer of Data 2021 programme where students from across the country became the original data creators using official and publicly accessible data sources.

  19. Human Trafficking In India (2018- 2020)

    • kaggle.com
    Updated Apr 15, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shefali C. (2022). Human Trafficking In India (2018- 2020) [Dataset]. https://www.kaggle.com/datasets/cshefali/human-trafficking-in-india-2018-2020
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 15, 2022
    Dataset provided by
    Kaggle
    Authors
    Shefali C.
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    India
    Description

    Content:

    This dataset contains information about total number of human trafficking cases reported per State/Union Territories in India, number of victims trafficked/rescued, nationality of the victims, age-group, purpose of trafficking, police and court disposal of cases, and number of culprits arrested/acquitted.

    To know more about the Indian states and Union Territories, you may refer Know India

    Till 2019, India had 29 states and 7 Union Territories. But in 2020, there were changes in the demographics and now, there are 28 states and 8 union territories.

    2018 - (29 states and 7 Union Territories)
    • Rate of Cognizable Crimes- this column refers to cases reported per 1 lakh population.
    • Due to non-receipt of data from Assam & Jharkhand for 2018, data furnished for 2017 has been used.
      ###### 2019 - (29 states and 7 Union Territories)
    • Due to non-receipt of data from West Bengal in time for 2019, data furnished for 2018 has been used.
      ###### 2020 - (28 states and 8 Union territories).
    • The earlier two UTs of D & N Haveli and Daman & Diu were combined into 1.
    • State of Jammu and Kashmir was changed to Union Territories of:
      1. Jammu & Kashmir
      2. Ladhak
    • The number of cases reported in 2018, 2019 for Jammu & Kashmir includes data for Ladhak too.

    Here is a short description about few terms present in the dataset. For further reading, you may refer this site.

    1. Charge Sheet- is the complaint of a private individual on which criminal proceedings are initiated. When the charge sheet is sent by police to Magistrate, the preliminary stage of investigation and preparation is over.
    2. Final report- The charge sheet is followed by the Final Report. It records the conclusion arrived at by the Police after the investigation process.

    So, if Final Report column contains 0, it implies that the investigation is not yet complete.

    Acknowledgement

    The data has been taken from the National Crime Records Bureau portal of India.

    Inspiration

    I recently watched some movies/documentaries on Human Trafficking which prompted me to compile this dataset.

  20. N

    Ontario, OR Non-Hispanic Population Breakdown By Race Dataset: Non-Hispanic...

    • neilsberg.com
    csv, json
    Updated Feb 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Neilsberg Research (2025). Ontario, OR Non-Hispanic Population Breakdown By Race Dataset: Non-Hispanic Population Counts and Percentages for 7 Racial Categories as Identified by the US Census Bureau // 2025 Edition [Dataset]. https://www.neilsberg.com/insights/ontario-or-population-by-race/
    Explore at:
    csv, jsonAvailable download formats
    Dataset updated
    Feb 21, 2025
    Dataset authored and provided by
    Neilsberg Research
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Ontario
    Variables measured
    Non-Hispanic Asian Population, Non-Hispanic Black Population, Non-Hispanic White Population, Non-Hispanic Some other race Population, Non-Hispanic Two or more races Population, Non-Hispanic American Indian and Alaska Native Population, Non-Hispanic Native Hawaiian and Other Pacific Islander Population, Non-Hispanic Asian Population as Percent of Total Non-Hispanic Population, Non-Hispanic Black Population as Percent of Total Non-Hispanic Population, Non-Hispanic White Population as Percent of Total Non-Hispanic Population, and 4 more
    Measurement technique
    The data presented in this dataset is derived from the latest U.S. Census Bureau American Community Survey (ACS) 2017-2021 5-Year Estimates. To measure the two variables, namely (a) Non-Hispanic population and (b) population as a percentage of the total Non-Hispanic population, we initially analyzed and categorized the data for each of the racial categories idetified by the US Census Bureau. It is ensured that the population estimates used in this dataset pertain exclusively to the identified racial categories, and are part of Non-Hispanic classification. For further information regarding these estimates, please feel free to reach out to us via email at research@neilsberg.com.
    Dataset funded by
    Neilsberg Research
    Description
    About this dataset

    Context

    The dataset tabulates the Non-Hispanic population of Ontario by race. It includes the distribution of the Non-Hispanic population of Ontario across various race categories as identified by the Census Bureau. The dataset can be utilized to understand the Non-Hispanic population distribution of Ontario across relevant racial categories.

    Key observations

    Of the Non-Hispanic population in Ontario, the largest racial group is White alone with a population of 5,873 (92.07% of the total Non-Hispanic population).

    Content

    When available, the data consists of estimates from the U.S. Census Bureau American Community Survey (ACS) 2019-2023 5-Year Estimates.

    Racial categories include:

    • White
    • Black or African American
    • American Indian and Alaska Native
    • Asian
    • Native Hawaiian and Other Pacific Islander
    • Some other race
    • Two or more races (multiracial)

    Variables / Data Columns

    • Race: This column displays the racial categories (for Non-Hispanic) for the Ontario
    • Population: The population of the racial category (for Non-Hispanic) in the Ontario is shown in this column.
    • % of Total Population: This column displays the percentage distribution of each race as a proportion of Ontario total Non-Hispanic population. Please note that the sum of all percentages may not equal one due to rounding of values.

    Good to know

    Margin of Error

    Data in the dataset are based on the estimates and are subject to sampling variability and thus a margin of error. Neilsberg Research recommends using caution when presening these estimates in your research.

    Custom data

    If you do need custom data for any of your research project, report or presentation, you can contact our research staff at research@neilsberg.com for a feasibility of a custom tabulation on a fee-for-service basis.

    Inspiration

    Neilsberg Research Team curates, analyze and publishes demographics and economic data from a variety of public and proprietary sources, each of which often includes multiple surveys and programs. The large majority of Neilsberg Research aggregated datasets and insights is made available for free download at https://www.neilsberg.com/research/.

    Recommended for further research

    This dataset is a part of the main dataset for Ontario Population by Race & Ethnicity. You can refer the same here

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Satya Thirumani (2025). 🦈 Shark Tank India dataset 🇮🇳 [Dataset]. https://www.kaggle.com/datasets/thirumani/shark-tank-india
Organization logo

🦈 Shark Tank India dataset 🇮🇳

Shark Tank India data set, includes Season 1 to Season 4 information

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Satya Thirumani
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Shark Tank India Data set.

Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.

All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.

Here is the data dictionary for (Indian) Shark Tank season's dataset.

  • Season Number - Season number
  • Startup Name - Company name or product name
  • Episode Number - Episode number within the season
  • Pitch Number - Overall pitch number
  • Season Start - Season first aired date
  • Season End - Season last aired date
  • Original Air Date - Episode original/first aired date, on OTT/TV
  • Episode Title - Episode title in SonyLiv
  • Anchor - Name of the episode presenter/host
  • Industry - Industry name or type
  • Business Description - Business Description
  • Company Website - Company Website URL
  • Started in - Year in which startup was started/incorporated
  • Number of Presenters - Number of presenters
  • Male Presenters - Number of male presenters
  • Female Presenters - Number of female presenters
  • Transgender Presenters - Number of transgender/LGBTQ presenters
  • Couple Presenters - Are presenters wife/husband ? 1-yes, 0-no
  • Pitchers Average Age - All pitchers average age, <30 young, 30-50 middle, >50 old
  • Pitchers City - Presenter's town/city or place where company head office exists
  • Pitchers State - Indian state pitcher hails from or state where company head office exists
  • Yearly Revenue - Yearly revenue, in lakhs INR, -1 means negative revenue, 0 means pre-revenue
  • Monthly Sales - Total monthly sales, in lakhs
  • Gross Margin - Gross margin/profit of company, in percentages
  • Net Margin - Net margin/profit of company, in percentages
  • EBITDA - Earnings Before Interest, Taxes, Depreciation, and Amortization
  • Cash Burn - In loss in current year; burning/paying money from their pocket (yes/no)
  • SKUs - Stock Keeping Units or number of varieties, at the time of pitch
  • Has Patents - Pitcher has Patents/Intellectual property (filed/granted), at the time of pitch
  • Bootstrapped - Startup is bootstrapped or not (yes/no)
  • Part of Match off - Competition between two similar brands, pitched at same time
  • Original Ask Amount - Original Ask Amount, in lakhs INR
  • Original Offered Equity - Original Offered Equity, in percentages
  • Valuation Requested - Valuation Requested, in lakhs INR
  • Received Offer - Received offer or not, 1-received, 0-not received
  • Accepted Offer - Accepted offer or not, 1-accepted, 0-rejected
  • Total Deal Amount - Total Deal Amount, in lakhs INR
  • Total Deal Equity - Total Deal Equity, in percentages
  • Total Deal Debt - Total Deal debt/loan amount, in lakhs INR
  • Debt Interest - Debt interest rate, in percentages
  • Deal Valuation - Deal Valuation, in lakhs INR
  • Number of sharks in deal - Number of sharks involved in deal
  • Deal has conditions - Deal has conditions or not? (yes or no)
  • Royalty Percentage - Royalty percentage, if it's royalty deal
  • Royalty Recouped Amount - Royalty recouped amount, if it's royalty deal, in lakhs
  • Advisory Shares Equity - Deal with Advisory shares or equity, in percentages
  • Namita Investment Amount - Namita Investment Amount, in lakhs INR
  • Namita Investment Equity - Namita Investment Equity, in percentages
  • Namita Debt Amount - Namita Debt Amount, in lakhs INR
  • Vineeta Investment Amount - Vineeta Investment Amount, in lakhs INR
  • Vineeta Investment Equity - Vineeta Investment Equity, in percentages
  • Vineeta Debt Amount - Vineeta Debt Amount, in lakhs INR
  • Anupam Investment Amount - Anupam Investment Amount, in lakhs INR
  • Anupam Investment Equity - Anupam Investment Equity, in percentages
  • Anupam Debt Amount - Anupam Debt Amount, in lakhs INR
  • Aman Investment Amount - Aman Investment Amount, in lakhs INR
  • Aman Investment Equity - Aman Investment Equity, in percentages
  • Aman Debt Amount - Aman Debt Amount, in lakhs INR
  • Peyush Investment Amount - Peyush Investment Amount, in lakhs INR
  • Peyush Investment Equity - Peyush Investment Equity, in percentages
  • Peyush Debt Amount - Peyush Debt Amount, in lakhs INR
  • Ritesh Investment Amount - Ritesh Investment Amount, in lakhs INR
  • Ritesh Investment Equity - Ritesh Investment Equity, in percentages
  • Ritesh Debt Amount - Ritesh Debt Amount, in lakhs INR
  • Amit Investment Amount - Amit Investment Amount, in lakhs INR
  • Amit Investment Equity - Amit Investment Equity, in percentages
  • Amit Debt Amount - Amit Debt Amount, in lakhs INR
  • Guest Investment Amount - Guest Investment Amount, in lakhs INR
  • Guest Investment Equity - Guest Investment Equity, in percentages
  • Guest Debt Amount - Guest Debt Amount, in lakhs INR
  • Invested Guest Name - Name of the guest(s) who invested in deal
  • All Guest Names - Name of all guests, who are present in episode
  • Namita Present - Whether Namita present in episode or not
  • Vineeta Present - Whether Vineeta present in episode or not
  • Anupam ...
Search
Clear search
Close search
Google apps
Main menu