21 datasets found
  1. 🦈 Shark Tank India dataset 🇮🇳

    • kaggle.com
    Updated Apr 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Satya Thirumani (2025). 🦈 Shark Tank India dataset 🇮🇳 [Dataset]. https://www.kaggle.com/datasets/thirumani/shark-tank-india
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 20, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Satya Thirumani
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Shark Tank India Data set.

    Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.

    All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.

    Here is the data dictionary for (Indian) Shark Tank season's dataset.

    • Season Number - Season number
    • Startup Name - Company name or product name
    • Episode Number - Episode number within the season
    • Pitch Number - Overall pitch number
    • Season Start - Season first aired date
    • Season End - Season last aired date
    • Original Air Date - Episode original/first aired date, on OTT/TV
    • Episode Title - Episode title in SonyLiv
    • Anchor - Name of the episode presenter/host
    • Industry - Industry name or type
    • Business Description - Business Description
    • Company Website - Company Website URL
    • Started in - Year in which startup was started/incorporated
    • Number of Presenters - Number of presenters
    • Male Presenters - Number of male presenters
    • Female Presenters - Number of female presenters
    • Transgender Presenters - Number of transgender/LGBTQ presenters
    • Couple Presenters - Are presenters wife/husband ? 1-yes, 0-no
    • Pitchers Average Age - All pitchers average age, <30 young, 30-50 middle, >50 old
    • Pitchers City - Presenter's town/city or place where company head office exists
    • Pitchers State - Indian state pitcher hails from or state where company head office exists
    • Yearly Revenue - Yearly revenue, in lakhs INR, -1 means negative revenue, 0 means pre-revenue
    • Monthly Sales - Total monthly sales, in lakhs
    • Gross Margin - Gross margin/profit of company, in percentages
    • Net Margin - Net margin/profit of company, in percentages
    • EBITDA - Earnings Before Interest, Taxes, Depreciation, and Amortization
    • Cash Burn - In loss in current year; burning/paying money from their pocket (yes/no)
    • SKUs - Stock Keeping Units or number of varieties, at the time of pitch
    • Has Patents - Pitcher has Patents/Intellectual property (filed/granted), at the time of pitch
    • Bootstrapped - Startup is bootstrapped or not (yes/no)
    • Part of Match off - Competition between two similar brands, pitched at same time
    • Original Ask Amount - Original Ask Amount, in lakhs INR
    • Original Offered Equity - Original Offered Equity, in percentages
    • Valuation Requested - Valuation Requested, in lakhs INR
    • Received Offer - Received offer or not, 1-received, 0-not received
    • Accepted Offer - Accepted offer or not, 1-accepted, 0-rejected
    • Total Deal Amount - Total Deal Amount, in lakhs INR
    • Total Deal Equity - Total Deal Equity, in percentages
    • Total Deal Debt - Total Deal debt/loan amount, in lakhs INR
    • Debt Interest - Debt interest rate, in percentages
    • Deal Valuation - Deal Valuation, in lakhs INR
    • Number of sharks in deal - Number of sharks involved in deal
    • Deal has conditions - Deal has conditions or not? (yes or no)
    • Royalty Percentage - Royalty percentage, if it's royalty deal
    • Royalty Recouped Amount - Royalty recouped amount, if it's royalty deal, in lakhs
    • Advisory Shares Equity - Deal with Advisory shares or equity, in percentages
    • Namita Investment Amount - Namita Investment Amount, in lakhs INR
    • Namita Investment Equity - Namita Investment Equity, in percentages
    • Namita Debt Amount - Namita Debt Amount, in lakhs INR
    • Vineeta Investment Amount - Vineeta Investment Amount, in lakhs INR
    • Vineeta Investment Equity - Vineeta Investment Equity, in percentages
    • Vineeta Debt Amount - Vineeta Debt Amount, in lakhs INR
    • Anupam Investment Amount - Anupam Investment Amount, in lakhs INR
    • Anupam Investment Equity - Anupam Investment Equity, in percentages
    • Anupam Debt Amount - Anupam Debt Amount, in lakhs INR
    • Aman Investment Amount - Aman Investment Amount, in lakhs INR
    • Aman Investment Equity - Aman Investment Equity, in percentages
    • Aman Debt Amount - Aman Debt Amount, in lakhs INR
    • Peyush Investment Amount - Peyush Investment Amount, in lakhs INR
    • Peyush Investment Equity - Peyush Investment Equity, in percentages
    • Peyush Debt Amount - Peyush Debt Amount, in lakhs INR
    • Ritesh Investment Amount - Ritesh Investment Amount, in lakhs INR
    • Ritesh Investment Equity - Ritesh Investment Equity, in percentages
    • Ritesh Debt Amount - Ritesh Debt Amount, in lakhs INR
    • Amit Investment Amount - Amit Investment Amount, in lakhs INR
    • Amit Investment Equity - Amit Investment Equity, in percentages
    • Amit Debt Amount - Amit Debt Amount, in lakhs INR
    • Guest Investment Amount - Guest Investment Amount, in lakhs INR
    • Guest Investment Equity - Guest Investment Equity, in percentages
    • Guest Debt Amount - Guest Debt Amount, in lakhs INR
    • Invested Guest Name - Name of the guest(s) who invested in deal
    • All Guest Names - Name of all guests, who are present in episode
    • Namita Present - Whether Namita present in episode or not
    • Vineeta Present - Whether Vineeta present in episode or not
    • Anupam ...
  2. d

    PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot...

    • datarade.ai
    Updated Oct 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Predik Data-driven (2021). PREDIK Data-Driven: Geospatial Data | USA | Tailor-made datasets: Foot traffic & Places Data [Dataset]. https://datarade.ai/data-products/predik-data-driven-geospatial-data-usa-tailor-made-datas-predik-data-driven
    Explore at:
    .json, .csv, .xls, .sqlAvailable download formats
    Dataset updated
    Oct 13, 2021
    Dataset authored and provided by
    Predik Data-driven
    Area covered
    United States
    Description

    This Location Data & Foot traffic dataset available for all countries include enriched raw mobility data and visitation at POIs to answer questions such as:

    -How often do people visit a location? (daily, monthly, absolute, and averages). -What type of places do they visit ? (parks, schools, hospitals, etc) -Which social characteristics do people have in a certain POI? - Breakdown by type: residents, workers, visitors. -What's their mobility like enduring night hours & day hours?
    -What's the frequency of the visits partition by day of the week and hour of the day?

    Extra insights -Visitors´ relative income Level. -Visitors´ preferences as derived by their visits to shopping, parks, sports facilities, churches, among others.

    Overview & Key Concepts Each record corresponds to a ping from a mobile device, at a particular moment in time and at a particular latitude and longitude. We procure this data from reliable technology partners, which obtain it through partnerships with location-aware apps. All the process is compliant with applicable privacy laws.

    We clean and process these massive datasets with a number of complex, computer-intensive calculations to make them easier to use in different data science and machine learning applications, especially those related to understanding customer behavior.

    Featured attributes of the data Device speed: based on the distance between each observation and the previous one, we estimate the speed at which the device is moving. This is particularly useful to differentiate between vehicles, pedestrians, and stationery observations.

    Night base of the device: we calculate the approximated location of where the device spends the night, which is usually their home neighborhood.

    Day base of the device: we calculate the most common daylight location during weekdays, which is usually their work location.

    Income level: we use the night neighborhood of the device, and intersect it with available socioeconomic data, to infer the device’s income level. Depending on the country, and the availability of good census data, this figure ranges from a relative wealth index to a currency-calculated income.

    POI visited: we intersect each observation with a number of POI databases, to estimate check-ins to different locations. POI databases can vary significantly, in scope and depth, between countries.

    Category of visited POI: for each observation that can be attributable to a POI, we also include a standardized location category (park, hospital, among others). Coverage: Worldwide.

    Delivery schemas We can deliver the data in three different formats:

    Full dataset: one record per mobile ping. These datasets are very large, and should only be consumed by experienced teams with large computing budgets.

    Visitation stream: one record per attributable visit. This dataset is considerably smaller than the full one but retains most of the more valuable elements in the dataset. This helps understand who visited a specific POI, characterize and understand the consumer's behavior.

    Audience profiles: one record per mobile device in a given period of time (usually monthly). All the visitation stream is aggregated by category. This is the most condensed version of the dataset and is very useful to quickly understand the types of consumers in a particular area and to create cohorts of users.

  3. m

    Medicinal Leaf Dataset

    • data.mendeley.com
    Updated Oct 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Roopashree S (2020). Medicinal Leaf Dataset [Dataset]. http://doi.org/10.17632/nnytj2v3n5.1
    Explore at:
    Dataset updated
    Oct 22, 2020
    Authors
    Roopashree S
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mother earth is enriched and nourished with a variety of plants. These plants are useful in many ways such as drug formulation, production of herbal products, and medicines to cure many common ailments and diseases. For the past 5000 years, Ayurveda, a traditional Indian medicinal system is widely accepted even today. India is a rich country for being the habitat for a variety of medicinal plants. Many parts of the plants such as leaves, bark, root, seeds, fruits, and many more are used as a vital ingredient for the production of herbal medicines. Herbal medicines are preferred in both developing and developed countries as an alternative to synthetic drugs mainly because of no side effects. Recognition of these plants by human sight will be tedious, time-consuming, and inaccurate. Applications of image processing and computer vision techniques for the identification of the medicinal plants are very crucial as many of them are under extinction as per the IUCN records. Hence, the digitization of useful medicinal plants is crucial for the conservation of biodiversity. Studies reveal that to build an intelligent system for recognition of medicinal herbs requires a decent size of plant leaf dataset. The dataset comprises of thirty species of healthy medicinal herbs such as Santalum album (Sandalwood), Muntingia calabura (Jamaica cherry), Plectranthus amboinicus / Coleus amboinicus (Indian Mint, Mexican mint), Brassica juncea (Oriental mustard), and many more. The dataset consists of 1500 images of forty species. Each species consist of 60 to 100 high-quality images. The folders are named as per the species botanical/scientific name. The leaves plucked are from different plants of the same species available in local gardens. It is keenly ensured not to pluck many leaves to build the dataset as it goes to waste after capturing a picture of it. Healthy and mature leaves are selected for the dataset. The instruments used are a Mobile camera (Model: Samsung s9+) and printer (Model: Canon Inkjet Printer). The images of the leaf in the dataset are slightly rotated and tilted to take its utmost advantage in training any machine learning and deep learning models. The contribution of the medicinal plant leaf dataset to develop Artificial Intelligence models (machine learning and deep learning) will assist many researchers and computer scientists to detect, identify the species and its diseases and learn more about the herb existence and medicinal properties. By releasing this dataset to the community, we look forward to stimulate research in medicinal plants where the current lack of public datasets is one of the main barriers for progress.

  4. Transparency in Keyword Faceted Search: a dataset of Google Shopping html...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cozza Vittoria; Cozza Vittoria; Hoang Van Tien; Hoang Van Tien; Petrocchi Marinella; Petrocchi Marinella; De Nicola Rocco; De Nicola Rocco (2020). Transparency in Keyword Faceted Search: a dataset of Google Shopping html pages [Dataset]. http://doi.org/10.5281/zenodo.1491557
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Cozza Vittoria; Cozza Vittoria; Hoang Van Tien; Hoang Van Tien; Petrocchi Marinella; Petrocchi Marinella; De Nicola Rocco; De Nicola Rocco
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains a collection of around 2,000 HTML pages: these web pages contain the search results obtained in return to queries for different products, searched by a set of synthetic users surfing Google Shopping (US version) from different locations, in July, 2016.

    Each file in the collection has a name where there is indicated the location from where the search has been done, the userID, and the searched product: no_email_LOCATION_USERID.PRODUCT.shopping_testing.#.html

    The locations are Philippines (PHI), United States (US), India (IN). The userIDs: 26 to 30 for users searching from Philippines, 1 to 5 from US, 11 to 15 from India.

    Products have been choice following 130 keywords (e.g., MP3 player, MP4 Watch, Personal organizer, Television, etc.).

    In the following, we describe how the search results have been collected.

    Each user has a fresh profile. The creation of a new profile corresponds to launch a new, isolated, web browser client instance and open the Google Shopping US web page.

    To mimic real users, the synthetic users can browse, scroll pages, stay on a page, and click on links.

    A fully-fledged web browser is used to get the correct desktop version of the website under investigation. This is because websites could be designed to behave according to user agents, as witnessed by the differences between the mobile and desktop versions of the same website.

    The prices are the retail ones displayed by Google Shopping in US dollars (thus, excluding shipping fees).

    Several frameworks have been proposed for interacting with web browsers and analysing results from search engines. This research adopts OpenWPM. OpenWPM is automatised with Selenium to efficiently create and manage different users with isolated Firefox and Chrome client instances, each of them with their own associated cookies.

    The experiments run, on average, 24 hours. In each of them, the software runs on our local server, but the browser's traffic is redirected to the designated remote servers (i.e., to India), via tunneling in SOCKS proxies. This way, all commands are simultaneously distributed over all proxies. The experiments adopt the Mozilla Firefox browser (version 45.0) for the web browsing tasks and run under Ubuntu 14.04. Also, for each query, we consider the first page of results, counting 40 products. Among them, the focus of the experiments is mostly on the top 10 and top 3 results.

    Due to connection errors, one of the Philippine profiles have no associated results. Also, for Philippines, a few keywords did not lead to any results: videocassette recorders, totes, umbrellas. Similarly, for US, no results were for totes and umbrellas.

    The search results have been analyzed in order to check if there were evidence of price steering, based on users' location.

    One term of usage applies:

    In any research product whose findings are based on this dataset, please cite

    @inproceedings{DBLP:conf/ircdl/CozzaHPN19,
     author  = {Vittoria Cozza and
            Van Tien Hoang and
            Marinella Petrocchi and
            Rocco {De Nicola}},
     title   = {Transparency in Keyword Faceted Search: An Investigation on Google
            Shopping},
     booktitle = {Digital Libraries: Supporting Open Science - 15th Italian Research
            Conference on Digital Libraries, {IRCDL} 2019, Pisa, Italy, January
            31 - February 1, 2019, Proceedings},
     pages   = {29--43},
     year   = {2019},
     crossref = {DBLP:conf/ircdl/2019},
     url    = {https://doi.org/10.1007/978-3-030-11226-4\_3},
     doi    = {10.1007/978-3-030-11226-4\_3},
     timestamp = {Fri, 18 Jan 2019 23:22:50 +0100},
     biburl  = {https://dblp.org/rec/bib/conf/ircdl/CozzaHPN19},
     bibsource = {dblp computer science bibliography, https://dblp.org}
    }
    

  5. Indo-German Literature Dataset

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv
    Updated May 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nina Smirnova; Nina Smirnova; Jack H. Culbert; Jack H. Culbert; Philipp Mayr; Philipp Mayr (2024). Indo-German Literature Dataset [Dataset]. http://doi.org/10.5281/zenodo.10607235
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    May 14, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nina Smirnova; Nina Smirnova; Jack H. Culbert; Jack H. Culbert; Philipp Mayr; Philipp Mayr
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description

    The IGLD is a dataset which is a mirror of the data utilised in the SEASON project selected from OpenAlex. It contains Indo-German research articles for research of academic collaboration between 1990 and 2022.

    Paper

    Our paper describing our work in the SEASON project:

    Aasif Ahmad Mir, Nina Sminova, Jeyshankar Ramalingam, & Philipp Mayr (2024). The rise of Indo-German collaborative research: 1990-2022. In Global Knowledge, Memory and Communication, 2024. https://doi.org/10.1108/GKMC-09-2023-0328

    Usage

    Description of Selection and Cleaning

    The following search query: CU (“GERMANY” AND “INDIA”) was used to retrieve the data from WoS. The data were retrieved from the year 1990 till the 30th of November 2022. A total of 36,999 records were retrieved against the employed query. For the present dataset, we retrieved only articles identical to those from WoS.

    Our original dataset retrieved from WoS consisted of 36,999 entries. 33,319 entries possess a valid DOI, and 3,680 entries do not have a DOI. Therefore, we developed two approaches for retrieving desired data from the Openalex collection. Articles possessing a DOI were matched by DOI (dataset 1), and articles without DOI (dataset 2) were matched by article title and publication year.

    Afterwards, DOIs in dataset 1 were additionally compared to the DOIs from the original WoS dataset, all inconsistencies were removed.

    For dataset 2, authors were additionally checked. Authors’ surnames from dataset 2 and authors’ surnames from corresponding articles (matching by title and publication year) from the WoS dataset were compared. Only articles with matching publishing years, author surnames lists and titles were considered for the Openalex dataset. Following, dataset 1 and dataset 2 were combined into one final dataset (Openalex data).

    Additionally, all duplicates (by article ID) were removed from the Openalex data. In the final step, we checked if all entries contained both German and Indian affiliations. Some inconsistencies with the WoS data were observed: 5,584 entries, which have both Indian and German affiliations in WoS had only one of the indicated above affiliations in the Openalex. These entries were removed from the final dataset. The final dataset resulted in 22,844 unique entries.

    Column Descriptions

    These descriptions are relevant summaries or extracts from the documentation at
    https://docs.openalex.org/api-entities/works/work-object,
    https://docs.openalex.org/api-entities/authors/author-object and
    https://docs.openalex.org/api-entities/institutions/institution-object.

    • article_id
      (Work attribute)

    • doi
      (Work attribute)

      • Digital Object Identifier for the work.
        Consists of a URL to doi.org

    • title
      (Work attribute)

      • Title of the work.

    • article_display_name
      (Work attribute)

      • Duplicate of "title" column, retained to match other OpenAlex objects' attribute.

    • publication_year
      (Work attribute)

      • The year in which the work was published.
        Please note that this is respective to the version of the work captured by OpenAlex as this particular entry. Other and potentially earlier published versions may be accessible in the work's location field, accessible from OpenAlex.

    • publication_date
      (Work attribute)

      • An ISO 8601 formatted date for the publication of the work.
        The same caveat to publication_year applies to publication_date.

    • article_type
      (Work attribute)

      • Type of work.
        E.g. Article, conference paper, report, dataset, etc.

    • article_type_crossref
      (Work attribute)

      • Legacy type information inherited from Crossref.

    • article_cited_by_count
      (Work attribute)

      • Number of citations to the work.

    • article_cited_by_api_url
      (Work attribute)

      • A OpenAlex URL that allows the user to view the works which cite this work.

      article_grants
      (Work attribute)

      • A list of details for the grants which the work is in receipt from.
        This information is gathered from Crossref and is described by OpenAlex at time of publication as "limited".

    • article_referenced_works_count
      (Work attribute)

      • Number of works within OpenAlex that this work cites.
        Please note that the total number of references in the work may be higher

    • language
      (Work attribute)

      • The ISO 639-1 style Language of the work.
        This attribute is inferred a software library (langdetect) used by OpenAlex based on the abstract, or title if the abstract is not available.

    • article_counts_by_year
      (Work attribute)

      • A list of the citation count of this work per year, for up to the last 10 years.

    • article_locations_count
      (Work attribute)

      • Number of locations this work can be found.
        In OpenAlex, "locations" refer to the places on the internet where versions of this work is accessible.

    • author_id
      (Author attribute)

      • OpenAlex identifier for an author of the work.
        To retrieve OpenAlex's bibliography for this user you may visit https://openalex.org/authors/.
        The following author attributes are associated with the author identifier in each row, please note that a work with multiple authors may have multiple rows, one for each author in OpenAlex.

    • orcid
      (Author attribute)

      • ORCID identifier for the author.

    • author_name
      (Author attribute)

      • Name of the author.

    • author_name_alternatives
      (Author attribute)

      • Alternative formats for the author's name which OpenAlex has observed.

    • author_works_count
      (Author attribute)

      • Number of works the author has created.

    • author_cited_by_count
      (Author attribute)

      • Number of works which cite a work the author has created.

    • author_last_known_institution
      (Author attribute)

      • Identifier for the institution with which the author is affiliated with, in the most recent publication from the author containing an institutional identifier.
        Please note this may differ from the institution associated with the author at time of the work's release, which is listed in this database as "institution_id".

    • author_summary_stats
      (Author attribute)

      • OpenAlex's citation metrics for the author.
        These include citation count, i10-index, h-index and more.

    • institution_id
      (Institution attribute)

      • OpenAlex identifier for the institution associated with the author when the work was published.

    • ror
      (Institution attribute)

      • Research Organization Registry (ROR) identifier for the institution.

    • institution_name
      (Institution attribute)

      • Name of the institution.

    • institution_country_code
      (Institution attribute)

      • ISO 3166-1 Alpha-2 (two-letter) country code for the country in which the institution is located.

    • insitution_type
      (Institution attribute)

      • ROR-style primary type for the institution.

    • institution_homepage_url
      (Institution attribute)

      • A URL for the institution's primary homepage

    • institution_display_name_acroynyms
      (Institution attribute)

      • Known acronyms or initialisms for the institution.

    • institution_display_name_alternatives
      (Institution attribute)

      • Alternative names for the institution.

    • institution_works_count
      (Institution attribute)

      • The number of works created by authors affiliated with this institution.

    • institution_cited_by_count
      (Institution attribute)

      • The number of works that cite a work created by authors affiliated by the institution.

    • institution_summary_stats
      (Institution attribute)

      • Citation metrics for the institution
        Similar to author_summary_stats.

    Information

    Contact Information

    Nina Smirnova - nina.smirnova@gesis.org

    Publication Information

    Released to Zenodo 1st Feb

  6. A

    ‘Shark Tank India Companies’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Feb 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Shark Tank India Companies’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-shark-tank-india-companies-8330/latest
    Explore at:
    Dataset updated
    Feb 13, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Shark Tank India Companies’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/devanshu125/shark-tank-india-companies on 13 February 2022.

    --- Dataset description provided by original source is as follows ---

    Context

    Recently, I saw a dataset based on Shark Tank USA. This dataset inspired me to create one for India as well and since season 1 recently ended, I thought this was the perfect time to look at some insights based on the deals.

    Content

    This dataset contains the following information - 1. episode - episode number 2. pitch_no - pitch number (unique) 3. company - company name 4. idea - company description 5. deal - final deal that was taken 6. ashneer - Did Ashneer invest? 7. namita - Did Ashneer invest? 8. anupam - Did Anupam invest? 9. vineeta - Did Vineeta invest? 10. aman - Did Aman invest? 11. peyush - Did - Did Peyush invest? 12. ghazal - Did Ghazal invest?

    Acknowledgements

    This data was scraped from Wikipedia.

    --- Original source retains full ownership of the source dataset ---

  7. e

    Global Dated Landslide Data Base during Sentinel-2 satellite data...

    • b2find.eudat.eu
    Updated Apr 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Global Dated Landslide Data Base during Sentinel-2 satellite data availability - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/bb7035e6-f42f-5f42-be92-979ed10665a6
    Explore at:
    Dataset updated
    Apr 17, 2024
    Description

    This Global Dated Landslide Database (GDLDB) is part of the project WeMonitor (Weakly Supervised Deep Learning Models for Detecting and Monitoring Spatio-Temporal Anomalies in Optical and Radar Satellite Time Series), funded by the Helmholtz Imaging Platform. The aim is to develop a deep learning model that uses satellite image time series from Sentinel1/2 to automatically monitor changes caused, for example, by landslides, deforestation, large fires, dam failures, or the emergence of waste dumps. To train such a model, a reference dataset is required that shows the area and date of the changes as precise as possible. To allow for a generic and transferable model, the reference data also needs to cover the diversity of the process to be detected. Thus, the aim of the GDLDB is to comprise landslides of different sizes, shapes, and types, occurring at different seasons and in different regions with varying natural conditions and different triggering mechanisms such as rainfall and earthquake-induced landslides. To build the GDLDB, available local and regional landslide inventories from around the world are combined into one coherent database by verifying their location and date of occurrence with high-resolution remote sensing data. The selection criteria for the source inventories are the definition of the landslide location as polygons, at least a rough indication of the landslide origin date, and that the landslides occurred during the Sentinel-2 data availability from 2016 onwards. A total of 16 individual inventories are included (Table 1), one each from the USA, Dominica, Italy, Zimbabwe, southern India, Nepal, China, Papua New Guinea, and New Zealand, and two each from Kyrgyzstan, Japan, and the Philippines. In addition, a global inventory was added, including a small number of landslides from the USA, Peru, Chile, Europe, Pakistan, Nepal, India, and Taiwan, and a larger number of landslides from Indonesia. From each inventory, approximately 100 landslides were randomly selected to ensure an unbiased selection of landslides in terms of shape, size, and location. The original source inventories are produced using a variety of methods, including manual mapping in airborne data with ground verification and automatic identification in satellite remote sensing data. As a result, the mapping quality of the inventories varies greatly. In cases where landslides could not be verified by us using available optical remote sensing data (e.g. Sentinel-2, Planet Scope, and data available in Google Earth) new polygons are selected until the number of approximately 100 landslides is reached. In some inventories, the number of 100 landslides could not be guaranteed, due to a lack of suitable landslides (e.g., small size, incorrect classification) or the total number of landslides in the selected inventory was less than 100. For inventories with a lot of small landslides, that were difficult or impossible to observe, a size threshold of 1000m2 was introduced.

  8. i

    Vadu HDSS INDEPTH Core Dataset 2009 - 2015 (Release 2017) - India

    • datacatalog.ihsn.org
    • catalog.ihsn.org
    Updated Mar 29, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Siddhivinayak Hirve (Founding Investigator: from 2002-2009) (2019). Vadu HDSS INDEPTH Core Dataset 2009 - 2015 (Release 2017) - India [Dataset]. https://datacatalog.ihsn.org/catalog/study/IND_2009-2015_INDEPTH-VHDSS_v01_M
    Explore at:
    Dataset updated
    Mar 29, 2019
    Dataset provided by
    Dr. Sanjay Juvekar (Founding Co-Investigator and presently Investigator: 2002 to date)
    Dr. Siddhivinayak Hirve (Founding Investigator: from 2002-2009)
    Time period covered
    2009 - 2015
    Area covered
    India
    Description

    Abstract

    Vadu Rural Health Program, KEM Hospital Research Centre Pune has a rich tradition in health care and development being in the forefront of needs-based, issue-driven research over almost 35 years. During the decades of 1980 and 1990 the research at Vadu focused on mother and child with epidemiological and social science research exploring low birth weight, child survival, maternal mortality, safe abortion and domestic violence. The research portfolio has ever since expanded to include adult health and aging, non-communicable and communicable diseases and to clinical trials in recent years. It started with establishment of Health and Demographic Surveillance System at Vadu (HDSS Vadu) in August, 2002 that seeks to establish a quasi-experimental design setting to allow evaluation of impact of health interventions as well as monitor secular trends in diseases, risk factors and health behavior of humans.

    The term "demographic surveillance" means to keep close track of the population dynamics. Vadu HDSS deals with keeping track of health issues and demographic changes in Vadu rural health program (VRHP) area. It is one of the most promising projects of national relevance that aims at establishing a quasi-experimental intervention research setting with the following objectives: 1) To create a longitudinal data base for efficient service delivery, future research, and linking all past micro-studies in Vadu area 2) Monitoring trends in public health problems 3) Keeping track of population dynamics 4) Evaluating intervention services

    This dataset contains the events of all individuals ever resident during the study period (1 Jan. 2009 to 31 Dec. 2015).

    Geographic coverage

    Vadu HDSS falls in two administrative blocks: (1) Shirur and (2) Haweli of Pune district in Maharashtra in western India. It covers an area of approximately 232 square kilometers.

    Analysis unit

    Individual

    Universe

    Vadu HDSS covers as many as 50,000 households having 140,000 population spread across 22 villages.

    Kind of data

    Event history data

    Frequency of data collection

    Two rounds per year

    Sampling procedure

    Vadu area including 22 villages in two administrative blocks is the study area. This area was selected as this is primarily coverage area of Vadu Rural Health Program which is in function since more than four decade. Every individual household is included in HDSS. There is no sampling strategy employed as 100% population coverage in the area is expected.

    Mode of data collection

    Proxy Respondent [proxy]

    Research instrument

    Language of communication is in Marath or Hindi. The form labels are multilingual - in English and Marathi, but the data entered through the forms are in English only.

    The following forms were used: - Field Worker Checklist Form - The checklist provides a guideline to ensure that all the households are covered during the round and the events occurred in each household are captured. - Enumeration Form: To capture the population details at the start of the HDSS or any addition of villages afterwards. - Pregnancy Form: To capture pregnancy details of women in the age group 15 to 49. - Birth Form: To capture the details of the birth events.
    - Inmigration Form: To capture inward population movement from outside the HDSS area and also for movement within the HDSS area. - Outmigration Form: To capture outward population movement from inside the HDSS area and also for movement within the HDSS area. - Death Form: To capture death events.

    Cleaning operations

    Entered data undergo a data cleaning process. During the cleaning process all error data are either corrected in consultaiton with the data QC team or the respective forms are sent back to the field for re collection of correct data. Data editors have the access to the raw dataset for making necessary editing after corrected data are bought from the field.

    For all individuals whose enumeration (ENU), Inmigration (IMG) or Birth (BTH) have occurred before the left censoring date (2009-01-01) and have not outmigrated (OMG) or not died (DTH) before the left censoring date (2009-01-01) are included in the dataset as Enumeration (ENU) with EventDate as the left censored date (2009-01-01). But the actual date of observation of the event (ENU, BTH, IMG) is retained in the dataset as observation date for these left censored ENU events. The individual is dropped from the dataset if their end event (OMG or DTH) is prior to the left censoring date (2009-01-01)

    Response rate

    On an average the response rate is 99.99% in all rounds over the years.

    Sampling error estimates

    Not Applicable

    Data appraisal

    Data is cleaned to an acceptable level against the standard data rules using Pentaho Data Integration Comminity Edition (PDI CE) tool. After the cleaning process, quality metrics were as follows:

    CentreId MetricTable QMetric Illegal Legal Total Metric RunDate IN021 MicroDataCleaned Starts 1 301112 301113 0. 2017-05-31 20:06
    IN021 MicroDataCleaned Transitions 0 667010 667010 0. 2017-05-31 20:07
    IN021 MicroDataCleaned Ends 301113 2017-05-31 20:07
    IN021 MicroDataCleaned SexValues 29 666981 667010 0. 2017-05-31 20:07
    IN021 MicroDataCleaned DoBValues 575 666435 667010 0. 2017-05-31 20:07

    Note: Except lower under five mortality in 2012 and lower adult mortality among females in 2013, all other estimates are fairly within expected range. Data underwent additional review in terms of electronic data capture, data cleaning and management to look for reasons for lower under five mortality rates in 2013 and lower female adult mortality in 2013. The additional review returned marginally higher rates and this supplements the validity of collected data. Further field related review of 2012 and 2013 data are underway and any revisions to published data/figures will be shared at a later stage.

  9. F

    Hindi Wake Words & Voice Commands Speech Data

    • futurebeeai.com
    wav
    Updated Aug 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    FutureBee AI (2022). Hindi Wake Words & Voice Commands Speech Data [Dataset]. https://www.futurebeeai.com/dataset/wake-words-and-commands-dataset/wake-words-and-commands-hindi-india
    Explore at:
    wavAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    FutureBeeAI
    Authors
    FutureBee AI
    License

    https://www.futurebeeai.com/policies/ai-data-license-agreementhttps://www.futurebeeai.com/policies/ai-data-license-agreement

    Dataset funded by
    FutureBeeAI
    Description

    Introduction

    The Hindi Wake Word & Voice Command Dataset is expertly curated to support the training and development of voice-activated systems. This dataset includes a large collection of wake words and command phrases, essential for enabling seamless user interaction with voice assistants and other speech-enabled technologies. It’s designed to ensure accurate wake word detection and voice command recognition, enhancing overall system performance and user experience.

    Speech Data

    This dataset includes 20,000+ audio recordings of wake words and command phrases. Each participant contributed 400 recordings, captured under varied environmental conditions and speaking speeds. The data covers:

    Wake words alone
    Wake words followed by command phrases

    Participant Diversity

    Speakers: 50 native Hindi speakers from the FutureBeeAI community
    Regions: Participants from various India provinces, ensuring broad coverage of accents and dialects
    Demographics: Ages 18–70; 60% male and 40% female participants

    Recording Details

    Type: Scripted wake words and command phrases
    Duration: 1 to 15 seconds per clip
    Format: WAV, stereo, 16-bit, with sample rates ranging from 16 kHz to 48 kHz

    Dataset Diversity

    Wake Word Types
    Automobile Wake Words: Hey Mercedes, Hey BMW, Hey Porsche, Hey Volvo, Hey Audi, Hi Genesis, Ok Ford, etc.
    Voice Assistant Wake Words: Hey Siri, Ok Google, Alexa, Hey Cortana, Hi Bixby, Hey Celia, etc.
    Home Appliance Wake Words: Hi LG, Ok LG, Hello Lloyd, and more
    Command Types by Use Case
    Automobile: Play music, check directions, voice search, provide feedback, and more
    Voice Assistant: Ask general questions, make calls, control devices, shopping, manage calendars, and more
    Home Appliances: Control appliances, check status, set reminders/alarms, manage shopping lists, etc.
    Recording Environments
    No background noise
    Background traffic noise
    People talking in the background
    Speaking Pace
    Normal speed
    Fast speed

    This diversity ensures robust training for real-world voice assistant applications.

    Metadata

    Each audio file is accompanied by detailed metadata to support advanced filtering and training needs.

    Participant Metadata: Unique ID, age, gender, region, accent, dialect
    Recording Metadata: Transcript, environment, pace, device used, sample rate, bit depth, file format

    Use Cases & Applications

    Voice Assistant Activation: Train models to accurately detect and trigger based on wake words
    Smart Home Devices: Enable responsive voice control in smart appliances
    <b style="font-weight:

  10. w

    India - National Family Health Survey 1998-1999 - Dataset - waterdata

    • wbwaterdata.org
    Updated Mar 16, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). India - National Family Health Survey 1998-1999 - Dataset - waterdata [Dataset]. https://wbwaterdata.org/dataset/india-national-family-health-survey-1998-1999
    Explore at:
    Dataset updated
    Mar 16, 2020
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    The second National Family Health Survey (NFHS-2), conducted in 1998-99, provides information on fertility, mortality, family planning, and important aspects of nutrition, health, and health care. The International Institute for Population Sciences (IIPS) coordinated the survey, which collected information from a nationally representative sample of more than 90,000 ever-married women age 15-49. The NFHS-2 sample covers 99 percent of India's population living in all 26 states. This report is based on the survey data for 25 of the 26 states, however, since data collection in Tripura was delayed due to local problems in the state. IIPS also coordinated the first National Family Health Survey (NFHS-1) in 1992-93. Most of the types of information collected in NFHS-2 were also collected in the earlier survey, making it possible to identify trends over the intervening period of six and one-half years. In addition, the NFHS-2 questionnaire covered a number of new or expanded topics with important policy implications, such as reproductive health, women's autonomy, domestic violence, women's nutrition, anaemia, and salt iodization. The NFHS-2 survey was carried out in two phases. Ten states were surveyed in the first phase which began in November 1998 and the remaining states (except Tripura) were surveyed in the second phase which began in March 1999. The field staff collected information from 91,196 households in these 25 states and interviewed 89,199 eligible women in these households. In addition, the survey collected information on 32,393 children born in the three years preceding the survey. One health investigator on each survey team measured the height and weight of eligible women and children and took blood samples to assess the prevalence of anaemia. SUMMARY OF FINDINGS POPULATION CHARACTERISTICS Three-quarters (73 percent) of the population lives in rural areas. The age distribution is typical of populations that have recently experienced a fertility decline, with relatively low proportions in the younger and older age groups. Thirty-six percent of the population is below age 15, and 5 percent is age 65 and above. The sex ratio is 957 females for every 1,000 males in rural areas but only 928 females for every 1,000 males in urban areas, suggesting that more men than women have migrated to urban areas. The survey provides a variety of demographic and socioeconomic background information. In the country as a whole, 82 percent of household heads are Hindu, 12 percent are Muslim, 3 percent are Christian, and 2 percent are Sikh. Muslims live disproportionately in urban areas, where they comprise 15 percent of household heads. Nineteen percent of household heads belong to scheduled castes, 9 percent belong to scheduled tribes, and 32 percent belong to other backward classes (OBCs). Two-fifths of household heads do not belong to any of these groups. Questions about housing conditions and the standard of living of households indicate some improvements since the time of NFHS-1. Sixty percent of households in India now have electricity and 39 percent have piped drinking water compared with 51 percent and 33 percent, respectively, at the time of NFHS-1. Sixty-four percent of households have no toilet facility compared with 70 percent at the time of NFHS-1. About three-fourths (75 percent) of males and half (51 percent) of females age six and above are literate, an increase of 6-8 percentage points from literacy rates at the time of NFHS-1. The percentage of illiterate males varies from 6-7 percent in Mizoram and Kerala to 37 percent in Bihar and the percentage of illiterate females varies from 11 percent in Mizoram and 15 percent in Kerala to 65 percent in Bihar. Seventy-nine percent of children age 6-14 are attending school, up from 68 percent in NFHS-1. The proportion of children attending school has increased for all ages, particularly for girls, but girls continue to lag behind boys in school attendance. Moreover, the disparity in school attendance by sex grows with increasing age of children. At age 6-10, 85 percent of boys attend school compared with 78 percent of girls. By age 15-17, 58 percent of boys attend school compared with 40 percent of girls. The percentage of girls 6-17 attending school varies from 51 percent in Bihar and 56 percent in Rajasthan to over 90 percent in Himachal Pradesh and Kerala. Women in India tend to marry at an early age. Thirty-four percent of women age 15-19 are already married including 4 percent who are married but gauna has yet to be performed. These proportions are even higher in the rural areas. Older women are more likely than younger women to have married at an early age: 39 percent of women currently age 45-49 married before age 15 compared with 14 percent of women currently age 15-19. Although this indicates that the proportion of women who marry young is declining rapidly, half the women even in the age group 20-24 have married before reaching the legal minimum age of 18 years. On average, women are five years younger than the men they marry. The median age at marriage varies from about 15 years in Madhya Pradesh, Bihar, Uttar Pradesh, Rajasthan, and Andhra Pradesh to 23 years in Goa. As part of an increasing emphasis on gender issues, NFHS-2 asked women about their participation in household decisionmaking. In India, 91 percent of women are involved in decision-making on at least one of four selected topics. A much lower proportion (52 percent), however, are involved in making decisions about their own health care. There are large variations among states in India with regard to women's involvement in household decisionmaking. More than three out of four women are involved in decisions about their own health care in Himachal Pradesh, Meghalaya, and Punjab compared with about two out of five or less in Madhya Pradesh, Orissa, and Rajasthan. Thirty-nine percent of women do work other than housework, and more than two-thirds of these women work for cash. Only 41 percent of women who earn cash can decide independently how to spend the money that they earn. Forty-three percent of working women report that their earnings constitute at least half of total family earnings, including 18 percent who report that the family is entirely dependent on their earnings. Women's work-participation rates vary from 9 percent in Punjab and 13 percent in Haryana to 60-70 percent in Manipur, Nagaland, and Arunachal Pradesh. FERTILITY AND FAMILY PLANNING Fertility continues to decline in India. At current fertility levels, women will have an average of 2.9 children each throughout their childbearing years. The total fertility rate (TFR) is down from 3.4 children per woman at the time of NFHS-1, but is still well above the replacement level of just over two children per woman. There are large variations in fertility among the states in India. Goa and Kerala have attained below replacement level fertility and Karnataka, Himachal Pradesh, Tamil Nadu, and Punjab are at or close to replacement level fertility. By contrast, fertility is 3.3 or more children per woman in Meghalaya, Uttar Pradesh, Rajasthan, Nagaland, Bihar, and Madhya Pradesh. More than one-third to less than half of all births in these latter states are fourth or higher-order births compared with 7-9 percent of births in Kerala, Goa, and Tamil Nadu. Efforts to encourage the trend towards lower fertility might usefully focus on groups within the population that have higher fertility than average. In India, rural women and women from scheduled tribes and scheduled castes have somewhat higher fertility than other women, but fertility is particularly high for illiterate women, poor women, and Muslim women. Another striking feature is the high level of childbearing among young women. More than half of women age 20-49 had their first birth before reaching age 20, and women age 15-19 account for almost one-fifth of total fertility. Studies in India and elsewhere have shown that health and mortality risks increase when women give birth at such young ages?both for the women themselves and for their children. Family planning programmes focusing on women in this age group could make a significant impact on maternal and child health and help to reduce fertility. INFANT AND CHILD MORTALITY NFHS-2 provides estimates of infant and child mortality and examines factors associated with the survival of young children. During the five years preceding the survey, the infant mortality rate was 68 deaths at age 0-11 months per 1,000 live births, substantially lower than 79 per 1,000 in the five years preceding the NFHS-1 survey. The child mortality rate, 29 deaths at age 1-4 years per 1,000 children reaching age one, also declined from the corresponding rate of 33 per 1,000 in NFHS-1. Ninety-five children out of 1,000 born do not live to age five years. Expressed differently, 1 in 15 children die in the first year of life, and 1 in 11 die before reaching age five. Child-survival programmes might usefully focus on specific groups of children with particularly high infant and child mortality rates, such as children who live in rural areas, children whose mothers are illiterate, children belonging to scheduled castes or scheduled tribes, and children from poor households. Infant mortality rates are more than two and one-half times as high for women who did not receive any of the recommended types of maternity related medical care than for mothers who did receive all recommended types of care. HEALTH, HEALTH CARE, AND NUTRITION Promotion of maternal and child health has been one of the most important components of the Family Welfare Programme of the Government of India. One goal is for each pregnant woman to receive at least three antenatal check-ups plus two tetanus toxoid injections and a full course of iron and folic acid supplementation. In India, mothers of 65 percent of the children born in the three years preceding NFHS-2 received at least one antenatal

  11. Waste Management and Recycling in Indian Cities

    • kaggle.com
    Updated Dec 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Krishna Yadu (2024). Waste Management and Recycling in Indian Cities [Dataset]. http://doi.org/10.34740/kaggle/dsv/10203312
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 15, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Krishna Yadu
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    About the Dataset: Waste Management and Recycling in India

    Overview:

    This dataset provides comprehensive information on waste management and recycling practices in various cities across India. It includes key data related to waste generation, recycling rates, population density, municipal efficiency, landfill details, and more. The data spans multiple years (2019–2023) and covers a range of waste types, including plastic, organic waste, electronic waste (e-waste), construction waste, and hazardous waste.

    Purpose:

    The dataset aims to: - Promote efficient waste management practices across Indian cities. - Analyze trends in recycling and waste disposal methods. - Provide insights for improving municipal management systems. - Support research and development in sustainability, environmental science, and urban planning.

    Columns:

    1. City/District: The name of the Indian city or district.
    2. Waste Type: Type of waste generated, e.g., Plastic, Organic, E-Waste, Construction, Hazardous.
    3. Waste Generated (Tons/Day): Amount of waste generated in tons per day.
    4. Recycling Rate (%): The percentage of waste that is recycled.
    5. Population Density (People/km²): The number of people per square kilometer in the city.
    6. Municipal Efficiency Score (1-10): A score indicating how effectively the municipality manages waste (e.g., waste segregation, collection, disposal).
    7. Disposal Method: The method used for waste disposal (e.g., Landfill, Recycling, Incineration, Composting).
    8. Cost of Waste Management (₹/Ton): The cost of managing one ton of waste in Indian Rupees.
    9. Awareness Campaigns Count: The number of awareness campaigns organized by the municipality in that year related to waste management.
    10. Landfill Name: The name of the landfill site used by the city.
    11. Landfill Location (Lat, Long): The geographical location (latitude and longitude) of the landfill.
    12. Landfill Capacity (Tons): The total waste capacity (in tons) that the landfill can hold.
    13. Year: The year of the data entry, ranging from 2019 to 2023.

    Applications:

    • Urban Planning: The dataset can be used to analyze and optimize waste management infrastructure in urban areas.
    • Sustainability Research: It can help in studying the progress of recycling and waste reduction strategies.
    • Policy Making: Government bodies can use this data to craft policies aimed at improving waste management and recycling rates.
    • Machine Learning/AI: The dataset can be used to build models for predicting waste generation trends, recycling outcomes, and municipal efficiency.

    Sources:

    • The data is simulated for this dataset based on average waste management practices observed in Indian cities.
    • Real-world data could come from municipal corporations, environmental agencies, and government reports on waste management.
  12. #IndiaNeedsOxygen Tweets

    • kaggle.com
    zip
    Updated Nov 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kash (2021). #IndiaNeedsOxygen Tweets [Dataset]. https://www.kaggle.com/kaushiksuresh147/indianeedsoxygen-tweets
    Explore at:
    zip(4441094 bytes)Available download formats
    Dataset updated
    Nov 14, 2021
    Authors
    Kash
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    India marks one COVID-19 death every 5 minutes

    https://ichef.bbci.co.uk/news/976/cpsprodpb/11C98/production/_118165827_gettyimages-1232465340.jpg" alt="">

    Content

    People across India scrambled for life-saving oxygen supplies on Friday and patients lay dying outside hospitals as the capital recorded the equivalent of one death from COVID-19 every five minutes.

    For the second day running, the country’s overnight infection total was higher than ever recorded anywhere in the world since the pandemic began last year, at 332,730.

    India’s second wave has hit with such ferocity that hospitals are running out of oxygen, beds, and anti-viral drugs. Many patients have been turned away because there was no space for them, doctors in Delhi said.

    https://s.yimg.com/ny/api/res/1.2/XhVWo4SOloJoXaQLrxxUIQ--/YXBwaWQ9aGlnaGxhbmRlcjt3PTk2MA--/https://s.yimg.com/os/creatr-uploaded-images/2021-04/8aa568f0-a3e0-11eb-8ff6-6b9a188e374a" alt="">

    Mass cremations have been taking place as the crematoriums have run out of space. Ambulance sirens sounded throughout the day in the deserted streets of the capital, one of India’s worst-hit cities, where a lockdown is in place to try and stem the transmission of the virus. source

    Dataset

    The dataset consists of the tweets made with the #IndiaWantsOxygen hashtag covering the tweets from the past week. The dataset totally consists of 25,440 tweets and will be updated on a daily basis.

    The description of the features is given below | No |Columns | Descriptions | | -- | -- | -- | | 1 | user_name | The name of the user, as they’ve defined it. | | 2 | user_location | The user-defined location for this account’s profile. | | 3 | user_description | The user-defined UTF-8 string describing their account. | | 4 | user_created | Time and date, when the account was created. | | 5 | user_followers | The number of followers an account currently has. | | 6 | user_friends | The number of friends an account currently has. | | 7 | user_favourites | The number of favorites an account currently has | | 8 | user_verified | When true, indicates that the user has a verified account | | 9 | date | UTC time and date when the Tweet was created | | 10 | text | The actual UTF-8 text of the Tweet | | 11 | hashtags | All the other hashtags posted in the tweet along with #IndiaWantsOxygen | | 12 | source | Utility used to post the Tweet, Tweets from the Twitter website have a source value - web | | 13 | is_retweet | Indicates whether this Tweet has been Retweeted by the authenticating user. |

    Acknowledgements

    https://globalnews.ca/news/7785122/india-covid-19-hospitals-record/ Image courtesy: BBC and Reuters

    Inspiration

    The past few days have been really depressing after seeing these incidents. These tweets are the voice of the indians requesting help and people all over the globe asking their own countries to support India by providing oxygen tanks.

    And I strongly believe that this is not just some data, but the pure emotions of people and their call for help. And I hope we as data scientists could contribute on this front by providing valuable information and insights.

  13. Indian Railways Latest

    • kaggle.com
    Updated Dec 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arihant Jain (2020). Indian Railways Latest [Dataset]. https://www.kaggle.com/datasets/arihantjain09/indian-railways-latest
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 14, 2020
    Dataset provided by
    Kaggle
    Authors
    Arihant Jain
    License

    http://www.gnu.org/licenses/old-licenses/gpl-2.0.en.htmlhttp://www.gnu.org/licenses/old-licenses/gpl-2.0.en.html

    Description

    As Indian Railway Dataset is not thoroughly available, we thought of creating one and give it to the world! There is no missing data in this dataset.

    Content

    We have made this dataset using some info from data.gov.in, and added distance along with another table train_info, and much more cleaning. There are 2 files in this dataset train_info and train_schedule. train_schedule has more than 186000 rows while train_info consists of 11114 rows.

    Acknowledgements

    This dataset was part of our DBMS project which is hosted and live on **http://www.railways.live **

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  14. Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats,...

    • zenodo.org
    • explore.openaire.eu
    bin, csv, jpeg, txt
    Updated Aug 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vijay Karthick; Vijay Karthick; Vijay Kumar; Vijay Kumar; Anand Osuri; Anand Osuri (2024). Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India [Dataset]. http://doi.org/10.5281/zenodo.13340613
    Explore at:
    csv, bin, jpeg, txtAvailable download formats
    Dataset updated
    Aug 20, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Vijay Karthick; Vijay Karthick; Vijay Kumar; Vijay Kumar; Anand Osuri; Anand Osuri
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Western Ghats, Sakleshpura, India
    Description

    Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India

    This dataset contains mammal occurrence records from 2022 to 2024 in the Sakleshpura region of central Western Ghats, India. It includes a few occurrence records of other chordates. Occurrence records were gathered in the field by researchers of the Nature Conservation Foundation, India, using a mobile data collection application. Suggested citation is:
    Nature Conservation Foundation (2024). Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India. Nature Conservation Foundation, India. Dataset

    Keywords: tropical rainforest, plantations, Sakleshpura, animal distribution, Western Ghats

    CONTACT #1
    1. Name: Anand M Osuri
    2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
    3. Work Phone: +91 821 2515601
    4. Email address: aosuri@ncf-india.org
    5. ORCID: https://orcid.org/0000-0001-9909-5633

    CONTACT #2
    1. Name: Vijay Karthick
    2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
    3. Work Phone: +91 821 2515601
    4. Email address: vijayk@ncf-india.org
    5. ORCID: https://orcid.org/0000-0001-6023-3955

    CONTACT #3
    1. Name: Vijay Kumar
    2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
    3. Work Phone: +91 821 2515601
    4. Email address: vijaykumar@ncf-india.org
    5. ORCID: https://orcid.org/0009-0000-4149-0083


    Geographic Coverage:
    1. Location/Study Area: Sakleshpura, Karnataka, India
    2. GPS coordinates: Kadamane Village (12.924647, 75.654650)

    Temporal Coverage:
    1. Begins: 2022-05-16 (Year, Month, Day)
    2. Ends: 2024-05-22 (Year, Month, Day)

    Besides the 000_readMe.txt file containing this information and the 14 images associated with individual observations, the dataset includes three comma-delimited text (csv) files, and one R code file as explained below:
    1) 001_mammalData.csv -- This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file

    2) 002_placeLocs.csv -- This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 or 1000m accuracy

    3) 003_nameMatch.csv -- This file matches the name as originally recorded with the correct common name and scientific name

    4) 004_GBIF_upload_code.R -- R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

    5) 005_download_images_from_googledrive.R - R code to extract image IDs and download images from googledrive

    6) 006_kadamane_mammal_occurrence.xlsx - An excel file that contains the raw data and used in the codes above

    FILES INCLUDED IN DATASET

    001_mammaldata.csv
    This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file

    observers: Observers who made the observation
    timestamp: Automatic time stamp of date and time when app was used
    date: Date of observation
    time: Time of observation
    decimalLatitude: Latitude in decimal degrees N
    decimalLongitude: Longitude in decimal degrees E
    GPSaltitude: Altitude in metres
    GPSaccuracy: Horizontal accuracy of GPS location in metres
    place: Name of locality
    habitat: Habitat type
    taxa: mammal or reptile/amphibian
    species: Species common name
    count: Number of individuals observed
    countType: Total (solitary or fully counted groups) or Partial (incompletely counted groups)
    obsType: Type of observation: sighting, sign (droppings or vocalisation), death, roadkill, electrocution, other
    notes: Notes or remarks on observation
    imageID: Link to the google drive photo, if photo is available
    instanceID: Automatically generated unique identifier of observation

    002_placeLocs.csv
    This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 m accuracy

    place: Name of locality as recorded
    lat: Assigned latitude in decimal degrees N
    long: Assigned longitude in decimal degrees E
    GPSaccuracy: Assigned as 500 or 1000m – Horizontal accuracy of GPS location in metres

    003_nameMatch.csv
    This file matches the name as originally recorded with the correct common name and scientific name.

    verbatimIdentification: Identification as originally recorded in the ‘species’ column of the mammaldata.csv file
    vernacularName: Common or english name
    scientificName: Scientific name

    004_GBIF_upload_code.R
    R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

    005_download_images_from_googledrive.R
    R code that extracts imageIDs from the 001_mammalData.csv file and downloads them automatically to a preferred directory

    006_kadamane_mammal_occurrence.xlsx
    An excel file that contains the raw data and used in the codes above

  15. w

    Study on Global Ageing and Adult Health-2007, Wave 1 - India

    • apps.who.int
    • catalog.ihsn.org
    • +3more
    Updated Oct 24, 2013
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Professor P. Arokiasamy (2013). Study on Global Ageing and Adult Health-2007, Wave 1 - India [Dataset]. https://apps.who.int/healthinfo/systems/surveydata/index.php/catalog/65
    Explore at:
    Dataset updated
    Oct 24, 2013
    Dataset authored and provided by
    Professor P. Arokiasamy
    Time period covered
    2007
    Area covered
    India
    Description

    Abstract

    Purpose: The multi-country Study on Global Ageing and Adult Health (SAGE) is run by the World Health Organization's Multi-Country Studies unit in the Innovation, Information, Evidence and Research Cluster. SAGE is part of the unit's Longitudinal Study Programme which is compiling longitudinal data on the health and well-being of adult populations, and the ageing process, through primary data collection and secondary data analysis. SAGE baseline data (Wave 0, 2002/3) was collected as part of WHO's World Health Survey http://www.who.int/healthinfo/survey/en/index.html (WHS). SAGE Wave 1 (2007/10) provides a comprehensive data set on the health and well-being of adults in six low and middle-income countries: China, Ghana, India, Mexico, Russian Federation and South Africa. Objectives: To obtain reliable, valid and comparable health, health-related and well-being data over a range of key domains for adult and older adult populations in nationally representative samples To examine patterns and dynamics of age-related changes in health and well-being using longitudinal follow-up of a cohort as they age, and to investigate socio-economic consequences of these health changes To supplement and cross-validate self-reported measures of health and the anchoring vignette approach to improving comparability of self-reported measures, through measured performance tests for selected health domains To collect health examination and biomarker data that improves reliability of morbidity and risk factor data and to objectively monitor the effect of interventions

    Additional Objectives: To generate large cohorts of older adult populations and comparison cohorts of younger populations for following-up intermediate outcomes, monitoring trends, examining transitions and life events, and addressing relationships between determinants and health, well-being and health-related outcomes To develop a mechanism to link survey data to demographic surveillance site data To build linkages with other national and multi-country ageing studies To improve the methodologies to enhance the reliability and validity of health outcomes and determinants data To provide a public-access information base to engage all stakeholders, including national policy makers and health systems planners, in planning and decision-making processes about the health and well-being of older adults

    Methods: SAGE's first full round of data collection included both follow-up and new respondents in most participating countries. The goal of the sampling design was to obtain a nationally representative cohort of persons aged 50 years and older, with a smaller cohort of persons aged 18 to 49 for comparison purposes. In the older households, all persons aged 50+ years (for example, spouses and siblings) were invited to participate. Proxy respondents were identified for respondents who were unable to respond for themselves. Standardized SAGE survey instruments were used in all countries consisting of five main parts: 1) household questionnaire; 2) individual questionnaire; 3) proxy questionnaire; 4) verbal autopsy questionnaire; and, 5) appendices including showcards. A VAQ was completed for deaths in the household over the last 24 months. The procedures for including country-specific adaptations to the standardized questionnaire and translations into local languages from English follow those developed by and used for the World Health Survey.

    Content Household questionnaire 0000 Coversheet 0100 Sampling Information 0200 Geocoding and GPS Information 0300 Recontact Information 0350 Contact Record 0400 Household Roster 0450 Kish Tables and Household Consent 0500 Housing 0600 Household and Family Support Networks and Transfers 0700 Assets and Household Income 0800 Household Expenditures 0900 Interviewer Observations

    Individual questionnaire 1000 Socio-Demographic Characteristics 1500 Work History and Benefits 2000 Health State Descriptions and Vignettes 2500 Anthropometrics, Performance Tests and Biomarkers 3000 Risk Factors and Preventive Health Behaviours 4000 Chronic Conditions and Health Services Coverage 5000 Health Care Utilization 6000 Social Cohesion 7000 Subjective Well-Being and Quality of Life (WHOQoL-8 and Day Reconstruction Method) 8000 Impact of Caregiving 9000 Interviewer Assessment

    Geographic coverage

    National coverage

    Analysis unit

    households and individuals

    Universe

    The household section of the survey covered all households in 19 of the 28 states in India which covers 96% of the population. Institutionalised populations are excluded. The individual section covered all persons aged 18 years and older residing within individual households.

    Kind of data

    Sample survey data [ssd]

    Sampling procedure

    World Health Survey Sampling India has 28 states and seven union territories. 19 of the 28 states were included in the design representing 96% of the population. India used a stratified multistage cluster sample design. Six states were selected in accordance with their geographic location and level of development. Strata were defined by the 6 states:(Assam, Karnataka, Maharashtra, Rajasthan, Uttar Pradesh and West Bengal), and locality (urban or rural). There are 12 strata in total. The 2000 Census demarcation was used as the sampling frame. Two stage and three stage sampling was adopted in rural and urban areas, respectively. In rural areas PSUs(villages) were selected probability proportional to size. The measure of size being the 2001 Census population in the village. SSUs (households) were selected using systematic sampling. TSUs (individuals) were selected using Kish tables. In urban areas, PSUs(city wards) were selected probability proportional to size. SSUs(census enumeration blocks), two were randomly selected from each PSU. TSU (households) were selected using systematic sampling. QSU (individuals) were selected as in rural areas. A sample of 379 EAs was selected as the primary sampling units(PSU).

    SAGE Sampling The SAGE sample was pre-determined as all PSUs and households selected for the WHS/SAGE Wave 0 survey were included. Exceptions are three PSUs in Assam which were replaced as they were inaccessible due to flooding. And a further six PSUs were omitted for which the household roster information was not available. In each selected EA, a listing of the households was conducted to classify each household into the following mutually exclusive categories: 1)Households with a WHS/SAGE Wave 0 respondent aged 50-plus: all members aged 50-plus including the WHS/SAGE Wave 0 respondent were eligible for the individual interview. 2)Households with a WHS/SAGE Wave 0 respondent aged 47-49: all members aged 50-plus including the WHS/SAGE Wave 0 respondent aged 47-49 was eligible for the individual interview. 3)Households with a WHS/SAGE Wave 0 female respondent aged 18-46: all females members aged 18-49 including the WHS/SAGE Wave 0 female respondent aged 18-46 were eligible for the individual interview. 4)Households with a WHS/SAGE Wave 0 male respondent aged 18-46: three households were selected using systematic sampling and one male aged 18-49 was eligible for the individual interview. In the households not selected, all members aged 50-plus were eligible for the individual interview.

    Stages of selection Strata: State, Locality=12 PSU: EAs=375 surveyed SSU: Households=10424 surveyed TSU: Individual=12198 surveyed

    Mode of data collection

    Face-to-face [f2f] PAPI

    Research instrument

    The questionnaires were based on the WHS Model Questionnaire with some modification and many new additions. A household questionnaire was administered to all households eligible for the study. A Verbal Autopsy questionnaire was administered to households that had a death in the last 24 months. An Individual questionniare was administered to eligible respondents identified from the household roster. A Proxy questionnaire was administered to individual respondents who had cognitive limitations. A Womans Questionnaire was administered to all females aged 18-49 years identified from the household roster. The questionnaires were developed in English and were piloted as part of the SAGE pretest in 2005. All documents were translated into Hindi, Assamese, Kanada and Marathi. SAGE generic questionnaires are available as external resources.

    Cleaning operations

    Data editing took place at a number of stages including: (1) office editing and coding (2) during data entry (3) structural checking of the CSPro files (4) range and consistency secondary edits in Stata

    Response rate

    Household Response rate=88% Cooperation rate=92%

    Individual: Response rate=68% Cooperation rate=92%

  16. All Stocks Data of Indian Stock Market(1 Year)

    • kaggle.com
    Updated Jan 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KESHAV_MAHESHWARI (2022). All Stocks Data of Indian Stock Market(1 Year) [Dataset]. https://www.kaggle.com/datasets/gmkeshav/all-stocks-data-of-indian-stock-market1-year
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 9, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    KESHAV_MAHESHWARI
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    India
    Description

    After some rigorous SQL queries and coding on python. I made this dataset. In this dataset, all stocks of the Indian Stock Market are present a total of 2435 stocks. The data is of 1-year rows represent stock name and column represent date and I have filled the table with closing price. Enjoy and do some stock price predictions.

  17. Mammal occurrence records (2020-23) in the Valparai Plateau and Anamalai...

    • zenodo.org
    • data.niaid.nih.gov
    bin, csv, jpeg, txt
    Updated Oct 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T. R. Shankar Raman; T. R. Shankar Raman; Divya Mudappa; Divya Mudappa (2024). Mammal occurrence records (2020-23) in the Valparai Plateau and Anamalai Tiger Reserve, Western Ghats, India [Dataset]. http://doi.org/10.5281/zenodo.11903722
    Explore at:
    jpeg, csv, txt, binAvailable download formats
    Dataset updated
    Oct 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    T. R. Shankar Raman; T. R. Shankar Raman; Divya Mudappa; Divya Mudappa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 17, 2024
    Area covered
    Valparai, Western Ghats, India
    Description

    This dataset contains Mammal occurrence records (January 2020 - June 2023) in the Valparai Plateau and Anamalai Tiger Reserve, Western Ghats, India. It includes a few occurrence records of reptiles. Occurrence records were gathered in the field by researchers of the Nature Conservation Foundation, India, using a mobile data collection application. Suggested citation is:
    Nature Conservation Foundation (2024). Mammal occurrence records (2020-23) in the Valparai Plateau and Anamalai Tiger Reserve, Western Ghats, India. Nature Conservation Foundation, India. Dataset, Zenodo. DOI: 10.5281/zenodo.11903722

    CONTACT #1
    1. Name: T. R. Shankar Raman
    2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
    3. Work Phone: +91 821 2515601
    4. Email address: trsr@ncf-india.org
    5. ORCID: https://orcid.org/0000-0002-1347-3953

    CONTACT #2
    1. Name: Divya Mudappa
    2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
    3. Work Phone: +91 821 2515601
    4. Email address: divya@ncf-india.org
    5. ORCID: https://orcid.org/0000-0001-9708-4826

    Keywords: tropical rainforest, plantations, Anamalai Hills, Western Ghats, animal distribution, mammals

    Geographic Coverage:
    1. Location/Study Area: Valparai Plateau, Tamil Nadu, India; Anamalai Tiger Reserve, Tamil Nadu, India
    2. GPS coordinates: Valparai Plateau (10°15'- 10°22'N, 76°52' - 76°59'E); Anamalai Tiger Reserve (10°12' - 10°35'N, 76°49' - 77°24'E)

    Temporal Coverage:
    1. Begins: 2020-01-11 (Year, Month, Day)
    2. Ends: 2023-06-02 (Year, Month, Day)

    Besides the 000_readMe.txt file containing this information, the dataset includes 60 images (photographs), three comma-delimited text (csv) files, and one R markdown text file with R code as explained below:
    1) 001_mammalData.csv -- This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file

    2) 002_placeLocs.csv -- This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 m accuracy

    3) 003_nameMatch.csv -- This file matches the name as originally recorded with the correct common name and scientific name

    4) 004_mammup.Rmd -- R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

    +60 image files (with ".jpg" file extension)

    FILES INCLUDED IN DATASET

    001_mammdata.csv
    This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file
    recordedBy: Observer who recorded/made the observation
    username: Username of person on whose mobile phone the data were noted
    timestamp: Automatic time stamp of date and time when app was used
    date: Date of observation
    time: Time of observation
    decimalLatitude: Latitude in decimal degrees N
    decimalLongitude: Longitude in decimal degrees E
    GPSaltitude: Altitude in metres
    GPSaccuracy: Horizontal accuracy of GPS location in metres
    place: Name of locality
    habitat: Habitat type
    species: Species common name
    count: Number of individuals observed
    countType: Total (solitary or fully counted groups) or Partial (incompletely counted groups)
    obsType: Type of observation: sighting, sign (droppings or vocalisation), death, roadkill, electrocution, other
    notes: Notes or remarks on observation
    imageID: Image filename if available (NA, if not available)
    instanceID: Automatically generated unique identifier of observation

    002_placeLocs.csv
    This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 m accuracy
    place: Name of locality as recorded
    lat: Assigned latitude in decimal degrees N
    long: Assigned longitude in decimal degrees E
    GPSaccuracy: Assigned as 500 m – Horizontal accuracy of GPS location in metres

    003_nameMatch.csv
    This file matches the name as originally recorded with the correct common name and scientific name.
    verbatimIdentification: Identification as originally recorded in the ‘species’ column of the mammdata.csv file
    vernacularName: Common or engish name
    scientificName: Scientific name

    004_mammup.Rmd
    R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

  18. e

    Poverty and inner wellbeing: India and Zambia 2010-2014 - Dataset - B2FIND

    • b2find.eudat.eu
    Updated Oct 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). Poverty and inner wellbeing: India and Zambia 2010-2014 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/c2aaeedb-8c5a-51b2-81de-c64cafd29e00
    Explore at:
    Dataset updated
    Oct 29, 2023
    Area covered
    Zambia, India
    Description

    The main method of the project was a survey interview from which both qualitative and quantitative data were collected. Field research was undertaken in marginalised rural communities in Zambia (Chiawa) and India (Sarguja district, Chhattisgarh state). Two rounds of fieldwork were undertaken in each place, in Zambia August–November 2010 (Zambia T1) and August–October 2012 (Zambia T2); in India February–May 2011 (India T1) and February–June 2013 (India T2). In both locations, we talked to husbands and wives (separately) and women heading households. In India we surveyed 340 people in 2011 and 368 in 2013. 187 respondents were interviewed in both rounds. 7% of respondents were single women. Qualitative data include 105 survey notes. In Zambia we surveyed 412 people in 2010 and 370 in 2012. These included 52 women heading households. 358 respondents were surveyed both years. Qualitative data include notes from 105 survey interviews. This research aims to identify pathways of wellbeing and poverty within rural communities in Zambia and India. It will demonstrate how poverty affects wellbeing and how different constellations of wellbeing in turn affect people's movements into, within and out of poverty. Drawing on the sociology of development and psychology, it adopts a mixed method, cross-cultural longitudinal approach, with qualitative and quantitative data collection across a two year interval, involving 700 respondents. Statistical tests assess the validity and reliability of our model of wellbeing. In-depth case studies provide a deeper sense of people's own understandings and experience. In particular, the research tests a key hypothesis that social and personal relationships constitute critical drivers of wellbeing in developing countries. The project is rooted in research-policy engagement. It involves partnership with NGOs committed to incorporating wellbeing into their programmes, and generates a broader programme of communications activities at national and global level. The Wellbeing and Poverty Pathways project developed a multi-dimensional model of wellbeing called “Inner Wellbeing” (IWB) which reflects what people think and feel they are able to be and do. The project explored relationships between people's subjective experiences of wellbeing and the external conditions in which they live their lives. Inner wellbeing comprises seven domains: economic confidence; agency and participation; social connections; close relationships; physical and mental health; competence and self-worth; values and meaning. It was constructed through a combination of theoretical reflection and empirical analysis in two rural communities, one in Zambia and one in India. The main research instrument was a survey which comprised three sections: an opening section on demographics and health; the central IWB section; and a final section on livelihoods and access to state services. Specifically for the central IWB section, the survey has five questions (or items) for each domain, which are designed to reflect different aspects of that domain. For each question respondents are asked to select one of five graduated answers. These are then scored on a scale from strong negative (1) to weak negative (2) to neutral (3) to weak positive (4) to strong positive wellbeing (5). The questions were extensively grounded and piloted to ensure they captured issues that were important to people’s lives locally. The studied population came from two rural areas of the Global South: Chiawa in Zambia and four villages in the Sarguja district of the Chhattisgarh state in India. No sample selection was applied. Instead, everyone in the study areas who would talk to us was interviewed. Chiawa is a Game Management Area (GMA), located in Kafue district, Lusaka province. To the south east it borders Zimbabwe and to the east the Lower Zambezi National Park. The majority population is Goba, a people-group that originated in what is now Zimbabwe. The research in India focused on four villages located in the historically remote hill and forest regions of northern Chhattisgarh. These villages were selected because they presented a range of contrasts. The communities there are extremely poor and people depend on (largely rainfed) farming, daily labour and gathering non-timber forest products to survive. Reflecting the area’s population as a whole, the majority of respondents (84%) are Adivasi, including Particularly Vulnerable Tribal Groups (PTG), with smaller numbers of Other Backward Caste (OBC) (15%) and Scheduled Caste (1%) people.

  19. m

    DEVANAGARI CAPTCHA DATASET OF 1 Million Images : A challenge Test

    • data.mendeley.com
    • ieee-dataport.org
    Updated Apr 5, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SANJAY PATE (2023). DEVANAGARI CAPTCHA DATASET OF 1 Million Images : A challenge Test [Dataset]. http://doi.org/10.17632/knmbfjsdwn.1
    Explore at:
    Dataset updated
    Apr 5, 2023
    Authors
    SANJAY PATE
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CAPTCHA (Completely Automated Public Turing Tests to Tell Computers and Humans Apart). Only humans can successfully complete this test; current computer systems cannot. It is utilized in several applications for both human and machine identification. Text-based CAPTCHAs are the most typical type used on websites. Most of the letters in this protected CAPTCHA script are in English, it is challenging for rural residents who only speak their native tongues to pass the test. Devanagari characters have more complex characters than standard English characters and numeral-based CAPTCHAs, which makes machine recognition much more difficult. The majority of official websites in India only offer information in Devanagari. Unfortunately, websites do not use Devanagari CAPTCHAs.As a result, we have created a new text-based CAPTCHA in Devanagari script in this article. A computer/printed font and handwritten Devanagari character(34 each) and number(10 each) , in total 44+44 = 88 character images are used to design CAPTCHA. General CAPTCHA generation principles are used to add noise to the image using digital image processing techniques. Size of each CAPTCHA image is 250 X 90 pixels. 04 (Four) types of Character Sets are used – Printed Alphabet(34), Handwritten Alphabet(34), Printed Digit(10), and Handwritten Digit(10). Generated 11 Classes from these 04 combinations. The string length of the CAPTCHA image considered here is FIVE, SIX, and SEVEN ( 5, 6, 7). For each class – 03 (THREE) subclasses are created depending upon string length. In total there are 11 classes X 3 subclasses = 33 subclasses. So 33 types of CAPTCHA images were generated. For each class, 10,000 CAPTCHA images were created. For 11 Classes X 10,000 images , a Devanagari CAPTCHA Data set of 1,10,000 ( One Million Ten Thousand) images were created using Python. To make the CAPTCHA image less recognized or not easily broken. Passing a test with identifying Devanagari alphabets is difficult. It is beneficial to researchers who are investigating captcha recognition in this area. This dataset is helpful to researcher to design OCR for recognize Devanagari CAPTCHA and break it.

  20. Indian Candidates for General Election 2019

    • kaggle.com
    Updated Mar 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Prakrut Chauhan (2020). Indian Candidates for General Election 2019 [Dataset]. https://www.kaggle.com/prakrutchauhan/indian-candidates-for-general-election-2019/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2020
    Dataset provided by
    Kaggle
    Authors
    Prakrut Chauhan
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Area covered
    India
    Description

    Context

    With over 600 Million voters voting for 8500+ candidates across 543 constituencies, the general elections in the world's largest democracy are a potential goldmine of data. While there are existing separate datasets about the votes each candidate received and the personal information of each candidate, there was no comprehensive dataset that included both these information. Thus, this dataset will provide more usability than most existing datasets in this domain.

    Content

    I scraped the website of myneta.info to get the personal information of each candidate (as per their own sworn affidavits) and the website of Election Commission of India to get the data about the votes received. I merged both this datasets to create this comprehensive dataset. Only the candidates who secured at least 1% of the total votes polled in their constituency have been included.

    Acknowledgements

    I have collected the data from MyNeta.info maintained by the Association for Democratic Reforms and the website of Election Commission of India.

    Inspiration

    There are 2 main tasks that can be performed on this dataset: Exploratory Data Analytics to visualize the impact of each feature of the candidate and the use of machine learning to predict the chances of winning of a candidate.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Satya Thirumani (2025). 🦈 Shark Tank India dataset 🇮🇳 [Dataset]. https://www.kaggle.com/datasets/thirumani/shark-tank-india
Organization logo

🦈 Shark Tank India dataset 🇮🇳

Shark Tank India data set, includes Season 1 to Season 4 information

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Satya Thirumani
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Shark Tank India Data set.

Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.

All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.

Here is the data dictionary for (Indian) Shark Tank season's dataset.

  • Season Number - Season number
  • Startup Name - Company name or product name
  • Episode Number - Episode number within the season
  • Pitch Number - Overall pitch number
  • Season Start - Season first aired date
  • Season End - Season last aired date
  • Original Air Date - Episode original/first aired date, on OTT/TV
  • Episode Title - Episode title in SonyLiv
  • Anchor - Name of the episode presenter/host
  • Industry - Industry name or type
  • Business Description - Business Description
  • Company Website - Company Website URL
  • Started in - Year in which startup was started/incorporated
  • Number of Presenters - Number of presenters
  • Male Presenters - Number of male presenters
  • Female Presenters - Number of female presenters
  • Transgender Presenters - Number of transgender/LGBTQ presenters
  • Couple Presenters - Are presenters wife/husband ? 1-yes, 0-no
  • Pitchers Average Age - All pitchers average age, <30 young, 30-50 middle, >50 old
  • Pitchers City - Presenter's town/city or place where company head office exists
  • Pitchers State - Indian state pitcher hails from or state where company head office exists
  • Yearly Revenue - Yearly revenue, in lakhs INR, -1 means negative revenue, 0 means pre-revenue
  • Monthly Sales - Total monthly sales, in lakhs
  • Gross Margin - Gross margin/profit of company, in percentages
  • Net Margin - Net margin/profit of company, in percentages
  • EBITDA - Earnings Before Interest, Taxes, Depreciation, and Amortization
  • Cash Burn - In loss in current year; burning/paying money from their pocket (yes/no)
  • SKUs - Stock Keeping Units or number of varieties, at the time of pitch
  • Has Patents - Pitcher has Patents/Intellectual property (filed/granted), at the time of pitch
  • Bootstrapped - Startup is bootstrapped or not (yes/no)
  • Part of Match off - Competition between two similar brands, pitched at same time
  • Original Ask Amount - Original Ask Amount, in lakhs INR
  • Original Offered Equity - Original Offered Equity, in percentages
  • Valuation Requested - Valuation Requested, in lakhs INR
  • Received Offer - Received offer or not, 1-received, 0-not received
  • Accepted Offer - Accepted offer or not, 1-accepted, 0-rejected
  • Total Deal Amount - Total Deal Amount, in lakhs INR
  • Total Deal Equity - Total Deal Equity, in percentages
  • Total Deal Debt - Total Deal debt/loan amount, in lakhs INR
  • Debt Interest - Debt interest rate, in percentages
  • Deal Valuation - Deal Valuation, in lakhs INR
  • Number of sharks in deal - Number of sharks involved in deal
  • Deal has conditions - Deal has conditions or not? (yes or no)
  • Royalty Percentage - Royalty percentage, if it's royalty deal
  • Royalty Recouped Amount - Royalty recouped amount, if it's royalty deal, in lakhs
  • Advisory Shares Equity - Deal with Advisory shares or equity, in percentages
  • Namita Investment Amount - Namita Investment Amount, in lakhs INR
  • Namita Investment Equity - Namita Investment Equity, in percentages
  • Namita Debt Amount - Namita Debt Amount, in lakhs INR
  • Vineeta Investment Amount - Vineeta Investment Amount, in lakhs INR
  • Vineeta Investment Equity - Vineeta Investment Equity, in percentages
  • Vineeta Debt Amount - Vineeta Debt Amount, in lakhs INR
  • Anupam Investment Amount - Anupam Investment Amount, in lakhs INR
  • Anupam Investment Equity - Anupam Investment Equity, in percentages
  • Anupam Debt Amount - Anupam Debt Amount, in lakhs INR
  • Aman Investment Amount - Aman Investment Amount, in lakhs INR
  • Aman Investment Equity - Aman Investment Equity, in percentages
  • Aman Debt Amount - Aman Debt Amount, in lakhs INR
  • Peyush Investment Amount - Peyush Investment Amount, in lakhs INR
  • Peyush Investment Equity - Peyush Investment Equity, in percentages
  • Peyush Debt Amount - Peyush Debt Amount, in lakhs INR
  • Ritesh Investment Amount - Ritesh Investment Amount, in lakhs INR
  • Ritesh Investment Equity - Ritesh Investment Equity, in percentages
  • Ritesh Debt Amount - Ritesh Debt Amount, in lakhs INR
  • Amit Investment Amount - Amit Investment Amount, in lakhs INR
  • Amit Investment Equity - Amit Investment Equity, in percentages
  • Amit Debt Amount - Amit Debt Amount, in lakhs INR
  • Guest Investment Amount - Guest Investment Amount, in lakhs INR
  • Guest Investment Equity - Guest Investment Equity, in percentages
  • Guest Debt Amount - Guest Debt Amount, in lakhs INR
  • Invested Guest Name - Name of the guest(s) who invested in deal
  • All Guest Names - Name of all guests, who are present in episode
  • Namita Present - Whether Namita present in episode or not
  • Vineeta Present - Whether Vineeta present in episode or not
  • Anupam ...
Search
Clear search
Close search
Google apps
Main menu