86 datasets found

Amazon India products dataset in CSV format
crawlfeeds.com
csv, zip
Updated Mar 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Crawl Feeds (2025). Amazon India products dataset in CSV format [Dataset]. https://crawlfeeds.com/datasets/amazon-india-products-dataset-in-csv-format
Explore at:
csv, zipAvailable download formats
Dataset updated
Mar 27, 2025
Dataset authored and provided by
Crawl Feeds
License
https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy
Area covered
India
Description
Gain access to a structured dataset featuring thousands of products listed on Amazon India. This dataset is ideal for e-commerce analytics, competitor research, pricing strategies, and market trend analysis.

Dataset Features:

Product Details: Name, Brand, Category, and Unique ID

Pricing Information: Current Price, Discounted Price, and Currency

Availability & Ratings: Stock Status, Customer Ratings, and Reviews

Seller Information: Seller Name and Fulfillment Details

Additional Attributes: Product Description, Specifications, and Images

Dataset Specifications:

Format: CSV

Number of Records: 50,000+

Delivery Time: 3 Days

Price: $149.00

Availability: Immediate

This dataset provides structured and actionable insights to support e-commerce businesses, pricing strategies, and product optimization. If you're looking for more datasets for e-commerce analysis, explore our E-commerce datasets for a broader selection.
FDI Dataset for India(Month-wise) from 2014-2024
kaggle.com
Updated Dec 31, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Golden Ave (2024). FDI Dataset for India(Month-wise) from 2014-2024 [Dataset]. https://www.kaggle.com/datasets/atharvarayar/fdi-dataset-for-indiamonth-wise-from-2014-2024
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 31, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Golden Ave
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
India
Description
The dataset was created from the data released from the Department of Promotion of Industry and Internal Trade India. The data available in the website is in pdf format, I scraped the data and then converted it to csv format. The main motivation behind creating the dataset was I couldn't find the latest data for month-wise FDI in India. I checked the government websites but some of them that do have data, have yearly data and not for each month, to track quarterly performance, month-wise data is needed so I decided to scrape the data from the pdfs available, I also came across some dataset in Kaggle but they were till 2021 and some were not in csv. I am also working on sector-wise data(now available in version 2)and state-wise data which will be also soon be available. I will also try to update the data quarterly, and your support will go a long way to motivate me to keep updating. Cheers!
p
AI-Driven Mental Health Literacy - An Interventional Study from India (Data...
psycharchives.org
Updated Oct 2, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). AI-Driven Mental Health Literacy - An Interventional Study from India (Data from main study).csv [Dataset]. https://psycharchives.org/handle/20.500.12034/8771
Explore at:
Dataset updated
Oct 2, 2023
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
India
Description
The dataset is from an Indian study which made use of ChatGPT- a natural language processing model by OpenAI to design a mental health literacy intervention for college students. Prompt engineering tactics were used to formulate prompts that acted as anchors in the conversations with the AI agent regarding mental health. An intervention lasting for 20 days was designed with sessions of 15-20 minutes on alternative days. Fifty-one students completed pre-test and post-test measures of mental health literacy, mental help-seeking attitude, stigma, mental health self-efficacy, positive and negative experiences, and flourishing in the main study, which were then analyzed using paired t-tests. The results suggest that the intervention is effective among college students as statistically significant changes were noted in mental health literacy and mental health self-efficacy scores. The study affirms the practicality, acceptance, and initial indications of AI-driven methods in advancing mental health literacy and suggests the promising prospects of innovative platforms such as ChatGPT within the field of applied positive psychology.: Data used in analysis for the intervention study
Crop Yield Data India
kaggle.com
Updated Jul 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shahid Hussain (2024). Crop Yield Data India [Dataset]. https://www.kaggle.com/datasets/saincoder404/crop-yield-data-india
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 14, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shahid Hussain
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
India
Description
This dataset contains detailed information on crop yields across various states in India for the year 1997. It includes data on different crops, their production, area under cultivation, season of cultivation, and state-specific information. Additionally, the dataset provides supplementary details such as annual rainfall, fertilizer use, pesticide use, and yield for each crop. This comprehensive dataset can be used for agricultural analysis, trend prediction, and studying the impact of various factors on crop yields in India.
B
Residential School Locations Dataset (CSV Format)
borealisdata.ca
search.dataone.org
Updated Jun 5, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosa Orlandini (2019). Residential School Locations Dataset (CSV Format) [Dataset]. http://doi.org/10.5683/SP2/RIYEMU
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/RIYEMU
Dataset updated
Jun 5, 2019
Dataset provided by
Borealis
Authors
Rosa Orlandini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1863 - Jun 30, 1998
Area covered
Canada
Description
The Residential School Locations Dataset [IRS_Locations.csv] contains the locations (latitude and longitude) of Residential Schools and student hostels operated by the federal government in Canada. All the residential schools and hostels that are listed in the Indian Residential School Settlement Agreement are included in this dataset, as well as several Industrial schools and residential schools that were not part of the IRRSA. This version of the dataset doesn’t include the five schools under the Newfoundland and Labrador Residential Schools Settlement Agreement. The original school location data was created by the Truth and Reconciliation Commission, and was provided to the researcher (Rosa Orlandini) by the National Centre for Truth and Reconciliation in April 2017. The dataset was created by Rosa Orlandini, and builds upon and enhances the previous work of the Truth and Reconcilation Commission, Morgan Hite (creator of the Atlas of Indian Residential Schools in Canada that was produced for the Tk'emlups First Nation and Justice for Day Scholar's Initiative, and Stephanie Pyne (project lead for the Residential Schools Interactive Map). Each individual school location in this dataset is attributed either to RSIM, Morgan Hite, NCTR or Rosa Orlandini. Many schools/hostels had several locations throughout the history of the institution. If the school/hostel moved from its’ original location to another property, then the school is considered to have two unique locations in this dataset,the original location and the new location. For example, Lejac Indian Residential School had two locations while it was operating, Stuart Lake and Fraser Lake. If a new school building was constructed on the same property as the original school building, it isn't considered to be a new location, as is the case of Girouard Indian Residential School.When the precise location is known, the coordinates of the main building are provided, and when the precise location of the building isn’t known, an approximate location is provided. For each residential school institution location, the following information is provided: official names, alternative name, dates of operation, religious affiliation, latitude and longitude coordinates, community location, Indigenous community name, contributor (of the location coordinates), school/institution photo (when available), location point precision, type of school (hostel or residential school) and list of references used to determine the location of the main buildings or sites.
Data from: Distribution models predict climate-related range alteration or...
zenodo.org
Updated Mar 28, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
A.P. Madhavan; A.P. Madhavan; Kshama Bhat; Kshama Bhat; Srinivasan Kasinathan; Srinivasan Kasinathan; Divya Mudappa; Divya Mudappa; Navendu Page; Navendu Page; T. R. Shankar Raman; T. R. Shankar Raman (2024). Data from: Distribution models predict climate-related range alteration or extinction of eleven threatened tropical rainforest trees in the Western Ghats [Dataset]. http://doi.org/10.5281/zenodo.10888938
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10888938
Dataset updated
Mar 28, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
A.P. Madhavan; A.P. Madhavan; Kshama Bhat; Kshama Bhat; Srinivasan Kasinathan; Srinivasan Kasinathan; Divya Mudappa; Divya Mudappa; Navendu Page; Navendu Page; T. R. Shankar Raman; T. R. Shankar Raman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Mar 28, 2024
Area covered
Western Ghats
Description
This dataset contains information related to species occurence data and species distribution modeling (SDM) analysisr of eleven threatened tree species. Occurrences are compiled from extensive field surveys in the Anamalai Hills along with data from the Global Biodiversity Information Facility (GBIF.org) and earlier work done within the southern Western Ghats, India.

References:
Page, N. V., & Shanker, K. (2020). Climatic stability drives latitudinal trends in range size and richness of woody plants in the Western Ghats, India. PLOS ONE, 15(7), e0235733. https://doi.org/10.1371/journal.pone.0235733

GBIF.org (2022) GBIF Occurrence Download, 2 August 2022. DOI:10.15468/dl.gnvuxj

AUTHOR #1
1. Name: A.P. Madhavan
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Email address: madhavan@ncf-india.org
4. ORCID: https://orcid.org/0009-0009-2754-8256

AUTHOR #2
1. Name: Kshama Bhat
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Email address: kshama@ncf-india.org
4. ORCID: ORCID: https://orcid.org/0000-0002-6190-2687

AUTHOR #3
1. Name: Srinivasan Kasinathan
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Email address: srini@ncf-india.org
4. ORCID: https://orcid.org/0000-0001-7323-6653

AUTHOR #4
1. Name: Divya Mudappa
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Email address: divya@ncf-india.org
4. ORCID: https://orcid.org/0000-0001-9708-4826

AUTHOR #5
1. Name: Navendu Page
2. Work Address: Wildlife Institute of India, Post Box No. 18, Chandrabani, Dehradun, Uttarakhand 248001, India
3. Email address: navendu.page@gmail.com
4. ORCID: ORCID: https://orcid.org/0000-0002-9413-7571

AUTHOR #6
1. Name: T. R. Shankar Raman
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Email address: trsr@ncf-india.org
4. ORCID: https://orcid.org/0000-0002-1347-3953

Keywords: tropical rainforest, climate change, tree distributions, species distribution models, range shifts, Western Ghats

Geographic Coverage:
1. Location/Study Area: Southern Western Ghats Montane Rain Forests, Southern Western Ghats Moist Deciduous Forests, India
2. GPS coordinates: SWG (73.95° – 80.33° E, 8.06° – 13.11°N)

Temporal coverage
Starts: 2020-08-01
Ends: 2024-03-28

Besides this README.txt file, the dataset includes three comma-delimited text files (csv); two R scripts, and 1 kml file of surveyed trails.

CSV files with the data in columns as explained below:

1) Focal_Tree_Dat.csv

Comp: Number identifier
FT_ID: Unique tree no for each individual
Focal_tree: Scientific name of species
Date: Date of occurrence observation
Place: Area/locality description
Trail: Unique trail ID
Waypoint: Waypoint number
Time: Time in hh:mm format
Location: Specific description of occurrence locality
Latitude: Latitude in decimal degrees N
Longitude: Longitude in decimal degrees E
Elevation: Elevation in metres
Slope: Cateory of slope
ID_Notes: Notes on identification
Phenophase: Phenophase expression at the time of observation
GBH: Girth at breast height in centimetres (comma separated list of numbers in case of multi-stemmed trees)
Tree_ht: Tree height in metres
Canopy_ht: Maximimum height of the surrounding canopy in metres
Substrate: Soil substrate composition
Invasives: Name of invasive species (if present)
Stature: Vegetation strata position
Relatively: Stature of focal individual relative to other surrounding individuals
Deadwood: Description of deadwood on the tree
Damage: Description of damage on the bole
Shape: Description of tree canopy shape
Closure: Canopy closure at focal tree
Seedlings: Number of conspecific seedlings present in 5 m radius of focal tree
Saplings: Number of conspecific saplings present in 5 m radius of focal tree
Trees: Number of conspecific trees present in 5 m radius of focal tree
Remarks: Remarks

2) Ffspecies.csv

Source: Source of occurrence
ID: State/location of occurrence
Region: Biogeographic region of occurrence
decimalLatitude: Latitude in decimal degrees N
decimalLongitude: Longitude in decimal degrees E
species: Scientific name of species

4) ft_surveys.csv

Date: Date of survey of sample trail
Prot_type: Category indicating whether protected area or fragment
Place: Area/locality description
Route_description: Specific landmark description of trail
Trail: Unique trail ID
Trail_distance: Tracked distance of trail in km
Corrected_trail_distance: Corrected distance of trail in km
Track_filename_kml: File name of gps track
Sample_collected: Name of species if sample collected
Observers: Name of observers
Remarks: Remarks

ANALYSES SCRIPTS
flexsdm_script.R
Script containing the analysis of all maxent distribution modeling and associated analysis

Franklinia_density.Rmd
Script of density and abundance related analysis
🦈 Shark Tank India dataset 🇮🇳
kaggle.com
Updated Apr 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Satya Thirumani (2025). 🦈 Shark Tank India dataset 🇮🇳 [Dataset]. https://www.kaggle.com/datasets/thirumani/shark-tank-india
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 20, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Satya Thirumani
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Shark Tank India Data set.

Shark Tank India - Season 1 to season 4 information, with 80 fields/columns and 630+ records.

All seasons/episodes of 🦈 SHARKTANK INDIA 🇮🇳 were broadcasted on SonyLiv OTT/Sony TV.

Here is the data dictionary for (Indian) Shark Tank season's dataset.

Season Number - Season number

Startup Name - Company name or product name

Episode Number - Episode number within the season

Pitch Number - Overall pitch number

Season Start - Season first aired date

Season End - Season last aired date

Original Air Date - Episode original/first aired date, on OTT/TV

Episode Title - Episode title in SonyLiv

Anchor - Name of the episode presenter/host

Industry - Industry name or type

Business Description - Business Description

Company Website - Company Website URL

Started in - Year in which startup was started/incorporated

Number of Presenters - Number of presenters

Male Presenters - Number of male presenters

Female Presenters - Number of female presenters

Transgender Presenters - Number of transgender/LGBTQ presenters

Couple Presenters - Are presenters wife/husband ? 1-yes, 0-no

Pitchers Average Age - All pitchers average age, <30 young, 30-50 middle, >50 old

Pitchers City - Presenter's town/city or place where company head office exists

Pitchers State - Indian state pitcher hails from or state where company head office exists

Yearly Revenue - Yearly revenue, in lakhs INR, -1 means negative revenue, 0 means pre-revenue

Monthly Sales - Total monthly sales, in lakhs

Gross Margin - Gross margin/profit of company, in percentages

Net Margin - Net margin/profit of company, in percentages

EBITDA - Earnings Before Interest, Taxes, Depreciation, and Amortization

Cash Burn - In loss in current year; burning/paying money from their pocket (yes/no)

SKUs - Stock Keeping Units or number of varieties, at the time of pitch

Has Patents - Pitcher has Patents/Intellectual property (filed/granted), at the time of pitch

Bootstrapped - Startup is bootstrapped or not (yes/no)

Part of Match off - Competition between two similar brands, pitched at same time

Original Ask Amount - Original Ask Amount, in lakhs INR

Original Offered Equity - Original Offered Equity, in percentages

Valuation Requested - Valuation Requested, in lakhs INR

Received Offer - Received offer or not, 1-received, 0-not received

Accepted Offer - Accepted offer or not, 1-accepted, 0-rejected

Total Deal Amount - Total Deal Amount, in lakhs INR

Total Deal Equity - Total Deal Equity, in percentages

Total Deal Debt - Total Deal debt/loan amount, in lakhs INR

Debt Interest - Debt interest rate, in percentages

Deal Valuation - Deal Valuation, in lakhs INR

Number of sharks in deal - Number of sharks involved in deal

Deal has conditions - Deal has conditions or not? (yes or no)

Royalty Percentage - Royalty percentage, if it's royalty deal

Royalty Recouped Amount - Royalty recouped amount, if it's royalty deal, in lakhs

Advisory Shares Equity - Deal with Advisory shares or equity, in percentages

Namita Investment Amount - Namita Investment Amount, in lakhs INR

Namita Investment Equity - Namita Investment Equity, in percentages

Namita Debt Amount - Namita Debt Amount, in lakhs INR

Vineeta Investment Amount - Vineeta Investment Amount, in lakhs INR

Vineeta Investment Equity - Vineeta Investment Equity, in percentages

Vineeta Debt Amount - Vineeta Debt Amount, in lakhs INR

Anupam Investment Amount - Anupam Investment Amount, in lakhs INR

Anupam Investment Equity - Anupam Investment Equity, in percentages

Anupam Debt Amount - Anupam Debt Amount, in lakhs INR

Aman Investment Amount - Aman Investment Amount, in lakhs INR

Aman Investment Equity - Aman Investment Equity, in percentages

Aman Debt Amount - Aman Debt Amount, in lakhs INR

Peyush Investment Amount - Peyush Investment Amount, in lakhs INR

Peyush Investment Equity - Peyush Investment Equity, in percentages

Peyush Debt Amount - Peyush Debt Amount, in lakhs INR

Ritesh Investment Amount - Ritesh Investment Amount, in lakhs INR

Ritesh Investment Equity - Ritesh Investment Equity, in percentages

Ritesh Debt Amount - Ritesh Debt Amount, in lakhs INR

Amit Investment Amount - Amit Investment Amount, in lakhs INR

Amit Investment Equity - Amit Investment Equity, in percentages

Amit Debt Amount - Amit Debt Amount, in lakhs INR

Guest Investment Amount - Guest Investment Amount, in lakhs INR

Guest Investment Equity - Guest Investment Equity, in percentages

Guest Debt Amount - Guest Debt Amount, in lakhs INR

Invested Guest Name - Name of the guest(s) who invested in deal

All Guest Names - Name of all guests, who are present in episode

Namita Present - Whether Namita present in episode or not

Vineeta Present - Whether Vineeta present in episode or not

Anupam ...
Data from: Assessing Temporal Dynamics of Nitrogen Surplus in Indian...
zenodo.org
csv
Updated Jul 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shekhar Sharan Goyal; Shekhar Sharan Goyal; Udit Bhatia; Udit Bhatia; Rohini Kumar; Rohini Kumar (2024). Assessing Temporal Dynamics of Nitrogen Surplus in Indian Agriculture: District-Scale Data from 1966 to 2017 [Dataset]. http://doi.org/10.5281/zenodo.12662782
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12662782
Dataset updated
Jul 6, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Shekhar Sharan Goyal; Shekhar Sharan Goyal; Udit Bhatia; Udit Bhatia; Rohini Kumar; Rohini Kumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset comprises annual long-term total agricultural cropland nitrogen (N) surplus across India, provided at a district-level spatial resolution for the period from 1966 to 2017. The dataset includes twelve N surplus estimates that incorporate uncertainties stemming from various input data sources and methodological choices in key components of the N surplus. This dataset allows for the aggregation of N surplus at any relevant spatial scale, thereby supporting the development of effective water and land management strategies.

Data description:

1. District-level N Surplus Data (CSV format): District-level N surplus data (CSV format): This dataset includes 12 columns of N surplus values (Kg N/ha), each representing 52 years (1966-2017) of district-level data. The N surplus is calculated using different methods and data choices. 12_nitrogen_budjet_1966_2017

2. Aggregated N surplus at the state level (CSV format): This dataset contains 52 years (1966-2017) of N surplus data (Kg N/ha) at the state level, including the mean and standard deviation of 12 different estimates. State_N_surplus_mean_std.csv

3. District centroid locations: This dataset contains latitude and longitude coordinates for the centroid locations of districts in India where data is available, which can be used as identifiers. We have followed district name and classification using ICRISAT 1966 identifiers. district_centroids_lat_long.csv

4. Aggregated N Surplus (Kg/ha) at Indian River Basins: This dataset encompasses 52 years (1966-2017) of mean nitrogen surplus (Kg N/ha) data at the basin level, classified according to the basin classification provided by the Central Water Commission of India. The dataset is available in the file basin_level_mean_N_surplus_kg_ha_1966_2017.csv

Additionally, a CSV file named river_basin_IDs.csv is provided, which contains the river basin IDs along with their names as assigned by the Central Water Commission (CWC) of India.

5. Aggregated N Surplus (Kg/ha) at Sub-Basin level: This dataset includes 52 years (1966-2017) of mean nitrogen surplus data (Kg N/ha) at the sub-basin level, based on the level 5 HydroSHEDS basin classification. The dataset is available in the file Sub_basin_mean_N_surplus_kg_ha_1966_2017.csv

Unit: kg/ha/yr (ha = Net cropping area )

Time period: 1966-2017

Further information:

Details/Citation: Assessing Temporal Dynamics of Nitrogen Surplus in Indian Agriculture: District-Scale Data from 1966 to 2017 by S.S. Goyal, U. Bhatia and R. Kumar.

Further queries regarding these datasets can be directed to Shekhar Goyal (goyal_shekhar@iitgn.ac.in), Udit Bhatia (bhatia.u@iitgn.ac.in) and Rohini Kumar (rohini.kumar@ufz.de).
Employment Of India CLeaned and Messy Data
kaggle.com
Updated Apr 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SONIA SHINDE (2025). Employment Of India CLeaned and Messy Data [Dataset]. https://www.kaggle.com/datasets/soniaaaaaaaa/employment-of-india-cleaned-and-messy-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 7, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
SONIA SHINDE
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Area covered
India
Description
This dataset presents a dual-version representation of employment-related data from India, crafted to highlight the importance of data cleaning and transformation in any real-world data science or analytics project.

🔹 Dataset Composition:

It includes two parallel datasets: 1. Messy Dataset (Raw) – Represents a typical unprocessed dataset often encountered in data collection from surveys, databases, or manual entries. 2. Cleaned Dataset – This version demonstrates how proper data preprocessing can significantly enhance the quality and usability of data for analytical and visualization purposes.

Each record captures multiple attributes related to individuals in the Indian job market, including: - Age Group
- Employment Status (Employed/Unemployed)
- Monthly Salary (INR)
- Education Level
- Industry Sector
- Years of Experience
- Location
- Perceived AI Risk
- Date of Data Recording

Transformations & Cleaning Applied:

The raw dataset underwent comprehensive transformations to convert it into its clean, analysis-ready form: - Missing Values: Identified and handled using either row elimination (where critical data was missing) or imputation techniques. - Duplicate Records: Identified using row comparison and removed to prevent analytical skew. - Inconsistent Formatting: Unified inconsistent naming in columns (like 'monthly_salary_(inr)' → 'Monthly Salary (INR)'), capitalization, and string spacing. - Incorrect Data Types: Converted columns like salary from string/object to float for numerical analysis. - Outliers: Detected and handled based on domain logic and distribution analysis. - Categorization: Converted numeric ages into grouped age categories for comparative analysis. - Standardization: Uniform labels for employment status, industry names, education, and AI risk levels were applied for visualization clarity.

Purpose & Utility:

This dataset is ideal for learners and professionals who want to understand: - The impact of messy data on visualization and insights - How transformation steps can dramatically improve data interpretation - Practical examples of preprocessing techniques before feeding into ML models or BI tools

It's also useful for: - Training ML models with clean inputs
- Data storytelling with visual clarity
- Demonstrating reproducibility in data cleaning pipelines

By examining both the messy and clean datasets, users gain a deeper appreciation for why “garbage in, garbage out” rings true in the world of data science.
Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats,...
zenodo.org
explore.openaire.eu
bin, csv, jpeg, txt
Updated Aug 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vijay Karthick; Vijay Karthick; Vijay Kumar; Vijay Kumar; Anand Osuri; Anand Osuri (2024). Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India [Dataset]. http://doi.org/10.5281/zenodo.13340613
Explore at:
csv, bin, jpeg, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13340613
Dataset updated
Aug 20, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Vijay Karthick; Vijay Karthick; Vijay Kumar; Vijay Kumar; Anand Osuri; Anand Osuri
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Sakleshpura, India, Western Ghats
Description
Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India

This dataset contains mammal occurrence records from 2022 to 2024 in the Sakleshpura region of central Western Ghats, India. It includes a few occurrence records of other chordates. Occurrence records were gathered in the field by researchers of the Nature Conservation Foundation, India, using a mobile data collection application. Suggested citation is:
Nature Conservation Foundation (2024). Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats, India. Nature Conservation Foundation, India. Dataset

Keywords: tropical rainforest, plantations, Sakleshpura, animal distribution, Western Ghats

CONTACT #1
1. Name: Anand M Osuri
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Work Phone: +91 821 2515601
4. Email address: aosuri@ncf-india.org
5. ORCID: https://orcid.org/0000-0001-9909-5633

CONTACT #2
1. Name: Vijay Karthick
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Work Phone: +91 821 2515601
4. Email address: vijayk@ncf-india.org
5. ORCID: https://orcid.org/0000-0001-6023-3955

CONTACT #3
1. Name: Vijay Kumar
2. Work Address: Nature Conservation Foundation, 1311, 12th A Main, Vijayanagar 1st Stage, Mysuru 570017, Karnataka, India
3. Work Phone: +91 821 2515601
4. Email address: vijaykumar@ncf-india.org
5. ORCID: https://orcid.org/0009-0000-4149-0083

Geographic Coverage:
1. Location/Study Area: Sakleshpura, Karnataka, India
2. GPS coordinates: Kadamane Village (12.924647, 75.654650)

Temporal Coverage:
1. Begins: 2022-05-16 (Year, Month, Day)
2. Ends: 2024-05-22 (Year, Month, Day)

Besides the 000_readMe.txt file containing this information and the 14 images associated with individual observations, the dataset includes three comma-delimited text (csv) files, and one R code file as explained below:
1) 001_mammalData.csv -- This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file

2) 002_placeLocs.csv -- This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 or 1000m accuracy

3) 003_nameMatch.csv -- This file matches the name as originally recorded with the correct common name and scientific name

4) 004_GBIF_upload_code.R -- R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

5) 005_download_images_from_googledrive.R - R code to extract image IDs and download images from googledrive

6) 006_kadamane_mammal_occurrence.xlsx - An excel file that contains the raw data and used in the codes above

FILES INCLUDED IN DATASET

001_mammaldata.csv
This file has the main mammal occurrence data with relevant and renamed columns derived from the original downloaded Excel worksheet file

observers: Observers who made the observation
timestamp: Automatic time stamp of date and time when app was used
date: Date of observation
time: Time of observation
decimalLatitude: Latitude in decimal degrees N
decimalLongitude: Longitude in decimal degrees E
GPSaltitude: Altitude in metres
GPSaccuracy: Horizontal accuracy of GPS location in metres
place: Name of locality
habitat: Habitat type
taxa: mammal or reptile/amphibian
species: Species common name
count: Number of individuals observed
countType: Total (solitary or fully counted groups) or Partial (incompletely counted groups)
obsType: Type of observation: sighting, sign (droppings or vocalisation), death, roadkill, electrocution, other
notes: Notes or remarks on observation
imageID: Link to the google drive photo, if photo is available
instanceID: Automatically generated unique identifier of observation

002_placeLocs.csv
This file lists names places for which the GPS location was unavailable from the mobile phone application, and was manually assigned to coordinates with 500 m accuracy

place: Name of locality as recorded
lat: Assigned latitude in decimal degrees N
long: Assigned longitude in decimal degrees E
GPSaccuracy: Assigned as 500 or 1000m – Horizontal accuracy of GPS location in metres

003_nameMatch.csv
This file matches the name as originally recorded with the correct common name and scientific name.

verbatimIdentification: Identification as originally recorded in the ‘species’ column of the mammaldata.csv file
vernacularName: Common or english name
scientificName: Scientific name

004_GBIF_upload_code.R
R code for processing the files to create a file for upload as an occurrence dataset on the Global Biodiversity Information Facility (GBIF.org)

005_download_images_from_googledrive.R
R code that extracts imageIDs from the 001_mammalData.csv file and downloads them automatically to a preferred directory

006_kadamane_mammal_occurrence.xlsx
An excel file that contains the raw data and used in the codes above
e
Csv Pharmaceuticals India Private Limited | See Full Import/Export Data |...
eximpedia.app
Updated Jan 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Seair Exim (2024). Csv Pharmaceuticals India Private Limited | See Full Import/Export Data | Eximpedia [Dataset]. https://www.eximpedia.app/
Explore at:
.bin, .xml, .csv, .xlsAvailable download formats
Dataset updated
Jan 19, 2024
Dataset provided by
Eximpedia Export Import Trade Data
Eximpedia PTE LTD
Authors
Seair Exim
Area covered
Niue, Nauru, Colombia, Equatorial Guinea, Nigeria, Virgin Islands (U.S.), Andorra, Swaziland, Switzerland, Costa Rica
Description
Eximpedia Export import trade data lets you search trade data and active Exporters, Importers, Buyers, Suppliers, manufacturers exporters from over 209 countries
g
India zip code - Download Dataset
geopostcodes.com
csv
Updated Feb 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GeoPostcodes (2025). India zip code - Download Dataset [Dataset]. https://www.geopostcodes.com/country/india-zip-code/
Explore at:
csvAvailable download formats
Dataset updated
Feb 2, 2025
Dataset authored and provided by
GeoPostcodes
Area covered
India
Description
Our India zip code Database offers comprehensive postal code data for spatial analysis, including postal and administrative areas. This dataset contains accurate and up-to-date information on all administrative divisions, cities, and zip codes, making it an invaluable resource for various applications such as address capture and validation, map and visualization, reporting and business intelligence (BI), master data management, logistics and supply chain management, and sales and marketing. Our location data packages are available in various formats, including CSV, optimized for seamless integration with popular systems like Esri ArcGIS, Snowflake, QGIS, and more. Product features include fully and accurately geocoded data, multi-language support with address names in local and foreign languages, comprehensive city definitions, and the option to combine map data with UNLOCODE and IATA codes, time zones, and daylight saving times. Companies choose our location databases for their enterprise-grade service, reduction in integration time and cost by 30%, and weekly updates to ensure the highest quality.
m
Bollywood Movies data
data.mendeley.com
Updated May 12, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prashant Premkumar (2020). Bollywood Movies data [Dataset]. http://doi.org/10.17632/3c57btcxy9.1
Explore at:
Unique identifier
https://doi.org/10.17632/3c57btcxy9.1
Dataset updated
May 12, 2020
Authors
Prashant Premkumar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Using a Python script to scrape data from the web, we collected data pertaining to all 1698 Hindi language movies that released in India across a 13 year period (2005-2017) from the website of Box Office India.
d
CompanyData.com (BoldData) — Indian Largest B2B Company Database — 32.5+...
datarade.ai
Updated Jul 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CompanyData.com (BoldData) (2025). CompanyData.com (BoldData) — Indian Largest B2B Company Database — 32.5+ Million Verified Companies [Dataset]. https://datarade.ai/data-products/list-of-17-8m-companies-in-india-bolddata
Explore at:
.json, .csv, .xls, .txtAvailable download formats
Dataset updated
Jul 31, 2025
Dataset authored and provided by
CompanyData.com (BoldData)
Area covered
India
Description
CompanyData.com, powered by BoldData, delivers high-quality, verified B2B company information from official trade registers around the world. Our India company database includes 32,468,995 verified business records, giving you powerful insight into one of the fastest-growing economies on the planet.

Each company profile is rich with firmographic data, including company name, CIN (Corporate Identification Number), registration number, legal status, industry classification (NIC codes), revenue range, and employee size. Many records are enhanced with contact details such as email addresses, phone numbers, and names of key decision-makers, supporting direct outreach and smarter segmentation.

Our India dataset is designed for a wide range of business applications — from KYC and AML compliance, due diligence, and regulatory checks, to B2B sales, lead generation, marketing campaigns, CRM enrichment, and AI model training. Whether you’re targeting local startups or large enterprises, our data helps you connect with the right businesses at the right time.

Delivery is flexible to suit your needs. Choose from customized lists, full databases in Excel or CSV, access via our real-time API, or our intuitive self-service platform. We also offer data enrichment and cleansing services to refresh and improve your existing datasets with accurate, up-to-date company information from India.

With access to 32,468,995 verified companies across more than 200 countries, CompanyData.com helps businesses grow confidently — in India and beyond. Rely on our precise, structured data to fuel your strategies and scale with speed and accuracy.
A
‘India Census 2011’ analyzed by Analyst-2
analyst-2.ai
Updated Feb 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘India Census 2011’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-india-census-2011-9aa2/latest
Explore at:
Dataset updated
Feb 13, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
India
Description
Analysis of ‘India Census 2011’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/danofer/india-census on 13 February 2022.

--- Dataset description provided by original source is as follows ---

Context

2011 India census data. Includes population/demographic data and housing data for each district.

Content

india-districts-census-2011.csv - Population enumeration data with expanded columns.

hlpca-total.csv: Housing statistics for total (rural + urban) population by district.

pca-colnames.csv: Mapping of PCA column names to expanded names.

Data is raw counts per district, not normalized percentages!

Acknowledgements

Gathered from 2 sources: https://github.com/pigshell/india-census-2011 https://github.com/nishusharma1608/India-Census-2011-Analysis

Original census data released (and owned by) the Registrar General and Census Commissioner of India under the Ministry of Home Affairs, Government of India.

Inspiration

Where are differences between urbal and rural areas greatest? Between casts? Between men and women?

Use data with other datasets in India!

--- Original source retains full ownership of the source dataset ---
Crimes In India
kaggle.com
Updated Jul 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saurabh Shahane (2022). Crimes In India [Dataset]. https://www.kaggle.com/datasets/saurabhshahane/crimes-in-india
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Saurabh Shahane
Area covered
India
Description
This dataset contains number of crimes filed under each category of the Indian Penal Code (IPC), number of victims of those crimes, and average crime rate. The data is presented separately by IPC category and sub-category. Data are available at the state/UT level for 2018.

● 7060_source_data.csv: The raw data from the source with original administrative dimensions. This dataset may have already been restructured by scraping PDFs, combining files, or pivoting tables to fit the proper tabular format used by NDAP, but the actual data values remain unchanged. ● NDAP_REPORT_7060.csv: The final standardised data using LGD geographic dimensions as seen on NPAP. ● 7060_metadata.csv: Variable-level metadata, including the following fields: ❖ VariableName: The full variable name as it appears in the data ❖ VariableCode: A unique variable code that is used as a short name for the variable during internal processing and can be used for simplicity if desired ❖ Type_Of_Variable: The classification of the column, whether it is a dimension or a variable (i.e. indicator) ❖ Unit_Of_Measure: ❖ Aggregation_Type: The default aggregation function to be used when aggregating each variable ❖ Weighing_Variable_Name: The weight assigned to each variable that is used by default when aggregating ❖ Weighing_Variable_ID: The weighting variable id corresponding to the weighing variable name ❖ Long_Description: A more descriptive definition of the variable ❖ Scaling_factor: Scaling factor from source ● 7060_KEYS.csv: The key which maps source administrative units to the standardised Local Government Directory (LGD) dimensions. This file also contains pre-calculated weights for every constituent unit mapped from the source dimensions into the LGD. You can interpret each row as describing what fraction of the source unit is mapped to a corresponding LGD unit. This file includes the following fields: ❖ src[Unit]Name: The administrative unit name as it appears in the source data. Depending on the dataset, that may include State, District, Subdistrict, Block, Village/Town, etc. ❖ [Unit]Name: The standardised administrative unit name as it appears in the LGD. Depending on the dataset, that may include State, District, Subdistrict, Block, Village/Town, etc. ❖ [Unit]Name: The standardised administrative unit code corresponding to the unit name in the LDG. ❖ Year: The year in which the data was collected or reported. Depending on the dataset, any other temporal variables may also be present (Quarter, Month, Calendar Day, etc.) ❖ Number_Of_Children: The number of LGD units associated with the mapping described by an individual row. Units from the source that have undergone a split will contain multiple children. ❖ Number_Of_Parents: The number of source units associated with the mapping described by an individual row. Units from the source that have undergone a merge will contain multiple parents. ❖ Weighing_Variables: Households, Population, Male Population, Female Population, Land Area (Total, Rural, and Urban versions of each). For each weighing variable there are the following associated fields: ■ Count: the total count of households, population, or land area mapped from the source unit to the LGD unit for that particular row (NumberOfHouseholds, TotalPopulation, LandArea). ■ Mapping_Error: the percentage error due to missing villages in the base data, meaning what fraction of the weighing variable is dropped because the microdata could not be mapped to the LGD. ■ Weighing_Ratio: the weighing ratio for that constituent match of source unit to LGD unit for each particular row. This is the fraction applied to the source data to achieve the LGD-standardised final data
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
m
Cardiovascular_Disease_Dataset
data.mendeley.com
Updated Apr 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhanu Prakash Doppala (2021). Cardiovascular_Disease_Dataset [Dataset]. http://doi.org/10.17632/dzz48mvjht.1
Explore at:
Unique identifier
https://doi.org/10.17632/dzz48mvjht.1
Dataset updated
Apr 16, 2021
Authors
Bhanu Prakash Doppala
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This heart disease dataset is acquired from one o f the multispecialty hospitals in India. Over 14 common features which makes it one of the heart disease dataset available so far for research purposes. This dataset consists of 1000 subjects with 12 features. This dataset will be useful for building a early-stage heart disease detection as well as to generate predictive machine learning models.
e
India - Wind Speed and Wind Power Potential Maps
energydata.info
cloud.csiss.gmu.edu
+1more
Updated Jun 8, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). India - Wind Speed and Wind Power Potential Maps [Dataset]. https://energydata.info/dataset/india-wind-speed-and-wind-power-potential-maps
Explore at:
Dataset updated
Jun 8, 2020
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
India
Description
Maps with wind speed, wind rose and wind power density potential in India. The GIS data stems from the Global Wind Atlas (http://globalwindatlas.info/). GIS data is available as JSON and CSV. The second link provides poster size (.pdf) and midsize maps (.png).
B
Residential Schools Locations Dataset (Shapefile format)
borealisdata.ca
dataone.org
Updated Jun 5, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rosa Orlandini (2019). Residential Schools Locations Dataset (Shapefile format) [Dataset]. http://doi.org/10.5683/SP2/FJG5TG
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.5683/SP2/FJG5TG
Dataset updated
Jun 5, 2019
Dataset provided by
Borealis
Authors
Rosa Orlandini
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 1, 1863 - Jun 30, 1998
Area covered
Canada
Description
The Residential Schools Locations Dataset in shapefile format contains the locations (latitude and longitude) of Residential Schools and student hostels operated by the federal government in Canada. All the residential schools and hostels that are listed in the Indian Residential School Settlement Agreement are included in this data set, as well as several Industrial schools and residential schools that were not part of the IRRSA. This version of the dataset doesn’t include the five schools under the Newfoundland and Labrador Residential Schools Settlement Agreement. The original school location data was created by the Truth and Reconciliation Commission, and was provided to the researcher (Rosa Orlandini) by the National Centre for Truth and Reconciliation in April 2017. The data set was created by Rosa Orlandini, and builds upon and enhances the previous work of the Truth and Reconcilation Commission, Morgan Hite (creator of the Atlas of Indian Residential Schools in Canada that was produced for the Tk'emlups First Nation and Justice for Day Scholar's Initiative, and Stephanie Pyne (project lead for the Residential Schools Interactive Map). Each individual school location in this dataset is attributed either to RSIM, Morgan Hite, NCTR or Rosa Orlandini. Many schools/hostels had several locations throughout the history of the institution. If the school/hostel moved from its’ original location to another property, then the school is considered to have two unique locations in this data set,the original location and the new location. For example, Lejac Indian Residential School had two locations while it was operating, Stuart Lake and Fraser Lake. If a new school building was constructed on the same property as the original school building, it isn't considered to be a new location, as is the case of Girouard Indian Residential School. When the precise location is known, the coordinates of the main building are provided, and when the precise location of the building isn’t known, an approximate location is provided. For each residential school institution location, the following information is provided: official names, alternative name, dates of operation, religious affiliation, latitude and longitude coordinates, community location, Indigenous community name, contributor (of the location coordinates), school/institution photo (when available), location point precision, type of school (hostel or residential school) and list of references used to determine the location of the main buildings or sites. The geographic coordinate system for this dataset is WGS 1984. The data in shapefile format [IRS_locations.zip] can be viewed and mapped in a Geographic Information System software. Detailed metadata in xml format is available as part of the data in shapefile format. In addition, the field name descriptions (IRS_locfields.csv) and the detailed locations descriptions (IRS_locdescription.csv) should be used alongside the data in shapefile format.

Facebook

Twitter

Click to copy link

Link copied

Cite

Crawl Feeds (2025). Amazon India products dataset in CSV format [Dataset]. https://crawlfeeds.com/datasets/amazon-india-products-dataset-in-csv-format

Amazon India products dataset in CSV format

Amazon India products dataset in CSV format from amazon.in

Explore at:

csv, zipAvailable download formats

Dataset updated

Mar 27, 2025

Dataset authored and provided by

Crawl Feeds

License

https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

Area covered

India

Description

Gain access to a structured dataset featuring thousands of products listed on Amazon India. This dataset is ideal for e-commerce analytics, competitor research, pricing strategies, and market trend analysis.

Dataset Features:

Product Details: Name, Brand, Category, and Unique ID
Pricing Information: Current Price, Discounted Price, and Currency
Availability & Ratings: Stock Status, Customer Ratings, and Reviews
Seller Information: Seller Name and Fulfillment Details
Additional Attributes: Product Description, Specifications, and Images

Dataset Specifications:

Format: CSV
Number of Records: 50,000+
Delivery Time: 3 Days
Price: $149.00
Availability: Immediate

This dataset provides structured and actionable insights to support e-commerce businesses, pricing strategies, and product optimization. If you're looking for more datasets for e-commerce analysis, explore our E-commerce datasets for a broader selection.

Clear search

Close search

Google apps

Main menu

Amazon India products dataset in CSV format

Dataset Features:

Dataset Specifications:

FDI Dataset for India(Month-wise) from 2014-2024

AI-Driven Mental Health Literacy - An Interventional Study from India (Data...

Crop Yield Data India

Residential School Locations Dataset (CSV Format)

Data from: Distribution models predict climate-related range alteration or...

🦈 Shark Tank India dataset 🇮🇳

Shark Tank India Data set.

Data from: Assessing Temporal Dynamics of Nitrogen Surplus in Indian...

Employment Of India CLeaned and Messy Data

🔹 Dataset Composition:

Transformations & Cleaning Applied:

Purpose & Utility:

Mammal occurrence records (2022-24) from Sakleshpura, central Western Ghats,...

Csv Pharmaceuticals India Private Limited | See Full Import/Export Data |...

India zip code - Download Dataset

Bollywood Movies data

CompanyData.com (BoldData) — Indian Largest B2B Company Database — 32.5+...

‘India Census 2011’ analyzed by Analyst-2

Context

Content

Acknowledgements

Inspiration

Crimes In India

Datasets for Sentiment Analysis

Cardiovascular_Disease_Dataset

India - Wind Speed and Wind Power Potential Maps

Residential Schools Locations Dataset (Shapefile format)

Amazon India products dataset in CSV format

Amazon India products dataset in CSV format from amazon.in

Dataset Features:

Dataset Specifications: