100+ datasets found

Ecommerce Dataset for Data Analysis
kaggle.com
zip
Updated Sep 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
Explore at:
zip(2028853 bytes)Available download formats
Dataset updated
Sep 19, 2024
Authors
Shrishti Manja
Description
This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning
Data from: Exploratory Data Analysis (EDA)
kaggle.com
zip
Updated Jul 26, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
64 Aashish Chaudhary (2024). Exploratory Data Analysis (EDA) [Dataset]. https://www.kaggle.com/datasets/trh2023/exploratory-data-analysis-eda/code
Explore at:
zip(176655 bytes)Available download formats
Dataset updated
Jul 26, 2024
Authors
64 Aashish Chaudhary
Description
Dataset

This dataset was created by 64 Aashish Chaudhary

Released under Other (specified in description)

Contents
Electronics Store Sales Dataset for EDA
kaggle.com
zip
Updated Feb 13, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sinjoy Saha (2021). Electronics Store Sales Dataset for EDA [Dataset]. https://www.kaggle.com/sinjoysaha/sales-analysis-dataset
Explore at:
zip(2505035 bytes)Available download formats
Dataset updated
Feb 13, 2021
Authors
Sinjoy Saha
Description
Content

This is a transactions data from an Electronics store chain in the US. The data contains 12 CSV files for each month of 2019. The naming convention is as follows: Sales_[MONTH_NAME]_2019 Each file contains anywhere from around 9000 to 26000 rows and 6 columns. The columns are as follows: Order ID, Product, Quantity Ordered, Price Each, Order Date, Purchase Address There are around 186851 data points combining all the 12-month files. There may be null values in some rows.

Inspiration

Keith Galli

Acknowledgements

Keith Galli's Youtube video - Solving real world data science tasks with Python Pandas!

Keith Galli's GitHub Repo - Pandas-Data-Science-Tasks
EDA on Cleaned Netflix Data
kaggle.com
zip
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nikhil raman K (2025). EDA on Cleaned Netflix Data [Dataset]. https://www.kaggle.com/datasets/nikhilramank/eda-on-cleaned-netflix-data
Explore at:
zip(110806 bytes)Available download formats
Dataset updated
Jul 7, 2025
Authors
Nikhil raman K
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a cleaned version of a Netflix movies dataset originally used for exploratory data analysis (EDA). The dataset contains information such as:

Title

Release Year

Rating

Genre

Votes

Description

Stars

Missing values have been handled using appropriate methods (mean, median, unknown), and new features like rating_level and popular have been added for deeper analysis.

The dataset is ready for: - EDA - Data visualization - Machine learning tasks - Dashboard building

Used in the accompanying notebook
EDA Analysis for Amazon Books
kaggle.com
zip
Updated Apr 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
syamalakumar (2021). EDA Analysis for Amazon Books [Dataset]. https://www.kaggle.com/syamalakumar/eda-analysis-for-amazon-books
Explore at:
zip(512381 bytes)Available download formats
Dataset updated
Apr 8, 2021
Authors
syamalakumar
Description
Dataset

This dataset was created by syamalakumar

Contents
Anomaly-driven Exploratory Data Analysis (2024)
kaggle.com
zip
Updated Mar 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sunaina Jain (2024). Anomaly-driven Exploratory Data Analysis (2024) [Dataset]. https://www.kaggle.com/datasets/sunainajain/anomaly-driven-exploratory-data-analysis-2024
Explore at:
zip(169013 bytes)Available download formats
Dataset updated
Mar 18, 2024
Authors
Sunaina Jain
Description
ID: Unique identifier for each individual in the dataset. Name: Name of the individual. Date_of_Birth: Birth date of the individual. Avg_Salary: Average salary of the individual. Nationality: Nationality of the individual. Address: Address of the individual. Zip_Code: Zip code of the individual's address. Monthly_Spending(USD): Monthly spending in USD by the individual. Health_Condition: Health condition of the individual (e.g., Diabetes, Asthma, Depression, Cancer). Marital_Status: Marital status of the individual (e.g., Married, Single, Divorced). Education: Highest education level attained by the individual (e.g., Bachelor's, Master's, Ph.D., High School). Gender: Gender of the individual. Occupation: Occupation of the individual (e.g., Engineer, Doctor, Teacher, Businessman, Nurse, IT Professional, Chef, Scientist). Number_of_Child: Number of children the individual has. Email_Address: Email address of the individual. Blood_Type: Blood type of the individual. Property_Value: Value of the individual's property. Credit_Score: Credit score of the individual. Smoking_Habit: Smoking habit of the individual (Yes/No). Preferred_Social_Network: Preferred social network of the individual (e.g., Facebook, Instagram, WhatsApp, Snapchat).
Cyclistic Bike - Data Analysis (Python)
kaggle.com
zip
Updated Jun 19, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amirthavarshini (2023). Cyclistic Bike - Data Analysis (Python) [Dataset]. https://www.kaggle.com/datasets/amirthavarshini12/cyclistic-bike-data-analysis-python/code
Explore at:
zip(211278092 bytes)Available download formats
Dataset updated
Jun 19, 2023
Authors
Amirthavarshini
Description
Conducted an in-depth analysis of Cyclistic bike-share data to uncover customer usage patterns and trends. Cleaned and processed raw data using Python libraries such as pandas and NumPy to ensure data quality. Performed exploratory data analysis (EDA) to identify insights, including peak usage times, customer demographics, and trip duration patterns. Created visualizations using Matplotlib and Seaborn to effectively communicate findings. Delivered actionable recommendations to enhance customer engagement and optimize operational efficiency.
Marketing Analytics
kaggle.com
zip
Updated Mar 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jack Daoud (2022). Marketing Analytics [Dataset]. https://www.kaggle.com/datasets/jackdaoud/marketing-data/discussion
Explore at:
zip(658411 bytes)Available download formats
Dataset updated
Mar 6, 2022
Authors
Jack Daoud
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This data is publicly available on GitHub here. It can be utilized for EDA, Statistical Analysis, and Visualizations.

Content

The data set ifood_df.csv consists of 2206 customers of XYZ company with data on: - Customer profiles - Product preferences - Campaign successes/failures - Channel performance

Acknowledgement

I do not own this dataset. I am simply making it accessible on this platform via the public GitHub link.
Exploratory Data Analysis
kaggle.com
zip
Updated Jun 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chaitu Devi (2024). Exploratory Data Analysis [Dataset]. https://www.kaggle.com/datasets/chaitudevi/exploratory-data-analysis/code
Explore at:
zip(10710014 bytes)Available download formats
Dataset updated
Jun 17, 2024
Authors
Chaitu Devi
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Dataset

This dataset was created by Chaitu Devi

Released under MIT

Contents
Theatres_Data_Version_1.0
kaggle.com
zip
Updated Aug 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
deanesh takkallapati (2025). Theatres_Data_Version_1.0 [Dataset]. https://www.kaggle.com/datasets/deaneshtakkallapati/theatres-data-version-1-0/data
Explore at:
zip(30479 bytes)Available download formats
Dataset updated
Aug 4, 2025
Authors
deanesh takkallapati
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Exploratory Data Analysis (EDA) of Theatres Data in India

Steps 1. Load Data 2. Check Nulls and Update Data if required 3. Perform Descriptive Statistics 4. Data Visualization Univariate - Single Column Visualization categorical - countplot continuous - histogram Bivariate - 2 Columns Visualization continuous vs continuous - scatterplot, regplot categorical vs continuous - boxplot categorical vs categorical - crosstab, heatmap Multivariate - Multi Columns Visualization correlation plot pairplot
Play Store Data Analysis By Vaishnavi
kaggle.com
zip
Updated Apr 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vaishnavi Sahu (2021). Play Store Data Analysis By Vaishnavi [Dataset]. https://www.kaggle.com/vaishnavisahu/play-store-data-analysis-by-vaishnavi
Explore at:
zip(597350 bytes)Available download formats
Dataset updated
Apr 30, 2021
Authors
Vaishnavi Sahu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
**### Context

EDA using numpy and pandas

Content

In this Task i have to predict what factors makes an app perform well .whether its size , price , category or multiple factors together . what makes an app rank on the top in google Playstore .**

Column description: App : name of the application Category: category of the application Rating: rating of an application Reviews: reviews of that application Size: size of application Installs:how many users installed that application Type: Type of application Price: price of application content rating:rating of content of the application
Complete Google Playstore EDA 2025
kaggle.com
zip
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Shayan (2025). Complete Google Playstore EDA 2025 [Dataset]. https://www.kaggle.com/datasets/muhammadshayan5839/complete-google-playstore-eda-2025
Explore at:
zip(20150127 bytes)Available download formats
Dataset updated
Jul 29, 2025
Authors
Muhammad Shayan
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
- About Dataset

Description
The Data Set was downloaded from Kaggle, from the following link

Context
Google PlayStore Android App Data. (2.3 Million+ App Data) Backup repo: https://github.com/gauthamp10/Google-Playstore-Dataset

Content
I've collected the data with the help of Python script (Scrapy) running on a cloud vm instance. The data was collected in the month of june 2025.

Also checkout:

Apple AppStore Apps dataset: https://www.kaggle.com/gauthamp10/apple-appstore-apps Android App Permission dataset: https://www.kaggle.com/gauthamp10/app-permissions-android

Acknowledgements
I couldn't have build this dataset without the help of Github Education and switched to facundoolano/google-play-scraper for sane reasons

Inspiration
Took inspiration from: https://www.kaggle.com/lava18/google-play-store-apps to build a big database for students and researchers.
Credit EDA Case Study Data
kaggle.com
zip
Updated Jan 11, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ADITYA MISHRA (2022). Credit EDA Case Study Data [Dataset]. https://www.kaggle.com/adityamishra0708/credit-eda-case-study-data
Explore at:
zip(117814223 bytes)Available download formats
Dataset updated
Jan 11, 2022
Authors
ADITYA MISHRA
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Dataset

This dataset was created by ADITYA MISHRA

Released under CC0: Public Domain

Contents
Exploratory data analysis on a smartphone dataset.
kaggle.com
zip
Updated Jun 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhishek Kumar (2025). Exploratory data analysis on a smartphone dataset. [Dataset]. https://www.kaggle.com/datasets/abhishek9065/exploratory-data-analysis-on-a-smartphone-dataset/data
Explore at:
zip(2736269 bytes)Available download formats
Dataset updated
Jun 2, 2025
Authors
Abhishek Kumar
Description
This notebook presents a comprehensive exploratory data analysis on a smartphone dataset, covering the distribution of prices, feature prevalence, and relationships between specs and price. All code, plots, and insights are included. Feedback and suggestions welcome!
Walmart Data Analysis and Forcasting
kaggle.com
zip
Updated Apr 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Amit Kumar Sahu (2023). Walmart Data Analysis and Forcasting [Dataset]. https://www.kaggle.com/datasets/asahu40/walmart-data-analysis-and-forcasting/code
Explore at:
zip(125153 bytes)Available download formats
Dataset updated
Apr 26, 2023
Authors
Amit Kumar Sahu
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
A retail store that has multiple outlets across the country are facing issues in managing the inventory - to match the demand with respect to supply. You are a data scientist, who has to come up with useful insights using the data and make prediction models to forecast the sales for X number of months/years.
Dataset for exploratory data analytics
kaggle.com
Updated Nov 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akalya Subramanian (2020). Dataset for exploratory data analytics [Dataset]. https://www.kaggle.com/akalyasubramanian/dataset-for-exploratory-data-analytics/code
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Nov 24, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Akalya Subramanian
Description
Dataset

This dataset was created by Akalya Subramanian

Contents
Amazon Reviews EDA (2001-2018)
kaggle.com
zip
Updated Sep 10, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ARHAM RUMI (2021). Amazon Reviews EDA (2001-2018) [Dataset]. https://www.kaggle.com/arhamrumi/amazon-reviews-eda-20012018
Explore at:
zip(770423224 bytes)Available download formats
Dataset updated
Sep 10, 2021
Authors
ARHAM RUMI
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

This dataset contains more than 180M consumer reviews on different amazon products. This dataset is also available on other dataset related sites, but I wrangled it and shared it here.

Content

This dataset contains the following attributes:

Total Records: 180M+ Total Columns: 6 Domain Name: amazon.com File Extension: CSV

Available Fields: rating, verified, reviewerID, product_id, date, vote

rating: Overall Rating out of 5 verified: Verification Status of the review (A term used in Amazon) reviewerID: Reviewer ID product_id: Product ID date: Date of Review (TimeStamp) vote: Helpful votes for review

Acknowledgements

We wouldn't be here without the help of teams who gathered this dataset and made it public.

Inspiration

Exploratory Data Analysis and Machine Learning and applying it to this kind of dataset are one of the biggest inspirations for this contribution.
Animal Shelter Analytics
kaggle.com
zip
Updated Mar 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jack Daoud (2021). Animal Shelter Analytics [Dataset]. https://www.kaggle.com/jackdaoud/animal-shelter-analytics
Explore at:
zip(8043946 bytes)Available download formats
Dataset updated
Mar 4, 2021
Authors
Jack Daoud
License
https://www.usa.gov/government-works/https://www.usa.gov/government-works/
Description
Context

I was reading Every Nose Counts: Using Metrics in Animal Shelters when I got inspired to conduct an EDA on animal shelter data. I looked online for data and found this dataset which is curated by Austin Animal Center. The data can be found on https://data.austintexas.gov.

This data can be utilized for EDA practice. So go ahead and help animal shelters with your EDA powers by completing this task!

Content

The data set contains three CSVs: 1. Austin_Animal_Center_Intakes.csv 2. Austin_Animal_Center_Outcomes.csv 3. Austin_Animal_Center_Stray_Map.csv

More TBD!

Acknowledgement

Thank you Austin Animal Center for all the animal protection you provide to stray & owned animals. Also, thank you for making your data accessible to the public.
Impact of Artificial Intelligence on Education
kaggle.com
zip
Updated Jun 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
INK (2025). Impact of Artificial Intelligence on Education [Dataset]. https://www.kaggle.com/datasets/irakozekelly/impact-of-artificial-intelligence-on-education/code
Explore at:
zip(327925 bytes)Available download formats
Dataset updated
Jun 9, 2025
Authors
INK
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset supports a study examining how students perceive the usefulness of artificial intelligence (AI) in educational settings. The project involved analyzing an open-access survey dataset that captures a wide range of student responses on AI tools in learning.

The data underwent cleaning and preprocessing, followed by an exploratory data analysis (EDA) to identify key trends and insights. Visualizations were created to support interpretation, and the results were summarized in a digital poster format to communicate findings effectively.

This resource may be useful for researchers, educators, and technologists interested in the evolving role of AI in education.

Keywords: Artificial Intelligence, Education, Student Perception, Survey, Data Analysis, EDA Subject: Computer and Information Science License: CC0 1.0 Universal Public Domain Dedication DOI: https://doi.org/10.18738/T8/RXUCHK
2025-BYU-Locating-Bacterial-Motors-public-repo
kaggle.com
zip
Updated Mar 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sergey Saharovskiy (2025). 2025-BYU-Locating-Bacterial-Motors-public-repo [Dataset]. https://www.kaggle.com/sergiosaharovskiy/2025-byu-locating-bacterial-motors-public-repo
Explore at:
zip(15615630 bytes)Available download formats
Dataset updated
Mar 10, 2025
Authors
Sergey Saharovskiy
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Welcome

It is Sergey's Home Credict Public Notebook code repo.

volume_stats.csv was obtained from the EDA code.

The calculation was obtained by using the below snippet:

Facebook

Twitter

Click to copy link

Link copied

Cite

Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning

Explore at:

zip(2028853 bytes)Available download formats

Dataset updated

Sep 19, 2024

Authors

Shrishti Manja

Description

This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

Clear search

Close search

Google apps

Main menu

Ecommerce Dataset for Data Analysis

Data from: Exploratory Data Analysis (EDA)

Dataset

Contents

Electronics Store Sales Dataset for EDA

Content

Inspiration

Acknowledgements

EDA on Cleaned Netflix Data

EDA Analysis for Amazon Books

Dataset

Contents

Anomaly-driven Exploratory Data Analysis (2024)

Cyclistic Bike - Data Analysis (Python)

Marketing Analytics

Context

Content

Acknowledgement

Exploratory Data Analysis

Dataset

Contents

Theatres_Data_Version_1.0

Exploratory Data Analysis (EDA) of Theatres Data in India

Play Store Data Analysis By Vaishnavi

Content

Complete Google Playstore EDA 2025

Credit EDA Case Study Data

Dataset

Contents

Exploratory data analysis on a smartphone dataset.

Walmart Data Analysis and Forcasting

Dataset for exploratory data analytics

Dataset

Contents

Amazon Reviews EDA (2001-2018)

Context

Content

Acknowledgements

Inspiration

Animal Shelter Analytics

Context

Content

Acknowledgement

Impact of Artificial Intelligence on Education

2025-BYU-Locating-Bacterial-Motors-public-repo

Welcome

volume_stats.csv was obtained from the EDA code.

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning

`volume_stats.csv` was obtained from the EDA code.