100+ datasets found
  1. Ecommerce Dataset for Data Analysis

    • kaggle.com
    zip
    Updated Sep 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
    Explore at:
    zip(2028853 bytes)Available download formats
    Dataset updated
    Sep 19, 2024
    Authors
    Shrishti Manja
    Description

    This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

    About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

    Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

    This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

    This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

  2. Data from: Exploratory Data Analysis (EDA)

    • kaggle.com
    zip
    Updated Jul 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    64 Aashish Chaudhary (2024). Exploratory Data Analysis (EDA) [Dataset]. https://www.kaggle.com/datasets/trh2023/exploratory-data-analysis-eda/code
    Explore at:
    zip(176655 bytes)Available download formats
    Dataset updated
    Jul 26, 2024
    Authors
    64 Aashish Chaudhary
    Description

    Dataset

    This dataset was created by 64 Aashish Chaudhary

    Released under Other (specified in description)

    Contents

  3. Electronics Store Sales Dataset for EDA

    • kaggle.com
    zip
    Updated Feb 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sinjoy Saha (2021). Electronics Store Sales Dataset for EDA [Dataset]. https://www.kaggle.com/sinjoysaha/sales-analysis-dataset
    Explore at:
    zip(2505035 bytes)Available download formats
    Dataset updated
    Feb 13, 2021
    Authors
    Sinjoy Saha
    Description

    Content

    This is a transactions data from an Electronics store chain in the US. The data contains 12 CSV files for each month of 2019. The naming convention is as follows: Sales_[MONTH_NAME]_2019 Each file contains anywhere from around 9000 to 26000 rows and 6 columns. The columns are as follows: Order ID, Product, Quantity Ordered, Price Each, Order Date, Purchase Address There are around 186851 data points combining all the 12-month files. There may be null values in some rows.

    Inspiration

    Keith Galli

    Acknowledgements

  4. EDA on Cleaned Netflix Data

    • kaggle.com
    zip
    Updated Jul 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nikhil raman K (2025). EDA on Cleaned Netflix Data [Dataset]. https://www.kaggle.com/datasets/nikhilramank/eda-on-cleaned-netflix-data
    Explore at:
    zip(110806 bytes)Available download formats
    Dataset updated
    Jul 7, 2025
    Authors
    Nikhil raman K
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is a cleaned version of a Netflix movies dataset originally used for exploratory data analysis (EDA). The dataset contains information such as:

    • Title
    • Release Year
    • Rating
    • Genre
    • Votes
    • Description
    • Stars

    Missing values have been handled using appropriate methods (mean, median, unknown), and new features like rating_level and popular have been added for deeper analysis.

    The dataset is ready for: - EDA - Data visualization - Machine learning tasks - Dashboard building

    Used in the accompanying notebook

  5. EDA Analysis for Amazon Books

    • kaggle.com
    zip
    Updated Apr 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    syamalakumar (2021). EDA Analysis for Amazon Books [Dataset]. https://www.kaggle.com/syamalakumar/eda-analysis-for-amazon-books
    Explore at:
    zip(512381 bytes)Available download formats
    Dataset updated
    Apr 8, 2021
    Authors
    syamalakumar
    Description

    Dataset

    This dataset was created by syamalakumar

    Contents

  6. Anomaly-driven Exploratory Data Analysis (2024)

    • kaggle.com
    zip
    Updated Mar 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sunaina Jain (2024). Anomaly-driven Exploratory Data Analysis (2024) [Dataset]. https://www.kaggle.com/datasets/sunainajain/anomaly-driven-exploratory-data-analysis-2024
    Explore at:
    zip(169013 bytes)Available download formats
    Dataset updated
    Mar 18, 2024
    Authors
    Sunaina Jain
    Description

    ID: Unique identifier for each individual in the dataset. Name: Name of the individual. Date_of_Birth: Birth date of the individual. Avg_Salary: Average salary of the individual. Nationality: Nationality of the individual. Address: Address of the individual. Zip_Code: Zip code of the individual's address. Monthly_Spending(USD): Monthly spending in USD by the individual. Health_Condition: Health condition of the individual (e.g., Diabetes, Asthma, Depression, Cancer). Marital_Status: Marital status of the individual (e.g., Married, Single, Divorced). Education: Highest education level attained by the individual (e.g., Bachelor's, Master's, Ph.D., High School). Gender: Gender of the individual. Occupation: Occupation of the individual (e.g., Engineer, Doctor, Teacher, Businessman, Nurse, IT Professional, Chef, Scientist). Number_of_Child: Number of children the individual has. Email_Address: Email address of the individual. Blood_Type: Blood type of the individual. Property_Value: Value of the individual's property. Credit_Score: Credit score of the individual. Smoking_Habit: Smoking habit of the individual (Yes/No). Preferred_Social_Network: Preferred social network of the individual (e.g., Facebook, Instagram, WhatsApp, Snapchat).

  7. Cyclistic Bike - Data Analysis (Python)

    • kaggle.com
    zip
    Updated Jun 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amirthavarshini (2023). Cyclistic Bike - Data Analysis (Python) [Dataset]. https://www.kaggle.com/datasets/amirthavarshini12/cyclistic-bike-data-analysis-python/code
    Explore at:
    zip(211278092 bytes)Available download formats
    Dataset updated
    Jun 19, 2023
    Authors
    Amirthavarshini
    Description

    Conducted an in-depth analysis of Cyclistic bike-share data to uncover customer usage patterns and trends. Cleaned and processed raw data using Python libraries such as pandas and NumPy to ensure data quality. Performed exploratory data analysis (EDA) to identify insights, including peak usage times, customer demographics, and trip duration patterns. Created visualizations using Matplotlib and Seaborn to effectively communicate findings. Delivered actionable recommendations to enhance customer engagement and optimize operational efficiency.

  8. Marketing Analytics

    • kaggle.com
    zip
    Updated Mar 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Daoud (2022). Marketing Analytics [Dataset]. https://www.kaggle.com/datasets/jackdaoud/marketing-data/discussion
    Explore at:
    zip(658411 bytes)Available download formats
    Dataset updated
    Mar 6, 2022
    Authors
    Jack Daoud
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This data is publicly available on GitHub here. It can be utilized for EDA, Statistical Analysis, and Visualizations.

    Content

    The data set ifood_df.csv consists of 2206 customers of XYZ company with data on: - Customer profiles - Product preferences - Campaign successes/failures - Channel performance

    Acknowledgement

    I do not own this dataset. I am simply making it accessible on this platform via the public GitHub link.

  9. Exploratory Data Analysis

    • kaggle.com
    zip
    Updated Jun 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Chaitu Devi (2024). Exploratory Data Analysis [Dataset]. https://www.kaggle.com/datasets/chaitudevi/exploratory-data-analysis/code
    Explore at:
    zip(10710014 bytes)Available download formats
    Dataset updated
    Jun 17, 2024
    Authors
    Chaitu Devi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Chaitu Devi

    Released under MIT

    Contents

  10. Theatres_Data_Version_1.0

    • kaggle.com
    zip
    Updated Aug 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    deanesh takkallapati (2025). Theatres_Data_Version_1.0 [Dataset]. https://www.kaggle.com/datasets/deaneshtakkallapati/theatres-data-version-1-0/data
    Explore at:
    zip(30479 bytes)Available download formats
    Dataset updated
    Aug 4, 2025
    Authors
    deanesh takkallapati
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Exploratory Data Analysis (EDA) of Theatres Data in India

    Steps
      1. Load Data
      2. Check Nulls and Update Data if required
      3. Perform Descriptive Statistics
      4. Data Visualization
         Univariate - Single Column Visualization
           categorical - countplot
           continuous - histogram
         Bivariate - 2 Columns Visualization
          continuous vs continuous  - scatterplot, regplot
          categorical vs continuous  - boxplot
          categorical vs categorical - crosstab, heatmap
         Multivariate - Multi Columns Visualization
          correlation plot
          pairplot
    
  11. Play Store Data Analysis By Vaishnavi

    • kaggle.com
    zip
    Updated Apr 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vaishnavi Sahu (2021). Play Store Data Analysis By Vaishnavi [Dataset]. https://www.kaggle.com/vaishnavisahu/play-store-data-analysis-by-vaishnavi
    Explore at:
    zip(597350 bytes)Available download formats
    Dataset updated
    Apr 30, 2021
    Authors
    Vaishnavi Sahu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    **### Context

    EDA using numpy and pandas

    Content

    In this Task i have to predict what factors makes an app perform well .whether its size , price , category or multiple factors together . what makes an app rank on the top in google Playstore .**

    Column description: App : name of the application Category: category of the application Rating: rating of an application Reviews: reviews of that application Size: size of application Installs:how many users installed that application Type: Type of application Price: price of application content rating:rating of content of the application

  12. Complete Google Playstore EDA 2025

    • kaggle.com
    zip
    Updated Jul 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Shayan (2025). Complete Google Playstore EDA 2025 [Dataset]. https://www.kaggle.com/datasets/muhammadshayan5839/complete-google-playstore-eda-2025
    Explore at:
    zip(20150127 bytes)Available download formats
    Dataset updated
    Jul 29, 2025
    Authors
    Muhammad Shayan
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    - About Dataset

  13. Credit EDA Case Study Data

    • kaggle.com
    zip
    Updated Jan 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ADITYA MISHRA (2022). Credit EDA Case Study Data [Dataset]. https://www.kaggle.com/adityamishra0708/credit-eda-case-study-data
    Explore at:
    zip(117814223 bytes)Available download formats
    Dataset updated
    Jan 11, 2022
    Authors
    ADITYA MISHRA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by ADITYA MISHRA

    Released under CC0: Public Domain

    Contents

  14. Exploratory data analysis on a smartphone dataset.

    • kaggle.com
    zip
    Updated Jun 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhishek Kumar (2025). Exploratory data analysis on a smartphone dataset. [Dataset]. https://www.kaggle.com/datasets/abhishek9065/exploratory-data-analysis-on-a-smartphone-dataset/data
    Explore at:
    zip(2736269 bytes)Available download formats
    Dataset updated
    Jun 2, 2025
    Authors
    Abhishek Kumar
    Description

    This notebook presents a comprehensive exploratory data analysis on a smartphone dataset, covering the distribution of prices, feature prevalence, and relationships between specs and price. All code, plots, and insights are included. Feedback and suggestions welcome!

  15. Walmart Data Analysis and Forcasting

    • kaggle.com
    zip
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amit Kumar Sahu (2023). Walmart Data Analysis and Forcasting [Dataset]. https://www.kaggle.com/datasets/asahu40/walmart-data-analysis-and-forcasting/code
    Explore at:
    zip(125153 bytes)Available download formats
    Dataset updated
    Apr 26, 2023
    Authors
    Amit Kumar Sahu
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A retail store that has multiple outlets across the country are facing issues in managing the inventory - to match the demand with respect to supply. You are a data scientist, who has to come up with useful insights using the data and make prediction models to forecast the sales for X number of months/years.

  16. Dataset for exploratory data analytics

    • kaggle.com
    Updated Nov 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akalya Subramanian (2020). Dataset for exploratory data analytics [Dataset]. https://www.kaggle.com/akalyasubramanian/dataset-for-exploratory-data-analytics/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 24, 2020
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Akalya Subramanian
    Description

    Dataset

    This dataset was created by Akalya Subramanian

    Contents

  17. Amazon Reviews EDA (2001-2018)

    • kaggle.com
    zip
    Updated Sep 10, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ARHAM RUMI (2021). Amazon Reviews EDA (2001-2018) [Dataset]. https://www.kaggle.com/arhamrumi/amazon-reviews-eda-20012018
    Explore at:
    zip(770423224 bytes)Available download formats
    Dataset updated
    Sep 10, 2021
    Authors
    ARHAM RUMI
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    This dataset contains more than 180M consumer reviews on different amazon products. This dataset is also available on other dataset related sites, but I wrangled it and shared it here.

    Content

    This dataset contains the following attributes:

    Total Records: 180M+ Total Columns: 6 Domain Name: amazon.com File Extension: CSV

    Available Fields: rating, verified, reviewerID, product_id, date, vote

    rating: Overall Rating out of 5 verified: Verification Status of the review (A term used in Amazon) reviewerID: Reviewer ID product_id: Product ID date: Date of Review (TimeStamp) vote: Helpful votes for review

    Acknowledgements

    We wouldn't be here without the help of teams who gathered this dataset and made it public.

    Inspiration

    Exploratory Data Analysis and Machine Learning and applying it to this kind of dataset are one of the biggest inspirations for this contribution.

  18. Animal Shelter Analytics

    • kaggle.com
    zip
    Updated Mar 4, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jack Daoud (2021). Animal Shelter Analytics [Dataset]. https://www.kaggle.com/jackdaoud/animal-shelter-analytics
    Explore at:
    zip(8043946 bytes)Available download formats
    Dataset updated
    Mar 4, 2021
    Authors
    Jack Daoud
    License

    https://www.usa.gov/government-works/https://www.usa.gov/government-works/

    Description

    Context

    I was reading Every Nose Counts: Using Metrics in Animal Shelters when I got inspired to conduct an EDA on animal shelter data. I looked online for data and found this dataset which is curated by Austin Animal Center. The data can be found on https://data.austintexas.gov.

    This data can be utilized for EDA practice. So go ahead and help animal shelters with your EDA powers by completing this task!

    Content

    The data set contains three CSVs: 1. Austin_Animal_Center_Intakes.csv 2. Austin_Animal_Center_Outcomes.csv 3. Austin_Animal_Center_Stray_Map.csv

    More TBD!

    Acknowledgement

    Thank you Austin Animal Center for all the animal protection you provide to stray & owned animals. Also, thank you for making your data accessible to the public.

  19. Impact of Artificial Intelligence on Education

    • kaggle.com
    zip
    Updated Jun 9, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    INK (2025). Impact of Artificial Intelligence on Education [Dataset]. https://www.kaggle.com/datasets/irakozekelly/impact-of-artificial-intelligence-on-education/code
    Explore at:
    zip(327925 bytes)Available download formats
    Dataset updated
    Jun 9, 2025
    Authors
    INK
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset supports a study examining how students perceive the usefulness of artificial intelligence (AI) in educational settings. The project involved analyzing an open-access survey dataset that captures a wide range of student responses on AI tools in learning.

    The data underwent cleaning and preprocessing, followed by an exploratory data analysis (EDA) to identify key trends and insights. Visualizations were created to support interpretation, and the results were summarized in a digital poster format to communicate findings effectively.

    This resource may be useful for researchers, educators, and technologists interested in the evolving role of AI in education.

    Keywords: Artificial Intelligence, Education, Student Perception, Survey, Data Analysis, EDA
    
    Subject: Computer and Information Science
    
    License: CC0 1.0 Universal Public Domain Dedication
    
    DOI: https://doi.org/10.18738/T8/RXUCHK
    
  20. 2025-BYU-Locating-Bacterial-Motors-public-repo

    • kaggle.com
    zip
    Updated Mar 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sergey Saharovskiy (2025). 2025-BYU-Locating-Bacterial-Motors-public-repo [Dataset]. https://www.kaggle.com/sergiosaharovskiy/2025-byu-locating-bacterial-motors-public-repo
    Explore at:
    zip(15615630 bytes)Available download formats
    Dataset updated
    Mar 10, 2025
    Authors
    Sergey Saharovskiy
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Welcome

    It is Sergey's Home Credict Public Notebook code repo.

    volume_stats.csv was obtained from the EDA code.

    The calculation was obtained by using the below snippet:

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shrishti Manja (2024). Ecommerce Dataset for Data Analysis [Dataset]. https://www.kaggle.com/datasets/shrishtimanja/ecommerce-dataset-for-data-analysis/code
Organization logo

Ecommerce Dataset for Data Analysis

Exploratory Data Analysis, Data Visualisation and Machine Learning

Explore at:
zip(2028853 bytes)Available download formats
Dataset updated
Sep 19, 2024
Authors
Shrishti Manja
Description

This dataset contains 55,000 entries of synthetic customer transactions, generated using Python's Faker library. The goal behind creating this dataset was to provide a resource for learners like myself to explore, analyze, and apply various data analysis techniques in a context that closely mimics real-world data.

About the Dataset: - CID (Customer ID): A unique identifier for each customer. - TID (Transaction ID): A unique identifier for each transaction. - Gender: The gender of the customer, categorized as Male or Female. - Age Group: Age group of the customer, divided into several ranges. - Purchase Date: The timestamp of when the transaction took place. - Product Category: The category of the product purchased, such as Electronics, Apparel, etc. - Discount Availed: Indicates whether the customer availed any discount (Yes/No). - Discount Name: Name of the discount applied (e.g., FESTIVE50). - Discount Amount (INR): The amount of discount availed by the customer. - Gross Amount: The total amount before applying any discount. - Net Amount: The final amount after applying the discount. - Purchase Method: The payment method used (e.g., Credit Card, Debit Card, etc.). - Location: The city where the purchase took place.

Use Cases: 1. Exploratory Data Analysis (EDA): This dataset is ideal for conducting EDA, allowing users to practice techniques such as summary statistics, visualizations, and identifying patterns within the data. 2. Data Preprocessing and Cleaning: Learners can work on handling missing data, encoding categorical variables, and normalizing numerical values to prepare the dataset for analysis. 3. Data Visualization: Use tools like Python’s Matplotlib, Seaborn, or Power BI to visualize purchasing trends, customer demographics, or the impact of discounts on purchase amounts. 4. Machine Learning Applications: After applying feature engineering, this dataset is suitable for supervised learning models, such as predicting whether a customer will avail a discount or forecasting purchase amounts based on the input features.

This dataset provides an excellent sandbox for honing skills in data analysis, machine learning, and visualization in a structured but flexible manner.

This is not a real dataset. This dataset was generated using Python's Faker library for the sole purpose of learning

Search
Clear search
Close search
Google apps
Main menu