Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Project Objective: analyzing sales data to identify sales trends, peak periods, customer preferences, and customers segment based on purchasing behavior. The goal is to derive actionable insights for better targeting and strategy formulation.
The project includes 5 files: 1. E-commerce Data Analysis Project.csv: The database is composed of (8 rows X 18,590 columns). 2. Final_E_Code.py: The python script used for data cleaning, EDA, and data analysis and visualization. 3. Presentation.pdf: The deck of slides which uses the analysis and visualization produced by the python script to derive insights and recommendations. 4. LICENSE: The dataset is licensed under the GNU General Public License v3.0 (GPL-3.0). 5. README.md: It includes the project objective and attribution.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
β’ I leveraged advanced data visualization techniques to extract valuable insights from a comprehensive dataset. By visualizing sales patterns, customer behavior, and product trends, I identified key growth opportunities and provided actionable recommendations to optimize business strategies and enhance overall performance. you can find the GitHub repo here Link to GitHub Repository.
there are exactly 6 table and 1 is a fact table and the rest of them are dimension tables: Fact Table:
payment_key:
Description: An identifier representing the payment transaction associated with the fact.
Use Case: This key links to a payment dimension table, providing details about the payment method and related information.
customer_key:
Description: An identifier representing the customer associated with the fact.
Use Case: This key links to a customer dimension table, providing details about the customer, such as name, address, and other customer-specific information.
time_key:
Description: An identifier representing the time dimension associated with the fact.
Use Case: This key links to a time dimension table, providing details about the time of the transaction, such as date, day of the week, and month.
item_key:
Description: An identifier representing the item or product associated with the fact.
Use Case: This key links to an item dimension table, providing details about the product, such as category, sub-category, and product name.
store_key:
Description: An identifier representing the store or location associated with the fact.
Use Case: This key links to a store dimension table, providing details about the store, such as location, store name, and other store-specific information.
quantity:
Description: The quantity of items sold or involved in the transaction.
Use Case: Represents the amount or number of items associated with the transaction.
unit:
Description: The unit or measurement associated with the quantity (e.g., pieces, kilograms).
Use Case: Specifies the unit of measurement for the quantity.
unit_price:
Description: The price per unit of the item.
Use Case: Represents the cost or price associated with each unit of the item.
total_price:
Description: The total price of the transaction, calculated as the product of quantity and unit price.
Use Case: Represents the overall cost or revenue generated by the transaction.
Customer Table: customer_key:
Description: An identifier representing a unique customer.
Use Case: Serves as the primary key to link with the fact table, allowing for easy and efficient retrieval of customer-specific information.
name:
Description: The name of the customer.
Use Case: Captures the personal or business name of the customer for identification and reference purposes.
contact_no:
Description: The contact number associated with the customer.
Use Case: Stores the phone number or contact details for communication or outreach purposes.
nid:
Description: The National ID (NID) or a unique identification number for the customer.
Item Table: item_key:
Description: An identifier representing a unique item or product.
Use Case: Serves as the primary key to link with the fact table, enabling retrieval of detailed information about specific items in transactions.
item_name:
Description: The name or title of the item.
Use Case: Captures the descriptive name of the item, providing a recognizable label for the product.
desc:
Description: A description of the item.
Use Case: Contains additional details about the item, such as features, specifications, or any relevant information.
unit_price:
Description: The price per unit of the item.
Use Case: Represents the cost or price associated with each unit of the item.
man_country:
Description: The country where the item is manufactured.
Use Case: Captures the origin or manufacturing location of the item.
supplier:
Description: The supplier or vendor providing the item.
Use Case: Stores the name or identifier of the supplier, facilitating tracking of item sources.
unit:
Description: The unit of measurement associated with the item (e.g., pieces, kilograms).
Store Table: store_key:
Description: An identifier representing a unique store or location.
Use Case: Serves as the primary key to link with the fact table, allowing for easy retrieval of information about transactions associated with specific stores.
division:
Description: The administrative division or region where the store is located.
Use Case: Captures the broader geographical area in which...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
π E-Commerce Data Analysis (Excel & Python Project) π Overview
This project analyzes 10,000+ e-commerce sales records using Excel and Python (Pandas) to uncover valuable business insights. It covers essential data analysis techniques such as cleaning, aggregation, and visualization β perfect for beginners and data analyst learners.
π― Objectives
Understand customer purchasing trends
Identify top-selling products
Analyze monthly sales and revenue performance
Calculate business KPIs such as Total Revenue, Total Orders, and Average Order Value (AOV)
π§© Dataset Information
File: ecommerce_simple_10k.csv Total Rows: 10,000 Columns:
Column Name Description order_id Unique order identifier product Product name quantity Number of items ordered price Price of a single item order_date Date of order placement city City where the order was placed π§Ή Data Cleaning (Python)
Key cleaning steps:
Removed currency symbols (βΉ) and commas from price and total_sales
Converted order_date into proper datetime format
Created new column month from order_date
Handled missing or incorrect data entries
Facebook
Twitter** Inputs related to Analysis for additional reference:** 1. Why do we need customer Segmentation? As every customer is unique and can be targeted in different ways. The Customer segmentation plays an important role in this case. The segmentation helps to understand profiles of customers and can be helpful in defining cross sell/upsell/activation/acquisition strategies. 2. What is RFM Segmentation? RFM Segmentation is an acronym of recency, frequency and monetary based segmentation. Recency is about when the last order of a customer. It means the number of days since a customer made the last purchase. If itβs a case for a website or an app, this could be interpreted as the last visit day or the last login time. Frequency is about the number of purchases in a given period. It could be 3 months, 6 months or 1 year. So we can understand this value as for how often or how many customers used the product of a company. The bigger the value is, the more engaged the customers are. Alternatively We can define, average duration between two transactions Monetary is the total amount of money a customer spent in that given period. Therefore big spenders will be differentiated with other customers such as MVP or VIP. 3. What is LTV and How to define it? In the current world, almost every retailer promotes its subscription and this is further used to understand the customer lifetime. Retailer can manage these customers in better manner if they know which customer is high life time value. Customer lifetime value (LTV) can also be defined as the monetary value of a customer relationship, based on the present value of the projected future cash flows from the customer relationship. Customer lifetime value is an important concept in that it encourages firms to shift their focus from quarterly profits to the long-term health of their customer relationships. Customer lifetime value is an important metric because it represents an upper limit on spending to acquire new customers. For this reason it is an important element in calculating payback of advertising spent in marketing mix modelling. 4. Why do need to predict Customer Lifetime Value? The LTV is an important building block in campaign design and marketing mix management. Although targeting models can help to identify the right customers to be targeted, LTV analysis can help to quantify the expected outcome of targeting in terms of revenues and profits. The LTV is also important because other major metrics and decision thresholds can be derived from it. For example, the LTV is naturally an upper limit on the spending to acquire a customer, and the sum of the LTVs for all of the customers of a brand, known as the customer equity, is a major metric forbusiness valuations. Similarly to many other problems of marketing analytics and algorithmic marketing, LTV modelling can be approached from descriptive, predictive, and prescriptive perspectives. 5. How Next Purchase Day helps to Retailers? Our objective is to analyse when our customer will purchase products in the future so for such customers we can build strategy and can come up with strategies and marketing campaigns accordingly. a. Group-1: Customers who will purchase in more than 60 days b. Group-2: Customers who will purchase in 30-60 days c. Group-3: Customers who will purchase in 0-30 days 6. What is Cohort Analysis? How it will be helpful? A cohort is a group of users who share a common characteristic that is identified in this report by an Analytics dimension. For example, all users with the same Acquisition Date belong to the same cohort. The Cohort Analysis report lets you isolate and analyze cohort behaviour. Cohort analysis in e-commerce means to monitor your customersβ behaviour based on common traits they share β the first product they bought, when they became customers, etc. - - to find patterns and tailor marketing activities for the group.
Transaction data has been provided for the period of 1st Jan 2019 to 31st Dec 2019. The below data sets have been provided. Online_Sales.csv: This file contains actual orders data (point of Sales data) at transaction level with below variables. CustomerID: Customer unique ID Transaction_ID: Transaction Unique ID Transaction_Date: Date of Transaction Product_SKU: SKU ID β Unique Id for product Product_Description: Product Description Product_Cateogry: Product Category Quantity: Number of items ordered Avg_Price: Price per one quantity Delivery_Charges: Charges for delivery Coupon_Status: Any discount coupon applied Customers_Data.csv: This file contains customerβs demographics. CustomerID: Customer Unique ID Gender: Gender of customer Location: Location of Customer Tenure_Months: Tenure in Months Discount_Coupon.csv: Discount coupons have been given for different categories in different months Month: Discount coupon applied in that month Product_Category: Product categor...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description: Explore a comprehensive dataset of e-commerce sales, encompassing a variety of product categories, pricing, customer reviews, and sales trends over the past year. This dataset is ideal for analyzing market trends, customer behavior, and sales performance. Explore into the data to uncover insights that can optimize product listings, pricing strategies, and marketing campaigns.
Columns:
product_id: Unique identifier for each product. product_name: Name of the product. category: Product category. price: Price of the product. review_score: Average customer review score (1 to 5). review_count: Total number of reviews. sales_month_1 to sales_month_12: Monthly sales data for each product over the past year. Potential Analyses:
Identify top-performing product categories. Analyze the impact of pricing on sales and customer reviews. Discover seasonal sales trends and patterns. Evaluate customer satisfaction based on review scores and counts.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Facebook
TwitterAttribution-ShareAlike 3.0 (CC BY-SA 3.0)https://creativecommons.org/licenses/by-sa/3.0/
License information was derived automatically
This dataset is a synthetic e-commerce dataset designed to provide a comprehensive view of transaction, customer, product, and advertising data in a dynamic marketplace. It simulates real-world scenarios with seasonal effects, regional variations, advertising metrics, and customer purchasing behaviors. This dataset can serve as a valuable resource for exploring e-commerce analytics, customer segmentation, product performance, and marketing effectiveness.
The dataset includes detailed transaction-level data featuring product categories, customer demographics, discounts, revenue, and advertising metrics such as impressions, clicks, conversion rates, and ad spend. Seasonal trends and regional multipliers are integrated into the data to create realistic patterns that mimic consumer behavior across different times of the year and geographic regions.
This dataset provides ample opportunities for data exploration, machine learning, and business analysis. We hope you find it insightful and useful for your projects!
Facebook
TwitterTypically e-commerce datasets are proprietary and consequently hard to find among publicly available data. However, The UCI Machine Learning Repository has made this dataset containing actual transactions from 2010 and 2011. The dataset is maintained on their site, where it can be found by the title "Online Retail".
"This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers."
Per the UCI Machine Learning Repository, this data was made available by Dr Daqing Chen, Director: Public Analytics group. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.
Image from stocksnap.io.
Analyses for this dataset could include time series, clustering, classification and more.
Facebook
TwitterThis dataset was created by Bhavika Puri 2811
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
π E-Commerce Customer Behavior and Sales Dataset π Dataset Overview This comprehensive dataset contains 5,000 e-commerce transactions from a Turkish online retail platform, spanning from January 2023 to March 2024. The dataset provides detailed insights into customer demographics, purchasing behavior, product preferences, and engagement metrics.
π― Use Cases This dataset is perfect for:
Customer Segmentation Analysis: Identify distinct customer groups based on behavior Sales Forecasting: Predict future sales trends and patterns Recommendation Systems: Build product recommendation engines Customer Lifetime Value (CLV) Prediction: Estimate customer value Churn Analysis: Identify customers at risk of leaving Marketing Campaign Optimization: Target customers effectively Price Optimization: Analyze price sensitivity across categories Delivery Performance Analysis: Optimize logistics and shipping π Dataset Structure The dataset contains 18 columns with the following features:
Order Information Order_ID: Unique identifier for each order (ORD_XXXXXX format) Date: Transaction date (2023-01-01 to 2024-03-26) Customer Demographics Customer_ID: Unique customer identifier (CUST_XXXXX format) Age: Customer age (18-75 years) Gender: Customer gender (Male, Female, Other) City: Customer city (10 major Turkish cities) Product Information Product_Category: 8 categories (Electronics, Fashion, Home & Garden, Sports, Books, Beauty, Toys, Food) Unit_Price: Price per unit (in TRY/Turkish Lira) Quantity: Number of units purchased (1-5) Transaction Details Discount_Amount: Discount applied (if any) Total_Amount: Final transaction amount after discount Payment_Method: Payment method used (5 types) Customer Behavior Metrics Device_Type: Device used for purchase (Mobile, Desktop, Tablet) Session_Duration_Minutes: Time spent on website (1-120 minutes) Pages_Viewed: Number of pages viewed during session (1-50) Is_Returning_Customer: Whether customer has purchased before (True/False) Post-Purchase Metrics Delivery_Time_Days: Delivery duration (1-30 days) Customer_Rating: Customer satisfaction rating (1-5 stars) π Key Statistics Total Records: 5,000 transactions Date Range: January 2023 - March 2024 (15 months) Average Transaction Value: ~450 TRY Customer Satisfaction: 3.9/5.0 average rating Returning Customer Rate: 60% Mobile Usage: 55% of transactions π Data Quality β No missing values β Consistent formatting across all fields β Realistic data distributions β Proper data types for all columns β Logical relationships between features π‘ Sample Analysis Ideas Customer Segmentation with K-Means Clustering
Segment customers based on spending, frequency, and recency Sales Trend Analysis
Identify seasonal patterns and peak shopping periods Product Category Performance
Compare revenue, ratings, and return rates across categories Device-Based Behavior Analysis
Understand how device choice affects purchasing patterns Predictive Modeling
Build models to predict customer ratings or purchase amounts City-Level Market Analysis
Compare market performance across different cities π οΈ Technical Details File Format: CSV (Comma-Separated Values) Encoding: UTF-8 File Size: ~500 KB Delimiter: Comma (,) π Column Descriptions Column Name Data Type Description Example Order_ID String Unique order identifier ORD_001337 Customer_ID String Unique customer identifier CUST_01337 Date DateTime Transaction date 2023-06-15 Age Integer Customer age 35 Gender String Customer gender Female City String Customer city Istanbul Product_Category String Product category Electronics Unit_Price Float Price per unit 1299.99 Quantity Integer Units purchased 2 Discount_Amount Float Discount applied 129.99 Total_Amount Float Final amount paid 2469.99 Payment_Method String Payment method Credit Card Device_Type String Device used Mobile Session_Duration_Minutes Integer Session time 15 Pages_Viewed Integer Pages viewed 8 Is_Returning_Customer Boolean Returning customer True Delivery_Time_Days Integer Delivery duration 3 Customer_Rating Integer Satisfaction rating 5 π Learning Outcomes By working with this dataset, you can learn:
Data cleaning and preprocessing techniques Exploratory Data Analysis (EDA) with Python/R Statistical analysis and hypothesis testing Machine learning model development Data visualization best practices Business intelligence and reporting π Citation If you use this dataset in your research or project, please cite:
E-Commerce Customer Behavior and Sales Dataset (2024) Turkish Online Retail Platform Data (2023-2024) Available on Kaggle βοΈ License This dataset is released under the CC0: Public Domain license. You are free to use it for any purpose.
π€ Contribution Found any issues or have suggestions? Feel free to provide feedback!
π Contact For questions or collaborations, please reach out through Kaggle.
Happy Analyzing! π
Keywords: e-c...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides a comprehensive collection of consumer behavior data that can be used for various market research and statistical analyses. It includes information on purchasing patterns, demographics, product preferences, customer satisfaction, and more, making it ideal for market segmentation, predictive modeling, and understanding customer decision-making processes.
The dataset is designed to help researchers, data scientists, and marketers gain insights into consumer purchasing behavior across a wide range of categories. By analyzing this dataset, users can identify key trends, segment customers, and make data-driven decisions to improve product offerings, marketing strategies, and customer engagement.
Key Features: Customer Demographics: Understand age, income, gender, and education level for better segmentation and targeted marketing. Purchase Behavior: Includes purchase amount, frequency, category, and channel preferences to assess spending patterns. Customer Loyalty: Features like brand loyalty, engagement with ads, and loyalty program membership provide insights into long-term customer retention. Product Feedback: Customer ratings and satisfaction levels allow for analysis of product quality and customer sentiment. Decision-Making: Time spent on product research, time to decision, and purchase intent reflect how customers make purchasing decisions. Influences on Purchase: Factors such as social media influence, discount sensitivity, and return rates are included to analyze how external factors affect purchasing behavior.
Columns Overview: Customer_ID: Unique identifier for each customer. Age: Customer's age (integer). Gender: Customer's gender (categorical: Male, Female, Non-binary, Other). Income_Level: Customer's income level (categorical: Low, Middle, High). Marital_Status: Customer's marital status (categorical: Single, Married, Divorced, Widowed). Education_Level: Highest level of education completed (categorical: High School, Bachelor's, Master's, Doctorate). Occupation: Customer's occupation (categorical: Various job titles). Location: Customer's location (city, region, or country). Purchase_Category: Category of purchased products (e.g., Electronics, Clothing, Groceries). Purchase_Amount: Amount spent during the purchase (decimal). Frequency_of_Purchase: Number of purchases made per month (integer). Purchase_Channel: The purchase method (categorical: Online, In-Store, Mixed). Brand_Loyalty: Loyalty to brands (1-5 scale). Product_Rating: Rating given by the customer to a purchased product (1-5 scale). Time_Spent_on_Product_Research: Time spent researching a product (integer, hours or minutes). Social_Media_Influence: Influence of social media on purchasing decision (categorical: High, Medium, Low, None). Discount_Sensitivity: Sensitivity to discounts (categorical: Very Sensitive, Somewhat Sensitive, Not Sensitive). Return_Rate: Percentage of products returned (decimal). Customer_Satisfaction: Overall satisfaction with the purchase (1-10 scale). Engagement_with_Ads: Engagement level with advertisements (categorical: High, Medium, Low, None). Device_Used_for_Shopping: Device used for shopping (categorical: Smartphone, Desktop, Tablet). Payment_Method: Method of payment used for the purchase (categorical: Credit Card, Debit Card, PayPal, Cash, Other). Time_of_Purchase: Timestamp of when the purchase was made (date/time). Discount_Used: Whether the customer used a discount (Boolean: True/False). Customer_Loyalty_Program_Member: Whether the customer is part of a loyalty program (Boolean: True/False). Purchase_Intent: The intent behind the purchase (categorical: Impulsive, Planned, Need-based, Wants-based). Shipping_Preference: Shipping preference (categorical: Standard, Express, No Preference). Payment_Frequency: Frequency of payment (categorical: One-time, Subscription, Installments). Time_to_Decision: Time taken from consideration to actual purchase (in days).
Use Cases: Market Segmentation: Segment customers based on demographics, preferences, and behavior. Predictive Analytics: Use data to predict customer spending habits, loyalty, and product preferences. Customer Profiling: Build detailed profiles of different consumer segments based on purchase behavior, social media influence, and decision-making patterns. Retail and E-commerce Insights: Analyze purchase channels, payment methods, and shipping preferences to optimize marketing and sales strategies.
Target Audience: Data scientists and analysts looking for consumer behavior data. Marketers interested in improving customer segmentation and targeting. Researchers are exploring factors influencing consumer decisions and preferences. Companies aiming to improve customer experience and increase sales through data-driven decisions.
This dataset is available in CSV format for easy integration into data analysis tools and platforms such as Python, R, and Excel.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a synthetic but realistic sample of e-commerce sales for an online store, covering the period from 2024 to 2025. It includes details about orders, customers, products, regions, pricing, discounts, sales, profit, and payment modes.
It is designed for data analysis, visualization, and machine learning projects. Beginners and advanced users can use this dataset to practice:
Exploratory Data Analysis (EDA)
Sales trend analysis
Profit margin and discount analysis
Customer segmentation
Predictive modeling (e.g., sales or profit prediction)
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset was created by VikasNagarVK
Released under MIT
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Data has been provided from Sep 2016 to Oct 2018 Tables: Customers: Customers information Sellers: Sellers information Products: Product information Orders: Orders info like ordered, product id, status, order dates, etc.. Order_Items: Order-level information Order_Payments: Order payment information Order_Review_Ratings: Customer ratings at order level Geo-Location: Location details
and below is the data model.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16953750%2F9453e20264c25ed7812d2ec4ee0031a4%2FScreenshot%202024-08-25%20170248.png?generation=1724585592640394&alt=media" alt="">
Business Context: The client is one of the leading online market place in India and would like partner with Analytixlabs. Client wants help in measuring, managing and analysing performance of business. Analytixlabs has hired you as an analyst for this project where client asked you to provide data driven insights about business and understand customer, seller behaviors, product behavior and channel behavior etc... While working on this project, you are expected to clean the data (if required) before analyze it.
Business Objective: The below are few Sample business questions to be addressed as part of this analysis. However this is not exhaustive list and you can add as many as analysis and provide insights on the same. 1. Perform Detailed exploratory analysis a. Define & calculate high level metrics like (Total Revenue, Total quantity, Total products, Total categories, Total sellers, Total locations, Total channels, Total payment methods etcβ¦) b. Understanding how many new customers acquired every month c. Understand the retention of customers on month on month basis d. How the revenues from existing/new customers on month on month basis e. Understand the trends/seasonality of sales, quantity by category, location, month, week, day, time, channel, payment method etcβ¦ f. Popular Products by month, seller, state, category. g. Popular categories by state, month h. List top 10 most expensive products sorted by price 2. Performing Customers/sellers Segmentation a. Divide the customers into groups based on the revenue generated b. Divide the sellers into groups based on the revenue generated 3. Cross-Selling (Which products are selling together) 4. Payment Behaviour a. How customers are paying? b. Which payment channels are used by most customers? 5. Customer satisfaction towards category & product a. Which categories (top 10) are maximum rated & minimum rated? b. Which products (top10) are maximum rated & minimum rated? c. Average rating by location, seller, product, category, month etc.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This synthetic dataset simulates a large-scale e-commerce platform with 100,000 records, ideal for data analysis, machine learning, and visualization projects. It includes various data types and reflects real-world e-commerce operations, making it suitable for portfolio projects focused on user behavior analysis, sales trends, and product performance.
This dataset contains 100,000 rows with details on users, products, and transactions, as well as user engagement and transaction attributes. It is crafted to resemble actual e-commerce data, providing insights into customer demographics, purchasing patterns, and engagement.
Male, Female, and Non-Binary.USA, Canada, UK, Australia, India, and Germany.Laptop, Smartphone, Headphones, Shoes, T-shirt, Book, Watch).Electronics, Apparel, Books, and Accessories.Price * Quantity).True or False).Excellent, Good, Average, Poor).Mobile, Desktop, and Tablet.Organic Search, Ad Campaign, Email Marketing, or Social Media.This dataset is intended for: - Exploratory Data Analysis (EDA): Understanding customer demographics, popular products, and sales distribution. - Data Visualization: Visualizing user engagement, sales trends, and product category performance. - Machine Learning Models: Training models on customer segmentation, purchase prediction, and review rating analysis.
This dataset is freely available for use in projects and portfolios. When sharing results derived from this dataset, please credit it as a synthetic data source.
Facebook
TwitterCSV version of Looker Ecommerce Dataset.
Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.
distribution_centers.csvid: Unique identifier for each distribution center.name: Name of the distribution center.latitude: Latitude coordinate of the distribution center.longitude: Longitude coordinate of the distribution center.events.csvid: Unique identifier for each event.user_id: Identifier for the user associated with the event.sequence_number: Sequence number of the event.session_id: Identifier for the session during which the event occurred.created_at: Timestamp indicating when the event took place.ip_address: IP address from which the event originated.city: City where the event occurred.state: State where the event occurred.postal_code: Postal code of the event location.browser: Web browser used during the event.traffic_source: Source of the traffic leading to the event.uri: Uniform Resource Identifier associated with the event.event_type: Type of event recorded.inventory_items.csvid: Unique identifier for each inventory item.product_id: Identifier for the associated product.created_at: Timestamp indicating when the inventory item was created.sold_at: Timestamp indicating when the item was sold.cost: Cost of the inventory item.product_category: Category of the associated product.product_name: Name of the associated product.product_brand: Brand of the associated product.product_retail_price: Retail price of the associated product.product_department: Department to which the product belongs.product_sku: Stock Keeping Unit (SKU) of the product.product_distribution_center_id: Identifier for the distribution center associated with the product.order_items.csvid: Unique identifier for each order item.order_id: Identifier for the associated order.user_id: Identifier for the user who placed the order.product_id: Identifier for the associated product.inventory_item_id: Identifier for the associated inventory item.status: Status of the order item.created_at: Timestamp indicating when the order item was created.shipped_at: Timestamp indicating when the order item was shipped.delivered_at: Timestamp indicating when the order item was delivered.returned_at: Timestamp indicating when the order item was returned.orders.csvorder_id: Unique identifier for each order.user_id: Identifier for the user who placed the order.status: Status of the order.gender: Gender information of the user.created_at: Timestamp indicating when the order was created.returned_at: Timestamp indicating when the order was returned.shipped_at: Timestamp indicating when the order was shipped.delivered_at: Timestamp indicating when the order was delivered.num_of_item: Number of items in the order.products.csvid: Unique identifier for each product.cost: Cost of the product.category: Category to which the product belongs.name: Name of the product.brand: Brand of the product.retail_price: Retail price of the product.department: Department to which the product belongs.sku: Stock Keeping Unit (SKU) of the product.distribution_center_id: Identifier for the distribution center associated with the product.users.csvid: Unique identifier for each user.first_name: First name of the user.last_name: Last name of the user.email: Email address of the user.age: Age of the user.gender: Gender of the user.state: State where t...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Dataset Description: E-commerce Customer Behavior
Overview: This dataset provides a comprehensive view of customer behavior within an e-commerce platform. Each entry in the dataset corresponds to a unique customer, offering a detailed breakdown of their interactions and transactions. The information is crafted to facilitate a nuanced analysis of customer preferences, engagement patterns, and satisfaction levels, aiding businesses in making data-driven decisions to enhance the customer experience.
Columns:
Customer ID:
Gender:
Age:
City:
Membership Type:
Total Spend:
Items Purchased:
Average Rating:
Discount Applied:
Days Since Last Purchase:
Satisfaction Level:
Use Cases:
Customer Segmentation:
Satisfaction Analysis:
Promotion Strategy:
Retention Strategies:
City-based Insights:
Note: This dataset is synthetically generated for illustrative purposes, and any resemblance to real individuals or scenarios is coincidental.
Facebook
TwitterOnline Retail E-Commerce Data Hey everyone! π
This dataset contains real e-commerce transaction data from 2009 to 2011. It comes from a UK-based online store that sells a variety of products. The data includes details like invoices, product codes, descriptions, prices, and even customer IDs.
Whatβs Inside? Each row represents a transaction, and the dataset has the following key columns: π Invoice β Unique order ID π¦ StockCode β Product code π Description β Name of the product π Quantity β Number of units sold β³ InvoiceDate β When the purchase happened π° Price β Unit price of the product π€ Customer ID β Unique identifier for each customer π Country β Where the customer is from
Why is this dataset useful? This dataset is great for exploring: Customer Segmentation (Find high-value customers) Customer Lifetime Value (LTV) Analysis Sales & Revenue Trends Market Basket Analysis (Which products are bought together?) Predicting Churn & Retention Strategies
How Can You Use It? If you're into data science, machine learning, or business analytics, this dataset is perfect for hands-on projects. You can analyze customer behavior, predict sales, or even build recommendation systems.
Hope this dataset helps with your projects! Let me know if you find something interesting.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Dataset Name: Online Store Dataset
Description: The Online Store dataset is a comprehensive collection of 500 rows of synthetic e-commerce product data. Designed to simulate an online retail environment similar to major e-commerce platforms like Amazon, this dataset includes a diverse range of attributes for each product. The dataset provides valuable insights into product characteristics, pricing, stock levels, and customer feedback, making it ideal for analysis, machine learning, and data visualization projects.
Features:
ID: Unique identifier for each product. Product_Name: Name of the product, generated using random words to simulate real-world product names. Category: Product category (e.g., Electronics, Clothing, Books, Home, Toys, Sports). Price: Product price, ranging from $10 to $500. Stock: Number of items available in stock. Rating: Customer rating of the product (1 to 5 stars). Reviews: Number of customer reviews. Brand: Brand of the product. Date_Added: Date when the product was added to the catalog. Discount: Percentage discount applied to the product. Use Cases:
Data Analysis: Explore trends and patterns in e-commerce product data. Machine Learning: Build and train models for product recommendation, pricing strategies, or customer segmentation. Data Visualization: Create visualizations to analyze product categories, pricing distribution, and customer reviews. Notes:
The data is synthetic and randomly generated, reflecting typical attributes found in e-commerce platforms. This dataset can be used for educational purposes, practice, and experimentation with various data analysis and machine learning techniques.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview This dataset contains 50,000 fictional e-commerce transaction records, making it ideal for data analysis, visualization, and machine learning experiments. It includes user demographics, product categories, purchase amounts, payment methods, and transaction dates to help understand consumer behavior and sales trends.
Columns Transaction_ID β Unique identifier for each transaction User_Name β Randomly generated user name Age β Age of the user (18 to 70) Country β Country where the transaction took place (randomly chosen from 10 countries) Product_Category β Category of the purchased item (e.g., Electronics, Clothing, Books) Purchase_Amount β Total amount spent on the transaction (randomly generated between $5 and $1000) Payment_Method β Method used for payment (e.g., Credit Card, PayPal, UPI) Transaction_Date β Date of the purchase (randomly selected within the past two years)
Use Cases Sales and trend analysis β Identify which product categories are most popular Customer segmentation β Analyze spending behavior based on age and country Fraud detection β Detect unusual purchase patterns Machine learning projects β Train models for recommendation systems or revenue predictions
This dataset is synthetic and does not contain real user data. It can be used for research, experimentation, and educational purposes.
Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html
Project Objective: analyzing sales data to identify sales trends, peak periods, customer preferences, and customers segment based on purchasing behavior. The goal is to derive actionable insights for better targeting and strategy formulation.
The project includes 5 files: 1. E-commerce Data Analysis Project.csv: The database is composed of (8 rows X 18,590 columns). 2. Final_E_Code.py: The python script used for data cleaning, EDA, and data analysis and visualization. 3. Presentation.pdf: The deck of slides which uses the analysis and visualization produced by the python script to derive insights and recommendations. 4. LICENSE: The dataset is licensed under the GNU General Public License v3.0 (GPL-3.0). 5. README.md: It includes the project objective and attribution.