Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset: Online Shopping Dataset;
CustomerID
Description: Unique identifier for each customer. Data Type: Numeric;
Gender:
Description: Gender of the customer (e.g., Male, Female). Data Type: Categorical;
Location:
Description: Location or address information of the customer. Data Type: Text;
Tenure_Months:
Description: Number of months the customer has been associated with the platform. Data Type: Numeric;
Transaction_ID:
Description: Unique identifier for each transaction. Data Type: Numeric;
Transaction_Date:
Description: Date of the transaction. Data Type: Date;
Product_SKU:
Description: Stock Keeping Unit (SKU) identifier for the product. Data Type: Text;
Product_Description:
Description: Description of the product. Data Type: Text;
Product_Category:
Description: Category to which the product belongs. Data Type: Categorical;
Quantity:
Description: Quantity of the product purchased in the transaction. Data Type: Numeric;
Avg_Price:
Description: Average price of the product. Data Type: Numeric;
Delivery_Charges:
Description: Charges associated with the delivery of the product. Data Type: Numeric;
Coupon_Status:
Description: Status of the coupon associated with the transaction. Data Type: Categorical;
GST:
Description: Goods and Services Tax associated with the transaction. Data Type: Numeric;
Date:
Description: Date of the transaction (potentially redundant with Transaction_Date). Data Type: Date;
Offline_Spend:
Description: Amount spent offline by the customer. Data Type: Numeric;
Online_Spend:
Description: Amount spent online by the customer. Data Type: Numeric;
Month:
Description: Month of the transaction. Data Type: Categorical;
Coupon_Code:
Description: Code associated with a coupon, if applicable. Data Type: Text;
Discount_pct:
Description: Percentage of discount applied to the transaction. Data Type: Numeric;
Facebook
TwitterContext
In the field of e-commerce, the datasets are typically considered as proprietary, meaning they are owned and controlled by individual organizations and are not often made publicly available due to privacy and business considerations. In spite of this, The UCI Machine Learning Repository, known for its extensive collection of datasets beneficial for machine learning and data mining research, has curated and made accessible a unique dataset. This dataset comprises actual transactional data spanning from the year 2010 to 2011. For those interested, the dataset is maintained and readily available on the UCI Machine Learning Repository's site under the title "Online Retail".
Content
The dataset is a transnational one, capturing every transaction made from December 1, 2010, through December 9, 2011, by a UK-based non-store online retail company. As an online retail entity, the company doesn't have a physical store presence, and its operations and sales are conducted purely online. The company's primary product offering includes unique gifts for all occasions. While the company serves a diverse range of customers, a significant number of its clientele includes wholesalers.
Acknowledgements
In collaboration with the UCI Machine Learning Repository, the dataset was provided and made available by Dr. Daqing Chen. Dr. Chen is the Director of the Public Analytics group at London South Bank University, UK. Any correspondence regarding this dataset can be sent to Dr. Chen at 'chend' at 'lsbu.ac.uk'. We are grateful to him for providing such an invaluable resource for researchers and data science enthusiasts.
The image used has been sourced from Canva
Inspiration
The rich and extensive data within this dataset opens the door for a multitude of potential analyses. It lends itself well to various methods and techniques in data science, including but not limited to time series analysis, clustering, and classification. By exploring this dataset, one could derive key insights into customer behavior, transaction trends, and product performance, providing ample opportunities for deep and insightful explorations.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Overview:
This dataset contains 1000 rows of synthetic online retail sales data, mimicking transactions from an e-commerce platform. It includes information about customer demographics, product details, purchase history, and (optional) reviews. This dataset is suitable for a variety of data analysis, data visualization and machine learning tasks, including but not limited to: customer segmentation, product recommendation, sales forecasting, market basket analysis, and exploring general e-commerce trends. The data was generated using the Python Faker library, ensuring realistic values and distributions, while maintaining no privacy concerns as it contains no real customer information.
Data Source:
This dataset is entirely synthetic. It was generated using the Python Faker library and does not represent any real individuals or transactions.
Data Content:
| Column Name | Data Type | Description |
|---|---|---|
customer_id | Integer | Unique customer identifier (ranging from 10000 to 99999) |
order_date | Date | Order date (a random date within the last year) |
product_id | Integer | Product identifier (ranging from 100 to 999) |
category_id | Integer | Product category identifier (10, 20, 30, 40, or 50) |
category_name | String | Product category name (Electronics, Fashion, Home & Living, Books & Stationery, Sports & Outdoors) |
product_name | String | Product name (randomly selected from a list of products within the corresponding category) |
quantity | Integer | Quantity of the product ordered (ranging from 1 to 5) |
price | Float | Unit price of the product (ranging from 10.00 to 500.00, with two decimal places) |
payment_method | String | Payment method used (Credit Card, Bank Transfer, Cash on Delivery) |
city | String | Customer's city (generated using Faker's city() method, so the locations will depend on the Faker locale you used) |
review_score | Integer | Customer's product rating (ranging from 1 to 5, or None with a 20% probability) |
gender | String | Customer's gender (M/F, or None with a 10% probability) |
age | Integer | Customer's age (ranging from 18 to 75) |
Potential Use Cases (Inspiration):
Customer Segmentation: Group customers based on demographics, purchasing behavior, and preferences.
Product Recommendation: Build a recommendation system to suggest products to customers based on their past purchases and browsing history.
Sales Forecasting: Predict future sales based on historical trends.
Market Basket Analysis: Identify products that are frequently purchased together.
Price Optimization: Analyze the relationship between price and demand.
Geographic Analysis: Explore sales patterns across different cities.
Time Series Analysis: Investigate sales trends over time.
Educational Purposes: Great for practicing data cleaning, EDA, feature engineering, and modeling.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
E-commerce has become a new channel to support businesses development. Through e-commerce, businesses can get access and establish a wider market presence by providing cheaper and more efficient distribution channels for their products or services. E-commerce has also changed the way people shop and consume products and services. Many people are turning to their computers or smart devices to order goods, which can easily be delivered to their homes.
This is a sales transaction data set of UK-based e-commerce (online retail) for one year. This London-based shop has been selling gifts and homewares for adults and children through the website since 2007. Their customers come from all over the world and usually make direct purchases for themselves. There are also small businesses that buy in bulk and sell to other customers through retail outlet channels.
The data set contains 500K rows and 8 columns. The following is the description of each column. 1. TransactionNo (categorical): a six-digit unique number that defines each transaction. The letter “C” in the code indicates a cancellation. 2. Date (numeric): the date when each transaction was generated. 3. ProductNo (categorical): a five or six-digit unique character used to identify a specific product. 4. Product (categorical): product/item name. 5. Price (numeric): the price of each product per unit in pound sterling (£). 6. Quantity (numeric): the quantity of each product per transaction. Negative values related to cancelled transactions. 7. CustomerNo (categorical): a five-digit unique number that defines each customer. 8. Country (categorical): name of the country where the customer resides.
There is a small percentage of order cancellation in the data set. Most of these cancellations were due to out-of-stock conditions on some products. Under this situation, customers tend to cancel an order as they want all products delivered all at once.
Information is a main asset of businesses nowadays. The success of a business in a competitive environment depends on its ability to acquire, store, and utilize information. Data is one of the main sources of information. Therefore, data analysis is an important activity for acquiring new and useful information. Analyze this dataset and try to answer the following questions. 1. How was the sales trend over the months? 2. What are the most frequently purchased products? 3. How many products does the customer purchase in each transaction? 4. What are the most profitable segment customers? 5. Based on your findings, what strategy could you recommend to the business to gain more profit?
Facebook
TwitterTypically e-commerce datasets are proprietary and consequently hard to find among publicly available data. However, The UCI Machine Learning Repository has made this dataset containing actual transactions from 2010 and 2011. The dataset is maintained on their site, where it can be found by the title "Online Retail".
"This is a transnational data set which contains all the transactions occurring between 01/12/2010 and 09/12/2011 for a UK-based and registered non-store online retail.The company mainly sells unique all-occasion gifts. Many customers of the company are wholesalers."
Per the UCI Machine Learning Repository, this data was made available by Dr Daqing Chen, Director: Public Analytics group. chend '@' lsbu.ac.uk, School of Engineering, London South Bank University, London SE1 0AA, UK.
Image from stocksnap.io.
Analyses for this dataset could include time series, clustering, classification and more.
Facebook
TwitterBy Weitong Li [source]
This dataset is a rich compilation of data that thoroughly guides us through consumers' behavior and their buying intentions while engaged in online shopping. It has been constructed with immense care to ensure it effectively examines an array of factors that influence customers' purchasing intentions in the increasingly significant realm of digital commerce.
The dataset is exhaustively composed with careful attention to collecting a diverse set of information, thus allowing a broad view into what affects online shopping behavior. Specific columns included cover customer's existing awareness about the website or source from where they are shopping, their information regarding the products they wish to purchase, and more importantly, their satisfaction level related to previous purchases.
Additionally, the dataset delves deep into investigating both objective and subjective aspects impacting customer behavior online. As such, it includes data on various webpage factors like loading speed, user-friendly interface design, webpage aesthetics, etc., which could significantly persuade the consumer's decision-making process during online shopping. The completion and submission convenience provided by those websites also form part of this database.
In order to fully understand consumer behavior within an online environment from multiple facets', individual consumers' subjective views are also captured in this dataset; it explores how consumers perceive their trust towards an e-commerce site or if they believe it’s convenient for them to shop via these platforms versus traditional methods? Do they feel relaxed when doing so?
In recognizing how crucial products competitiveness within such landscapes influences buyer intention - columns that provide details on critical characteristics like price comparisons against offline stores or similar product competitors across different websites have been included too.
Overall this comprehensive aggregated data collection aims not only at understanding fundamental consumer preferences but also towards predicting future buying behaviors hence forth enabling businesses capitalize on emerging trends within online retail spaces more efficiently & profitably
In an online-focused world, understanding consumer behavioral data is crucial. The 'Online Shopping Purchasing Intention Dataset' provides a comprehensive collection of consumer-based insights based on their behavior in virtual shopping environments. This dataset explores various factors that might affect a customer's decision to purchase. Here's how you can harness this dataset:
Defining the Problem
Identify a problem or question this data may answer. This might be: understanding what factors influence buying decisions, predicting whether a visit will result in a purchase based on user behavior, analyzing the impact of the month, operating system or traffic type on online purchasing intention etc.
Data Exploration
Understand the structure of the dataset by getting to know each variable and its meaning: - Administrative: Counting different types of pages visited by the user in that session. - Informational & Product Related: Measures how many informational/product related pages are viewed. - Bounce Rates, Exit Rate, Page Values: Assess these metrics as they provide significant insight about visitor activity. - Special Day: Explore correlation between proximity to special days (like Mother’s day and Valentine’s Day) with transactions. - Operating Systems / Browser / Region / Traffic Type: Uncover behavioral patterns associated with technical specs/geo location/ source of traffic.
Analysis and Visualization
Use appropriate statistical analysis techniques to scrutinize relationships between variables such as correlation analysis or chi-square tests for independence etc.
Visualize your findings using plots like bar graphs for categorical features comparison or scatter plots for multivariate relationships etc.
Model Building
Use machine learning algorithms (like logistic regression or decision tree models) potentially useful if your goal is predicting purchase intention based on given features.
This could also involve feature selection - choosing most relevant predictors; training & testing model and finally assessing model performance through metrics like accuracy score, precision-recall scores etc.
Remember to appropriately handle missing values if any before diving into predictive modeling
The comprehens...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Mariusz Šapczyński, Cracow University of Economics, Poland, lapczynm '@' uek.krakow.pl Sylwester Białowąs, Poznan University of Economics and Business, Poland, sylwester.bialowas '@' ue.poznan.pl
The dataset contains information on clickstream from online store offering clothing for pregnant women. Data are from five months of 2008 and include, among others, product category, location of the photo on the page, country of origin of the IP address and product price in US dollars.
The dataset contains 14 variables described in a separate file (See 'Data set description')
N/A
If you use this dataset, please cite:
Šapczyński M., Białowąs S. (2013) Discovering Patterns of Users' Behaviour in an E-shop - Comparison of Consumer Buying Behaviours in Poland and Other European Countries, “Studia Ekonomiczne†, nr 151, “La société de l'information : perspective européenne et globale : les usages et les risques d'Internet pour les citoyens et les consommateurs†, p. 144-153
========================================================
========================================================
========================================================
========================================================
following categories:
1-Australia 2-Austria 3-Belgium 4-British Virgin Islands 5-Cayman Islands 6-Christmas Island 7-Croatia 8-Cyprus 9-Czech Republic 10-Denmark 11-Estonia 12-unidentified 13-Faroe Islands 14-Finland 15-France 16-Germany 17-Greece 18-Hungary 19-Iceland 20-India 21-Ireland 22-Italy 23-Latvia 24-Lithuania 25-Luxembourg 26-Mexico 27-Netherlands 28-Norway 29-Poland 30-Portugal 31-Romania 32-Russia 33-San Marino 34-Slovakia 35-Slovenia 36-Spain 37-Sweden 38-Switzerland 39-Ukraine 40-United Arab Emirates 41-United Kingdom 42-USA 43-biz (.biz) 44-com (.com) 45-int (.int) 46-net (.net) 47-org (*.org)
========================================================
========================================================
1-trousers 2-skirts 3-blouses 4-sale
========================================================
(217 products)
========================================================
1-beige 2-black 3-blue 4-brown 5-burgundy 6-gray 7-green 8-navy blue 9-of many colors 10-olive 11-pink 12-red 13-violet 14-white
========================================================
1-top left 2-top in the middle 3-top right 4-bottom left 5-bottom in the middle 6-bottom right
========================================================
1-en face 2-profile
========================================================
========================================================
the average price for the entire product category
1-yes 2-no
========================================================
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Facebook
TwitterThis data is from E-Commerce. I used postgreSQL for data cleaning. I transformed NULL values to 'Not defined' and orginal data have only category name column(which was 'category_code') and that was 'DOT' seperated value which show us the products class from wide to specific. So I split them with delimeter('.').
| column name | description |
|---|---|
| time | Time when event happened at (in UTC). |
| event_name | 4 kinds of value: purchase, cart, view, remove_from_cart |
| product_id | ID of a product |
| category_id | Product's category ID |
| category_name | Product's category taxonomy (code name) if it was possible to make it. Usually present for meaningful categories and skipped for different kinds of accessories. |
| brand | Downcased string of brand name. |
| price | Float price of a product. |
| user_id | Permanent user ID. |
| session | Temporary user's session ID. Same for each user's session. Is changed every time user come back to online store from a long pause. |
| category_1 | Largest class of product included |
| category_2 | Bigger class of product included |
| category_3 | Smallest class of product included |
Many thanks Thanks to REES46 Marketing Platform for this dataset and Michael Kechinov
You can use this dataset for free. Just mention the source of it: link to this page and link to REES46 Marketing Platform and Origin data provider
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Facebook
TwitterBy UCI [source]
Comprehensive Dataset on Online Retail Sales and Customer Data
Welcome to this comprehensive dataset offering a wide array of information related to online retail sales. This data set provides an in-depth look at transactions, product details, and customer information documented by an online retail company based in the UK. The scope of the data spans vastly, from granular details about each product sold to extensive customer data sets from different countries.
This transnational data set is a treasure trove of vital business insights as it meticulously catalogues all the transactions that happened during its span. It houses rich transactional records curated by a renowned non-store online retail company based in the UK known for selling unique all-occasion gifts. A considerable portion of its clientele includes wholesalers; ergo, this dataset can prove instrumental for companies looking for patterns or studying purchasing trends among such businesses.
The available attributes within this dataset offer valuable pieces of information:
InvoiceNo: This attribute refers to invoice numbers that are six-digit integral numbers uniquely assigned to every transaction logged in this system. Transactions marked with 'c' at the beginning signify cancellations - adding yet another dimension for purchase pattern analysis.
StockCode: Stock Code corresponds with specific items as they're represented within the inventory system via 5-digit integral numbers; these allow easy identification and distinction between products.
Description: This refers to product names, giving users qualitative knowledge about what kind of items are being bought and sold frequently.
Quantity: These figures ascertain the volume of each product per transaction – important figures that can help understand buying trends better.
InvoiceDate: Invoice Dates detail when each transaction was generated down to precise timestamps – invaluable when conducting time-based trend analysis or segmentation studies.
UnitPrice: Unit prices represent how much each unit retails at — crucial for revenue calculations or cost-related analyses.
Finally,
- Country: This locational attribute shows where each customer hails from, adding geographical segmentation to your data investigation toolkit.
This dataset was originally collated by Dr Daqing Chen, Director of the Public Analytics group based at the School of Engineering, London South Bank University. His research studies and business cases with this dataset have been published in various papers contributing to establishing a solid theoretical basis for direct, data and digital marketing strategies.
Access to such records can ensure enriching explorations or formulating insightful hypotheses about consumer behavior patterns among wholesalers. Whether it's managing inventory or studying transactional trends over time or spotting cancellation patterns - this dataset is apt for multiple forms of retail analysis
1. Sales Analysis:
Sales data forms the backbone of this dataset, and it allows users to delve into various aspects of sales performance. You can use the Quantity and UnitPrice fields to calculate metrics like revenue, and further combine it with InvoiceNo information to understand sales over individual transactions.
2. Product Analysis:
Each product in this dataset comes with its unique identifier (StockCode) and its name (Description). You could analyse which products are most popular based on Quantity sold or look at popularity per transaction by considering both Quantity and InvoiceNo.
3. Customer Segmentation:
If you associated specific business logic onto the transactions (such as calculating total amounts), then you could use standard machine learning methods or even RFM (Recency, Frequency, Monetary) segmentation techniques combining it with 'CustomerID' for your customer base to understand customer behavior better. Concatenating invoice numbers (which stand for separate transactions) per client will give insights about your clients as well.
4. Geographical Analysis:
The Country column enables analysts to study purchase patterns across different geographical locations.
Practical applications
Understand what products sell best where - It can help drive tailored marketing strategies. Anomalies detection – Identify unusual behaviors that might lead frau...
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The Online Retail Dataset consists of records about retail transactions conducted online. It contains information about customer purchases, including the invoice number, stock code, description of the items purchased, quantity, unit price, invoice date, customer ID, and country.
Here's a breakdown of the columns in the dataset:
The dataset contains 542k records, with some missing values in the Description and CustomerID columns. The data types include integers, floats, datetime objects, and strings.
This dataset provides valuable insights into customer purchasing behavior, item popularity, sales trends over time, and geographic distribution of transactions. It can be used for various analytical purposes, including customer segmentation, sales forecasting, and market analysis.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
🛒 E-Commerce Customer Behavior and Sales Dataset 📊 Dataset Overview This comprehensive dataset contains 5,000 e-commerce transactions from a Turkish online retail platform, spanning from January 2023 to March 2024. The dataset provides detailed insights into customer demographics, purchasing behavior, product preferences, and engagement metrics.
🎯 Use Cases This dataset is perfect for:
Customer Segmentation Analysis: Identify distinct customer groups based on behavior Sales Forecasting: Predict future sales trends and patterns Recommendation Systems: Build product recommendation engines Customer Lifetime Value (CLV) Prediction: Estimate customer value Churn Analysis: Identify customers at risk of leaving Marketing Campaign Optimization: Target customers effectively Price Optimization: Analyze price sensitivity across categories Delivery Performance Analysis: Optimize logistics and shipping 📁 Dataset Structure The dataset contains 18 columns with the following features:
Order Information Order_ID: Unique identifier for each order (ORD_XXXXXX format) Date: Transaction date (2023-01-01 to 2024-03-26) Customer Demographics Customer_ID: Unique customer identifier (CUST_XXXXX format) Age: Customer age (18-75 years) Gender: Customer gender (Male, Female, Other) City: Customer city (10 major Turkish cities) Product Information Product_Category: 8 categories (Electronics, Fashion, Home & Garden, Sports, Books, Beauty, Toys, Food) Unit_Price: Price per unit (in TRY/Turkish Lira) Quantity: Number of units purchased (1-5) Transaction Details Discount_Amount: Discount applied (if any) Total_Amount: Final transaction amount after discount Payment_Method: Payment method used (5 types) Customer Behavior Metrics Device_Type: Device used for purchase (Mobile, Desktop, Tablet) Session_Duration_Minutes: Time spent on website (1-120 minutes) Pages_Viewed: Number of pages viewed during session (1-50) Is_Returning_Customer: Whether customer has purchased before (True/False) Post-Purchase Metrics Delivery_Time_Days: Delivery duration (1-30 days) Customer_Rating: Customer satisfaction rating (1-5 stars) 📈 Key Statistics Total Records: 5,000 transactions Date Range: January 2023 - March 2024 (15 months) Average Transaction Value: ~450 TRY Customer Satisfaction: 3.9/5.0 average rating Returning Customer Rate: 60% Mobile Usage: 55% of transactions 🔍 Data Quality ✅ No missing values ✅ Consistent formatting across all fields ✅ Realistic data distributions ✅ Proper data types for all columns ✅ Logical relationships between features 💡 Sample Analysis Ideas Customer Segmentation with K-Means Clustering
Segment customers based on spending, frequency, and recency Sales Trend Analysis
Identify seasonal patterns and peak shopping periods Product Category Performance
Compare revenue, ratings, and return rates across categories Device-Based Behavior Analysis
Understand how device choice affects purchasing patterns Predictive Modeling
Build models to predict customer ratings or purchase amounts City-Level Market Analysis
Compare market performance across different cities 🛠️ Technical Details File Format: CSV (Comma-Separated Values) Encoding: UTF-8 File Size: ~500 KB Delimiter: Comma (,) 📚 Column Descriptions Column Name Data Type Description Example Order_ID String Unique order identifier ORD_001337 Customer_ID String Unique customer identifier CUST_01337 Date DateTime Transaction date 2023-06-15 Age Integer Customer age 35 Gender String Customer gender Female City String Customer city Istanbul Product_Category String Product category Electronics Unit_Price Float Price per unit 1299.99 Quantity Integer Units purchased 2 Discount_Amount Float Discount applied 129.99 Total_Amount Float Final amount paid 2469.99 Payment_Method String Payment method Credit Card Device_Type String Device used Mobile Session_Duration_Minutes Integer Session time 15 Pages_Viewed Integer Pages viewed 8 Is_Returning_Customer Boolean Returning customer True Delivery_Time_Days Integer Delivery duration 3 Customer_Rating Integer Satisfaction rating 5 🎓 Learning Outcomes By working with this dataset, you can learn:
Data cleaning and preprocessing techniques Exploratory Data Analysis (EDA) with Python/R Statistical analysis and hypothesis testing Machine learning model development Data visualization best practices Business intelligence and reporting 📝 Citation If you use this dataset in your research or project, please cite:
E-Commerce Customer Behavior and Sales Dataset (2024) Turkish Online Retail Platform Data (2023-2024) Available on Kaggle ⚖️ License This dataset is released under the CC0: Public Domain license. You are free to use it for any purpose.
🤝 Contribution Found any issues or have suggestions? Feel free to provide feedback!
📞 Contact For questions or collaborations, please reach out through Kaggle.
Happy Analyzing! 🚀
Keywords: e-c...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
| Column Name | Description |
|---|---|
| InvoiceNo | A unique identifier for each sales transaction (invoice). |
| StockCode | The code representing the product stock-keeping unit (SKU). |
| Description | A brief description of the product. |
| Quantity | The number of units of the product sold in the transaction. |
| InvoiceDate | The date and time when the sale was recorded. |
| UnitPrice | The price per unit of the product in the transaction currency. |
| CustomerID | A unique identifier for each customer. |
| Country | The customer's country. |
| Discount | The discount applied to the transaction, if any. |
| PaymentMethod | The method of payment used for the transaction (e.g., PayPal, Bank Transfer). |
| ShippingCost | The cost of shipping for the transaction. |
| Category | The category to which the product belongs (e.g., Electronics, Apparel). |
| SalesChannel | The channel through which the sale was made (e.g., Online, In-store). |
| ReturnStatus | Indicates whether the item was returned or not. |
| ShipmentProvider | The provider responsible for delivering the order (e.g., UPS, FedEx). |
| WarehouseLocation | The warehouse location from which the order was fulfilled. |
| OrderPriority | The priority level of the order (e.g., High, Medium, Low). |
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
📦 Ecommerce Dataset (Products & Sizes Included)
🛍️ Essential Data for Building an Ecommerce Website & Analyzing Online Shopping Trends 📌 Overview This dataset contains 1,000+ ecommerce products, including detailed information on pricing, ratings, product specifications, seller details, and more. It is designed to help data scientists, developers, and analysts build product recommendation systems, price prediction models, and sentiment analysis tools.
🔹 Dataset Features
Column Name Description product_id Unique identifier for the product title Product name/title product_description Detailed product description rating Average customer rating (0-5) ratings_count Number of ratings received initial_price Original product price discount Discount percentage (%) final_price Discounted price currency Currency of the price (e.g., USD, INR) images URL(s) of product images delivery_options Available delivery methods (e.g., standard, express) product_details Additional product attributes breadcrumbs Category path (e.g., Electronics > Smartphones) product_specifications Technical specifications of the product amount_of_stars Distribution of star ratings (1-5 stars) what_customers_said Customer reviews (sentiments) seller_name Name of the product seller sizes Available sizes (for clothing, shoes, etc.) videos Product video links (if available) seller_information Seller details, such as location and rating variations Different variants of the product (e.g., color, size) best_offer Best available deal for the product more_offers Other available deals/offers category Product category
📊 Potential Use Cases
📌 Build an Ecommerce Website: Use this dataset to design a functional online store with product listings, filtering, and sorting. 🔍 Price Prediction Models: Predict product prices based on features like ratings, category, and discount. 🎯 Recommendation Systems: Suggest products based on user preferences, rating trends, and customer feedback. 🗣 Sentiment Analysis: Analyze what_customers_said to understand customer satisfaction and product popularity. 📈 Market & Competitor Analysis: Track pricing trends, popular categories, and seller performance. 🔍 Why Use This Dataset? ✅ Rich Feature Set: Includes all necessary ecommerce attributes. ✅ Realistic Pricing & Rating Data: Useful for price analysis and recommendations. ✅ Multi-Purpose: Suitable for machine learning, web development, and data visualization. ✅ Structured Format: Easy-to-use CSV format for quick integration.
📂 Dataset Format
CSV file (ecommerce_dataset.csv)
1000+ samples
Multi-category coverage
🔗 How to Use?
Download the dataset from Kaggle.
Load it in Python using Pandas:
python
Copy
Edit
import pandas as pd
df = pd.read_csv("ecommerce_dataset.csv")
df.head()
Explore trends & patterns using visualization tools (Seaborn, Matplotlib).
Build models & applications based on the dataset!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
E-commerce product recommendation is a feature commonly used in online retail to suggest products to customers based on various factors, including their browsing history, purchase behavior, product preferences, and other users' similar actions. This technique is pivotal in personalizing the shopping experience and increasing customer engagement and sales.
Facebook
TwitterBy ANil [source]
This dataset provides an in-depth look at the profitability of e-commerce sales. It contains data on a variety of sales channels, including Shiprocket and INCREFF, as well as financial information on related expenses and profits. The columns contain data such as SKU codes, design numbers, stock levels, product categories, sizes and colors. In addition to this we have included the MRPs across multiple stores like Ajio MRP , Amazon MRP , Amazon FBA MRP , Flipkart MRP , Limeroad MRP Myntra MRP and PaytmMRP along with other key parameters like amount paid by customer for the purchase , rate per piece for every individual transaction Also we have added transactional parameters like Date of sale months category fulfilledby B2b Status Qty Currency Gross amt . This is a must-have dataset for anyone trying to uncover the profitability of e-commerce sales in today's marketplace
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides a comprehensive overview of e-commerce sales data from different channels covering a variety of products. Using this dataset, retailers and digital marketers can measure the performance of their campaigns more accurately and efficiently.
The following steps help users make the most out of this dataset: - Analyze the general sales trends by examining info such as month, category, currency, stock level, and customer for each sale. This will give you an idea about how your e-commerce business is performing in each channel.
- Review the Shiprocket and INCREF data to compare and analyze profitability via different fulfilment methods. This comparison would enable you to make better decisions towards maximizing profit while minimizing costs associated with each method’s referral fees and fulfillment rates.
- Compare prices between various channels such as Amazon FBA MRP, Myntra MRP, Ajio MRP etc using the corresponding columns for each store (Amazon MRP etc). You can judge which stores are offering more profitable margins without compromising on quality by analyzing these pricing points in combination with other information related to product sales (TP1/TP2 - cost per piece).
- Look at customer specific data such as TP 1/TP 2 combination wise Gross Amount or Rate info in terms price per piece or total gross amount generated by any SKU dispersed over multiple customers with relevant dates associated to track individual item performance relative to others within its category over time periods shortlisted/filtered appropriately.. Have an eye on items commonly utilized against offers or promotional discounts offered hence crafting strategies towards inventory optimization leading up-selling operations.?
- Finally Use Overall ‘Stock’ details along all the P & L Data including Yearly Expenses_IIGF information record for takeaways which might be aimed towards essential cost cutting measures like switching amongst delivery options carefully chosen out of Shiprocket & INCREFF leadings away from manual inspections catering savings under support personnel outsourcing structures.?By employing a comprehensive understanding on how our internal subsidiaries perform globally unless attached respective audits may provide us remarkably lower operational costs servicing confidence; costing far lesser than being incurred taking into account entire pallet shipments tracking sheets representing current level supply chains efficiencies achieved internally., then one may finally scale profits exponentially increases cut down unseen losses followed up introducing newer marketing campaigns necessarily tailored according playing around multiple goods based spectrums due powerful backing suitable transportation boundaries set carefully
- Analysing the difference in profitability between sales made through Shiprocket and INCREFF. This data can be used to see where the biggest profit margins lie, and strategize accordingly.
- Examining the Complete Cost structure of a product with all its components and their contribution towards revenue or profitability, i.e., TP 1 & 2, MRP Old & Final MRP Old together with Platform based MRP - Amazon, Myntra and Paytm etc., Currency based Profit Margin etc.
- Building a predictive model using Machine Learning by leveraging historical data to predict future sales volume and profits for e-commerce products across multiple categories/devices/platforms such as Amazon, Flipkart, Myntra etc as well providing m...
Facebook
TwitterE-commerce (electronic commerce) is the buying and selling of goods and services, or the transmitting of funds or data, over an electronic network, primarily the internet. These business transactions occur either as business-to-business (B2B), business-to-consumer (B2C), consumer-to-consumer or consumer-to-business
This is simple data set of US online_store from 2020.
So, the data cames with some questions !!
What was the highest Sale in 2020? What is average discount rate of charis? What are the highest selling months in 2020? What is the Profit Margin for each sales record? How much profit is gained for each product? What is the total Profit & Sales by Sub-Category? People from city/state shop the most? Develop a function, to return a dataframe which is grouped by a particular column (as an input)
If you have wonderful idea about this dataset, welcome to contribute !!! Happy Kaggling, please up-vote if you find this dataset helpful!🖤!
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By Jeffrey Mvutu Mabilama [source]
Welcome to an exciting exploration of global C2C fashion store user behaviour! This dataset seeks to serve as a benchmark by providing valuable insights into e-commerce users, enabling you to make informed decisions and effectively grow your business. Let's dive right into the data!
This dataset contains records on over 9 million registered users from a successful online C2C fashion store launched in Europe around 2009 and later expanded worldwide. It includes metrics such as country, gender, active users, top buyers/sellers/ratio*, products bought/sold/listed* and social network features (likes/follows). Furthermore this is just a preview of much larger data set which contains more detailed information including product listings, comments from listed products etc.
E-commerce has become an essential part of our lives - people are now accustomed to buying anything with a few clicks online. With so many unknown elements that come with not only selling but also providing good customer service - understanding user behavior is key for success in this domain. By utilizing this dataset you can answer questions such as 'how many customers are likely to drop off after years of using my service?,' 'are my users active enough compared to those in this dataset?,” or “how likely are people from other countries signing up in a C2C website?' In addition, if you think this kind odf dataset may be useful don't forget do show your support or appreciation by leaving an upvote or comment on the page!
My Telegram bot will answer any queries regarding the datasets as well allow you see contact me directly if necessary; also please don't forget check out the *[data.world page](https://data.world/jfreex/e-commerce-users-of-a-french-c2c
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset provides a useful overview of global users' behavior in an online C2C fashion store. The data includes metrics such as buyers, top buyers, top buyer ratio, female buyers and their respective ratios, etc., per country. This dataset can be used to gain insights into how global audiences interact with the store and draw conclusions from comparison between different countries.
In order to make use of this dataset, one must first familiarize themselves with the various metrics included in it. These include: country; number of overall buyers; number of top buyers; ratio(s) of them (top buyer to total buyer); female-related data (buyers, top female buyers); bought-to-wish/like ration (top and non-top separately); overall products bought/wished/liked; total products sold by tops sellers in the same country versus what they sold outside the country; mean value for product stats (sold/listed/etc...) from looking at the whole population or just users that make those actions multiple times; average days for user offline /lurking around on the site without posting anything or buying anything etc.; mean follower(s) count(s).
Using this data one could generate reports about user behavior within particular countries either manually by computing all statistics or by using libraries like Pandas or SQL with queries made toward this datasets which consists of columns representing individual countries with all values necessary to answer any questions you might have regarding how many people buy something out there per region and what type they are –– Are they Top Buyer? Female? Etc.
Further potential work could involve utilising machine learning tools such as clustering algorithms to group similar customers together based on certain traits like age group, profession etc., so that personalised marketing promotions can be targetted at these customer clusters rather than aiming more generic ads at everyone!
Finally combined with other related product datasets which is available upon request via JfreexDatasets_bot provided by Jfreex team , this dataset can become another powerful tool providing you actionable insights into customers today — allowing you build better strategies towards improving customer experience tomorrow!
- Analyzing the conversion rate of users on a website - Comparing user metrics like the overall number of buyers, female buyers, top buyers ratio and top buyer gender can help determine if users in certain countries are more or less likely to convert into customers. Additionally, comparing average metrics like products bought or offl...
Facebook
TwitterThis dataset is having data of customers who buys clothes online. The store offers in-store style and clothing advice sessions. Customers come in to the store, have sessions/meetings with a personal stylist, then they can go home and order either on a mobile app or website for the clothes they want.
The company is trying to decide whether to focus their efforts on their mobile app experience or their website.
Facebook
TwitterData description “e-shop clothing 2008”
Variables:
========================================================
========================================================
========================================================
========================================================
1-Australia 2-Austria 3-Belgium 4-British Virgin Islands 5-Cayman Islands 6-Christmas Island 7-Croatia 8-Cyprus 9-Czech Republic 10-Denmark 11-Estonia 12-unidentified 13-Faroe Islands 14-Finland 15-France 16-Germany 17-Greece 18-Hungary 19-Iceland 20-India 21-Ireland 22-Italy 23-Latvia 24-Lithuania 25-Luxembourg 26-Mexico 27-Netherlands 28-Norway 29-Poland 30-Portugal 31-Romania 32-Russia 33-San Marino 34-Slovakia 35-Slovenia 36-Spain 37-Sweden 38-Switzerland 39-Ukraine 40-United Arab Emirates 41-United Kingdom 42-USA 43-biz (.biz) 44-com (.com) 45-int (.int) 46-net (.net) 47-org (*.org)
========================================================
========================================================
========================================================
========================================================
1-beige 2-black 3-blue 4-brown 5-burgundy 6-gray 7-green 8-navy blue 9-of many colors 10-olive 11-pink 12-red 13-violet 14-white
========================================================
1-top left 2-top in the middle 3-top right 4-bottom left 5-bottom in the middle 6-bottom right
========================================================
1-en face 2-profile
========================================================
========================================================
1-yes 2-no
========================================================
++++++++++++++++++++++++++++++++++++++++++++++++++++++++
I want to know how to solve this data regarding any problem (clustering, regression, classification, EDA)
Source: https://archive.ics.uci.edu/ml/datasets/clickstream+data+for+online+shopping
Facebook
Twitterhttps://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/
This dataset contains online retail sales data from an online store based in the UK. The data covers transactions from 01/12/2010 to 09/12/2011. Columns: InvoiceNo: Invoice number, a unique identifier for each transaction StockCode: Product code, a unique identifier for each product Description: Description of the product Quantity: Quantity of each product purchased in a transaction InvoiceDate: Date and time of the transaction UnitPrice: Price of each product CustomerID: Unique identifier for each customer Country: The country where the customer resides
Facebook
TwitterApache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Dataset: Online Shopping Dataset;
CustomerID
Description: Unique identifier for each customer. Data Type: Numeric;
Gender:
Description: Gender of the customer (e.g., Male, Female). Data Type: Categorical;
Location:
Description: Location or address information of the customer. Data Type: Text;
Tenure_Months:
Description: Number of months the customer has been associated with the platform. Data Type: Numeric;
Transaction_ID:
Description: Unique identifier for each transaction. Data Type: Numeric;
Transaction_Date:
Description: Date of the transaction. Data Type: Date;
Product_SKU:
Description: Stock Keeping Unit (SKU) identifier for the product. Data Type: Text;
Product_Description:
Description: Description of the product. Data Type: Text;
Product_Category:
Description: Category to which the product belongs. Data Type: Categorical;
Quantity:
Description: Quantity of the product purchased in the transaction. Data Type: Numeric;
Avg_Price:
Description: Average price of the product. Data Type: Numeric;
Delivery_Charges:
Description: Charges associated with the delivery of the product. Data Type: Numeric;
Coupon_Status:
Description: Status of the coupon associated with the transaction. Data Type: Categorical;
GST:
Description: Goods and Services Tax associated with the transaction. Data Type: Numeric;
Date:
Description: Date of the transaction (potentially redundant with Transaction_Date). Data Type: Date;
Offline_Spend:
Description: Amount spent offline by the customer. Data Type: Numeric;
Online_Spend:
Description: Amount spent online by the customer. Data Type: Numeric;
Month:
Description: Month of the transaction. Data Type: Categorical;
Coupon_Code:
Description: Code associated with a coupon, if applicable. Data Type: Text;
Discount_pct:
Description: Percentage of discount applied to the transaction. Data Type: Numeric;