82 datasets found
  1. E-Commerce Sales Dataset

    • kaggle.com
    Updated Dec 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2022). E-Commerce Sales Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-profits-with-e-commerce-sales-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 3, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    The Devastator
    Description

    E-Commerce Sales Dataset

    Analyzing and Maximizing Online Business Performance

    By ANil [source]

    About this dataset

    This dataset provides an in-depth look at the profitability of e-commerce sales. It contains data on a variety of sales channels, including Shiprocket and INCREFF, as well as financial information on related expenses and profits. The columns contain data such as SKU codes, design numbers, stock levels, product categories, sizes and colors. In addition to this we have included the MRPs across multiple stores like Ajio MRP , Amazon MRP , Amazon FBA MRP , Flipkart MRP , Limeroad MRP Myntra MRP and PaytmMRP along with other key parameters like amount paid by customer for the purchase , rate per piece for every individual transaction Also we have added transactional parameters like Date of sale months category fulfilledby B2b Status Qty Currency Gross amt . This is a must-have dataset for anyone trying to uncover the profitability of e-commerce sales in today's marketplace

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset provides a comprehensive overview of e-commerce sales data from different channels covering a variety of products. Using this dataset, retailers and digital marketers can measure the performance of their campaigns more accurately and efficiently.

    The following steps help users make the most out of this dataset: - Analyze the general sales trends by examining info such as month, category, currency, stock level, and customer for each sale. This will give you an idea about how your e-commerce business is performing in each channel.
    - Review the Shiprocket and INCREF data to compare and analyze profitability via different fulfilment methods. This comparison would enable you to make better decisions towards maximizing profit while minimizing costs associated with each method’s referral fees and fulfillment rates.
    - Compare prices between various channels such as Amazon FBA MRP, Myntra MRP, Ajio MRP etc using the corresponding columns for each store (Amazon MRP etc). You can judge which stores are offering more profitable margins without compromising on quality by analyzing these pricing points in combination with other information related to product sales (TP1/TP2 - cost per piece).
    - Look at customer specific data such as TP 1/TP 2 combination wise Gross Amount or Rate info in terms price per piece or total gross amount generated by any SKU dispersed over multiple customers with relevant dates associated to track individual item performance relative to others within its category over time periods shortlisted/filtered appropriately.. Have an eye on items commonly utilized against offers or promotional discounts offered hence crafting strategies towards inventory optimization leading up-selling operations.?
    - Finally Use Overall ‘Stock’ details along all the P & L Data including Yearly Expenses_IIGF information record for takeaways which might be aimed towards essential cost cutting measures like switching amongst delivery options carefully chosen out of Shiprocket & INCREFF leadings away from manual inspections catering savings under support personnel outsourcing structures.?

    By employing a comprehensive understanding on how our internal subsidiaries perform globally unless attached respective audits may provide us remarkably lower operational costs servicing confidence; costing far lesser than being incurred taking into account entire pallet shipments tracking sheets representing current level supply chains efficiencies achieved internally., then one may finally scale profits exponentially increases cut down unseen losses followed up introducing newer marketing campaigns necessarily tailored according playing around multiple goods based spectrums due powerful backing suitable transportation boundaries set carefully

    Research Ideas

    • Analysing the difference in profitability between sales made through Shiprocket and INCREFF. This data can be used to see where the biggest profit margins lie, and strategize accordingly.
    • Examining the Complete Cost structure of a product with all its components and their contribution towards revenue or profitability, i.e., TP 1 & 2, MRP Old & Final MRP Old together with Platform based MRP - Amazon, Myntra and Paytm etc., Currency based Profit Margin etc.
    • Building a predictive model using Machine Learning by leveraging historical data to predict future sales volume and profits for e-commerce products across multiple categories/devices/platforms such as Amazon, Flipkart, Myntra etc as well providing m...
  2. G

    Retail e-commerce sales, inactive

    • open.canada.ca
    • ouvert.canada.ca
    • +1more
    csv, html, xml
    Updated Mar 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statistics Canada (2023). Retail e-commerce sales, inactive [Dataset]. https://open.canada.ca/data/en/dataset/0ffbe1ee-7fa7-4369-ac78-a01c8175e1a6
    Explore at:
    html, csv, xmlAvailable download formats
    Dataset updated
    Mar 24, 2023
    Dataset provided by
    Statistics Canada
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    This table contains 3 series, with data for years 2016 - 2017 (not all combinations necessarily have data for all years). This table contains data described by the following dimensions (Not all combinations are available): Geography (1 item: Canada); Sales (3 items: Retail trade; Electronic shopping and mail-order houses; Retail E-commerce sales).

  3. Linear Regression E-commerce Dataset

    • kaggle.com
    zip
    Updated Sep 16, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saurabh Kolawale (2019). Linear Regression E-commerce Dataset [Dataset]. https://www.kaggle.com/datasets/kolawale/focusing-on-mobile-app-or-website
    Explore at:
    zip(44169 bytes)Available download formats
    Dataset updated
    Sep 16, 2019
    Authors
    Saurabh Kolawale
    Description

    This dataset is having data of customers who buys clothes online. The store offers in-store style and clothing advice sessions. Customers come in to the store, have sessions/meetings with a personal stylist, then they can go home and order either on a mobile app or website for the clothes they want.

    The company is trying to decide whether to focus their efforts on their mobile app experience or their website.

  4. Furniture E-commerce Dataset – 140K+ Product Records with Categories &...

    • crawlfeeds.com
    csv, zip
    Updated Aug 20, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Furniture E-commerce Dataset – 140K+ Product Records with Categories & Breadcrumbs (CSV for AI & NLP) [Dataset]. https://crawlfeeds.com/datasets/furniture-e-commerce-dataset-140k-product-records-with-categories-breadcrumbs-csv-for-ai-nlp
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 20, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    This furniture e-commerce dataset includes 140,000+ structured product records collected from online retail sources. Each entry provides detailed product information, categories, and breadcrumb hierarchies, making it ideal for AI, machine learning, and analytics applications.

    Key Features:

    • 📊 140K+ furniture product records in structured format

    • 🏷 Includes categories, subcategories, and breadcrumbs for taxonomy mapping

    • 📂 Delivered as a clean CSV file for easy integration

    • 🔎 Perfect dataset for AI, NLP, and machine learning model training

    Best Use Cases:
    LLM training & fine-tuning with domain-specific data
    Product classification datasets for AI models
    Recommendation engines & personalization in e-commerce
    Market research & furniture retail analytics
    Search optimization & taxonomy enrichment

    Why this dataset?

    • Large volume (140K+ furniture records) for robust training

    • Real-world e-commerce product data

    • Ready-to-use CSV, saving preprocessing time

    • Affordable licensing with bulk discounts for enterprise buyers

    Note:
    Each record in this dataset includes both a url (main product page) and a buy_url (the actual purchase page).
    The dataset is structured so that records are based on the buy_url, ensuring you get unique, actionable product-level data instead of just generic landing pages.

  5. Looker Ecommerce BigQuery Dataset

    • kaggle.com
    Updated Jan 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mustafa Keser (2024). Looker Ecommerce BigQuery Dataset [Dataset]. https://www.kaggle.com/datasets/mustafakeser4/looker-ecommerce-bigquery-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 18, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Mustafa Keser
    Description

    Looker Ecommerce Dataset Description

    CSV version of Looker Ecommerce Dataset.

    Overview Dataset in BigQuery TheLook is a fictitious eCommerce clothing site developed by the Looker team. The dataset contains information >about customers, products, orders, logistics, web events and digital marketing campaigns. The contents of this >dataset are synthetic, and are provided to industry practitioners for the purpose of product discovery, testing, and >evaluation. This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This >means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on >this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public >datasets.

    1. distribution_centers.csv

    • Columns:
      • id: Unique identifier for each distribution center.
      • name: Name of the distribution center.
      • latitude: Latitude coordinate of the distribution center.
      • longitude: Longitude coordinate of the distribution center.

    2. events.csv

    • Columns:
      • id: Unique identifier for each event.
      • user_id: Identifier for the user associated with the event.
      • sequence_number: Sequence number of the event.
      • session_id: Identifier for the session during which the event occurred.
      • created_at: Timestamp indicating when the event took place.
      • ip_address: IP address from which the event originated.
      • city: City where the event occurred.
      • state: State where the event occurred.
      • postal_code: Postal code of the event location.
      • browser: Web browser used during the event.
      • traffic_source: Source of the traffic leading to the event.
      • uri: Uniform Resource Identifier associated with the event.
      • event_type: Type of event recorded.

    3. inventory_items.csv

    • Columns:
      • id: Unique identifier for each inventory item.
      • product_id: Identifier for the associated product.
      • created_at: Timestamp indicating when the inventory item was created.
      • sold_at: Timestamp indicating when the item was sold.
      • cost: Cost of the inventory item.
      • product_category: Category of the associated product.
      • product_name: Name of the associated product.
      • product_brand: Brand of the associated product.
      • product_retail_price: Retail price of the associated product.
      • product_department: Department to which the product belongs.
      • product_sku: Stock Keeping Unit (SKU) of the product.
      • product_distribution_center_id: Identifier for the distribution center associated with the product.

    4. order_items.csv

    • Columns:
      • id: Unique identifier for each order item.
      • order_id: Identifier for the associated order.
      • user_id: Identifier for the user who placed the order.
      • product_id: Identifier for the associated product.
      • inventory_item_id: Identifier for the associated inventory item.
      • status: Status of the order item.
      • created_at: Timestamp indicating when the order item was created.
      • shipped_at: Timestamp indicating when the order item was shipped.
      • delivered_at: Timestamp indicating when the order item was delivered.
      • returned_at: Timestamp indicating when the order item was returned.

    5. orders.csv

    • Columns:
      • order_id: Unique identifier for each order.
      • user_id: Identifier for the user who placed the order.
      • status: Status of the order.
      • gender: Gender information of the user.
      • created_at: Timestamp indicating when the order was created.
      • returned_at: Timestamp indicating when the order was returned.
      • shipped_at: Timestamp indicating when the order was shipped.
      • delivered_at: Timestamp indicating when the order was delivered.
      • num_of_item: Number of items in the order.

    6. products.csv

    • Columns:
      • id: Unique identifier for each product.
      • cost: Cost of the product.
      • category: Category to which the product belongs.
      • name: Name of the product.
      • brand: Brand of the product.
      • retail_price: Retail price of the product.
      • department: Department to which the product belongs.
      • sku: Stock Keeping Unit (SKU) of the product.
      • distribution_center_id: Identifier for the distribution center associated with the product.

    7. users.csv

    • Columns:
      • id: Unique identifier for each user.
      • first_name: First name of the user.
      • last_name: Last name of the user.
      • email: Email address of the user.
      • age: Age of the user.
      • gender: Gender of the user.
      • state: State where t...
  6. ECommerce Data Analysis

    • kaggle.com
    Updated Jan 1, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    M Mohaiminul Islam (2024). ECommerce Data Analysis [Dataset]. https://www.kaggle.com/datasets/mmohaiminulislam/ecommerce-data-analysis
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 1, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    M Mohaiminul Islam
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Objectives:

    • I leveraged advanced data visualization techniques to extract valuable insights from a comprehensive dataset. By visualizing sales patterns, customer behavior, and product trends, I identified key growth opportunities and provided actionable recommendations to optimize business strategies and enhance overall performance. you can find the GitHub repo here Link to GitHub Repository.

    Data Description:

    there are exactly 6 table and 1 is a fact table and the rest of them are dimension tables: Fact Table:

    payment_key:
      Description: An identifier representing the payment transaction associated with the fact.
      Use Case: This key links to a payment dimension table, providing details about the payment method and related information.
    
    customer_key:
      Description: An identifier representing the customer associated with the fact.
      Use Case: This key links to a customer dimension table, providing details about the customer, such as name, address, and other customer-specific information.
    
    time_key:
      Description: An identifier representing the time dimension associated with the fact.
      Use Case: This key links to a time dimension table, providing details about the time of the transaction, such as date, day of the week, and month.
    
    item_key:
      Description: An identifier representing the item or product associated with the fact.
      Use Case: This key links to an item dimension table, providing details about the product, such as category, sub-category, and product name.
    
    store_key:
      Description: An identifier representing the store or location associated with the fact.
      Use Case: This key links to a store dimension table, providing details about the store, such as location, store name, and other store-specific information.
    
    quantity:
      Description: The quantity of items sold or involved in the transaction.
      Use Case: Represents the amount or number of items associated with the transaction.
    
    unit:
      Description: The unit or measurement associated with the quantity (e.g., pieces, kilograms).
      Use Case: Specifies the unit of measurement for the quantity.
    
    unit_price:
      Description: The price per unit of the item.
      Use Case: Represents the cost or price associated with each unit of the item.
    
    total_price:
      Description: The total price of the transaction, calculated as the product of quantity and unit price.
      Use Case: Represents the overall cost or revenue generated by the transaction.
    

    Customer Table: customer_key:

    Description: An identifier representing a unique customer.
    Use Case: Serves as the primary key to link with the fact table, allowing for easy and efficient retrieval of customer-specific information.
    

    name:

    Description: The name of the customer.
    Use Case: Captures the personal or business name of the customer for identification and reference purposes.
    

    contact_no:

    Description: The contact number associated with the customer.
    Use Case: Stores the phone number or contact details for communication or outreach purposes.
    

    nid:

    Description: The National ID (NID) or a unique identification number for the customer.
    

    Item Table: item_key:

    Description: An identifier representing a unique item or product.
    Use Case: Serves as the primary key to link with the fact table, enabling retrieval of detailed information about specific items in transactions.
    

    item_name:

    Description: The name or title of the item.
    Use Case: Captures the descriptive name of the item, providing a recognizable label for the product.
    

    desc:

    Description: A description of the item.
    Use Case: Contains additional details about the item, such as features, specifications, or any relevant information.
    

    unit_price:

    Description: The price per unit of the item.
    Use Case: Represents the cost or price associated with each unit of the item.
    

    man_country:

    Description: The country where the item is manufactured.
    Use Case: Captures the origin or manufacturing location of the item.
    

    supplier:

    Description: The supplier or vendor providing the item.
    Use Case: Stores the name or identifier of the supplier, facilitating tracking of item sources.
    

    unit:

    Description: The unit of measurement associated with the item (e.g., pieces, kilograms).
    

    Store Table: store_key:

    Description: An identifier representing a unique store or location.
    Use Case: Serves as the primary key to link with the fact table, allowing for easy retrieval of information about transactions associated with specific stores.
    

    division:

    Description: The administrative division or region where the store is located.
    Use Case: Captures the broader geographical area in which...
    
  7. h

    Bitext-retail-ecommerce-llm-chatbot-training-dataset

    • huggingface.co
    Updated Aug 6, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bitext (2024). Bitext-retail-ecommerce-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 6, 2024
    Dataset authored and provided by
    Bitext
    License

    https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/

    Description

    Bitext - Retail (eCommerce) Tagged Training Dataset for LLM-based Virtual Assistants

      Overview
    

    This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the [Retail (eCommerce)] sector can be easily achieved using our two-step approach to LLM… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-retail-ecommerce-llm-chatbot-training-dataset.

  8. Synthetic E-Commerce Relational Datasets

    • kaggle.com
    Updated Aug 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nael Aqel (2025). Synthetic E-Commerce Relational Datasets [Dataset]. https://www.kaggle.com/datasets/naelaqel/synthetic-e-commerce-relational-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 31, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Nael Aqel
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Synthetic E-Commerce Relational Dataset

    This dataset is synthetically generated fake data designed to simulate a realistic e-commerce environment.

    Purpose

    To provide large-scale relational datasets for practicing database operations, analytics, and testing tools like DuckDB, Pandas, and SQL engines. Ideal for benchmarking, educational projects, and data engineering experiments.

    Entity Relationship Diagram (ERD) - Tables Overview

    1. Customers

    • customer_id (int): Unique identifier for each customer
    • name (string): Customer full name
    • email (string): Customer email address
    • gender (string): Customer gender ('Male', 'Female', 'Other')
    • signup_date (date): Date customer signed up
    • country (string): Customer country of residence

    2. Products

    • product_id (int): Unique identifier for each product
    • product_name (string): Name of the product
    • category (string): Product category (e.g., Electronics, Books)
    • price (float): Price per unit
    • stock_quantity (int): Available stock count
    • brand (string): Product brand name

    3. Orders

    • order_id (int): Unique identifier for each order
    • customer_id (int): ID of the customer who placed the order (foreign key to Customers)
    • order_date (date): Date when order was placed
    • total_amount (float): Total amount for the order
    • payment_method (string): Payment method used (Credit Card, PayPal, etc.)
    • shipping_country (string): Country where the order is shipped

    4. Order Items

    • order_item_id (int): Unique identifier for each order item
    • order_id (int): ID of the order this item belongs to (foreign key to Orders)
    • product_id (int): ID of the product ordered (foreign key to Products)
    • quantity (int): Number of units ordered
    • unit_price (float): Price per unit at order time

    5. Product Reviews

    • review_id (int): Unique identifier for each review
    • product_id (int): ID of the reviewed product (foreign key to Products)
    • customer_id (int): ID of the customer who wrote the review (foreign key to Customers)
    • rating (int): Rating score (1 to 5)
    • review_text (string): Text content of the review
    • review_date (date): Date the review was written

    Visual EDR

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F9179978%2F7681afe8fc52a116ff56a2a4e179ad19%2FEDR.png?generation=1754741998037680&alt=media" alt="">

    Notes

    • All data is randomly generated using Python’s Faker library, so it does not reflect any real individuals or companies.
    • The data is provided in both CSV and Parquet formats.
    • The generator script is available in the accompanying GitHub repository for reproducibility and customization.

    Output

    The script saves two folders inside the specified output path:

    csv/    # CSV files
    parquet/  # Parquet files
    

    License

    MIT License

    References

  9. o

    Pakistan Largest Ecommerce Dataset - Datasets - Open Data Pakistan

    • opendata.com.pk
    Updated Apr 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). Pakistan Largest Ecommerce Dataset - Datasets - Open Data Pakistan [Dataset]. https://opendata.com.pk/dataset/pakistan-largest-ecommerce-dataset
    Explore at:
    Dataset updated
    Apr 2, 2021
    Area covered
    Pakistan
    Description

    This is the largest retail e-commerce orders dataset from Pakistan. It contains half a million transaction records from March 2016 to August 2018.

  10. Zara UK Products Dataset - Complete Fashion E-commerce Data

    • crawlfeeds.com
    csv, zip
    Updated Aug 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Zara UK Products Dataset - Complete Fashion E-commerce Data [Dataset]. https://crawlfeeds.com/datasets/zara-uk-products-dataset-complete-fashion-e-commerce-data
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Aug 17, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    16,000 Zara UK Fashion Products in CSV Format

    Unlock fashion retail intelligence with our comprehensive Zara UK products dataset. This premium collection contains 16,000 products from Zara's UK online store, providing detailed insights into one of the world's leading fast-fashion retailers. Perfect for fashion trend analysis, pricing strategies, competitive research, and machine learning applications.

    Dataset Overview

    • Language: English
    • Coverage: Men's, women's, and children's fashion
    • File Size: ~30MB
    • Data Freshness: Recently collected (2025)

    Complete Data Fields Included

    Product Information

    • name: Complete product titles and descriptions
    • brand: Brand identification (Zara)
    • category: Product categories (tops, bottoms, dresses, accessories)
    • description: Detailed item descriptions and features
    • composition: Fabric composition and material details
    • breadcrumbs: Navigation path and product hierarchy

    Pricing and Promotions

    • price: Current prices in GBP
    • old_price: Original prices before discounts
    • discount: Discount percentages and savings
    • promotions: Active promotional campaigns
    • currency: GBP for UK market analysis

    Product Attributes

    • color: Available color variations
    • sizes: Size ranges and availability
    • images: High-resolution product image URLs
    • url: Direct product page links

    Technical Fields

    • uniq_id: Unique product identifiers
    • scraped_at: Data collection timestamps

    Key Use Cases

    Fashion Trend Analysis

    • Track seasonal trends and popular styles
    • Analyze color preferences and combinations
    • Monitor fashion trend evolution
    • Predict upcoming fashion movements

    Competitive Intelligence

    • Study Zara's pricing strategies
    • Analyze product mix and category focus
    • Monitor inventory and availability patterns
    • Compare market positioning

    E-commerce Analytics

    • Category performance analysis
    • Price optimization strategies
    • Inventory planning insights
    • Customer preference mapping

    Machine Learning Applications

    • Fashion recommendation systems
    • Price prediction models
    • Trend forecasting algorithms
    • Image recognition training data

    Data Quality Features

    • Clean, Validated Data: Pre-processed and error-checked
    • Consistent Formatting: Standardized structure across records
    • No Duplicates: Unique products only
    • Complete Coverage: Entire Zara UK catalog included
    • Fresh Collection: Recently scraped for current relevance

    Target Industries

    Fashion Retailers

    • Competitive benchmarking
    • Trend adoption strategies
    • Pricing optimization
    • Product development insights

    Technology Companies

    • AI training datasets
    • Fashion analytics platforms
    • E-commerce enhancement
    • Style recommendation engines

    Market Research

    • Industry analysis reports
    • Brand performance tracking
    • Consumer behavior studies
    • Trend forecasting services

    Academic Research

    • Fashion industry studies
    • Business case studies
    • Data science applications
    • Sustainability research

    Licensing Options

    Commercial License

    • Full business usage rights
    • Team sharing permissions
    • Resale of processed insights
    • API integration allowed

    Academic License

    • Non-commercial research use
    • Educational institution sharing
    • Publication rights included
    • Discounted pricing available

    Delivery Methods

    • Instant

  11. Fraud Detection in E-Commerce Dataset

    • kaggle.com
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Vagan (2025). Fraud Detection in E-Commerce Dataset [Dataset]. https://www.kaggle.com/datasets/kevinvagan/fraud-detection-dataset/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 3, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Kevin Vagan
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This is a fabricated dataset which is made by merging two dataset, Dataset1.csv and Dataset2.csv .

    The final dataset which merged_dataset.csv is a synthetic dataset, using probabilistic imputation to handle missing values.

    Balancing the Dataset: The dataset, which was initially imbalanced, was balanced using the ROSE (Random Over-Sampling Examples) package to ensure equal representation of fraudulent and non-fraudulent transactions.

    This dataset was used for my group and school project report, which can be read about more in this link. You can check out my code for this project, through this https://github.com/slothislazy/DM_AOL

  12. d

    E-Commerce Product Datasets for Product Catalog Insights

    • datarade.ai
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Oxylabs, E-Commerce Product Datasets for Product Catalog Insights [Dataset]. https://datarade.ai/data-products/e-commerce-product-datasets-for-product-catalog-insights-oxylabs
    Explore at:
    .json, .xml, .csv, .xlsAvailable download formats
    Dataset authored and provided by
    Oxylabs
    Area covered
    Kazakhstan, Ethiopia, Saint Vincent and the Grenadines, Niue, French Polynesia, Puerto Rico, Lao People's Democratic Republic, Samoa, Nicaragua, Tanzania
    Description

    Introducing E-Commerce Product Datasets!

    Unlock the full potential of your product strategy with E-Commerce Product Datasets. Gain invaluable insights to optimize your product offerings and pricing, analyze top-selling strategies, and assess customer sentiment.

    Our E-Commerce Datasets Source:

    1. Amazon: Access accurate product data from Amazon, including categories, pricing, reviews, and more.

    2. Walmart: Receive comprehensive product information from Walmart, covering pricing, sellers, ratings, availability, and more.

    E-Commerce Product Datasets provide structured and actionable data, empowering you to understand customer needs and enhance product strategies. We deliver fresh and precise public e-commerce data, including product names, brands, prices, number of sellers, review counts, ratings, and availability.

    You have the flexibility to tailor data delivery to your specific needs:

    • Receive datasets in various formats, including JSON and CSV.
    • Choose delivery via SFTP or directly to your cloud storage (e.g., AWS S3, Google Cloud Storage).
    • Select from one-time, monthly, quarterly, or bi-annual data delivery frequencies.

    Why Choose Oxylabs E-Commerce Datasets:

    1. Fresh and accurate data: Access clean and structured public e-commerce data collected by our leading web scraping professionals.

    2. Time and resource savings: Let our experts handle data extraction at an affordable cost, allowing you to focus on your core business objectives.

    3. Customizable solutions: Share your unique business needs, and our team will craft customized dataset solutions tailored to your requirements.

    4. Legal compliance: Partner with a trusted leader in ethical data collection, endorsed by Fortune 500 companies and fully compliant with GDPR and CCPA regulations.

    Pricing Options:

    Custom Datasets: Tailor datasets from any public web domain to your unique business needs. Contact our sales team for custom pricing.

    Experience a seamless journey with Oxylabs:

    • Understanding your data needs: We work closely to understand your business nature and daily operations, defining your unique data requirements.
    • Developing a customized solution: Our experts create a custom framework to extract public data using our in-house web scraping infrastructure.
    • Delivering data sample: We provide a sample for your feedback on data quality and the entire delivery process.
    • Continuous data delivery: We continuously collect public data and deliver custom datasets per the agreed frequency.

    Unlock the potential of your e-commerce strategy with E-Commerce Product Datasets!

  13. Walmart products free dataset

    • crawlfeeds.com
    csv, zip
    Updated Apr 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Crawl Feeds (2025). Walmart products free dataset [Dataset]. https://crawlfeeds.com/datasets/walmart-products-free-dataset
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Apr 27, 2025
    Dataset authored and provided by
    Crawl Feeds
    License

    https://crawlfeeds.com/privacy_policyhttps://crawlfeeds.com/privacy_policy

    Description

    Discover the Walmart Products Free Dataset, featuring 2,000 records in CSV format. This dataset includes detailed information about various Walmart products, such as names, prices, categories, and descriptions.

    It’s perfect for data analysis, e-commerce research, and machine learning projects. Download now and kickstart your insights with accurate, real-world data.

  14. Z

    E-commerce Product Dataset from Mercado Libre Perú

    • data.niaid.nih.gov
    Updated Oct 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cotacallapa Mamani, Harold Enrique (2023). E-commerce Product Dataset from Mercado Libre Perú [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8415495
    Explore at:
    Dataset updated
    Oct 12, 2023
    Dataset provided by
    Universidad Peruana Unión
    Authors
    Cotacallapa Mamani, Harold Enrique
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We offer a dataset comprising approximately 1,198,398 unique products sourced from Mercado Libre Perú. This dataset was collected from the platform's public API spanning from February 2022 to May 2023.

    Files description:

    ml_db_raw.db : Raw dataset stored in a SQLite Database

    ml_db_sample.csv : A sample of only 5 electronic categories

    test.csv* : 20% of data from ml_db_sample.csv

    train.csv* : 80% of data from ml_db_sample.csv

    • The dataset was divided into training and testing sets using a random stratified technique.

    Attributes description:

    CatX : Category Name for X level

    CatX_code : Category Code given by Mercado Libre for X level

    id : Unique product identifier

    title : Original product title

    price : Product price

    currency : Product currency (PEN, USD)

    link : Product link

    insert_date : Web scraping date

    mlp_updated_date : Mercado Libre product update date

    text : Cleaned product title

    taxonomy : Category path from general to specific categories

  15. m

    Data from: Dataset for the electronic customer relationship management based...

    • data.mendeley.com
    Updated Feb 22, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bui Thanh Khoa (2022). Dataset for the electronic customer relationship management based on S-O-R model in electronic commerce [Dataset]. http://doi.org/10.17632/n9tdpdp45k.4
    Explore at:
    Dataset updated
    Feb 22, 2022
    Authors
    Bui Thanh Khoa
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes a data file (.csv), a questionnaire (.pdf), and a figure (.png). It contains 485 records and three research factors: electronic loyalty (3 items), perceived metal benefits (4 items), and hedonic value (4 items). Based on data analysis, there were positive impacts of perceived mental benefits and hedonic value on electronic loyalty, which confirmed the Stimulus–Organism–Response (SOR) model. Additional, the hedonic value is a mediator in the relationship between electronic loyalty and perceived mental benefits in electronic customer relationship management.

  16. Ecommerce-FAQ-Chatbot-Dataset

    • kaggle.com
    Updated May 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muhammad Saad Makhdoom (2023). Ecommerce-FAQ-Chatbot-Dataset [Dataset]. https://www.kaggle.com/datasets/saadmakhdoom/ecommerce-faq-chatbot-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    May 19, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Muhammad Saad Makhdoom
    Description

    Dataset

    This dataset was created by Muhammad Saad Makhdoom

    Contents

  17. Pakistan's Largest E-Commerce Dataset

    • kaggle.com
    Updated Jan 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zeeshan-ul-hassan Usmani (2021). Pakistan's Largest E-Commerce Dataset [Dataset]. https://www.kaggle.com/datasets/zusmani/pakistans-largest-ecommerce-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 30, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Zeeshan-ul-hassan Usmani
    Area covered
    Pakistan
    Description

    Context

    This is the largest retail e-commerce orders dataset from Pakistan. It contains half a million transaction records from March 2016 to August 2018. The data was collected from various e-commerce merchants as part of a research study. I am releasing this dataset as a capstone project for my data science course at Alnafi (alnafi.com/zusmani).
    There is a dire need for such dataset to learn about Pakistan’s emerging e-commerce potential and I hope this will help many startups in many ways.

    Content

    Geography: Pakistan

    Time period: 03/2016 – 08/2018

    Unit of analysis: E-Commerce Orders

    Dataset: The dataset contains detailed information of half a million e-commerce orders in Pakistan from March 2016 to August 2018. It contains item details, shipping method, payment method like credit card, Easy-Paisa, Jazz-Cash, cash-on-delivery, product categories like fashion, mobile, electronics, appliance etc., date of order, SKU, price, quantity, total and customer ID. This is the most detailed dataset about e-commerce in Pakistan that you can find in the Public domain.

    Variables: The dataset contains Item ID, Order Status (Completed, Cancelled, Refund), Date of Order, SKU, Price, Quantity, Grand Total, Category, Payment Method and Customer ID.

    Size: 101 MB

    File Type: CSV

    Acknowledgements

    I like to thank all the startups who are trying to make their mark in Pakistan despite the unavailability of research data.

    Inspiration

    I’d like to call the attention of my fellow Kagglers to use Machine Learning and Data Sciences to help me explore these ideas:

    • What is the best-selling category? • Visualize payment method and order status frequency • Find a correlation between payment method and order status • Find a correlation between order date and item category • Find any hidden patterns that are counter-intuitive for a layman • Can we predict number of orders, or item category or number of customers/amount in advance?

  18. Z

    Dataset Literature Review E-commerce And CSR Strategy

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ferdi Safari (2024). Dataset Literature Review E-commerce And CSR Strategy [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7337446
    Explore at:
    Dataset updated
    Jul 15, 2024
    Dataset authored and provided by
    Ferdi Safari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data tersebut didapatkan dengan menggunakan website lens.org dengan memilih Scholarly Works dengan kunci/keyword E-commerce And CSR Strategy. Yang selanjutnya di filter sebanyak 2 kali pada bagian Document Type “Journal Article dan Conference Proceeding” setelahnya menggunakan fitur filter Subject Matter pada Subject “Law”. Pada setiap tahapan mulai awal hingga akhir semua data tersebut masing-masing di export dengan format CSV dan BIBTEX, dan diambil semua gambar yang muncul pada menu analisys di lens.org pada tahap terakhir.

  19. Research on E-Commerce as per Scopus Database as at October 2020

    • search.datacite.org
    • data.mendeley.com
    Updated Oct 7, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aidi Ahmi (2020). Research on E-Commerce as per Scopus Database as at October 2020 [Dataset]. http://doi.org/10.17632/jc6mjmf29s
    Explore at:
    Dataset updated
    Oct 7, 2020
    Dataset provided by
    DataCitehttps://www.datacite.org/
    Mendeley
    Authors
    Aidi Ahmi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This is a dataset that has been obtained from the Scopus database as of 7 October 2020. This dataset is gathered based on the following search query: TITLE ( "e-commerce" OR "electronic commerce" OR "e commerce" OR "ecommerce" ). It can be used to map the research on e-commerce using bibliometrics analysis from 1992 until 2020. There are 9 parts of this dataset and it has been prepared in CSV and RIS format. The data in CSV format can be opened and analysed using applications such as Microsoft Excel. It also can be opened using VOSviewer for constructing and visualizing bibliometric networks. While the data in RIS format can be opened using any reference manager software such as EndNote or Mendeley Desktop and Harzing's Publish or Perish to be further analysed.

  20. E-commerce, customer relation management (CRM) and secure transactions by...

    • data.europa.eu
    Updated Nov 30, 2009
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eurostat (2009). E-commerce, customer relation management (CRM) and secure transactions by size class of enterprise [Dataset]. https://data.europa.eu/data/datasets/i9yvadmdw9xeyctv8zeswg?locale=en
    Explore at:
    Dataset updated
    Nov 30, 2009
    Dataset authored and provided by
    Eurostathttps://ec.europa.eu/eurostat
    Description

    The dataset "isoc_bde15dec" has been discontinued since 08/02/2024.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Devastator (2022). E-Commerce Sales Dataset [Dataset]. https://www.kaggle.com/datasets/thedevastator/unlock-profits-with-e-commerce-sales-data/code
Organization logo

E-Commerce Sales Dataset

Analyzing and Maximizing Online Business Performance

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 3, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Description

E-Commerce Sales Dataset

Analyzing and Maximizing Online Business Performance

By ANil [source]

About this dataset

This dataset provides an in-depth look at the profitability of e-commerce sales. It contains data on a variety of sales channels, including Shiprocket and INCREFF, as well as financial information on related expenses and profits. The columns contain data such as SKU codes, design numbers, stock levels, product categories, sizes and colors. In addition to this we have included the MRPs across multiple stores like Ajio MRP , Amazon MRP , Amazon FBA MRP , Flipkart MRP , Limeroad MRP Myntra MRP and PaytmMRP along with other key parameters like amount paid by customer for the purchase , rate per piece for every individual transaction Also we have added transactional parameters like Date of sale months category fulfilledby B2b Status Qty Currency Gross amt . This is a must-have dataset for anyone trying to uncover the profitability of e-commerce sales in today's marketplace

More Datasets

For more datasets, click here.

Featured Notebooks

  • 🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset provides a comprehensive overview of e-commerce sales data from different channels covering a variety of products. Using this dataset, retailers and digital marketers can measure the performance of their campaigns more accurately and efficiently.

The following steps help users make the most out of this dataset: - Analyze the general sales trends by examining info such as month, category, currency, stock level, and customer for each sale. This will give you an idea about how your e-commerce business is performing in each channel.
- Review the Shiprocket and INCREF data to compare and analyze profitability via different fulfilment methods. This comparison would enable you to make better decisions towards maximizing profit while minimizing costs associated with each method’s referral fees and fulfillment rates.
- Compare prices between various channels such as Amazon FBA MRP, Myntra MRP, Ajio MRP etc using the corresponding columns for each store (Amazon MRP etc). You can judge which stores are offering more profitable margins without compromising on quality by analyzing these pricing points in combination with other information related to product sales (TP1/TP2 - cost per piece).
- Look at customer specific data such as TP 1/TP 2 combination wise Gross Amount or Rate info in terms price per piece or total gross amount generated by any SKU dispersed over multiple customers with relevant dates associated to track individual item performance relative to others within its category over time periods shortlisted/filtered appropriately.. Have an eye on items commonly utilized against offers or promotional discounts offered hence crafting strategies towards inventory optimization leading up-selling operations.?
- Finally Use Overall ‘Stock’ details along all the P & L Data including Yearly Expenses_IIGF information record for takeaways which might be aimed towards essential cost cutting measures like switching amongst delivery options carefully chosen out of Shiprocket & INCREFF leadings away from manual inspections catering savings under support personnel outsourcing structures.?

By employing a comprehensive understanding on how our internal subsidiaries perform globally unless attached respective audits may provide us remarkably lower operational costs servicing confidence; costing far lesser than being incurred taking into account entire pallet shipments tracking sheets representing current level supply chains efficiencies achieved internally., then one may finally scale profits exponentially increases cut down unseen losses followed up introducing newer marketing campaigns necessarily tailored according playing around multiple goods based spectrums due powerful backing suitable transportation boundaries set carefully

Research Ideas

  • Analysing the difference in profitability between sales made through Shiprocket and INCREFF. This data can be used to see where the biggest profit margins lie, and strategize accordingly.
  • Examining the Complete Cost structure of a product with all its components and their contribution towards revenue or profitability, i.e., TP 1 & 2, MRP Old & Final MRP Old together with Platform based MRP - Amazon, Myntra and Paytm etc., Currency based Profit Margin etc.
  • Building a predictive model using Machine Learning by leveraging historical data to predict future sales volume and profits for e-commerce products across multiple categories/devices/platforms such as Amazon, Flipkart, Myntra etc as well providing m...
Search
Clear search
Close search
Google apps
Main menu