2 datasets found
  1. Online_Retail_II

    • kaggle.com
    Updated Jul 2, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shah Nawaj (2025). Online_Retail_II [Dataset]. https://www.kaggle.com/datasets/shahnawaj9/online-retail/data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 2, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shah Nawaj
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Cleaned & Merged UCI Online Retail Dataset (Dec 2009 – Dec 2011)

    This dataset is a cleaned and merged version of the original UCI Online Retail and Online Retail II datasets. It contains transaction data from a UK-based online retailer, covering a period from December 2009 to December 2011.

    Description

    The original UCI Online Retail II dataset contains two separate sheets: - Year 2009–2010 - Year 2010–2011

    These have been merged with the original UCI Online Retail dataset to create a unified and continuous dataset.

    Cleaning and Preprocessing Performed

    • Merged all sheets into a single dataset
    • Removed:
      • Rows with negative or zero quantity
      • Rows with negative or zero price
      • Rows with missing customer_id
    • Created:
      • total_price column (quantity × price)
      • is_cancelled column based on invoice format or return flag
    • Standardized:
      • invoicedate formatting
      • Column names and data types

    Column Definitions

    ColumnDescription
    invoiceInvoice number (returns start with 'C')
    stockcodeProduct code
    descriptionDescription of product
    quantityNumber of items purchased
    invoicedateDate and time of invoice
    priceUnit price in GBP
    customer_idUnique identifier for each customer
    countryCustomer’s country
    is_cancelledBoolean flag for cancelled transactions
    total_priceComputed total (quantity × price) for each line item

    Included Files and Descriptions

    FileTypeDescription
    online_retail_cleaned.csvDataCleaned and merged retail transactions from 2009–2011
    rfm_final_score.csvOutputFinal RFM scores for each customer with segment labels
    Retail_Data_Analysis_Dashboard.xlsxExcelInteractive Excel dashboard with KPIs, CLV, monthly trends
    Retail_Data_Analysis_Dashboard.pngImageVisual preview of the Excel dashboard
    RFM_Segmentation.sqlSQLSQL logic to calculate RFM scores and assign segments
    Cohort_Analysis_on_Customer.sqlSQLCohort analysis based on acquisition month
    Cohort_Analysis_on_Revenue.sqlSQLCohort revenue tracking over time

    Dataset Summary

    • Time range: December 2009 – December 2011
    • Data combined from all three sheets (original and Online Retail II)
    • Most customers are from the United Kingdom
    • Fully cleaned and ready for use in analysis or modeling

    Applications

    • Market basket analysis
    • RFM segmentation
    • Cohort and retention analysis
    • Customer lifetime value modeling
    • Time series forecasting

    Included Analysis & Dashboards

    In addition to the cleaned dataset, this dataset includes complete analysis artifacts:

    1. Excel Dashboard

    • Summary metrics: Total Revenue, Orders, Customers, AOV
    • Turnover by year
    • Customer Lifetime Value segmentation (High, Medium, Low)
    • Monthly customer acquisition and churn trend
    • Country-wise revenue
    • Key business recommendations

    2. SQL-Based RFM Segmentation

    • RFM scores (1–5 scale)
    • Segment grouping (e.g., Champions, At Risk, Loyal Customers)
    • Monetary value distributions

    3. SQL-Based Cohort Analysis

    • Monthly cohorts based on acquisition date
    • Retention matrix for month-over-month analysis
    • Supports churn and lifecycle evaluation

    These files are provided in .xlsx and .sql formats and can be used for further business analysis or modeling.

    Source

    Original datasets: - UCI Online Retail II: https://archive.ics.uci.edu/ml/datasets/Online+Retail+II

    This version was cleaned and merged by: Md Shah Nawaj

    Tags

    retail, ecommerce, customer segmentation, transactions, time series, data cleaning, rfm, python, pandas, online retail

  2. s

    Global Shopping Trolley Market Size, Share, Growth Analysis, By Product...

    • skyquestt.com
    Updated Apr 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SkyQuest Technology (2024). Global Shopping Trolley Market Size, Share, Growth Analysis, By Product type(Roller basket, Child cart), By Applications(Supermarkets, Hypermarkets), By Materials(Stainless steel, Plastic hybrid), By Wheels(Three-wheel, Four wheel) - Industry Forecast 2023-2030 [Dataset]. https://www.skyquestt.com/report/shopping-trolley-market
    Explore at:
    Dataset updated
    Apr 18, 2024
    Dataset authored and provided by
    SkyQuest Technology
    License

    https://www.skyquestt.com/privacy/https://www.skyquestt.com/privacy/

    Time period covered
    2023 - 2030
    Area covered
    Global
    Description

    Global Shopping Trolley Market size was valued at USD 1094.79 million in 2021 and is poised to grow from USD 1396.62 million in 2022 to USD 9745.28 million by 2030, growing at a CAGR of 27.48% in the forecast period (2023-2030).

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shah Nawaj (2025). Online_Retail_II [Dataset]. https://www.kaggle.com/datasets/shahnawaj9/online-retail/data
Organization logo

Online_Retail_II

Merged and cleaned transaction data from UCI Online Retail II (2009–2011)

Explore at:
4 scholarly articles cite this dataset (View in Google Scholar)
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 2, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Shah Nawaj
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Cleaned & Merged UCI Online Retail Dataset (Dec 2009 – Dec 2011)

This dataset is a cleaned and merged version of the original UCI Online Retail and Online Retail II datasets. It contains transaction data from a UK-based online retailer, covering a period from December 2009 to December 2011.

Description

The original UCI Online Retail II dataset contains two separate sheets: - Year 2009–2010 - Year 2010–2011

These have been merged with the original UCI Online Retail dataset to create a unified and continuous dataset.

Cleaning and Preprocessing Performed

  • Merged all sheets into a single dataset
  • Removed:
    • Rows with negative or zero quantity
    • Rows with negative or zero price
    • Rows with missing customer_id
  • Created:
    • total_price column (quantity × price)
    • is_cancelled column based on invoice format or return flag
  • Standardized:
    • invoicedate formatting
    • Column names and data types

Column Definitions

ColumnDescription
invoiceInvoice number (returns start with 'C')
stockcodeProduct code
descriptionDescription of product
quantityNumber of items purchased
invoicedateDate and time of invoice
priceUnit price in GBP
customer_idUnique identifier for each customer
countryCustomer’s country
is_cancelledBoolean flag for cancelled transactions
total_priceComputed total (quantity × price) for each line item

Included Files and Descriptions

FileTypeDescription
online_retail_cleaned.csvDataCleaned and merged retail transactions from 2009–2011
rfm_final_score.csvOutputFinal RFM scores for each customer with segment labels
Retail_Data_Analysis_Dashboard.xlsxExcelInteractive Excel dashboard with KPIs, CLV, monthly trends
Retail_Data_Analysis_Dashboard.pngImageVisual preview of the Excel dashboard
RFM_Segmentation.sqlSQLSQL logic to calculate RFM scores and assign segments
Cohort_Analysis_on_Customer.sqlSQLCohort analysis based on acquisition month
Cohort_Analysis_on_Revenue.sqlSQLCohort revenue tracking over time

Dataset Summary

  • Time range: December 2009 – December 2011
  • Data combined from all three sheets (original and Online Retail II)
  • Most customers are from the United Kingdom
  • Fully cleaned and ready for use in analysis or modeling

Applications

  • Market basket analysis
  • RFM segmentation
  • Cohort and retention analysis
  • Customer lifetime value modeling
  • Time series forecasting

Included Analysis & Dashboards

In addition to the cleaned dataset, this dataset includes complete analysis artifacts:

1. Excel Dashboard

  • Summary metrics: Total Revenue, Orders, Customers, AOV
  • Turnover by year
  • Customer Lifetime Value segmentation (High, Medium, Low)
  • Monthly customer acquisition and churn trend
  • Country-wise revenue
  • Key business recommendations

2. SQL-Based RFM Segmentation

  • RFM scores (1–5 scale)
  • Segment grouping (e.g., Champions, At Risk, Loyal Customers)
  • Monetary value distributions

3. SQL-Based Cohort Analysis

  • Monthly cohorts based on acquisition date
  • Retention matrix for month-over-month analysis
  • Supports churn and lifecycle evaluation

These files are provided in .xlsx and .sql formats and can be used for further business analysis or modeling.

Source

Original datasets: - UCI Online Retail II: https://archive.ics.uci.edu/ml/datasets/Online+Retail+II

This version was cleaned and merged by: Md Shah Nawaj

Tags

retail, ecommerce, customer segmentation, transactions, time series, data cleaning, rfm, python, pandas, online retail

Search
Clear search
Close search
Google apps
Main menu