10 datasets found

Market Basket Analysis
kaggle.com
zip
Updated Dec 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis
Explore at:
zip(23875170 bytes)Available download formats
Dataset updated
Dec 9, 2021
Authors
Aslan Ahmedov
Description
Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import

Data Understanding and Exploration

Transformation of the data – so that is ready to be consumed by the association rules algorithm

Running association rules

Exploring the rules generated

Filtering the generated rules

Visualization of Rule

Dataset Description

File name: Assignment-1_Data

List name: retaildata

File format: . xlsx

Number of Row: 522065

Number of Attributes: 7

BillNo: 6-digit number assigned to each transaction. Nominal.

Itemname: Product name. Nominal.

Quantity: The quantities of each product per transaction. Numeric.

Date: The day and time when each transaction was generated. Numeric.

Price: Product price. Numeric.

CustomerID: 5-digit number assigned to each customer. Nominal.

Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).

arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.

tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.

readxl - Read Excel Files in R.

plyr - Tools for Splitting, Applying and Combining Data.

ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.

knitr - Dynamic Report generation in R.

magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.

dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.

tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Market basket analysis
kaggle.com
zip
Updated Feb 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Boopathi M (2024). Market basket analysis [Dataset]. https://www.kaggle.com/datasets/boopathi09945/market-basket-analysis
Explore at:
zip(72860 bytes)Available download formats
Dataset updated
Feb 17, 2024
Authors
Boopathi M
Description
Market basket analysis with Python as we uncover hidden patterns and relationships within transactional data. Discover how algorithms like Apriori can reveal valuable insights into customer behavior, product associations, and purchasing trends. Explore the power of data-driven decision-making in retail, marketing, and beyond, as we navigate through the fascinating realm of market basket analysis.
Retail Transactions Dataset
kaggle.com
zip
Updated May 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Prasad Patil (2024). Retail Transactions Dataset [Dataset]. https://www.kaggle.com/datasets/prasad22/retail-transactions-dataset/code
Explore at:
zip(37330179 bytes)Available download formats
Dataset updated
May 18, 2024
Authors
Prasad Patil
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset was created to simulate a market basket dataset, providing insights into customer purchasing behavior and store operations. The dataset facilitates market basket analysis, customer segmentation, and other retail analytics tasks. Here's more information about the context and inspiration behind this dataset:

Context:

Retail businesses, from supermarkets to convenience stores, are constantly seeking ways to better understand their customers and improve their operations. Market basket analysis, a technique used in retail analytics, explores customer purchase patterns to uncover associations between products, identify trends, and optimize pricing and promotions. Customer segmentation allows businesses to tailor their offerings to specific groups, enhancing the customer experience.

Inspiration:

The inspiration for this dataset comes from the need for accessible and customizable market basket datasets. While real-world retail data is sensitive and often restricted, synthetic datasets offer a safe and versatile alternative. Researchers, data scientists, and analysts can use this dataset to develop and test algorithms, models, and analytical tools.

Dataset Information:

The columns provide information about the transactions, customers, products, and purchasing behavior, making the dataset suitable for various analyses, including market basket analysis and customer segmentation. Here's a brief explanation of each column in the Dataset:

Transaction_ID: A unique identifier for each transaction, represented as a 10-digit number. This column is used to uniquely identify each purchase.

Date: The date and time when the transaction occurred. It records the timestamp of each purchase.

Customer_Name: The name of the customer who made the purchase. It provides information about the customer's identity.

Product: A list of products purchased in the transaction. It includes the names of the products bought.

Total_Items: The total number of items purchased in the transaction. It represents the quantity of products bought.

Total_Cost: The total cost of the purchase, in currency. It represents the financial value of the transaction.

Payment_Method: The method used for payment in the transaction, such as credit card, debit card, cash, or mobile payment.

City: The city where the purchase took place. It indicates the location of the transaction.

Store_Type: The type of store where the purchase was made, such as a supermarket, convenience store, department store, etc.

Discount_Applied: A binary indicator (True/False) representing whether a discount was applied to the transaction.

Customer_Category: A category representing the customer's background or age group.

Season: The season in which the purchase occurred, such as spring, summer, fall, or winter.

Promotion: The type of promotion applied to the transaction, such as "None," "BOGO (Buy One Get One)," or "Discount on Selected Items."

Use Cases:

Market Basket Analysis: Discover associations between products and uncover buying patterns.

Customer Segmentation: Group customers based on purchasing behavior.

Pricing Optimization: Optimize pricing strategies and identify opportunities for discounts and promotions.

Retail Analytics: Analyze store performance and customer trends.

Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.
Mall Customer Segmentation Data
kaggle.com
Updated Aug 11, 2018
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vijay Choudhary (2018). Mall Customer Segmentation Data [Dataset]. https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python/discussion?sort=undefined
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 11, 2018
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Vijay Choudhary
Description
Context

This data set is created only for the learning purpose of the customer segmentation concepts , also known as market basket analysis . I will demonstrate this by using unsupervised ML technique (KMeans Clustering Algorithm) in the simplest form.

Content

You are owing a supermarket mall and through membership cards , you have some basic data about your customers like Customer ID, age, gender, annual income and spending score. Spending Score is something you assign to the customer based on your defined parameters like customer behavior and purchasing data.

Problem Statement You own the mall and want to understand the customers like who can be easily converge [Target Customers] so that the sense can be given to marketing team and plan the strategy accordingly.

Acknowledgements

From Udemy's Machine Learning A-Z course.

I am new to Data science field and want to share my knowledge to others

https://github.com/SteffiPeTaffy/machineLearningAZ/blob/master/Machine%20Learning%20A-Z%20Template%20Folder/Part%204%20-%20Clustering/Section%2025%20-%20Hierarchical%20Clustering/Mall_Customers.csv

Inspiration

By the end of this case study , you would be able to answer below questions. 1- How to achieve customer segmentation using machine learning algorithm (KMeans Clustering) in Python in simplest way. 2- Who are your target customers with whom you can start marketing strategy [easy to converse] 3- How the marketing strategy works in real world

Online Retail & E-Commerce Dataset

kaggle.com

zip

Updated Mar 20, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Ertuğrul EŞOL (2025). Online Retail & E-Commerce Dataset [Dataset]. https://www.kaggle.com/datasets/ertugrulesol/online-retail-data

Explore at:

zip(26067 bytes)Available download formats

Dataset updated

Mar 20, 2025

Authors

Ertuğrul EŞOL

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Overview:

This dataset contains 1000 rows of synthetic online retail sales data, mimicking transactions from an e-commerce platform. It includes information about customer demographics, product details, purchase history, and (optional) reviews. This dataset is suitable for a variety of data analysis, data visualization and machine learning tasks, including but not limited to: customer segmentation, product recommendation, sales forecasting, market basket analysis, and exploring general e-commerce trends. The data was generated using the Python Faker library, ensuring realistic values and distributions, while maintaining no privacy concerns as it contains no real customer information.

Data Source:

This dataset is entirely synthetic. It was generated using the Python Faker library and does not represent any real individuals or transactions.

Data Content:

Column Name	Data Type	Description
`customer_id`	Integer	Unique customer identifier (ranging from 10000 to 99999)
`order_date`	Date	Order date (a random date within the last year)
`product_id`	Integer	Product identifier (ranging from 100 to 999)
`category_id`	Integer	Product category identifier (10, 20, 30, 40, or 50)
`category_name`	String	Product category name (Electronics, Fashion, Home & Living, Books & Stationery, Sports & Outdoors)
`product_name`	String	Product name (randomly selected from a list of products within the corresponding category)
`quantity`	Integer	Quantity of the product ordered (ranging from 1 to 5)
`price`	Float	Unit price of the product (ranging from 10.00 to 500.00, with two decimal places)
`payment_method`	String	Payment method used (Credit Card, Bank Transfer, Cash on Delivery)
`city`	String	Customer's city (generated using Faker's `city()` method, so the locations will depend on the Faker locale you used)
`review_score`	Integer	Customer's product rating (ranging from 1 to 5, or None with a 20% probability)
`gender`	String	Customer's gender (M/F, or None with a 10% probability)
`age`	Integer	Customer's age (ranging from 18 to 75)

Potential Use Cases (Inspiration):

Customer Segmentation: Group customers based on demographics, purchasing behavior, and preferences.

Product Recommendation: Build a recommendation system to suggest products to customers based on their past purchases and browsing history.

Sales Forecasting: Predict future sales based on historical trends.

Market Basket Analysis: Identify products that are frequently purchased together.

Price Optimization: Analyze the relationship between price and demand.

Geographic Analysis: Explore sales patterns across different cities.

Time Series Analysis: Investigate sales trends over time.

Educational Purposes: Great for practicing data cleaning, EDA, feature engineering, and modeling.

Store Sales Dataset
kaggle.com
zip
Updated Sep 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nimisha Davis (2025). Store Sales Dataset [Dataset]. https://www.kaggle.com/datasets/drnimishadavis/store-sales-dataset
Explore at:
zip(562846 bytes)Available download formats
Dataset updated
Sep 22, 2025
Authors
Nimisha Davis
Description
This dataset contains retail sales records from a superstore, including detailed information on orders, products, categories, sales, discounts, profits, customers, and regions.

It is widely used for business intelligence, data visualization, and machine learning projects. With features such as order date, ship mode, customer segment, and geographic region, the dataset is excellent for:

Sales forecasting

Profitability analysis

Market basket analysis

Customer segmentation

Data visualization practice (Tableau, Power BI, Excel, Python, R)

Inspiration:

Great dataset for learning how to build dashboards.

Commonly used in case studies for predictive analytics and decision-making.

Source: Originally inspired by a sample dataset frequently used in Tableau training and BI case studies.

E-commerce_dataset

kaggle.com

zip

Updated Nov 16, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Abhay Ayare (2025). E-commerce_dataset [Dataset]. https://www.kaggle.com/datasets/abhayayare/e-commerce-dataset

Explore at:

zip(644123 bytes)Available download formats

Dataset updated

Nov 16, 2025

Authors

Abhay Ayare

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

E-commerce_dataset

This dataset is a synthetic yet realistic E-commerce retail dataset generated programmatically using Python (Faker + NumPy + Pandas).
It is designed to closely mimic real-world online shopping behavior, user patterns, product interactions, seasonal trends, and marketplace events.

You can use this dataset for:

Machine Learning & Deep Learning
Recommender Systems
Customer Segmentation
Sales Forecasting
A/B Testing
E-commerce Behaviour Analysis
Data Cleaning / Feature Engineering Practice
SQL practice

📁Dataset Contents

The dataset contains 6 CSV files: ~~~ File Rows Description users.csv ~10,000 User profiles, demographics & signup info products.csv ~2,000 Product catalog with rating and pricing orders.csv ~20,000 Order-level transactions order_items.csv ~60,000 Items purchased per order reviews.csv ~15,000 Customer-written product reviews events.csv ~80,000 User event logs: view, cart, wishlist, purchase ~~~

🧬 Data Dictionary

1. Users (users.csv)
Column Description
user_id Unique user identifier
name  Full customer name
email  Email (synthetic, no real emails)
gender Male / Female / Other
city  City of residence
signup_date Account creation date

2. Products (products.csv)
Column Description
product_id Unique product identifier
product_name  Product title
category  Electronics, Clothing, Beauty, Home, Sports, etc.
price  Actual selling price
rating Average product rating

3. Orders (orders.csv)
Column Description
order_id  Unique order identifier
user_id User who placed the order
order_date Timestamp of the order
order_status  Completed / Cancelled / Returned
total_amount  Total order value

4. Order Items (order_items.csv)
Column Description
order_item_id  Unique identifier
order_id  Associated order
product_id Purchased product
quantity  Quantity purchased
item_price Price per unit

5. Reviews (reviews.csv)
Column Description
review_id  Unique review identifier
user_id User who submitted review
product_id Reviewed product
rating 1–5 star rating
review_text Short synthetic review
review_date Submission date

6. Events (events.csv)
Column Description
event_id  Unique event identifier
user_id User performing event
product_id Viewed/added/purchased product
event_type view/cart/wishlist/purchase
event_timestamp Timestamp of event

🧠 Possible Use Cases (Ideas & Projects)

🔍 Machine Learning

Customer churn prediction
Review sentiment analysis (NLP)
Recommendation engines
Price optimization models
Demand forecasting (Time-series)

📦 Business Analytics

Market basket analysis
RFM segmentation
Cohort analysis
Funnel conversion tracking
A/B testing simulations

🧮 SQL Practice

Joins
Window functions
Aggregations
CTE-based funnels
Complex queries

🛠 How the Dataset Was Generated

The dataset was generated entirely in Python using:

Faker for realistic user and review generation
NumPy for probability-based event modeling
Pandas for data processing

Custom logic for:

demand variation
user behavior simulation
return/cancel probabilities
seasonal order timestamp distribution
The dataset does not include any real personal data.
Everything is generated synthetically.

⚠️ License

This dataset is released under CC BY 4.0 — free to use for:
Research
Education
Commercial projects
Kaggle competitions
Machine learning pipelines
Just provide attribution.

⭐ If you found this dataset helpful, please:

Upvote the dataset
Leave a comment
Share your notebooks/notebooks using it

Mall_CustomerData_with_Nulls
kaggle.com
zip
Updated Nov 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
panda (2022). Mall_CustomerData_with_Nulls [Dataset]. https://www.kaggle.com/datasets/jangid6/mall-customerdata-with-nulls
Explore at:
zip(1603 bytes)Available download formats
Dataset updated
Nov 21, 2022
Authors
panda
Description
This is the extension of https://www.kaggle.com/datasets/vjchoudhary7/customer-segmentation-tutorial-in-python public dataset. Changelog : - I have explicitly added NaN/Nulls to the Annual Income & Spending Score.

About Dataset

Context

This data set is created only for the learning purpose of the customer segmentation concepts , also known as market basket analysis . I will demonstrate this by using unsupervised ML technique (KMeans Clustering Algorithm) in the simplest form.

Content

You are owing a supermarket mall and through membership cards , you have some basic data about your customers like Customer ID, age, gender, annual income and spending score. Spending Score is something you assign to the customer based on your defined parameters like customer behavior and purchasing data.

Problem Statement You own the mall and want to understand the customers like who can be easily converge [Target Customers] so that the sense can be given to marketing team and plan the strategy accordingly.

Acknowledgements From Udemy's Machine Learning A-Z course.

I am new to Data science field and want to share my knowledge to others

https://github.com/SteffiPeTaffy/machineLearningAZ/blob/master/Machine%20Learning%20A-Z%20Template%20Folder/Part%204%20-%20Clustering/Section%2025%20-%20Hierarchical%20Clustering/Mall_Customers.csv

Inspiration By the end of this case study , you would be able to answer below questions. 1- How to achieve customer segmentation using machine learning algorithm (KMeans Clustering) in Python in simplest way. 2- Who are your target customers with whom you can start marketing strategy [easy to converse] 3- How the marketing strategy works in real world
👟 Sneakers & Streetwear Sales (2022)
kaggle.com
zip
Updated Jul 29, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Atharva Soundankar (2025). 👟 Sneakers & Streetwear Sales (2022) [Dataset]. https://www.kaggle.com/datasets/atharvasoundankar/sneakers-and-streetwear-sales-2022
Explore at:
zip(6320 bytes)Available download formats
Dataset updated
Jul 29, 2025
Authors
Atharva Soundankar
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
👟 Sneakers & Streetwear Sales Data (2022)

This dataset offers a detailed snapshot of global retail sales from the fast-growing sneaker and streetwear market between January and August 2022. It captures essential sales insights from multiple countries, spanning brands like Nike, Adidas, Supreme, Yeezy, and Off-White, along with high-demand categories such as sneakers, hoodies, joggers, and graphic tees.

The data has been carefully simulated to mirror real-world patterns in retail e-commerce — including seasonality, gender preferences, price bands, and payment behaviors. Each record represents a successful transaction, making this dataset ideal for sales analytics, business intelligence projects, and predictive modeling.

📦 What’s Inside?

500 clean, non-null, and unique sales records

Covering 10 countries and 30+ product names

Fields include: Order Date, Country, Gender, Product, Category, Quantity Sold, Unit Price, Total Sale, and Payment Method

💡 Why This Dataset?

Sneakers and streetwear aren't just fashion — they're a data-rich ecosystem of global trends, influencer impact, resale value, and cultural relevance. Whether you're working on:

EDA & trend visualization

Time-series forecasting

Market basket analysis

Customer segmentation

Sales dashboards

… this dataset gives you everything you need to explore, model, and tell a data story.

✅ Key Features

Realistic sales simulation for 2022

Useful for beginners and advanced practitioners alike

Cleaned and curated — ready for analysis, dashboards, and ML

Ideal for Power BI, Tableau, Python (Pandas, Seaborn, Plotly), and ML libraries
Online Retail Transaction Records
kaggle.com
zip
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Online Retail Transaction Records [Dataset]. https://www.kaggle.com/datasets/thedevastator/online-retail-transaction-records
Explore at:
zip(9098240 bytes)Available download formats
Dataset updated
Dec 21, 2023
Authors
The Devastator
Description
Online Retail Transaction Records

Online Retail Sales: Product Transactions and Customer Details

By Ali Prasla [source]

About this dataset

The Online Retail Sales Dataset, often referred to as the Online Retail.csv file, is an extensive and comprehensive collection of data points relating to e-commerce transactions. This dataset provides a detailed view of sales activities within the online retail sector, covering numerous essential attributes necessary for a quantitative understanding of consumer behavior and the overall business performance.

One of the key elements covered in this dataset is 'InvoiceNo', which is a unique identifier for each transaction taking place in this retail environment. Given its uniqueness, it serves as a primary key for distinguishing individual transactions. It's worthwhile to note that these Invoice Numbers are numerical values.

Another important attribute included here is 'StockCode'. Each product listed or sold on this online retail platform has been assigned with its unique identification code or StockCode. These codes are also numerical values that offer another layer to clearly classify items and distinguish one from another.

For further understanding, every product comes with a basic description noted under the 'Description' column. In textual form, these descriptions provide insights into what exactly each product item entails. Aside from aiding identification efforts, they can potentially open avenues for text-based analysis such as sentiment analysis or keyword flagging based on product trends.

'Moving onto details about transactions themselves', we have two crucial columns: 'Quantity' and 'UnitPrice'. As their names suggest, these show respectively how many particular units of an item were sold per transaction and at what price per unit was sold at.

Further adding detail to our transactions information comes 'InvoiceDate', which records when each separate purchase occurred down to accurate date & time records. This data can be pivotal in recognizing sales patterns throughout different periods or predicting future trends based on historical timing behavior.

Finally yet importantly comes our global indicator - The ‘Country’ column specifies various countries where customers reside who interacts with this particular online platform regularly by making purchases. This application allows us insights into the geographical dispersion of user base across various countries, potentially providing us insights into regional preferences or global market segmentation.

Ith such a wealth of detailed transaction records and customer information, the Online Retail.csv dataset stands as an invaluable tool for those looking to delve deep into online retail sales data analysis. The possibilities with this dataset are vast, ranging from shaping efficient marketing strategies based on geographical data to predicting sales & growth metrics using historical behavior and much more

How to use the dataset

Here's how to make best use of this dataset:

Getting Started Before you start analyzing your data – you'll have to load it into statistical software such as Python (using pandas library) or R. The dataset is saved in .csv file format which supports easy reading into most data manipulation software.

Understand The Fields

InvoiceNo: Each transaction made has an associated unique numerical identifier called InvoiceNo. Consider it like a receipt code - these allow for tracking individual transactions.

StockCode: To identify each product uniquely during analysis, refer to each StockCode value which is essentially a product identification code.

Description: A brief textual description about each product that can be invaluable when dealing with categories for market-basket type analysis.

Quantity: Each row lists out how many units of a particular item were involved in a single transaction - watch out for very large values as they might represent bulk orders.

decode 3

code point 747

hidden fields exercise difficulty

coding dictionary letters

decipher hidden message codes

dictionary letters python

a word scramble solution .

hidden language symbols

unscramble words solver codes

descriptions quizlet game zones

hidden words gameplay notes

name that symbol solutions pack.

11.russian alphabet chart deciphered key .

12.writing numbers in words worksheets grade 1 difficulty

13.cool letter symbols copy and paste trick

14.solve the equation by factoring puzzle answers...
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Aslan Ahmedov (2021). Market Basket Analysis [Dataset]. https://www.kaggle.com/datasets/aslanahmedov/market-basket-analysis

Market Basket Analysis

Analyzing Consumer Behaviour Using MBA Association Rule Mining

Explore at:

2 scholarly articles cite this dataset (View in Google Scholar)

zip(23875170 bytes)Available download formats

Dataset updated

Dec 9, 2021

Authors

Aslan Ahmedov

Description

Market Basket Analysis

Market basket analysis with Apriori algorithm

The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.

Introduction

Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.

An Example of Association Rules

Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.

Strategy

Data Import
Data Understanding and Exploration
Transformation of the data – so that is ready to be consumed by the association rules algorithm
Running association rules
Exploring the rules generated
Filtering the generated rules
Visualization of Rule

Dataset Description

File name: Assignment-1_Data
List name: retaildata
File format: . xlsx
Number of Row: 522065
Number of Attributes: 7
- BillNo: 6-digit number assigned to each transaction. Nominal.
- Itemname: Product name. Nominal.
- Quantity: The quantities of each product per transaction. Numeric.
- Date: The day and time when each transaction was generated. Numeric.
- Price: Product price. Numeric.
- CustomerID: 5-digit number assigned to each customer. Nominal.
- Country: Name of the country where each customer resides. Nominal.

https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">

Libraries in R

First, we need to load required libraries. Shortly I describe all libraries.

arules - Provides the infrastructure for representing, manipulating and analyzing transaction data and patterns (frequent itemsets and association rules).
arulesViz - Extends package 'arules' with various visualization. techniques for association rules and item-sets. The package also includes several interactive visualizations for rule exploration.
tidyverse - The tidyverse is an opinionated collection of R packages designed for data science.
readxl - Read Excel Files in R.
plyr - Tools for Splitting, Applying and Combining Data.
ggplot2 - A system for 'declaratively' creating graphics, based on "The Grammar of Graphics". You provide the data, tell 'ggplot2' how to map variables to aesthetics, what graphical primitives to use, and it takes care of the details.
knitr - Dynamic Report generation in R.
magrittr- Provides a mechanism for chaining commands with a new forward-pipe operator, %>%. This operator will forward a value, or the result of an expression, into the next function call/expression. There is flexible support for the type of right-hand side expressions.
dplyr - A fast, consistent tool for working with data frame like objects, both in memory and out of memory.
tidyverse - This package is designed to make it easy to install and load multiple 'tidyverse' packages in a single step.

https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">

Data Pre-processing

Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.

https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png"> https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">

After we will clear our data frame, will remove missing values.

https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">

To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...

Clear search

Close search

Google apps

Main menu

Market Basket Analysis

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

Market basket analysis

Retail Transactions Dataset

Context:

Inspiration:

Dataset Information:

Use Cases:

Note: This dataset is entirely synthetic and was generated using the Python Faker library, which means it doesn't contain real customer data. It's designed for educational and research purposes.

Mall Customer Segmentation Data

Context

Content

Acknowledgements

Inspiration

Online Retail & E-Commerce Dataset

Store Sales Dataset

E-commerce_dataset

E-commerce_dataset

You can use this dataset for:

📁**Dataset Contents**

🧬 Data Dictionary

🧠 Possible Use Cases (Ideas & Projects)

🔍 Machine Learning

📦 Business Analytics

🧮 SQL Practice

🛠 How the Dataset Was Generated

The dataset was generated entirely in Python using:

Custom logic for:

⚠️ License

⭐ If you found this dataset helpful, please:

Mall_CustomerData_with_Nulls

👟 Sneakers & Streetwear Sales (2022)

👟 Sneakers & Streetwear Sales Data (2022)

📦 What’s Inside?

💡 Why This Dataset?

✅ Key Features

Online Retail Transaction Records

Online Retail Transaction Records

Online Retail Sales: Product Transactions and Customer Details

About this dataset

How to use the dataset

Market Basket Analysis

Analyzing Consumer Behaviour Using MBA Association Rule Mining

Market Basket Analysis

Introduction

An Example of Association Rules

Strategy

Dataset Description

Libraries in R

Data Pre-processing

`Context:`

`Inspiration:`

`Dataset Information:`

`Use Cases:`

📁Dataset Contents