Facebook
TwitterMarket basket analysis with Apriori algorithm
The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.
Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.
Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Number of Attributes: 7
https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">
First, we need to load required libraries. Shortly I describe all libraries.
https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">
Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.
https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png">
https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">
After we will clear our data frame, will remove missing values.
https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">
To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains simulated object-centric event logs for four distinct business processes: Order-to-Cash (O2C), Procure-to-Pay (P2P), Hiring, and Hospital Patient Lifecycle. Each process is designed to reflect realistic workflows, encompassing multiple object types and capturing key activities, decision points, and process dynamics. The dataset is aimed at providing a rich source of data for process mining, analysis, and modeling activities.
1. Order-to-Cash (O2C):
The O2C process simulates an end-to-end business flow starting from customer order placement to payment receipt. It includes diverse activities such as order approval, fulfillment, invoice generation, and payment processing, involving object types like Customers, Orders, Products, and Invoices. The dataset captures variability through random decisions, synchronization between departments, and workarounds in credit checks and inventory adjustments. Attributes such as customer tiers, order values, and shipment statuses add further depth, allowing for detailed analysis of this complex process.
2. Procure-to-Pay (P2P):
The P2P process simulates the procurement lifecycle, from requisition creation to payment of suppliers. Key activities include purchase order creation, three-way matching, goods receipt, and payment processing. The event log records object types such as Purchase Requisitions, Purchase Orders, Suppliers, and Invoices. Variability is introduced through approval decisions, batching, and potential mismatches in the matching process. The dataset represents the inherent complexities of real-world procurement operations, including batching and synchronization issues between different process stages.
3. Hiring Process:
The hiring process log tracks the recruitment lifecycle, from job requisition creation to onboarding. It includes object types like Candidates, Job Requisitions, Recruiters, and Interviewers. The process covers activities such as resume screening, interviews, assessments, and offer management. Variability in the hiring process is introduced through random delays, candidate decisions, and background check durations. Batching occurs in stages like resume screening and onboarding, while synchronization challenges arise during interview scheduling.
4. Hospital Patient Lifecycle:
This log represents the lifecycle of patients within a hospital, capturing interactions with multiple resources such as physicians, beds, and medical equipment. The process begins with pre-admission activities, followed by diagnosis, treatment, and discharge. The dataset includes object types like Patients, Physicians, and Medical Equipment, with attributes related to patient demographics and event severity. The process reflects the dynamic nature of hospital operations, including synchronization of resources and the occurrence of workarounds in case of delays or resource unavailability.
Each process simulation captures high variability, synchronization issues, and batching, making this dataset suitable for analyzing real-world operational challenges. The logs provide a comprehensive view of complex workflows, supporting advanced analysis, including object-centric process mining.
This description will provide the necessary details about the dataset, highlighting its structure, purpose, and potential uses for researchers and process analysts.
Object-centric event logs conceived and simulated by the o1-preview-2024-09-12 LRM, using the https://github.com/fit-alessandro-berti/llm-ocel-simulator project.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Facebook
TwitterMarket basket analysis with Apriori algorithm
The retailer wants to target customers with suggestions on itemset that a customer is most likely to purchase .I was given dataset contains data of a retailer; the transaction data provides data around all the transactions that have happened over a period of time. Retailer will use result to grove in his industry and provide for customer suggestions on itemset, we be able increase customer engagement and improve customer experience and identify customer behavior. I will solve this problem with use Association Rules type of unsupervised learning technique that checks for the dependency of one data item on another data item.
Association Rule is most used when you are planning to build association in different objects in a set. It works when you are planning to find frequent patterns in a transaction database. It can tell you what items do customers frequently buy together and it allows retailer to identify relationships between the items.
Assume there are 100 customers, 10 of them bought Computer Mouth, 9 bought Mat for Mouse and 8 bought both of them. - bought Computer Mouth => bought Mat for Mouse - support = P(Mouth & Mat) = 8/100 = 0.08 - confidence = support/P(Mat for Mouse) = 0.08/0.09 = 0.89 - lift = confidence/P(Computer Mouth) = 0.89/0.10 = 8.9 This just simple example. In practice, a rule needs the support of several hundred transactions, before it can be considered statistically significant, and datasets often contain thousands or millions of transactions.
Number of Attributes: 7
https://user-images.githubusercontent.com/91852182/145270162-fc53e5a3-4ad1-4d06-b0e0-228aabcf6b70.png">
First, we need to load required libraries. Shortly I describe all libraries.
https://user-images.githubusercontent.com/91852182/145270210-49c8e1aa-9753-431b-a8d5-99601bc76cb5.png">
Next, we need to upload Assignment-1_Data. xlsx to R to read the dataset.Now we can see our data in R.
https://user-images.githubusercontent.com/91852182/145270229-514f0983-3bbb-4cd3-be64-980e92656a02.png">
https://user-images.githubusercontent.com/91852182/145270251-6f6f6472-8817-435c-a995-9bc4bfef10d1.png">
After we will clear our data frame, will remove missing values.
https://user-images.githubusercontent.com/91852182/145270286-05854e1a-2b6c-490e-ab30-9e99e731eacb.png">
To apply Association Rule mining, we need to convert dataframe into transaction data to make all items that are bought together in one invoice will be in ...