100+ datasets found
  1. Customer Purchases Behaviour Dataset

    • kaggle.com
    zip
    Updated Apr 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanyam Goyal (2024). Customer Purchases Behaviour Dataset [Dataset]. https://www.kaggle.com/datasets/sanyamgoyal401/customer-purchases-behaviour-dataset
    Explore at:
    zip(1524741 bytes)Available download formats
    Dataset updated
    Apr 6, 2024
    Authors
    Sanyam Goyal
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Subtitle:

    Simulated Dataset of Customer Purchase Behavior

    Description:

    This dataset contains simulated data representing customer purchase behavior. It includes various features such as age, gender, income, education, region, loyalty status, purchase frequency, purchase amount, product category, promotion usage, and satisfaction score.

    File Information:

    • File Format: CSV
    • Number of Rows: 100000
    • Number of Columns: 12

    Column Descriptors:

    • age: Age of the customer.
    • gender: Gender of the customer (0 for Male, 1 for Female).
    • income: Annual income of the customer.
    • education: Education level of the customer.
    • region: Region where the customer resides.
    • loyalty_status: Loyalty status of the customer.
    • purchase_frequency: Frequency of purchases made by the customer.
    • purchase_amount: Amount spent by the customer in each purchase.
    • product_category: Category of the purchased product.
    • promotion_usage: Indicates whether the customer used promotional offers (0 for No, 1 for Yes).
    • satisfaction_score: Satisfaction score of the customer.

    Provenance:

    The dataset was simulated using the simstudy package in R. Various distributions and formulas were used to generate synthetic data representing customer purchase behavior. The data is organized to mimic real-world scenarios, but it does not represent actual customer data.

  2. Online Retail Sales and Customer Data

    • kaggle.com
    zip
    Updated Dec 21, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Online Retail Sales and Customer Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/online-retail-sales-and-customer-data
    Explore at:
    zip(9098240 bytes)Available download formats
    Dataset updated
    Dec 21, 2023
    Authors
    The Devastator
    Description

    Online Retail Sales and Customer Data

    Transactional Data with Product and Customer Details in Online Retail

    By Marc Szafraniec [source]

    About this dataset

    The InvoiceNo column holds unique identifiers for each transaction conducted. This numerical code serves a twofold purpose: it facilitates effortless identification of individual sales or purchases while simultaneously enabling treasury management by offering a repository for record keeping.

    In concordance with the invoice number is the InvoiceDate column. It provides a date-time stamp associated with every transaction, which can reveal patterns in purchasing behaviour over time and assists with record-keeping requirements.

    The StockCode acts as an integral part of this dataset; it encompasses alphanumeric sequences allocated distinctively to every item in stock. Such a system aids unequivocally identifying individual products making inventory records seamless.

    The Description field offers brief elucidations about each listed product, adding layers beyond just stock codes to aid potential customers' understanding of products better and make more informed choices.

    Detailed logs concerning sold quantities come under the Quantity banner - it lists the units involved per transaction alongside aiding calculations regarding total costs incurred during each sale/purchase offering significant help tracking inventory levels based on products' outflow dynamics within given periods.

    Retail isn't merely about what you sell but also at what price you sell- A point acknowledged via our inclusion of unit prices exerted on items sold within transactions inside our dataset's UnitPrice column which puts forth pertinent pricing details serving as pivotal factors driving metrics such as gross revenue calculation etc

    Finally yet importantly is our dive into foreign waters - literally! With impressive international outreach we're looking into segmentation bases like geographical locations via documenting countries (under the name Country) where transactions are conducted & consumers reside extending opportunities for businesses to map their customer bases, track regional performance metrics, extend localization efforts and overall contributing to the formulation of efficient segmentation strategies.

    All this invaluable information can be found in a sortable CSV file titled online_retail.csv. This dataset will prove incredibly advantageous for anyone interested in or researching online sales trends, developing customer profiles, or gaining insights into effective inventory management practices

    How to use the dataset

    Identifying Products: StockCode is the unique identifier for each product. You can use it to identify individual products, track their sales, or discover patterns related to specific items.

    Assessing Sales Volume: Quantity column tells you about the number of units of a product involved in each transaction. Along with InvoiceNo, you can analyze overall sales volume or specific purchases throughout your selected period.

    Observing Price Fluctuations: By using the UnitPrice, not only can the total cost per transaction be calculated (by multiplying with Quantity), but also insightful observations like price fluctuations over time or determining most profitable items could be derived.

    Analyzing Description Patterns/Trends: The Description field sheds light upon what kind of products are being traded. This could provide some inspiration for text analysis like term frequency-inverse document frequency (TF-IDF), sentiment analysis on descriptions, etc., to figure out popular trends at given times.

    Analysing Geographical Trends: With the help of Country column, geographical trends in sales volumes across different nations can easily be analyzed i.e., which location has more customers or which country orders more quantity or expensive units based on unit price and quantity columns respectively.

    Keep in mind that proper extraction and transformation methodology should be applied while handling data from different columns as per their datatypes (textual/alphanumeric/numeric) requirements.

    This dataset not only allows retailers to gain an immediate understanding into their operations but could also serve as a base dataset for those interested in machine learning regarding predicting future transactions

    Research Ideas

    • Inventory Management: By tracking the 'Quantity' and 'StockCode' over time, a business could use this data to notice if certain products are frequently purchased together or in specific seasons, allowing them to better stock their inventory.
    • Pricing Strategy:...
  3. Sales Dataset

    • kaggle.com
    zip
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Indervir Singh (2025). Sales Dataset [Dataset]. https://www.kaggle.com/datasets/indervirsingh5/sales-dataset
    Explore at:
    zip(138709 bytes)Available download formats
    Dataset updated
    Nov 18, 2025
    Authors
    Indervir Singh
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Randomized dataset intended for trial ETL work using tools such as Microsoft Fabric, Azure Data Factory, Azure Synapse Analytics, Databricks, etc. Contains 3 CSV files: products.csv (dim), customers.csv (dim), sales.csv (facts).

  4. Customer DataSets

    • kaggle.com
    zip
    Updated Mar 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    zohre notash (2023). Customer DataSets [Dataset]. https://www.kaggle.com/datasets/zohrenotash/customer-datasets
    Explore at:
    zip(348498 bytes)Available download formats
    Dataset updated
    Mar 15, 2023
    Authors
    zohre notash
    Description

    In a business context: Clustering algorithm is a technique that assists customer segmentation which is a process of classifying similar customers into the same segment. Clustering algorithm helps to better understand customers, in terms of both static demographics and dynamic behaviors. Customer with comparable characteristics often interact with the business similarly, thus business can benefit from this technique by creating tailored marketing strategy for each segment. Based on that, customers can be provided with discounts, offers, promo codes etc. As a simple example. A bank wants to give credit card offers to its customers. Currently, they look at the details of each customer and based on this information, decide which offer should be given to which customer. Now, the bank can potentially have millions of customers. Does it make sense to look at the details of each customer separately and then make a decision? Certainly not! It is a manual process and will take a huge amount of time. So what can the bank do? One option is to segment its customers into different groups. For instance, the bank can group the customers based on their income

    The bank can now make three different strategies or offers, one for each group. Here, instead of creating different strategies for individual customers, they only have to make 3 strategies. This will reduce the effort as well as the time. Clustering is the process of dividing the entire data into groups (also known as clusters) based on the patterns in the data.

    The purpose of this notebook ​​ The competitive in financial industries are getting harder in the next decade. One of this industry main source of revenue are Interest Income which they could get by giving loan or credit payment facilities to customer. Therefore, the more the credit are given, the more interest they get. Since the data are collected by every credit activities, the company hope they could get some insight by processing the data. This time, we have a data contains summary of the usage behavior of about 9000 active credit card holders during the last 6 months. Data includes transaction frequency, amount, tenure... etc.The bank marketing team would like to leverage AI/ML to launch a targeted marketing ad campaign that is tailored to a specific group of customers. In order for this campaign to be successful, the bank has to divide its customers into at least 3 distinctive groups.This process is known as "marketing segmentation" and is crucial for maximizing marketing campaign conversion rate.We will process this data using unsupervised learning methodology to segmentize the customer by finding a certain pattern in hope we could find some characteristic between each customer segment. Then we will analyze each segment and plan the marketing approach that work best with each segment.

    Problem Statement

    This case requires to develop a customer segmentation to define marketing strategy. The problem described in this dataset requires us to extract segments of customers depending on their behaviour patterns provided in the dataset, to focus marketing strategy of the company on a particular segment. ​​

    What is customer segmentation?

    One of method for the marketing team to understand their customer, is by dividing their customer by their characteristic which is called customer segmentation. Customer segmentation is the process by which you divide your customers up based on common characteristics – such as demographics or behaviours, so you can market to those customers more effectively.

  5. Customer Transactions

    • kaggle.com
    zip
    Updated Oct 15, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    bkcoban (2023). Customer Transactions [Dataset]. https://www.kaggle.com/datasets/bkcoban/customer-transactions
    Explore at:
    zip(1379480 bytes)Available download formats
    Dataset updated
    Oct 15, 2023
    Authors
    bkcoban
    License

    https://cdla.io/sharing-1-0/https://cdla.io/sharing-1-0/

    Description

    Synthetic Dataset of Customer Transactions with Demographic and Shopping Behavior Information

    Features
    • Customer ID:The unique identifier for each customer.
    • Name: The customer's name.
    • Surname: The customer's last name.
    • Gender:The gender of the customer.
    • Birthdate: The customer's date of birth.
    • Transaction Amount:The amount of the transaction. ($)
    • Date: The date of the transaction.
    • Merchant Name:The name of the merchant where the transaction was made.
    • Category: The category of the transaction.
  6. Customer Dataset for clustering

    • kaggle.com
    zip
    Updated Sep 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhash Rai (2024). Customer Dataset for clustering [Dataset]. https://www.kaggle.com/datasets/abhashrai/customer-dataset-for-clustering
    Explore at:
    zip(20870 bytes)Available download formats
    Dataset updated
    Sep 3, 2024
    Authors
    Abhash Rai
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Beginner friendly dataset for clustering.

    You can train a model to cluster customers in segments (High, Medium, Low) based on 'Avg_Order_Value' and 'Total_Spending'.

    Actual segment is also provided.

  7. Mall Customers Dataset

    • kaggle.com
    zip
    Updated Feb 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    tusharatkare06 (2024). Mall Customers Dataset [Dataset]. https://www.kaggle.com/datasets/tusharatkare06/mall-customers-dataset
    Explore at:
    zip(1599 bytes)Available download formats
    Dataset updated
    Feb 15, 2024
    Authors
    tusharatkare06
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Dataset

    This dataset was created by tusharatkare06

    Released under Database: Open Database, Contents: Database Contents

    Contents

  8. Customer Segmentation Data

    • kaggle.com
    zip
    Updated Mar 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Smit Raval (2024). Customer Segmentation Data [Dataset]. https://www.kaggle.com/datasets/ravalsmit/customer-segmentation-data
    Explore at:
    zip(1842344 bytes)Available download formats
    Dataset updated
    Mar 11, 2024
    Authors
    Smit Raval
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides comprehensive customer data suitable for segmentation analysis. It includes anonymized demographic, transactional, and behavioral attributes, allowing for detailed exploration of customer segments. Leveraging this dataset, marketers, data scientists, and business analysts can uncover valuable insights to optimize targeted marketing strategies and enhance customer engagement. Whether you're looking to understand customer behavior or improve campaign effectiveness, this dataset offers a rich resource for actionable insights and informed decision-making.

    Key Features:

    Anonymized demographic, transactional, and behavioral data. Suitable for customer segmentation analysis. Opportunities to optimize targeted marketing strategies. Valuable insights for improving campaign effectiveness. Ideal for marketers, data scientists, and business analysts.

    Usage Examples:

    Segmenting customers based on demographic attributes. Analyzing purchase behavior to identify high-value customer segments. Optimizing marketing campaigns for targeted engagement. Understanding customer preferences and tailoring product offerings accordingly. Evaluating the effectiveness of marketing strategies and iterating for improvement. Explore this dataset to unlock actionable insights and drive success in your marketing initiatives!

  9. Global E-Commerce Sales and Customer Data.csv

    • kaggle.com
    zip
    Updated Mar 12, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassan Abbas (2026). Global E-Commerce Sales and Customer Data.csv [Dataset]. https://www.kaggle.com/datasets/hassnabbas611/global-e-commerce-sales-and-customer-data-csv
    Explore at:
    zip(62707 bytes)Available download formats
    Dataset updated
    Mar 12, 2026
    Authors
    Hassan Abbas
    Description

    Dataset

    This dataset was created by Hassan Abbas

    Released under Other (specified in description)

    Contents

  10. Sales data based on demographics

    • kaggle.com
    zip
    Updated Jan 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Devastator (2023). Sales data based on demographics [Dataset]. https://www.kaggle.com/datasets/thedevastator/demographical-shopping-purchases-data
    Explore at:
    zip(1541029 bytes)Available download formats
    Dataset updated
    Jan 12, 2023
    Authors
    The Devastator
    Description

    Demographical Shopping Purchases Data

    Analyzing customer purchasing patterns and preferences

    By Joseph Nowicki [source]

    About this dataset

    This dataset contains demographic information about customers who have made purchases in a store, including their name, IP address, region, age, items purchased, and total amount spent. Furthermore, this data can provide insights into customer shopping behaviour for the store in question - from their geographical information to the types of products they purchase. With detailed demographic data like this at hand it is possible to make strategic decisions regarding target customers as well as developing specific marketing campaigns or promotions tailored to meet their needs and interests. By gaining deeper understanding of customer habits through this dataset we unlock more possibilities for businesses seeking higher engagement levels with shoppers

    More Datasets

    For more datasets, click here.

    Featured Notebooks

    • 🚨 Your notebook can be here! 🚨!

    How to use the dataset

    This dataset includes information such as customer's names, IP address, age, items purchased and amount spent. This data can be used to uncover patterns in spending behavior of shoppers from different areas or regions across demographics like age group or gender.

    Research Ideas

    • Analyze customer shopping trends based on age and region to maximize targetted advertising.
    • Analyze the correlation between customer spending habits based on store versus online behavior.
    • Use IP addresses to track geographical trends in items purchased from a particular online store to identify new markets for targeted expansion

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. Data Source

    License

    See the dataset description for more information.

    Columns

    File: Demographic_Data_Orig.csv | Column name | Description | |:---------------|:------------------------------------------------------------------------------------------------| | full.name | The full name of the customer. (String) | | ip.address | The IP address of the customer. (String) | | region | The region of residence of the customer. (String) | | in.store | A boolean value indicating whether the customer made the purchase in-store or online. (Boolean) | | age | The age of the customer. (Integer) | | items | The number of items purchased by the customer. (Integer) | | amount | The total amount spent by the customer. (Float) |

    Acknowledgements

    If you use this dataset in your research, please credit the original authors. If you use this dataset in your research, please credit Joseph Nowicki.

  11. 🔍 Diverse CSV Dataset Samples

    • kaggle.com
    zip
    Updated Nov 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samy Baladram (2023). 🔍 Diverse CSV Dataset Samples [Dataset]. https://www.kaggle.com/datasets/samybaladram/multidisciplinary-csv-datasets-collection
    Explore at:
    zip(330927 bytes)Available download formats
    Dataset updated
    Nov 6, 2023
    Authors
    Samy Baladram
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    https://i.imgur.com/PcSDv8A.png" alt="Imgur">

    Overview

    The dataset provided here is a rich compilation of various data files gathered to support diverse analytical challenges and education in data science. It is especially curated to provide researchers, data enthusiasts, and students with real-world data across different domains, including biostatistics, travel, real estate, sports, media viewership, and more.

    Files

    Below is a brief overview of what each CSV file contains: - Addresses: Practical examples of string manipulation and address data formatting in CSV. - Air Travel: Historical dataset suitable for analyzing trends in air travel over a period of three years. - Biostats: A dataset of office workers' biometrics, ideal for introductory statistics and biology. - Cities: Geographic and administrative data for urban analysis or socio-demographic studies. - Car Crashes in Catalonia: Weekly traffic accident data from Catalonia, providing a base for public policy research. - De Niro's Film Ratings: Analyze trends in film ratings over time with this entertainment-focused dataset. - Ford Escort Sales: Pre-owned vehicle sales data, perfect for regression analysis or price prediction models. - Old Faithful Geyser: Geological data for pattern recognition and prediction in natural phenomena. - Freshman Year Weights and BMIs: Dataset depicting weight and BMI changes for health and lifestyle studies. - Grades: Education performance data which can be correlated with demographics or study patterns. - Home Sales: A dataset reflecting the housing market dynamics, useful for economic analysis or real estate appraisal. - Hooke's Law Demonstration: Physics data illustrating the classic principle of elasticity in springs. - Hurricanes and Storm Data: Climate data on hurricane and storm frequency for environmental risk assessments. - Height and Weight Measurements: Public health research dataset on anthropometric data. - Lead Shot Specs: Detailed engineering data for material sciences and manufacturing studies. - Alphabet Letter Frequency: Text analysis dataset for frequency distribution studies in large text samples. - MLB Player Statistics: Comprehensive athletic data set for analysis of performance metrics in sports. - MLB Teams' Seasonal Performance: A dataset combining financial and sports performance data from the 2012 MLB season. - TV News Viewership: Media consumption data which can be used to analyze viewing patterns and trends. - Historical Nile Flood Data: A unique environmental dataset for historical trend analysis in flood levels. - Oscar Winner Ages: A dataset to explore age trends among Oscar-winning actors and actresses. - Snakes and Ladders Statistics: Data from the game outcomes useful in studying probability and game theory. - Tallahassee Cab Fares: Price modeling data from the real-world pricing of taxi services. - Taxable Goods Data: A snapshot of economic data concerning taxation impact on prices. - Tree Measurements: Ecological and environmental science data related to tree growth and forest management. - Real Estate Prices from Zillow: Market analysis dataset for those interested in housing price determinants.

    Format

    The enclosed data respect the comma-separated values (CSV) file format standards, ensuring compatibility with most data processing libraries in Python, R, and other languages. The datasets are ready for import into Jupyter notebooks, RStudio, or any other integrated development environment (IDE) used for data science.

    Quality Assurance

    The data is pre-checked for common issues such as missing values, duplicate records, and inconsistent entries, offering a clean and reliable dataset for various analytical exercises. With initial header lines in some CSV files, users can easily identify dataset fields and start their analysis without additional data cleaning for headers.

    Acknowledgements

    The dataset adheres to the GNU LGPL license, making it freely available for modification and distribution, provided that the original source is cited. This opens up possibilities for educators to integrate real-world data into curricula, researchers to validate models against diverse datasets, and practitioners to refine their analytical skills with hands-on data.

    This dataset has been compiled from https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html, with gratitude to the authors and maintainers for their dedication to providing open data resources for educational and research purposes. https://i.imgur.com/HOtyghv.png" alt="Imgur">

  12. Customer Segmentation Dataset

    • kaggle.com
    zip
    Updated Aug 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Madhura Atmaram Bhagat (2024). Customer Segmentation Dataset [Dataset]. https://www.kaggle.com/datasets/madhuraatmarambhagat/customer-segmentation-dataset
    Explore at:
    zip(1583 bytes)Available download formats
    Dataset updated
    Aug 22, 2024
    Authors
    Madhura Atmaram Bhagat
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Madhura Atmaram Bhagat

    Released under Apache 2.0

    Contents

  13. Mall Customer

    • kaggle.com
    zip
    Updated Jan 10, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ramisa Sharar Nidhi (2024). Mall Customer [Dataset]. https://www.kaggle.com/datasets/ramisashararnidhi/mall-customer-csv
    Explore at:
    zip(1583 bytes)Available download formats
    Dataset updated
    Jan 10, 2024
    Authors
    Ramisa Sharar Nidhi
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Name: Mall Customer Dataset Purpose: To analyze customer demographics, income, and spending scores for market segmentation and business decision-making. Data Format: CSV file containing customer data with labels for gender, age, income, and spending score. Fields: CustomerID: Unique identifier for customers. Gender: Gender of the customer (Male/Female). Age: Age group of the customer (e.g., 18-23, 24-28). Annual Income (k$): Annual income of the customer (in thousands of dollars). Spending Score (1-100): Customer's spending score, ranging from 1 to 100. Total Entries: 200 customers with various demographic and spending information. Missing Data: None. All fields are filled for every customer. Unique Values: Unique values for each attribute like gender, age, income range, and spending score range. Suggested Tasks: Customer Segmentation

    Task: Use clustering algorithms (e.g., K-means, DBSCAN, Agglomerative Clustering) to segment customers into different groups based on their demographics and spending behaviors. Approach: Segment customers into clusters such as "High spenders," "Middle-income, low spenders," etc. Predict Spending Behavior

    Task: Build a predictive model to forecast the spending score of a customer based on demographic features (age, gender, income). Approach: Train a regression model (e.g., Linear Regression, Random Forest, or Gradient Boosting) to predict the spending score. Customer Behavior Analysis

    Task: Analyze spending patterns across different age groups, income brackets, or genders. Approach: Use visualization techniques (e.g., bar charts, box plots) to analyze relationships between income, age, and spending behavior. Market Basket Analysis

    Task: Apply market basket analysis to identify which customer segments are likely to purchase similar products or services. Approach: Use association rule learning (e.g., Apriori) to identify frequent itemsets and customer purchase patterns. Personalized Marketing Strategy

    Task: Based on customer segmentation, develop personalized marketing campaigns targeting specific customer groups. Approach: Use targeted advertising or personalized product recommendations for each customer segment. Customer Churn Prediction

    Task: Use the dataset to predict the likelihood of customers leaving or reducing their spending. Approach: Use classification algorithms (e.g., Logistic Regression, Decision Trees) to predict customer churn based on demographics and spending history. Customer Lifetime Value (CLV) Prediction

    Task: Estimate the future revenue a customer might bring based on their current spending behavior and demographic profile. Approach: Use machine learning models to estimate CLV, incorporating spending score, income, and age.

  14. Customer Survey Report

    • kaggle.com
    zip
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    smmmmmmmmmmmm (2023). Customer Survey Report [Dataset]. https://www.kaggle.com/datasets/smmmmmmmmmmmm/customer-survey-report
    Explore at:
    zip(87056 bytes)Available download formats
    Dataset updated
    Nov 29, 2023
    Authors
    smmmmmmmmmmmm
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Customer surveys are invaluable tools for businesses to gauge satisfaction and improve their offerings. Analyzing various metrics provides a comprehensive understanding of customer experiences.

    Each aspect of the survey holds significance. "Overall Satisfaction" encapsulates the holistic perception customers have toward the brand. "Product Quality" and "Service Speed" pinpoint core areas needing attention. "Support Helpfulness" measures the effectiveness of customer service, while "Website Ease of Use" reflects on user experience. "Delivery Speed" and "Price Competitiveness" directly impact customer satisfaction and loyalty. "Recommendation Likelihood" indicates potential advocacy, a key driver for growth.

    The "Experience with Brand" section delves into the emotional connection customers have with the brand, which goes beyond mere transactions. Open-ended questions like "Feedback Comments" uncover nuanced insights that quantitative data might miss, providing qualitative depth. "Contact Channel" analysis highlights preferred communication methods.

    Effective use of this data involves understanding correlations and trends between variables. For instance, a correlation between "Product Quality" and "Overall Satisfaction" might emphasize the importance of product excellence. Similarly, identifying a discrepancy between "Service Speed" and "Support Helpfulness" could prompt improvements in customer service training.

    Moreover, trends over time help identify improvements or declines, guiding strategic decisions. If "Website Ease of Use" scores drop, it might signal the need for website optimization.

    Acting on customer feedback is crucial. Resolving issues highlighted in "Feedback Comments" can improve customer experience and loyalty. Recognizing high satisfaction areas aids in emphasizing and promoting these strengths.

    Ultimately, interpreting this data holistically shapes actionable strategies. Prioritizing areas that significantly impact overall satisfaction while fostering a positive emotional connection with the brand strengthens customer relationships and bolsters business growth. Regular surveys ensure continual alignment with evolving customer preferences and market dynamics, fostering a customer-centric approach.

  15. Customer Support Ticket Dataset

    • kaggle.com
    zip
    Updated Jul 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Waseem AlAstal (2024). Customer Support Ticket Dataset [Dataset]. https://www.kaggle.com/datasets/waseemalastal/customer-support-ticket-dataset
    Explore at:
    zip(847457 bytes)Available download formats
    Dataset updated
    Jul 25, 2024
    Authors
    Waseem AlAstal
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Overview This dataset comprises detailed records of customer support tickets, providing valuable insights into various aspects of customer service operations. It is designed to aid in the analysis and modeling of customer support processes, offering a wealth of information for data scientists, machine learning practitioners, and business analysts.

    Dataset Description The dataset includes the following features:

    Ticket ID: Unique identifier for each support ticket. Customer Name: Name of the customer who submitted the ticket. Customer Email: Email address of the customer. Customer Age: Age of the customer. Customer Gender: Gender of the customer. Product Purchased: Product for which the customer has requested support. Date of Purchase: Date when the product was purchased. Ticket Type: Type of support ticket (e.g., Technical Issue, Billing Inquiry). Ticket Subject: Brief subject or title of the ticket. Ticket Description: Detailed description of the issue or inquiry. Ticket Status: Current status of the ticket (e.g., Open, Closed, Pending). Resolution: Description of how the ticket was resolved. Ticket Priority: Priority level of the ticket (e.g., High, Medium, Low). Ticket Channel: The Channel through which the ticket was submitted (e.g., Email, Phone, Web). First Response Time: Time taken for the first response to the ticket. Time to Resolution: Total time taken to resolve the ticket. Customer Satisfaction Rating: Customer satisfaction rating for the support received. Usage This dataset can be utilized for various analytical and modeling purposes, including but not limited to:

    Customer Support Analysis: Understand trends and patterns in customer support requests, and analyze ticket volumes, response times, and resolution effectiveness. NLP for Ticket Categorization: Develop natural language processing models to automatically classify tickets based on their content. Customer Satisfaction Prediction: Build predictive models to estimate customer satisfaction based on ticket attributes. Ticket Resolution Time Prediction: Predict the time required to resolve tickets based on historical data. Customer Segmentation: Segment customers based on their support interactions and demographics. Recommender Systems: Develop systems to recommend products or solutions based on past support tickets. Potential Applications: Enhancing customer support workflows by identifying bottlenecks and areas for improvement. Automating the ticket triaging process to ensure timely responses. Improving customer satisfaction through predictive analytics. Personalizing customer support based on segmentation and past interactions. File information: The dataset is provided in CSV format and contains 8470 records and [number of columns] features.

  16. Sales Data

    • kaggle.com
    zip
    Updated Aug 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jehanzaib Bhatti (2022). Sales Data [Dataset]. https://www.kaggle.com/datasets/jehanzaibbhatti/sales-data
    Explore at:
    zip(1206658 bytes)Available download formats
    Dataset updated
    Aug 31, 2022
    Authors
    Jehanzaib Bhatti
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Description:

    This dataset offers a rich source of information for understanding sales trends, patterns, and customer demographics. It is structured to provide detailed insights at different temporal levels (daily, monthly, and yearly) and offers a comprehensive view of various products ordered by customers across different demographics.

    Key Features:

    Temporal Insights:

    Daily Data: Provides a granular view of sales activity on a day-to-day basis, allowing users to track daily fluctuations and identify short-term trends. Monthly Data: Summarizes sales data on a monthly basis, enabling users to observe monthly trends, seasonality, and growth patterns. Yearly Data: Offers a high-level overview of sales performance on an annual basis, making it easier to identify long-term trends and growth. Product-Level Information:

    Product IDs: Unique identifiers for each product in the dataset. Sales Quantity: The number of units sold for each product. Sales Revenue: The total revenue generated by each product. Customer Demographics:

    Country: The country where the customer is located. State: The state or region within the country. Age: The age of the customer.

  17. Flipkart Product reviews with sentiment Dataset

    • kaggle.com
    zip
    Updated Feb 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nirali vaghani (2023). Flipkart Product reviews with sentiment Dataset [Dataset]. https://www.kaggle.com/datasets/niraliivaghani/flipkart-product-customer-reviews-dataset
    Explore at:
    zip(3970956 bytes)Available download formats
    Dataset updated
    Feb 3, 2023
    Authors
    Nirali vaghani
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This dataset contains information about Product name, Product price, Rate, Reviews, Summary and Sentiment in csv format. There are 104 different types of products of flipkart.com such as electronics items, clothing of men, women and kids, Home decor items, Automated systems, so on. It has 205053 rows and 6 columns. Also, if any product doesn't have any review but summary is present then Nan value already added to its blank space.

    This dataset has multiclass label as sentiment such as positive, neutral amd negative.The sentiment given was based on column called Summary using NLP and Vader model. Also, after that we manually check the label and put it into the appropriate categories like if summary has text like okay, just ok or one positive and negative we labeled as neutral for better understanding while using this dataset for human languages. On the summary and price column, data cleaning method is already performed using python module called NumPy and Pandas which are famous.You can learn it also through any online resource.

    Data was collected through web scraping using the library called beautifulsoup from flipkart.com. The scraping done in december 2022.

    Usage

    Sentiment Analysis: The text of customer reviews and the associated labels (such as positive, negative, or neutral) can be used to train machine learning models to automatically classify the sentiment of customer reviews.

    Predictive Modeling: Customer ratings, summary and reviews, along with their associated labels, can be used as features to build predictive models for various outcomes, such as customer behavior, purchasing patterns, product preferences and so on.

    Text Classification: The labeled customer reviews or summary can be used to train machine learning models for text classification tasks, such as spam detection, topic classification, and intent recognition,etc.

    Natural Language Processing (NLP): It can be used to train NLP algorithms, such as sentiment analysis models, for applications in other domains.

    Evaluating Machine Learning Models: This dataset can be used to evaluate the performance of machine learning models for sentiment analysis and other NLP tasks.

    Customer Service: Customer reviews, summary and labels can provide insight into customer complaints, issues, and suggestions, which can help companies improve their customer service.

    However,the applications of this type of data will depend on the specific dataset and the problem it is being used to solve.

  18. Bakery Customer Data

    • kaggle.com
    zip
    Updated Oct 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sarthak M. (2024). Bakery Customer Data [Dataset]. https://www.kaggle.com/datasets/sarthakmangalmurti/bakery-customer-data
    Explore at:
    zip(7035 bytes)Available download formats
    Dataset updated
    Oct 7, 2024
    Authors
    Sarthak M.
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset Description

    This dataset contains 500 records of customer transactions across five distinct bakeries, providing a rich source of information for analyzing consumer behavior in the bakery industry. Each record is characterized by several key features:

    • Bakery_ID: A unique identifier for each bakery, allowing for comparative analysis across different locations.
    • Customer_ID: A unique identifier assigned to each customer, facilitating individual transaction tracking without personal identification.
    • Items_Purchased: The quantity of items purchased in each transaction, which helps gauge customer buying habits and preferences.
    • Amount_Spent: The total expenditure of each customer during their visit, serving as a primary metric for assessing customer spending behavior.
    • Payment_Method: The method used by customers to complete their purchases, including options like Card, Cash, and Mobile Payment, which offers insights into payment trends and preferences.
    • Loyalty Member: This column indicates whether the customer is a member of the bakery's loyalty program. Analyzing loyalty membership data can provide insights into customer retention and the effectiveness of loyalty initiatives in driving repeat business.
    • Age: This column indicates the age of the customer at the time of purchase. It helps analyze spending patterns and preferences across different age groups.
    • Gender: This column represents the gender of the customer, providing insights into purchasing behavior and preferences. Analyzing gender data can assist bakeries in tailoring marketing strategies and product offerings.
    • Purchase_Date: This column records the date of each transaction, allowing for the identification of seasonal trends and peak shopping periods. It aids in understanding customer buying behavior over time.
    • Time_of_Purchase: This column categorizes the time of day when the purchase was made. Analyzing this data helps identify peak hours for customer visits, enabling bakeries to optimize staffing and inventory.

    This dataset is designed to facilitate various analyses, including spending patterns, payment method preferences, and overall consumer trends in the bakery sector. By utilizing this dataset, stakeholders can derive actionable insights to enhance customer engagement, optimize product offerings, and inform marketing strategies.

  19. Customer Purchase Data

    • kaggle.com
    zip
    Updated Mar 3, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    YEKA (2025). Customer Purchase Data [Dataset]. https://www.kaggle.com/datasets/presheepresh/customer-purchase-data
    Explore at:
    zip(336435 bytes)Available download formats
    Dataset updated
    Mar 3, 2025
    Authors
    YEKA
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset download from kaggle, contains detailed customer sales transactions, including age, annual income, gender, number of purchases and product details. It is useful for analyzing buying behavior, predicting trends, and optimizing sales strategies.

    Dataset Features Age_Group – Customer age category (Young, Middle Age, Old) Gender – Male/Female Annual_Income – Customer's annual income in USD Number_of_Purchases – Total purchases made Purchase_Amount – Value of the purchase in USD Product_Category – Type of product purchased

    Source & Context This dataset is based on anonymized retail transaction records from an e-commerce platform. It can help businesses understand customer segmentation, analyze spending habits, and build machine learning models for customer prediction.

    Potential Use Cases ✔️ Customer segmentation analysis ✔️ Sales trend forecasting ✔️ Price sensitivity analysis ✔️ Predicting customer churn ✔️ Recommender system development

    Example Questions to Explore Which age group makes the most purchases? What is the average purchase amount by income level? Do certain product categories sell better to specific demographics? How does income impact buying frequency?

  20. Tips Dataset

    • kaggle.com
    zip
    Updated Apr 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sakshi Satre (2024). Tips Dataset [Dataset]. https://www.kaggle.com/datasets/sakshisatre/tips-dataset
    Explore at:
    zip(1878 bytes)Available download formats
    Dataset updated
    Apr 20, 2024
    Authors
    Sakshi Satre
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The "tips" dataset is a popular dataset often used for demonstration and practice in data analysis and visualization. It contains information about various attributes of customers in a restaurant, including the total bill amount, tip amount, gender, whether the customer smokes or not, the day of the week, time of day, and the size of the party.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F19517213%2F49afcbc1ea63ba5f522e6aec5a75016f%2F1lodging_housespecialty_finish1.webp?generation=1711445992879706&alt=media" alt="">

    This data contain following columns:

    • total_bill: This attribute represents the total amount of the bill paid by the customer, including the cost of the meal, taxes, and any additional charges.

    • tip: This attribute denotes the amount of tip left by the customer. It's typically calculated as a percentage of the total bill and is often discretionary.

    • sex: This attribute indicates the gender of the customer. It could be either male or female.

    • smoker: This attribute indicates whether the customer is a smoker or a non-smoker. It's a categorical variable with two possible values: "Yes" for smokers and "No" for non-smokers.

    • day: This attribute represents the day of the week when the meal was consumed. It could be any of the seven days in a week (e.g., Monday, Tuesday, etc.).

    • time: This attribute denotes the time of the day when the meal was consumed. It's often categorized into two values: "Lunch" for meals consumed during the day and "Dinner" for meals consumed in the evening.

    • size: This attribute indicates the size of the party dining together. It represents the number of people included in the bill.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sanyam Goyal (2024). Customer Purchases Behaviour Dataset [Dataset]. https://www.kaggle.com/datasets/sanyamgoyal401/customer-purchases-behaviour-dataset
Organization logo

Customer Purchases Behaviour Dataset

Simulated Dataset of Customer Purchase Behavior

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
zip(1524741 bytes)Available download formats
Dataset updated
Apr 6, 2024
Authors
Sanyam Goyal
License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

Subtitle:

Simulated Dataset of Customer Purchase Behavior

Description:

This dataset contains simulated data representing customer purchase behavior. It includes various features such as age, gender, income, education, region, loyalty status, purchase frequency, purchase amount, product category, promotion usage, and satisfaction score.

File Information:

  • File Format: CSV
  • Number of Rows: 100000
  • Number of Columns: 12

Column Descriptors:

  • age: Age of the customer.
  • gender: Gender of the customer (0 for Male, 1 for Female).
  • income: Annual income of the customer.
  • education: Education level of the customer.
  • region: Region where the customer resides.
  • loyalty_status: Loyalty status of the customer.
  • purchase_frequency: Frequency of purchases made by the customer.
  • purchase_amount: Amount spent by the customer in each purchase.
  • product_category: Category of the purchased product.
  • promotion_usage: Indicates whether the customer used promotional offers (0 for No, 1 for Yes).
  • satisfaction_score: Satisfaction score of the customer.

Provenance:

The dataset was simulated using the simstudy package in R. Various distributions and formulas were used to generate synthetic data representing customer purchase behavior. The data is organized to mimic real-world scenarios, but it does not represent actual customer data.

Search
Clear search
Close search
Google apps
Main menu