100+ datasets found

Customer Shopping Trends Dataset
kaggle.com
Updated Oct 5, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sourav Banerjee (2023). Customer Shopping Trends Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/customer-shopping-trends-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Oct 5, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sourav Banerjee
Description
Context

The Customer Shopping Preferences Dataset offers valuable insights into consumer behavior and purchasing patterns. Understanding customer preferences and trends is critical for businesses to tailor their products, marketing strategies, and overall customer experience. This dataset captures a wide range of customer attributes including age, gender, purchase history, preferred payment methods, frequency of purchases, and more. Analyzing this data can help businesses make informed decisions, optimize product offerings, and enhance customer satisfaction. The dataset stands as a valuable resource for businesses aiming to align their strategies with customer needs and preferences. It's important to note that this dataset is a Synthetic Dataset Created for Beginners to learn more about Data Analysis and Machine Learning.

Content

This dataset encompasses various features related to customer shopping preferences, gathering essential information for businesses seeking to enhance their understanding of their customer base. The features include customer age, gender, purchase amount, preferred payment methods, frequency of purchases, and feedback ratings. Additionally, data on the type of items purchased, shopping frequency, preferred shopping seasons, and interactions with promotional offers is included. With a collection of 3900 records, this dataset serves as a foundation for businesses looking to apply data-driven insights for better decision-making and customer-centric strategies.

Dataset Glossary (Column-wise)

Customer ID - Unique identifier for each customer

Age - Age of the customer

Gender - Gender of the customer (Male/Female)

Item Purchased - The item purchased by the customer

Category - Category of the item purchased

Purchase Amount (USD) - The amount of the purchase in USD

Location - Location where the purchase was made

Size - Size of the purchased item

Color - Color of the purchased item

Season - Season during which the purchase was made

Review Rating - Rating given by the customer for the purchased item

Subscription Status - Indicates if the customer has a subscription (Yes/No)

Shipping Type - Type of shipping chosen by the customer

Discount Applied - Indicates if a discount was applied to the purchase (Yes/No)

Promo Code Used - Indicates if a promo code was used for the purchase (Yes/No)

Previous Purchases - The total count of transactions concluded by the customer at the store, excluding the ongoing transaction

Payment Method - Customer's most preferred payment method

Frequency of Purchases - Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly)

Structure of the Dataset

https://i.imgur.com/6UEqejq.png" alt="">

Acknowledgement

This dataset is a synthetic creation generated using ChatGPT to simulate a realistic customer shopping experience. Its purpose is to provide a platform for beginners and data enthusiasts, allowing them to create, enjoy, practice, and learn from a dataset that mirrors real-world customer shopping behavior. The aim is to foster learning and experimentation in a simulated environment, encouraging a deeper understanding of data analysis and interpretation in the context of consumer preferences and retail scenarios.

Cover Photo by: Freepik

Thumbnail by: Clothing icons created by Flat Icons - Flaticon
d
Sample Customers Data - Link - Dataset - Datopian CKAN instance
demo.dev.datopian.com
Updated May 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Sample Customers Data - Link - Dataset - Datopian CKAN instance [Dataset]. https://demo.dev.datopian.com/dataset/my-test-org7842hq--sample-customers-data---link
Explore at:
Dataset updated
May 27, 2025
Description
This dataset contains customer and contact information from various toy companies, including Tailspin Toys and Wingtip Toys, showing details such as Customer ID, Customer Name, Person ID, and Contact Person Name.
Ecommerce Order & Supply Chain Dataset
kaggle.com
Updated Aug 7, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aditya Bagus Pratama (2024). Ecommerce Order & Supply Chain Dataset [Dataset]. https://www.kaggle.com/datasets/bytadit/ecommerce-order-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 7, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Aditya Bagus Pratama
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Dataset Description

The E-commerce Order Dataset provides comprehensive information related to orders, items within orders, customers, payments, and products for an e-commerce platform. This dataset is structured with multiple tables, each containing specific information about various aspects of the e-commerce operations.

Dataset Features

Orders Table:

order_id: Unique identifier for an order, acting as the primary key.

customer_id: Unique identifier for a customer. This table may not be unique at this level.

order_status: Indicates the status of an order (e.g., delivered, cancelled, processing, etc.).

order_purchase_timestamp: Timestamp when the order was made by the customer.

order_approved_at: Timestamp when the order was approved from the seller's side.

order_delivered_timestamp: Timestamp when the order was delivered at the customer's location.

order_estimated_delivery_date: Estimated date of delivery shared with the customer while placing the order.

Order Items Table

order_id: Unique identifier for an order.

order_item_id: Item number in each order, acting as part of the primary key along with the order_id.

product_id: Unique identifier for a product.

seller_id: Unique identifier for the seller.

price: Selling price of the product.

shipping_charges: Charges associated with the shipping of the product.

Customers Table

customer_id: Unique identifier for a customer, acting as the primary key.

customer_zip_code_prefix: Customer's Zip code.

customer_city: Customer's city.

customer_state: Customer's state.

Payments Table

order_id: Unique identifier for an order.

payment_sequential: Provides information about the sequence of payments for the given order.

payment_type: Type of payment (e.g., credit_card, debit_card, etc.).

payment_installments: Payment installment number in case of credit cards.

payment_value: Transaction value.

Products Table

product_id: Unique identifier for each product, acting as the primary key.

product_category_name: Name of the category the product belongs to.

product_weight_g: Product weight in grams.

product_length_cm: Product length in centimeters.

product_height_cm: Product height in centimeters.

product_width_cm: Product width in centimeters.
m
Indeterminate Likert Scale - Sample Dataset - Customer Feedback of...
data.mendeley.com
Updated Dec 23, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ilanthenral Kandasamy (2018). Indeterminate Likert Scale - Sample Dataset - Customer Feedback of Restaurant [Dataset]. http://doi.org/10.17632/ywjxpyw95w.1
Explore at:
Unique identifier
https://doi.org/10.17632/ywjxpyw95w.1
Dataset updated
Dec 23, 2018
Authors
Ilanthenral Kandasamy
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Research Hypothesis: Using the concept of Neutrosophy to deal Indeterminacy in Feedback

Data: Feedback given by customers of a restaurant. Questionnaire based on six factors, i.e., Quality of Food, Service, Hygiene, Value for money, Ambiance, Overall Experience. Each question (based on the factor) has five membership values as follows: , Positive, Positive Indeterminate, Indeterminate, Negative Indeterminate and Negative.
D
Sample
data.sfgov.org
Updated Jul 3, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
San Francisco 311 (2025). Sample [Dataset]. https://data.sfgov.org/City-Infrastructure/Sample/buzf-b9hq
Explore at:
tsv, csv, application/rssxml, application/rdfxml, xml, kmz, kml, application/geo+jsonAvailable download formats
Dataset updated
Jul 3, 2025
Dataset authored and provided by
San Francisco 311
License
ODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
Description
This dataset was reset and modified on 04/19/2017. Read the full notice of changes in the 'About' section of this dataset.

SF311 cases created since 7/1/2008 with location information. For more information about Open311, see http://www.open311.org/.
h
Bitext-customer-support-llm-chatbot-training-dataset
huggingface.co
opendatalab.com
Updated Jul 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bitext (2024). Bitext-customer-support-llm-chatbot-training-dataset [Dataset]. https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2024
Dataset authored and provided by
Bitext
License
https://choosealicense.com/licenses/cdla-sharing-1.0/https://choosealicense.com/licenses/cdla-sharing-1.0/
Description
Bitext - Customer Service Tagged Training Dataset for LLM-based Virtual Assistants

Overview

This hybrid synthetic dataset is designed to be used to fine-tune Large Language Models such as GPT, Mistral and OpenELM, and has been generated using our NLP/NLG technology and our automated Data Labeling (DAL) tools. The goal is to demonstrate how Verticalization/Domain Adaptation for the Customer Support sector can be easily achieved using our two-step approach to LLM… See the full description on the dataset page: https://huggingface.co/datasets/bitext/Bitext-customer-support-llm-chatbot-training-dataset.
Datasets for Sentiment Analysis
zenodo.org
csv
Updated Dec 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias (2023). Datasets for Sentiment Analysis [Dataset]. http://doi.org/10.5281/zenodo.10157504
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10157504
Dataset updated
Dec 10, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Julie R. Repository creator - Campos Arias; Julie R. Repository creator - Campos Arias
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository was created for my Master's thesis in Computational Intelligence and Internet of Things at the University of Córdoba, Spain. The purpose of this repository is to store the datasets found that were used in some of the studies that served as research material for this Master's thesis. Also, the datasets used in the experimental part of this work are included.
Below are the datasets specified, along with the details of their references, authors, and download sources.

----------- STS-Gold Dataset ----------------
The dataset consists of 2026 tweets. The file consists of 3 columns: id, polarity, and tweet. The three columns denote the unique id, polarity index of the text and the tweet text respectively.
Reference: Saif, H., Fernandez, M., He, Y., & Alani, H. (2013). Evaluation datasets for Twitter sentiment analysis: a survey and a new dataset, the STS-Gold.
File name: sts_gold_tweet.csv
----------- Amazon Sales Dataset ----------------
This dataset is having the data of 1K+ Amazon Product's Ratings and Reviews as per their details listed on the official website of Amazon. The data was scraped in the month of January 2023 from the Official Website of Amazon.
Owner: Karkavelraja J., Postgraduate student at Puducherry Technological University (Puducherry, Puducherry, India)
Features:
product_id - Product ID
product_name - Name of the Product
category - Category of the Product
discounted_price - Discounted Price of the Product
actual_price - Actual Price of the Product
discount_percentage - Percentage of Discount for the Product
rating - Rating of the Product
rating_count - Number of people who voted for the Amazon rating
about_product - Description about the Product
user_id - ID of the user who wrote review for the Product
user_name - Name of the user who wrote review for the Product
review_id - ID of the user review
review_title - Short review
review_content - Long review
img_link - Image Link of the Product
product_link - Official Website Link of the Product
License: CC BY-NC-SA 4.0
File name: amazon.csv
----------- Rotten Tomatoes Reviews Dataset ----------------
This rating inference dataset is a sentiment classification dataset, containing 5,331 positive and 5,331 negative processed sentences from Rotten Tomatoes movie reviews. On average, these reviews consist of 21 words. The first 5331 rows contains only negative samples and the last 5331 rows contain only positive samples, thus the data should be shuffled before usage.
This data is collected from https://www.cs.cornell.edu/people/pabo/movie-review-data/ as a txt file and converted into a csv file. The file consists of 2 columns: reviews and labels (1 for fresh (good) and 0 for rotten (bad)).
Reference: Bo Pang and Lillian Lee. Seeing stars: Exploiting class relationships for sentiment categorization with respect to rating scales. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05), pages 115–124, Ann Arbor, Michigan, June 2005. Association for Computational Linguistics
File name: data_rt.csv
----------- Preprocessed Dataset Sentiment Analysis ----------------
Preprocessed amazon product review data of Gen3EcoDot (Alexa) scrapped entirely from amazon.in
Stemmed and lemmatized using nltk.
Sentiment labels are generated using TextBlob polarity scores.
The file consists of 4 columns: index, review (stemmed and lemmatized review using nltk), polarity (score) and division (categorical label generated using polarity score).
DOI: 10.34740/kaggle/dsv/3877817
Citation: @misc{pradeesh arumadi_2022, title={Preprocessed Dataset Sentiment Analysis}, url={https://www.kaggle.com/dsv/3877817}, DOI={10.34740/KAGGLE/DSV/3877817}, publisher={Kaggle}, author={Pradeesh Arumadi}, year={2022} }
This dataset was used in the experimental phase of my research.
File name: EcoPreprocessed.csv
----------- Amazon Earphones Reviews ----------------
This dataset consists of a 9930 Amazon reviews, star ratings, for 10 latest (as of mid-2019) bluetooth earphone devices for learning how to train Machine for sentiment analysis.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 5 columns: ReviewTitle, ReviewBody, ReviewStar, Product and division (manually added - categorical label generated using ReviewStar score)
License: U.S. Government Works
Source: www.amazon.in
File name (original): AllProductReviews.csv (contains 14337 reviews)
File name (edited - used for my research) : AllProductReviews2.csv (contains 9930 reviews)
----------- Amazon Musical Instruments Reviews ----------------
This dataset contains 7137 comments/reviews of different musical instruments coming from Amazon.
This dataset was employed in the experimental phase of my research. To align it with the objectives of my study, certain reviews were excluded from the original dataset, and an additional column was incorporated into this dataset.
The file consists of 10 columns: reviewerID, asin (ID of the product), reviewerName, helpful (helpfulness rating of the review), reviewText, overall (rating of the product), summary (summary of the review), unixReviewTime (time of the review - unix time), reviewTime (time of the review (raw) and division (manually added - categorical label generated using overall score).
Source: http://jmcauley.ucsd.edu/data/amazon/
File name (original): Musical_instruments_reviews.csv (contains 10261 reviews)
File name (edited - used for my research) : Musical_instruments_reviews2.csv (contains 7137 reviews)
Sales Dataset of USA [Updated]
kaggle.com
Updated Jun 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sulaiman ahmed (2023). Sales Dataset of USA [Updated] [Dataset]. https://www.kaggle.com/datasets/sulaimanahmed/sales-dataset-of-usa-updated
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 20, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sulaiman ahmed
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Area covered
United States
Description
The given dataset appears to be a sales dataset containing information about different orders. Here is a description of the data:

Row ID: An identifier for each row in the dataset.

Order ID: Unique identifier for each order.

Order Date: The date when the order was placed.

Ship Date: The date when the order was shipped.

Ship Mode: The mode of shipping chosen for the order.

Customer ID: Unique identifier for each customer.

Customer Name: Name of the customer who placed the order.

Segment: The segment to which the customer belongs (e.g., consumer, corporate).

Country: The country where the order was placed (in this case, United States).

City: The city where the order was placed.

State: The state where the order was placed.

Postal Code: The postal code associated with the order's location.

Region: The region of the country where the order was placed.

Product ID: Unique identifier for each product.

Category: The category to which the product belongs (e.g., furniture, office supplies).

Sub-Category: The sub-category to which the product belongs (e.g., bookcases, chairs).

Product Name: The name of the product.

Cost: The cost of the product.

Price: The price at which the product was sold.

Profit: The profit made from the sale of the product.

Quantity: The quantity of the product ordered.

Sales: The total sales generated from the order (quantity multiplied by price).

The dataset provides detailed information about each order, including customer details, product details, sales information, and shipping information. It can be used to analyze various aspects of the sales data, such as profitability, customer segments, product categories, and regional sales performance.
m
Data from: The American Customer Satisfaction Index (ACSI): A Sample Dataset...
data.mendeley.com
Updated Jan 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tomas Hult (2023). The American Customer Satisfaction Index (ACSI): A Sample Dataset and Description [Dataset]. http://doi.org/10.17632/64xkbj2ry5.1
Explore at:
Unique identifier
https://doi.org/10.17632/64xkbj2ry5.1
Dataset updated
Jan 23, 2023
Authors
Tomas Hult
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset provides a sample of survey data collected by the American Customer Satisfaction Index (ACSI). Using online sampling and stratified interviewing techniques of actual customers of predominantly large market-share (“large cap”) companies, the ACSI annually collects data from some 400,000 consumers residing across the United States for more than 400 companies within about 50 consumer industries. For this data depository, consumers’ perceptions of their experiences with individual companies included within four consumer industries as defined and measured by ACSI – processed food, commercial airlines, Internet service providers, and commercial banks – are included in the dataset. The overall sample size is n=8239 consumer responses for this sample ACSI dataset.
Envestnet | Yodlee's De-Identified Consumer Behavior Data | Row/Aggregate...
datarade.ai
.sql, .txt
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Envestnet | Yodlee, Envestnet | Yodlee's De-Identified Consumer Behavior Data | Row/Aggregate Level | USA Consumer Data covering 3600+ corporations | 90M+ Accounts [Dataset]. https://datarade.ai/data-products/envestnet-yodlee-s-de-identified-consumer-behavior-data-r-envestnet-yodlee
Explore at:
.sql, .txtAvailable download formats
Dataset provided by
Yodlee
Envestnethttp://envestnet.com/
Authors
Envestnet | Yodlee
Area covered
United States of America
Description
Envestnet®| Yodlee®'s Consumer Behavior Data (Aggregate/Row) Panels consist of de-identified, near-real time (T+1) USA credit/debit/ACH transaction level data – offering a wide view of the consumer activity ecosystem. The underlying data is sourced from end users leveraging the aggregation portion of the Envestnet®| Yodlee®'s financial technology platform.

Envestnet | Yodlee Consumer Panels (Aggregate/Row) include data relating to millions of transactions, including ticket size and merchant location. The dataset includes de-identified credit/debit card and bank transactions (such as a payroll deposit, account transfer, or mortgage payment). Our coverage offers insights into areas such as consumer, TMT, energy, REITs, internet, utilities, ecommerce, MBS, CMBS, equities, credit, commodities, FX, and corporate activity. We apply rigorous data science practices to deliver key KPIs daily that are focused, relevant, and ready to put into production.

We offer free trials. Our team is available to provide support for loading, validation, sample scripts, or other services you may need to generate insights from our data.

Investors, corporate researchers, and corporates can use our data to answer some key business questions such as: - How much are consumers spending with specific merchants/brands and how is that changing over time? - Is the share of consumer spend at a specific merchant increasing or decreasing? - How are consumers reacting to new products or services launched by merchants? - For loyal customers, how is the share of spend changing over time? - What is the company’s market share in a region for similar customers? - Is the company’s loyal user base increasing or decreasing? - Is the lifetime customer value increasing or decreasing?

Additional Use Cases: - Use spending data to analyze sales/revenue broadly (sector-wide) or granular (company-specific). Historically, our tracked consumer spend has correlated above 85% with company-reported data from thousands of firms. Users can sort and filter by many metrics and KPIs, such as sales and transaction growth rates and online or offline transactions, as well as view customer behavior within a geographic market at a state or city level. - Reveal cohort consumer behavior to decipher long-term behavioral consumer spending shifts. Measure market share, wallet share, loyalty, consumer lifetime value, retention, demographics, and more.) - Study the effects of inflation rates via such metrics as increased total spend, ticket size, and number of transactions. - Seek out alpha-generating signals or manage your business strategically with essential, aggregated transaction and spending data analytics.

Use Cases Categories (Our data provides an innumerable amount of use cases, and we look forward to working with new ones): 1. Market Research: Company Analysis, Company Valuation, Competitive Intelligence, Competitor Analysis, Competitor Analytics, Competitor Insights, Customer Data Enrichment, Customer Data Insights, Customer Data Intelligence, Demand Forecasting, Ecommerce Intelligence, Employee Pay Strategy, Employment Analytics, Job Income Analysis, Job Market Pricing, Marketing, Marketing Data Enrichment, Marketing Intelligence, Marketing Strategy, Payment History Analytics, Price Analysis, Pricing Analytics, Retail, Retail Analytics, Retail Intelligence, Retail POS Data Analysis, and Salary Benchmarking

Investment Research: Financial Services, Hedge Funds, Investing, Mergers & Acquisitions (M&A), Stock Picking, Venture Capital (VC)

Consumer Analysis: Consumer Data Enrichment, Consumer Intelligence

Market Data: AnalyticsB2C Data Enrichment, Bank Data Enrichment, Behavioral Analytics, Benchmarking, Customer Insights, Customer Intelligence, Data Enhancement, Data Enrichment, Data Intelligence, Data Modeling, Ecommerce Analysis, Ecommerce Data Enrichment, Economic Analysis, Financial Data Enrichment, Financial Intelligence, Local Economic Forecasting, Location-based Analytics, Market Analysis, Market Analytics, Market Intelligence, Market Potential Analysis, Market Research, Market Share Analysis, Sales, Sales Data Enrichment, Sales Enablement, Sales Insights, Sales Intelligence, Spending Analytics, Stock Market Predictions, and Trend Analysis
A
‘JB Link Telco Customer Churn’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘JB Link Telco Customer Churn’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-jb-link-telco-customer-churn-742f/5fbf9511/?iid=042-751&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘JB Link Telco Customer Churn’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/johnflag/jb-link-telco-customer-churn on 28 January 2022.

--- Dataset description provided by original source is as follows ---

This is a customized version of the widely known IBM Telco Customer Churn dataset. I've added a few more columns and modified others in order to make it a little more realistic.

My customizations are based on the following version: Telco customer churn (11.1.3+)

Below you may find a fictional business problem I created. You may use it in order to start developing something around this dataset.

JB Link Customer Churn Problem

JB Link is a small size telecom company located in the state of California that provides Phone and Internet services to customers on more than a 1,000 cities and 1,600 zip codes.

The company is in the market for just 6 years and has quickly grown by investing on infrastructure to bring internet and phone networks to regions that had poor or no coverage.

The company also has a very skilled sales team that is always performing well on attracting new customers. The number of new customers acquired in the past quarter represent 15% over the total.

However, by the end of this same period, only 43% of this customers stayed with the company and most of them decided on not renewing their contracts after a few months, meaning the customer churn rate is very high and the company is now facing a big challenge on retaining its customers.

The total customer churn rate last quarter was around 27%, resulting in a decrease of almost 12% in the total number of customers.

The executive leadership of JB Link is aware that some competitors are investing on new technologies and on the expansion of their network coverage and they believe this is one of the main drivers of the high customer churn rate.

Therefore, as an action plan, they have decided to created a task force inside the company that will be responsible to work on a customer retention strategy.

The task force will involve members from different areas of the company, including Sales, Finance, Marketing, Customer Service, Tech Support and a recent formed Data Science team.

The data science team will play a key role on this process and was assigned some very important tasks that will support on the decisions and actions the other teams will be taking : - Gather insights from the data to understand what is driving the high customer churn rate. - Develop a Machine Learning model that can accurately predict the customers that are more likely to churn. - Prescribe customized actions that could be taken in order to retain each of those customers.

The Data Science team was given a dataset with a random sample of 7,043 customers that can help on achieving this task.

The executives are aware that the cost of acquiring a new customer can be up to five times higher than the cost of retaining a customer, so they are expecting that the results of this project will save a lot of money to the company and make it start growing again.

--- Original source retains full ownership of the source dataset ---
d
Consumer Marketing Data | Comprehensive Data of Consumer Marketing Insights...
datarade.ai
.csv, .xls, .txt
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
VisitIQ™ (2024). Consumer Marketing Data | Comprehensive Data of Consumer Marketing Insights | Database & Dataset [Dataset]. https://datarade.ai/data-products/visitiq-consumer-marketing-data-comprehensive-data-of-co-visitiq
Explore at:
.csv, .xls, .txtAvailable download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
VisitIQ™
Area covered
United States of America
Description
At VisitIQ™, we provide a wealth of consumer marketing data to help businesses unlock deeper insights and optimize their B2C strategies. Our extensive and meticulously curated datasets are designed to provide a 360-degree view of your target consumers, combining a wide range of behavioral, demographic, and psychographic data points to deliver actionable insights that drive measurable results.

Our comprehensive consumer marketing database is built to fuel data-driven marketing strategies. With our rich behavioral insights, you can understand not just who your customers are, but also how they interact with your brand, what they are looking for, and what motivates their purchasing decisions. By tracking online and offline behaviors, preferences, purchase history, and engagement patterns, VisitIQ™ enables you to segment your audience more effectively and craft personalized marketing messages that resonate with your ideal customer profiles.

In addition to behavioral insights, our datasets provide detailed demographic information, including age, gender, location, income level, education, and household characteristics. This allows you to pinpoint your marketing efforts with incredible precision, reaching the right audience with the right message at the right time. Our data also includes psychographic attributes, such as lifestyle preferences, interests, and values, providing a deeper understanding of what drives consumer behavior and helping you create more compelling and relevant content.

VisitIQ's™ platform integrates seamlessly with your existing marketing stack, enabling you to utilize our consumer marketing data across multiple channels, from digital and social media to email and direct mail. With our data, you can improve targeting, increase engagement, reduce customer acquisition costs, and ultimately achieve a higher return on your marketing investment.

Whether you’re looking to attract new customers, retain existing ones, or re-engage lapsed consumers, VisitIQ™ provides the data you need to build effective, data-driven B2C marketing strategies. Our comprehensive datasets empower you to make informed decisions, optimize your marketing campaigns in real-time, and drive successful outcomes.

Unlock the full potential of your consumer marketing efforts with VisitIQ™. Transform your approach with powerful insights, sharpen your competitive edge, and achieve unparalleled marketing success.
H
Dataset for "Customer Feedback Text Analysis for Online Stores Reviews in...
dataverse.harvard.edu
Updated Nov 7, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tsvetanka Georgieva-Trifonova; Milena Stefanova; Stefan Kalchev (2018). Dataset for "Customer Feedback Text Analysis for Online Stores Reviews in Bulgarian" [Dataset]. http://doi.org/10.7910/DVN/TXIK9P
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/TXIK9P
Dataset updated
Nov 7, 2018
Dataset provided by
Harvard Dataverse
Authors
Tsvetanka Georgieva-Trifonova; Milena Stefanova; Stefan Kalchev
License
https://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.7910/DVN/TXIK9Phttps://dataverse.harvard.edu/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.7910/DVN/TXIK9P
Description
The dataset Customer_feedback_bg consists of customer reviews for online stores in Bulgarian. The data are retrieved from otzivi.bg and pazaruvaj.com, and represent user reviews in Bulgarian language about 87 online stores. 906 customer reviews were collected in free text and manually associated with the following categories: compliments, complaints, mixed, suggestions. Наборът от данни Customer_feedback_bg се състои от потребителски отзиви за онлайн магазини на български език. Данните са получени от otzivi.bg и pazaruvaj.com и представляват потребителски отзиви на български език за 87 онлайн магазина. Събрани са 906 потребителски отзива в свободен текст, които са ръчно асоциирани със следните категории: похвали, оплаквания, смесени, препоръки.
Customer Churn Dataset for Telecom
kaggle.com
Updated Apr 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
MehmetErtas (2025). Customer Churn Dataset for Telecom [Dataset]. https://www.kaggle.com/datasets/mehmetertas/customer-churn-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 8, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
MehmetErtas
Description
Dataset

This dataset was created by Mehmet Ertas

Contents
i
Sample Dataset for Testing
ieee-dataport.org
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alex Outman (2025). Sample Dataset for Testing [Dataset]. https://ieee-dataport.org/documents/sample-dataset-testing
Explore at:
Dataset updated
Apr 28, 2025
Authors
Alex Outman
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
10
A
‘Sample Sales Data’ analyzed by Analyst-2
analyst-2.ai
Updated Jan 28, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Sample Sales Data’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-sample-sales-data-1dc8/1310507b/?iid=023-678&v=presentation
Explore at:
Dataset updated
Jan 28, 2022
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Sample Sales Data’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/kyanyoga/sample-sales-data on 28 January 2022.

--- Dataset description provided by original source is as follows ---

Sample Sales Data, Order Info, Sales, Customer, Shipping, etc., Used for Segmentation, Customer Analytics, Clustering and More. Inspired for retail analytics. This was originally used for Pentaho DI Kettle, But I found the set could be useful for Sales Simulation training.

Originally Written by María Carina Roldán, Pentaho Community Member, BI consultant (Assert Solutions), Argentina. This work is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. Modified by Gus Segura June 2014.

--- Original source retains full ownership of the source dataset ---
h
customer-requests-nvidia-personas-sample
huggingface.co
Updated Jun 24, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Vila (2025). customer-requests-nvidia-personas-sample [Dataset]. https://huggingface.co/datasets/dvilasuero/customer-requests-nvidia-personas-sample
Explore at:
Dataset updated
Jun 24, 2025
Authors
Daniel Vila
Description
dvilasuero/customer-requests-nvidia-personas-sample dataset hosted on Hugging Face and contributed by the HF Datasets community
n
FOI 26605 - Datasets - Open Data Portal
opendata.nhsbsa.net
Updated Sep 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2023). FOI 26605 - Datasets - Open Data Portal [Dataset]. https://opendata.nhsbsa.net/dataset/foi-26605
Explore at:
Dataset updated
Sep 6, 2023
Description
Further to the original Enterprise Application request, the contract below has expired. Please provide the current status. Finance Capita CRM Trustmarque Solutions Ltd I'd like to apologise for the length of this request, and how tedious it may be to handle. That being said, please make an effort to provide all of this information. The information I'm requesting is regarding the software contracts that the organisation uses, for the following fields.Enterprise Resource Planning Software Solution (ERP): Primary Customer Relationship Management Solution (CRM): For example, Salesforce, Lagan CRM, Microsoft Dynamics; software of this nature. Primary Human Resources (HR) and Payroll Software Solution: For example, iTrent, ResourceLink, HealthRoster; software of this nature. The organisation’s primary corporate Finance Software Solution: For example, Agresso, Integra, Sapphire Systems; software of this nature. Name of Supplier: Can you please provide me with the software provider for each contract? The brand of the software: Can you please provide me with the actual name of the software. Please do not provide me with the supplier name again please provide me with the actual software name. Description of the contract: Can you please provide me with detailed information about this contract and please state if upgrade, maintenance and support is included. Please also list the software modules included in these contracts. Number of Users/Licenses: What is the total number of user/licenses for this contract? Annual Spend: What is the annual average spend for each contract? Contract Duration: What is the duration of the contract please include any available extensions within the contract. Contract Start Date: What is the start date of this contract? Please include month and year of the contract. DD-MM-YY or MM-YY. Contract Expiry: What is the expiry date of this contract? Please include month and year of the contract. DD-MM-YY or MM-YY.

🛍️ Fashion Retail Sales Dataset

kaggle.com

Updated Apr 1, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Atharva Soundankar (2025). 🛍️ Fashion Retail Sales Dataset [Dataset]. https://www.kaggle.com/datasets/atharvasoundankar/fashion-retail-sales

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Apr 1, 2025

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Atharva Soundankar

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

📜 Dataset Overview

This dataset contains 3,400 records of fashion retail sales, capturing various details about customer purchases, including item details, purchase amounts, ratings, and payment methods. It is useful for analyzing customer buying behavior, product popularity, and payment preferences.

📂 Dataset Details

Column Name	Data Type	Non-Null Count	Description
`Customer Reference ID`	Integer	3,400	A unique identifier for each customer.
`Item Purchased`	String	3,400	The name of the fashion item purchased.
`Purchase Amount (USD)`	Float	2,750	The purchase price of the item in USD (650 missing values).
`Date Purchase`	String	3,400	The date on which the purchase was made (format: DD-MM-YYYY).
`Review Rating`	Float	3,076	The customer review rating (scale: 1 to 5, 324 missing values).
`Payment Method`	String	3,400	The payment method used (e.g., Credit Card, Cash).

🔍 Key Insights

The dataset contains 3,400 transactions.
Missing values are present in:
- Purchase Amount (USD): 650 missing values
- Review Rating: 324 missing values
Payment Method includes multiple categories, allowing analysis of payment trends.
Date Purchase is in DD-MM-YYYY format, which can be useful for time-series analysis.
The dataset can help analyze sales trends, customer preferences, and payment behaviors in the fashion retail industry.

📊 Potential Use Cases

Sales Analysis: Understanding which fashion items are selling the most.
Customer Insights: Analyzing purchase behaviors and spending patterns.
Trend Forecasting: Identifying seasonal trends in fashion retail.
Payment Method Preferences: Understanding how customers prefer to pay.

H
Customer Segmentation - Raw Source Data
dataverse.harvard.edu
Updated May 6, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diomar Anez; Dimar Anez (2025). Customer Segmentation - Raw Source Data [Dataset]. http://doi.org/10.7910/DVN/0NS2KB
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/0NS2KB
Dataset updated
May 6, 2025
Dataset provided by
Harvard Dataverse
Authors
Diomar Anez; Dimar Anez
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This dataset contains raw, unprocessed data files pertaining to the management tool 'Customer Segmentation', including the closely related concept of Market Segmentation. The data originates from five distinct sources, each reflecting different facets of the tool's prominence and usage over time. Files preserve the original metrics and temporal granularity before any comparative normalization or harmonization. Data Sources & File Details: Google Trends File (Prefix: GT_): Metric: Relative Search Interest (RSI) Index (0-100 scale). Keywords Used: "customer segmentation" + "market segmentation" + "customer segmentation marketing" Time Period: January 2004 - January 2025 (Native Monthly Resolution). Scope: Global Web Search, broad categorization. Extraction Date: Data extracted January 2025. Notes: Index relative to peak interest within the period for these terms. Reflects public/professional search interest trends. Based on probabilistic sampling. Source URL: Google Trends Query Google Books Ngram Viewer File (Prefix: GB_): Metric: Annual Relative Frequency (% of total n-grams in the corpus). Keywords Used: Customer Segmentation + Market Segmentation Time Period: 1950 - 2022 (Annual Resolution). Corpus: English. Parameters: Case Insensitive OFF, Smoothing 0. Extraction Date: Data extracted January 2025. Notes: Reflects term usage frequency in Google's digitized book corpus. Subject to corpus limitations (English bias, coverage). Source URL: Ngram Viewer Query Crossref.org File (Prefix: CR_): Metric: Absolute count of publications per month matching keywords. Keywords Used: ("customer segmentation" OR "market segmentation") AND ("marketing" OR "strategy" OR "management" OR "targeting" OR "analysis" OR "approach" OR "practice") Time Period: 1950 - 2025 (Queried for monthly counts based on publication date metadata). Search Fields: Title, Abstract. Extraction Date: Data extracted January 2025. Notes: Reflects volume of relevant academic publications indexed by Crossref. Deduplicated using DOIs; records without DOIs omitted. Source URL: Crossref Search Query Bain & Co. Survey - Usability File (Prefix: BU_): Metric: Original Percentage (%) of executives reporting tool usage. Tool Names/Years Included: Customer Segmentation (1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Respondent Profile: CEOs, CFOs, COOs, other senior leaders; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., et al., various years: 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017). Note: Tool not included in the 2022 survey data. Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 1999/475; 2000/214; 2002/708; 2004/960; 2006/1221; 2008/1430; 2010/1230; 2012/1208; 2014/1067; 2017/1268. Bain & Co. Survey - Satisfaction File (Prefix: BS_): Metric: Original Average Satisfaction Score (Scale 0-5). Tool Names/Years Included: Customer Segmentation (1999, 2000, 2002, 2004, 2006, 2008, 2010, 2012, 2014, 2017). Respondent Profile: CEOs, CFOs, COOs, other senior leaders; global, multi-sector. Source: Bain & Company Management Tools & Trends publications (Rigby D., Bilodeau B., et al., various years: 2001, 2003, 2005, 2007, 2009, 2011, 2013, 2015, 2017). Note: Tool not included in the 2022 survey data. Data Compilation Period: July 2024 - January 2025. Notes: Data points correspond to specific survey years. Sample sizes: 1999/475; 2000/214; 2002/708; 2004/960; 2006/1221; 2008/1430; 2010/1230; 2012/1208; 2014/1067; 2017/1268. Reflects subjective executive perception of utility. File Naming Convention: Files generally follow the pattern: PREFIX_Tool.csv, where the PREFIX indicates the data source: GT_: Google Trends GB_: Google Books Ngram CR_: Crossref.org (Count Data for this Raw Dataset) BU_: Bain & Company Survey (Usability) BS_: Bain & Company Survey (Satisfaction) The essential identification comes from the PREFIX and the Tool Name segment. This dataset resides within the 'Management Tool Source Data (Raw Extracts)' Dataverse.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sourav Banerjee (2023). Customer Shopping Trends Dataset [Dataset]. https://www.kaggle.com/datasets/iamsouravbanerjee/customer-shopping-trends-dataset

Customer Shopping Trends Dataset

Journey into Consumer Insights and Retail Evolution with Synthetic Data

Explore at:

32 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 5, 2023

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Sourav Banerjee

Description

Context

The Customer Shopping Preferences Dataset offers valuable insights into consumer behavior and purchasing patterns. Understanding customer preferences and trends is critical for businesses to tailor their products, marketing strategies, and overall customer experience. This dataset captures a wide range of customer attributes including age, gender, purchase history, preferred payment methods, frequency of purchases, and more. Analyzing this data can help businesses make informed decisions, optimize product offerings, and enhance customer satisfaction. The dataset stands as a valuable resource for businesses aiming to align their strategies with customer needs and preferences. It's important to note that this dataset is a Synthetic Dataset Created for Beginners to learn more about Data Analysis and Machine Learning.

Content

This dataset encompasses various features related to customer shopping preferences, gathering essential information for businesses seeking to enhance their understanding of their customer base. The features include customer age, gender, purchase amount, preferred payment methods, frequency of purchases, and feedback ratings. Additionally, data on the type of items purchased, shopping frequency, preferred shopping seasons, and interactions with promotional offers is included. With a collection of 3900 records, this dataset serves as a foundation for businesses looking to apply data-driven insights for better decision-making and customer-centric strategies.

Dataset Glossary (Column-wise)

Customer ID - Unique identifier for each customer
Age - Age of the customer
Gender - Gender of the customer (Male/Female)
Item Purchased - The item purchased by the customer
Category - Category of the item purchased
Purchase Amount (USD) - The amount of the purchase in USD
Location - Location where the purchase was made
Size - Size of the purchased item
Color - Color of the purchased item
Season - Season during which the purchase was made
Review Rating - Rating given by the customer for the purchased item
Subscription Status - Indicates if the customer has a subscription (Yes/No)
Shipping Type - Type of shipping chosen by the customer
Discount Applied - Indicates if a discount was applied to the purchase (Yes/No)
Promo Code Used - Indicates if a promo code was used for the purchase (Yes/No)
Previous Purchases - The total count of transactions concluded by the customer at the store, excluding the ongoing transaction
Payment Method - Customer's most preferred payment method
Frequency of Purchases - Frequency at which the customer makes purchases (e.g., Weekly, Fortnightly, Monthly)

Structure of the Dataset

https://i.imgur.com/6UEqejq.png" alt="">

Acknowledgement

This dataset is a synthetic creation generated using ChatGPT to simulate a realistic customer shopping experience. Its purpose is to provide a platform for beginners and data enthusiasts, allowing them to create, enjoy, practice, and learn from a dataset that mirrors real-world customer shopping behavior. The aim is to foster learning and experimentation in a simulated environment, encouraging a deeper understanding of data analysis and interpretation in the context of consumer preferences and retail scenarios.

Cover Photo by: Freepik

Thumbnail by: Clothing icons created by Flat Icons - Flaticon

Clear search

Close search

Google apps

Main menu

Customer Shopping Trends Dataset

Context

Content

Dataset Glossary (Column-wise)

Structure of the Dataset

Acknowledgement

Sample Customers Data - Link - Dataset - Datopian CKAN instance

Ecommerce Order & Supply Chain Dataset

Dataset Description

Dataset Features

Orders Table:

Order Items Table

Customers Table

Payments Table

Products Table

Indeterminate Likert Scale - Sample Dataset - Customer Feedback of...

Sample

Bitext-customer-support-llm-chatbot-training-dataset

Datasets for Sentiment Analysis

Sales Dataset of USA [Updated]

Data from: The American Customer Satisfaction Index (ACSI): A Sample Dataset...

Envestnet | Yodlee's De-Identified Consumer Behavior Data | Row/Aggregate...

‘JB Link Telco Customer Churn’ analyzed by Analyst-2

JB Link Customer Churn Problem

Consumer Marketing Data | Comprehensive Data of Consumer Marketing Insights...

Dataset for "Customer Feedback Text Analysis for Online Stores Reviews in...

Customer Churn Dataset for Telecom

Dataset

Contents

Sample Dataset for Testing

‘Sample Sales Data’ analyzed by Analyst-2

customer-requests-nvidia-personas-sample

FOI 26605 - Datasets - Open Data Portal

🛍️ Fashion Retail Sales Dataset

📜 Dataset Overview

📂 Dataset Details

🔍 Key Insights

📊 Potential Use Cases

Customer Segmentation - Raw Source Data

Customer Shopping Trends DatasetSee More Versions

Journey into Consumer Insights and Retail Evolution with Synthetic Data

Context

Content

Dataset Glossary (Column-wise)

Structure of the Dataset

Acknowledgement

Customer Shopping Trends Dataset