100+ datasets found

Banking Customer Churn Prediction Dataset
kaggle.com
zip
Updated May 16, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Saurabh Badole (2024). Banking Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/bank-customer-churn-prediction-dataset
Explore at:
zip(267794 bytes)Available download formats
Dataset updated
May 16, 2024
Authors
Saurabh Badole
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
Description:

This dataset contains information about bank customers and their churn status, which indicates whether they have exited the bank or not. It is suitable for exploring and analyzing factors influencing customer churn in banking institutions and for building predictive models to identify customers at risk of churning.

Features:

RowNumber: The sequential number assigned to each row in the dataset.

CustomerId: A unique identifier for each customer.

Surname: The surname of the customer.

CreditScore: The credit score of the customer.

Geography: The geographical location of the customer (e.g., country or region).

Gender: The gender of the customer.

Age: The age of the customer.

Tenure: The number of years the customer has been with the bank.

Balance: The account balance of the customer.

NumOfProducts: The number of bank products the customer has.

HasCrCard: Indicates whether the customer has a credit card (binary: yes/no).

IsActiveMember: Indicates whether the customer is an active member (binary: yes/no).

EstimatedSalary: The estimated salary of the customer.

Exited: Indicates whether the customer has exited the bank (binary: yes/no).

Usage:

This dataset can be used for exploratory data analysis to understand the factors influencing customer churn in banks.

It can also be used to build machine learning models for predicting customer churn based on the given features.

License:

This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Telco Customer Churn
kaggle.com
zip
Updated Feb 23, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
BlastChar (2018). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/blastchar/telco-customer-churn
Explore at:
zip(175758 bytes)Available download formats
Dataset updated
Feb 23, 2018
Authors
BlastChar
Description
Context

"Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]

Content

Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

The data set includes information about:

Customers who left within the last month – the column is called Churn

Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies

Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges

Demographic info about customers – gender, age range, and if they have partners and dependents

Inspiration

To explore this type of models and learn more about the subject.

New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113
Predictive Analytics for Customer Churn: Dataset
kaggle.com
zip
Updated Oct 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Safrin S (2023). Predictive Analytics for Customer Churn: Dataset [Dataset]. https://www.kaggle.com/datasets/safrin03/predictive-analytics-for-customer-churn-dataset
Explore at:
zip(25124511 bytes)Available download formats
Dataset updated
Oct 6, 2023
Authors
Safrin S
Description
Context : This dataset is part of a data science project focused on customer churn prediction for a subscription-based service. Customer churn, the rate at which customers cancel their subscriptions, is a vital metric for businesses offering subscription services. Predictive analytics techniques are employed to anticipate which customers are likely to churn, enabling companies to take proactive measures for customer retention.

Content : This dataset contains anonymized information about customer subscriptions and their interaction with the service. The data includes various features such as subscription type, payment method, viewing preferences, customer support interactions, and other relevant attributes. It consists of three files such as "test.csv", "train.csv", "data_descriptions.csv".

Columns :

CustomerID: Unique identifier for each customer

SubscriptionType: Type of subscription plan chosen by the customer (e.g., Basic, Premium, Deluxe)

PaymentMethod: Method used for payment (e.g., Credit Card, Electronic Check, PayPal)

PaperlessBilling: Whether the customer uses paperless billing (Yes/No)

ContentType: Type of content accessed by the customer (e.g., Movies, TV Shows, Documentaries)

MultiDeviceAccess: Whether the customer has access on multiple devices (Yes/No)

DeviceRegistered: Device registered by the customer (e.g., Smartphone, Smart TV, Laptop)

GenrePreference: Genre preference of the customer (e.g., Action, Drama, Comedy)

Gender: Gender of the customer (Male/Female)

ParentalControl: Whether parental control is enabled (Yes/No)

SubtitlesEnabled: Whether subtitles are enabled (Yes/No)

AccountAge: Age of the customer's subscription account (in months)

MonthlyCharges: Monthly subscription charges

TotalCharges: Total charges incurred by the customer

ViewingHoursPerWeek: Average number of viewing hours per week

SupportTicketsPerMonth: Number of customer support tickets raised per month

AverageViewingDuration: Average duration of each viewing session

ContentDownloadsPerMonth: Number of content downloads per month

UserRating: Customer satisfaction rating (1 to 5)

WatchlistSize: Size of the customer's content watchlist

Acknowledgments : The dataset used in this project is obtained from Data Science Challenge on Coursera and is used for educational and research purposes. Any resemblance to real persons or entities is purely coincidental.
Customer Churn Prediction Business Dataset
kaggle.com
zip
Updated Dec 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Arif Miah (2025). Customer Churn Prediction Business Dataset [Dataset]. https://www.kaggle.com/datasets/miadul/customer-churn-prediction-business-dataset
Explore at:
zip(519989 bytes)Available download formats
Dataset updated
Dec 14, 2025
Authors
Arif Miah
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
📄 Dataset Description

Customer churn is one of the most critical challenges for subscription-based and service-oriented businesses. Retaining existing customers is significantly more cost-effective than acquiring new ones, making churn prediction a key business analytics problem.

This dataset is a synthetic but business-realistic customer churn dataset designed for machine learning, data science, and predictive analytics use cases. The data simulates real-world customer behavior by incorporating customer demographics, product usage patterns, billing and payment history, customer support interactions, and engagement metrics.

The target variable, churn, indicates whether a customer is likely to discontinue the service. Churn labels are generated using business-driven rules combined with probabilistic noise, ensuring realistic feature correlations rather than random labeling.

This dataset is ideal for:

Exploratory Data Analysis (EDA)

Feature engineering

Customer churn prediction modeling

Explainable AI (SHAP, feature importance)

Business dashboards and decision support systems

End-to-end ML deployment using Streamlit or Flask

🧾 Dataset Characteristics

Number of records: 10,000 customers

Target variable: churn (0 = No, 1 = Yes)

Data types: Numerical & Categorical

Domain: Subscription / SaaS / Telecom / Service Business

Data source: Synthetic (business-logic driven)

📊 Feature Categories

Customer Profile: age, gender, location, tenure, contract type

Product Usage: logins, session duration, feature usage, activity trends

Billing & Payment: subscription fees, revenue, payment failures, discounts

Customer Support: tickets, resolution time, CSAT, complaints

Engagement & Feedback: email activity, NPS score, survey responses

🎯 Use Cases

Predict high-risk churn customers

Identify key churn drivers

Estimate revenue at risk

Build retention strategies

Train and evaluate ML/DL models

Create executive-level business dashboards

⚠️ Disclaimer

This dataset is synthetically generated for educational, research, and portfolio purposes. While it reflects realistic business patterns, it does not represent real customer data.
Online Retail Customer Churn Dataset
kaggle.com
zip
Updated Feb 14, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hassane Skikri (2024). Online Retail Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/hassaneskikri/online-retail-customer-churn-dataset
Explore at:
zip(23795 bytes)Available download formats
Dataset updated
Feb 14, 2024
Authors
Hassane Skikri
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Overview:

This dataset provides a comprehensive overview of customer interactions with an online retail store, aiming to predict customer churn based on various behavioral and demographic features. It includes data on customer demographics, spending behavior, satisfaction levels, and engagement with marketing campaigns. The dataset is designed for analysis and development of predictive models to identify customers at risk of churn, enabling targeted customer retention strategies.

Description of Columns:

Customer_ID: A unique identifier for each customer.

Age: The customer's age.

Gender: The customer's gender (Male, Female, Other).

Annual_Income: The annual income of the customer in thousands of dollars.

Total_Spend: The total amount spent by the customer in the last year.

Years_as_Customer: The number of years the individual has been a customer of the store.

Num_of_Purchases: The number of purchases the customer made in the last year.

Average_Transaction_Amount: The average amount spent per transaction.

Num_of_Returns: The number of items the customer returned in the last year.

Num_of_Support_Contacts: The number of times the customer contacted support in the last year.

Satisfaction_Score: A score from 1 to 5 indicating the customer's satisfaction with the store.

Last_Purchase_Days_Ago: The number of days since the customer's last purchase.

Email_Opt_In: Whether the customer has opted in to receive marketing emails.

Promotion_Response: The customer's response to the last promotional campaign (Responded, Ignored, Unsubscribed).

Target_Churn: Indicates whether the customer churned (True or False).
Customer Churn Prediction Dataset
kaggle.com
zip
Updated Apr 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ziya (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/customer-churn-prediction-dataset
Explore at:
zip(7446 bytes)Available download formats
Dataset updated
Apr 7, 2025
Authors
Ziya
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
The Customer Churn Prediction Dataset is a dataset designed to predict customer churn based on various behavioral and demographic features. The dataset contains information about 1,000 customers, and includes the following key features:

Customer_ID: A unique identifier for each customer.

Age: The age of the customer (ranging from 18 to 70 years).

Gender: The gender of the customer (0 = Male, 1 = Female).

Monthly_Spending: The amount of money spent monthly by the customer (ranging from 50 to 500 units).

Subscription_Length: The number of years the customer has been subscribed to the service (ranging from 1 to 10 years).

Support_Interactions: The number of times the customer has interacted with customer support (ranging from 0 to 5).

Churn: The target variable indicating whether the customer has churned (1) or remained (0).
Bank Customer Churn Dataset
kaggle.com
zip
Updated Aug 30, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gaurav Topre (2022). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/gauravtopre/bank-customer-churn-dataset
Explore at:
zip(191965 bytes)Available download formats
Dataset updated
Aug 30, 2022
Authors
Gaurav Topre
Description
This dataset is for ABC Multistate bank with following columns:

customer_id, unused variable.

credit_score, used as input.

country, used as input.

gender, used as input.

age, used as input.

tenure, used as input.

balance, used as input.

products_number, used as input.

credit_card, used as input.

active_member, used as input.

estimated_salary, used as input.

churn, used as the target. 1 if the client has left the bank during some period or 0 if he/she has not.

Aim is to Predict the Customer Churn for ABC Bank.

https://miro.medium.com/max/737/1*Xap6OxaZvD7C7eMQKkaHYQ.jpeg" alt="">
Customer Churn Prediction Dataset
kaggle.com
zip
Updated Apr 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
AJ (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/smayanj/customer-churn-prediction-dataset
Explore at:
zip(622898 bytes)Available download formats
Dataset updated
Apr 12, 2025
Authors
AJ
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This is a synthetic dataset created to simulate customer behavior in a subscription-based service. It includes 15,000 rows, with each row representing a single customer.

Features:

tenure_months
How long (in months) the customer has been using the service.

monthly_usage_hours
Average number of hours the customer uses the service per month.

has_multiple_devices
Binary value (1 = yes, 0 = no). Whether the customer uses more than one device.

customer_support_calls
Number of times the customer contacted customer support.

payment_failures
Binary value (1 = yes, 0 = no). Whether the customer had recent payment issues.

is_premium_plan
Binary value (1 = yes, 0 = no). Whether the customer is on a premium subscription.

Target:

churn
Binary value (1 = customer will leave, 0 = customer will stay).
This is calculated based on a rule-based formula that considers factors like low tenure, low usage, support calls, and payment issues. Some randomness is added to mimic real-world uncertainty.
Customer Churn Prediction Dataset
kaggle.com
zip
Updated Mar 31, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Şahide Şeker (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sahideseker/customer-churn-prediction-dataset
Explore at:
zip(9296 bytes)Available download formats
Dataset updated
Mar 31, 2025
Authors
Şahide Şeker
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
🇬🇧 English:

This synthetic dataset was designed for those who want to practice customer churn prediction using structured tabular data. It includes 1,000 customer records, each containing features such as age, service tenure, service type, monthly fee, and churn status.

Use this dataset to:

Build classification models like Logistic Regression, Random Forest, or XGBoost

Explore churn-related patterns (e.g. short tenure, high price, mobile users)

Simulate real-world business scenarios without needing real customer data

Features:

customer_id: Unique customer ID (e.g. C1001 to C2000)

age: Age of the customer

tenure: Number of months the customer has been active

service_type: Type of service used (internet, mobile, tv, bundle)

monthly_fee: Monthly subscription fee

churn: Whether the customer has left the service (1 = Yes, 0 = No)

🇹🇷 Türkçe:

Bu sentetik veri seti, müşteri kaybı (churn) tahmini üzerine çalışmak isteyen araştırmacılar ve öğrenciler için oluşturulmuştur. 1.000 müşteriye ait yaş, hizmet süresi, hizmet türü, aylık ödeme ve abonelik durumuna dair sahte ancak gerçekçi veriler içerir.

Bu veri seti sayesinde:

Logistic Regression, Random Forest, XGBoost gibi sınıflandırma modelleri uygulanabilir

Churn davranışına etki eden faktörler incelenebilir (örneğin kısa üyelik, yüksek fiyat, mobil kullanıcılar)

Gerçek müşteri verilerine erişim gerekmeden iş senaryoları çalışılabilir

🧾 Değişkenler:

customer_id: Müşteri kimliği (ör. C1001 – C2000)

age: Müşteri yaşı

tenure: Kaç aydır hizmet aldığı

service_type: Aldığı hizmet türü (internet, mobile, tv, bundle)

monthly_fee: Aylık ödeme miktarı

churn: Hizmeti bırakıp bırakmadığı (1 = Evet, 0 = Hayır)
Patient Churn Prediction Dataset for Healthcare
kaggle.com
zip
Updated Jan 20, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nudrat Abbas (2026). Patient Churn Prediction Dataset for Healthcare [Dataset]. https://www.kaggle.com/datasets/nudratabbas/patient-churn-prediction-dataset-for-healthcare
Explore at:
zip(54926 bytes)Available download formats
Dataset updated
Jan 20, 2026
Authors
Nudrat Abbas
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Context

Patient churn (attrition) is a critical challenge in healthcare, costing providers billions in lost revenue and disrupting continuity of care. Studies show:

Acquiring a new patient costs 5-25x more than retaining an existing one

20-30% of patients switch providers annually

Low satisfaction and poor engagement are primary drivers of churn

Understanding which patients are at risk of leaving—and why—enables healthcare organizations to:

Implement proactive retention strategies

Improve patient satisfaction and outcomes

Optimize resource allocation for high-risk patients

Increase lifetime patient value

Content

The dataset contains 2,000 patient records with detailed behavioral, satisfaction, and engagement metrics:

Patient Demographics:

Age, Gender, Geographic location

Tenure with healthcare provider (months)

Service Utilization:

Annual visit frequency

Missed appointments

Days since last interaction

Specialty of care

Satisfaction Metrics:

Overall satisfaction score (1-5)

Wait time satisfaction

Staff interaction satisfaction

Provider rating

Financial & Engagement Factors:

Insurance type

Average out-of-pocket costs

Billing issues flag

Patient portal usage

Referral behavior

Distance to facility

Target Variable:

Churned (0 = Retained, 1 = Churned)

Inspiration & Use Cases

This dataset enables healthcare organizations to:

Churn Prediction - Build machine learning models to identify at-risk patients before they leave

Retention Marketing - Target high-risk patients with personalized retention campaigns

Patient Experience Improvement - Identify key drivers of satisfaction and dissatisfaction

Resource Optimization - Allocate retention resources to patients most likely to benefit

Lifetime Value Analysis - Understand the long-term impact of patient retention

Service Quality Enhancement - Pinpoint operational issues affecting patient loyalty

Ideal for: - Healthcare analysts, data scientists, patient experience managers, students learning classification algorithms

Online Retail Customer Churn Prediction Dataset

kaggle.com

zip

Updated Jun 21, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Sahil Islam007 (2025). Online Retail Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sahilislam007/online-retail-customer-churn-prediction-dataset

Explore at:

zip(430917 bytes)Available download formats

Dataset updated

Jun 21, 2025

Authors

Sahil Islam007

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This Synthetic dataset simulates customer behavior data for an online retail company and is designed to be useful for Exploratory Data Analysis (EDA) and various machine learning tasks such as:

Customer segmentation

Churn prediction

Recommendation systems

Customer lifetime value estimation

🔍 Dataset Overview: Each row represents a unique customer, and the columns provide information on their demographics, shopping habits, engagement with the website, and satisfaction.

Column	Description
`CustomerID`	Unique identifier for each customer
`Age`	Customer's age
`Gender`	Gender of the customer
`Annual_Income_USD`	Annual income in US dollars
`Spending_Score`	Score based on spending behavior (1–100)
`Membership_Status`	Customer loyalty level (Bronze to Platinum)
`Preferred_Payment_Method`	Payment method most often used
`Region`	Geographical region (e.g., North, South)
`Total_Purchases`	Total number of purchases made
`Avg_Purchase_Value`	Average value of each purchase
`Last_Purchase_Date`	Date of the most recent purchase
`Churn`	Whether the customer has churned (0 = No, 1 = Yes)
`Satisfaction_Score`	Satisfaction score (1–5 scale)
`Website_Visits_Last_Month`	Number of visits to the website last month
`Avg_Time_Per_Visit_Minutes`	Average time spent on website per visit
`Support_Tickets_Last_6_Months`	Number of support tickets raised
`Referred_Friends`	Number of friends referred to the platform

✅ Use Cases: Churn Prediction: Predict if a customer will churn based on behavior and demographics.

Segmentation: Use clustering to segment customers by behavior (e.g., income, spending, satisfaction).

Classification/Regression: Predict customer satisfaction or spending score.

Recommendation Engines: Based on purchase history and behavior patterns.

Data from: Customer Churn Dataset
kaggle.com
zip
Updated Dec 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sonal Shinde (2025). Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/sonalshinde123/customer-churn-prediction-dataset
Explore at:
zip(593965 bytes)Available download formats
Dataset updated
Dec 23, 2025
Authors
Sonal Shinde
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
This dataset is a synthetic customer churn dataset designed to simulate real-world telecom customer behavior. It is generated using business-driven rules based on customer tenure, billing amount, contract type, service usage, and support interactions. Controlled randomness and noise are added to avoid perfect patterns and make the dataset suitable for realistic machine learning classification tasks. The dataset is ideal for beginners to practice exploratory data analysis, feature engineering, and customer churn prediction using machine learning models.
SaaS Customer Churn Prediction Dataset
kaggle.com
zip
Updated Feb 14, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Suhani Gupta_04 (2026). SaaS Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/suhanigupta04/saas-customer-churn-prediction-dataset
Explore at:
zip(114493 bytes)Available download formats
Dataset updated
Feb 14, 2026
Authors
Suhani Gupta_04
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
📊 SaaS Customer Churn Prediction

Context

"TechFlow" is a fictional SaaS company providing project management software. Like many SaaS companies, they are experiencing customer churn (users cancelling subscriptions). The company has collected data on user usage, account age, and the textual content of their latest customer support interaction.

Content

The dataset contains 2,500 records (split into train.csv and test.csv). Each row represents a unique customer.

Features

Customer_ID: Unique ID for the customer.

Name: Name of the customer.

Email: Email of the customer.

Account_Age_Days: Number of days the customer has been active.

Login_Frequency: Categorical (Daily, Weekly, Rarely).

Daily_Usage_Mins: Average minutes the customer spends on the platform per day.

Last_Support_Ticket: The full text of the customer's most recent support ticket.

Churn: Target label (1 = Churned/Left, 0 = Retained/Stayed).

Inspiration

Most churn datasets are purely numerical. This dataset challenges you to combine numerical analysis with Natural Language Processing (NLP). 1. EDA: Does Login_Frequency correlate with Churn? 2. NLP: Can you perform sentiment analysis on Last_Support_Ticket to see if angry customers are more likely to churn? 3. Modeling: Build a model that uses both the usage metrics and the text data to predict churn with high accuracy.

Data Generation

This is a synthetic dataset generated using Python's Faker library. Real-world patterns (e.g., angry support tickets leading to higher churn) were simulated using weighted probabilities to make the data useful for machine learning practice.
Netflix Customer Churn dataset(upvote if you like)
kaggle.com
zip
Updated Jul 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdul Wadood (2025). Netflix Customer Churn dataset(upvote if you like) [Dataset]. https://www.kaggle.com/datasets/abdulwadood11220/netflix-customer-churn-dataset
Explore at:
zip(190004 bytes)Available download formats
Dataset updated
Jul 5, 2025
Authors
Abdul Wadood
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains synthetic data simulating customer behavior for a Netflix-like video streaming service. It includes 5,000 records with 14 carefully engineered features designed for churn prediction modeling, business insights, and customer segmentation.

The dataset is ideal for:

Machine learning classification tasks (churn vs. non-churn)

Exploratory data analysis (EDA)

Customer behavior modeling in OTT platforms

Customer Churn Prediction Dataset_ 1M

kaggle.com

zip

Updated Jan 11, 2026

Facebook

Twitter

Click to copy link

Link copied

Cite

Isandeep06 (2026). Customer Churn Prediction Dataset_ 1M [Dataset]. https://www.kaggle.com/datasets/isandeep06/customer-churn-prediction-dataset-1m

Explore at:

zip(42854638 bytes)Available download formats

Dataset updated

Jan 11, 2026

Authors

Isandeep06

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Customer Churn Prediction Dataset

Overview

A comprehensive, production-scale synthetic dataset for customer churn prediction in the telecommunications industry. This dataset contains 1,000,000 customer records with 28 features designed to simulate real-world churn behavior while maintaining privacy compliance. Perfect for exploring machine learning, feature engineering, and business analytics in customer retention scenarios.

Business Context

Telecommunications companies face significant customer attrition (churn) that directly impacts revenue. Predicting which customers are likely to churn allows businesses to implement targeted retention strategies, optimize marketing spend, and improve customer lifetime value.

Dataset Details

Rows: 1,000,000 customer records
Columns: 28 features + 1 target variable
Size: ~200 MB (compressed ~40 MB)
Churn Rate: 20-25% (industry-realistic)
Time Span: 5 years of customer data
Format: CSV with UTF-8 encoding

Column Descriptions

Customer Demographics (8 features)

Column	Type	Description	Missing Values
`customer_id`	String	Unique customer identifier	0%
`signup_date`	Date	Customer registration date	0%
`age`	Integer	Customer age (18-90)	0%
`gender`	Categorical	Gender: Male, Female, Other	0%
`annual_income`	Float	Annual income in USD	3%
`education`	Categorical	Education level	0%
`marital_status`	Categorical	Marital status	0%
`dependents`	Integer	Number of dependents (0-5)	0%
`senior_citizen`	Binary	1 if age ≥ 65, else 0	0%

Account Information (4 features)

Column	Type	Description	Missing Values
`tenure`	Integer	Months with company (1-72)	0%
`contract`	Categorical	Contract type: month_to_month, one_year, two_year	0%
`payment_method`	Categorical	Payment method	0%
`paperless_billing`	Categorical	Paperless billing: Yes/No	0%

Service Usage & Billing (11 features)

Column	Type	Description	Missing Values
`monthlycharges`	Float	Monthly service charges ($20-$200)	0%
`totalcharges`	Float	Total charges to date	0%
`num_services`	Integer	Number of subscribed services (1-6)	0%
`has_phone_service`	Binary	1 if has phone service	0%
`has_internet_service`	Binary	1 if has internet service	0%
`has_online_security`	Binary	1 if has online security	0%
`has_online_backup`	Binary	1 if has online backup	0%
`has_device_protection`	Binary	1 if has device protection	0%
`has_tech_support`	Binary	1 if has tech support	0%
`has_streaming_tv`	Binary	1 if has streaming TV	0%
`has_streaming_movies`	Binary	1 if has streaming movies	0%

Customer Behavior & Risk (7 features)

Column	Type	Description	Missing Values
`customer_satisfaction`	Integer	Satisfaction score (1-10)	2%
`num_complaints`	Integer	Complaints in last year (0-8)	3%
`num_service_calls`	Integer	Service calls last month (0-12)	0%
`late_payments`	Integer	Late payments last 3 months (0-5)	0%
`avg_monthly_gb`	Float	Average monthly data usage (GB)	5%
`days_since_last_interaction`	Integer	Days since last contact (1-365)	0%
`credit_score`	Integer	Credit score (300-850)	4%

Target Variable

Column	Type	Description	Missing Values
`churn`	Binary	Target: 1 if churned, 0 if retained	0%

Key Dataset Characteristics

Realistic Patterns

New customers (tenure < 6 months) have 2-3x higher churn risk
Monthly contracts show 40%+ churn rates vs <10% for two-year contracts
High satisfaction (8+) reduces churn probability by 60%
Late payments increase churn risk substantially
Service bundling (multiple services) improves retention

Data Quality Features

Missing values: Strategically introduced (2-5%) in key columns
Outliers: Mild outliers in monetary columns (0.1% of data)
Skewed distributions: Income (lognormal), tenure (exponential)
Correlations: Services correlate with internet subscription
Noise: ±10% noise in totalcharges ≈ monthlycharges × tenure

Interaction Effects

Low tenure × monthly contract → Very high churn risk
High charges × low satisfaction → Elevated churn probability
Many service calls × no tech support → Increased churn likelihood

Potential Use Cases

Machine Learning

Binary classification (churn prediction)
Fea...

Customer Churn Prediction Dataset
kaggle.com
zip
Updated Sep 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Study Mart (2020). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/studymart/customer-churn-prediction
Explore at:
zip(175743 bytes)Available download formats
Dataset updated
Sep 18, 2020
Authors
Study Mart
Description
Dataset

This dataset was created by Study Mart

Contents
Real World Customer Churn Dataset
kaggle.com
zip
Updated Oct 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lasal Jayawardena (2023). Real World Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/lasaljaywardena/real-world-churn/code
Explore at:
zip(415985883 bytes)Available download formats
Dataset updated
Oct 24, 2023
Authors
Lasal Jayawardena
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Area covered
World
Description
60,000+ Real Anonymized Customer Usage Data for Churn Prediction!

Dataset Information

Dataset Name: Real World Customer Churn Dataset in Telco Domain

Snapshot Period: January 1, 2023, to March 31, 2023

Source: One of the Largest Telco Companies in Sri Lanka

Data Anonymization: The Dataset is Anonymized to Protect Customer Privacy.

Overview

The "Real World Customer Churn Dataset in Telco Domain" is a comprehensive collection of anonymized data that provides insights into customer behavior and churn prediction within the telecommunications industry.

https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6361330%2F860271e0362e6c10503889f289201402%2FCustomer-churn.jpg?generation=1698182677600097&alt=media" alt="Dataset Image">

Usage Categories

The dataset contains data on over 60,000 customers across more than 10+ distinct usage categories. Some of the key usage categories include:

usage_app_youtube_daily: YouTube Traffic in MBs.

usage_app_facebook_daily: Facebook Traffic in MBs.

usage_app_tiktok_daily: TikTok Traffic in MBs.

usage_app_whatsapp_daily: WhatsApp Traffic in MBs.

usage_app_helakuru_daily: Helakuru App traffic in MBs.

usage_voice_o2o_outgoing: Outgoing call volume in minutes between the same operator.

usage_voice_o2op_outgoing: Outgoing call volume in minutes between operator and other operators.

usage_voice_o2o_incoming: Incoming call volume in minutes between the same operator.

usage_voice_op2o_incoming: Incoming call volume in minutes between other operator to operator.

usage_pack_data: Spend in LKR for data package purchasing.

usage_pack_vas: Spend in LKR for value-added service rentals or usage.

Dataset Files

The dataset consists of the following key files:

main.csv: An aggregated dataset that compiles usage data from all usage categories, providing a holistic view of customer behavior.

raw_dump folder: The raw data export, preserving the original source data for detailed exploration.

test and train folders: These folders contain customer IDs and corresponding Churn Labels, facilitating model training and testing.

usage_profiles folder: It comprises broken-down data frames for each customer under specific usage categories, allowing in-depth analysis of individual customer behavior within various usage categories.

Potential Use Cases

The "Real World Customer Churn Dataset in Telco Domain" offers a range of potential use cases, including:

Customer Churn Prediction: Leveraging customer usage patterns to predict and reduce churn.

Targeted Marketing: Designing customized marketing campaigns based on customer preferences.

Service Quality Enhancement: Identifying areas for service improvement, such as network quality.

Revenue Optimization: Maximizing revenue through the analysis of data package spending and value-added service usage.

Dataset Importance

This dataset's real-world aspect is of significant importance. It reflects actual customer interactions with a major telecommunications company in Sri Lanka, offering insights that can be directly applied to real-world scenarios. The dataset is sourced from one of the largest telco companies in the country, adding credibility and relevance to the insights it provides.

Understanding customer churn and usage behavior is pivotal for the telecommunications industry, and this dataset empowers researchers, data scientists, and businesses to gain deeper insights into these aspects.

Disclaimer

The dataset is anonymized to protect customer privacy, and all data used is in compliance with privacy regulations and agreements. Users are encouraged to explore and contribute to the "Real World Customer Churn Dataset in Telco Domain."

Thank you for your valuable contributions to this dataset.
Bank Customer Churn Dataset
kaggle.com
zip
Updated Jul 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhuvi Ranga (2023). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/bhuviranga/customer-churn-data
Explore at:
zip(191965 bytes)Available download formats
Dataset updated
Jul 11, 2023
Authors
Bhuvi Ranga
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention
Tour & Travels Customer Churn Prediction
kaggle.com
zip
Updated Oct 31, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tejashvi (2021). Tour & Travels Customer Churn Prediction [Dataset]. https://www.kaggle.com/datasets/tejashvi14/tour-travels-customer-churn-prediction
Explore at:
zip(3537 bytes)Available download formats
Dataset updated
Oct 31, 2021
Authors
Tejashvi
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
A Tour & Travels Company Wants To Predict Whether A Customer Will Churn Or Not Based On Indicators Given Below. Help Build Predictive Models And Save The Company's Money. Perform Fascinating EDAs. The Data Was Used For Practice Purposes And Also During A Mini Hackathon, Its Completely Free To Use
Telecom Churn Predict
kaggle.com
zip
Updated Aug 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swaraj Khan (2023). Telecom Churn Predict [Dataset]. https://www.kaggle.com/datasets/swarajkhan/telecom-churn-predict
Explore at:
zip(27579 bytes)Available download formats
Dataset updated
Aug 11, 2023
Authors
Swaraj Khan
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
"Telecom Customer Churn Prediction Dataset" is a synthetic dataset designed to simulate customer data for a telecommunications company. This dataset is created for the purpose of predicting customer churn, which refers to the phenomenon of customers discontinuing their services with the company. The dataset contains a variety of features that capture different aspects of customer behavior and characteristics.

The dataset includes information such as customer age, gender, contract type, monthly charges, total amount spent, number of devices connected, and the number of customer support calls made. The key focus of this dataset is the binary target variable "Churn," which indicates whether a customer has churned (1) or not (0). This variable is essential for training and evaluating predictive models aimed at identifying customers who are likely to leave the service.

Facebook

Twitter

Click to copy link

Link copied

Cite

Saurabh Badole (2024). Banking Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/bank-customer-churn-prediction-dataset

Banking Customer Churn Prediction Dataset

Understanding Customer Behavior and Predicting Churn in Banking Institutions

Explore at:

27 scholarly articles cite this dataset (View in Google Scholar)

zip(267794 bytes)Available download formats

Dataset updated

May 16, 2024

Authors

Saurabh Badole

License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Description:

This dataset contains information about bank customers and their churn status, which indicates whether they have exited the bank or not. It is suitable for exploring and analyzing factors influencing customer churn in banking institutions and for building predictive models to identify customers at risk of churning.

Features:

RowNumber: The sequential number assigned to each row in the dataset.

CustomerId: A unique identifier for each customer.

Surname: The surname of the customer.

CreditScore: The credit score of the customer.

Geography: The geographical location of the customer (e.g., country or region).

Gender: The gender of the customer.

Age: The age of the customer.

Tenure: The number of years the customer has been with the bank.

Balance: The account balance of the customer.

NumOfProducts: The number of bank products the customer has.

HasCrCard: Indicates whether the customer has a credit card (binary: yes/no).

IsActiveMember: Indicates whether the customer is an active member (binary: yes/no).

EstimatedSalary: The estimated salary of the customer.

Exited: Indicates whether the customer has exited the bank (binary: yes/no).

Usage:

This dataset can be used for exploratory data analysis to understand the factors influencing customer churn in banks.
It can also be used to build machine learning models for predicting customer churn based on the given features.

License:

This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Clear search

Close search

Google apps

Main menu

Banking Customer Churn Prediction Dataset

Description:

Features:

Usage:

License:

Telco Customer Churn

Context

Content

Inspiration

Predictive Analytics for Customer Churn: Dataset

Customer Churn Prediction Business Dataset

📄 Dataset Description

🧾 Dataset Characteristics

📊 Feature Categories

🎯 Use Cases

⚠️ Disclaimer

Online Retail Customer Churn Dataset

Overview:

Description of Columns:

Customer Churn Prediction Dataset

Bank Customer Churn Dataset

Customer Churn Prediction Dataset

Features:

Target:

Customer Churn Prediction Dataset

Patient Churn Prediction Dataset for Healthcare

Online Retail Customer Churn Prediction Dataset

Data from: Customer Churn Dataset

SaaS Customer Churn Prediction Dataset

📊 SaaS Customer Churn Prediction

Context

Content

Features

Inspiration

Data Generation

Netflix Customer Churn dataset(upvote if you like)

Customer Churn Prediction Dataset_ 1M

Customer Churn Prediction Dataset

Overview

Business Context

Dataset Details

Column Descriptions

Customer Demographics (8 features)

Account Information (4 features)

Service Usage & Billing (11 features)

Customer Behavior & Risk (7 features)

Target Variable

Key Dataset Characteristics

Realistic Patterns

Data Quality Features

Interaction Effects

Potential Use Cases

Machine Learning

Customer Churn Prediction Dataset

Dataset

Contents

Real World Customer Churn Dataset

60,000+ Real Anonymized Customer Usage Data for Churn Prediction!

Dataset Information

Overview

Usage Categories

Dataset Files

Potential Use Cases

Dataset Importance

Disclaimer

Bank Customer Churn Dataset

Tour & Travels Customer Churn Prediction

Telecom Churn Predict

Banking Customer Churn Prediction DatasetSee More Versions

Understanding Customer Behavior and Predicting Churn in Banking Institutions

Description:

Features:

Usage:

License:

Banking Customer Churn Prediction Dataset