100+ datasets found
  1. Banking Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated May 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saurabh Badole (2024). Banking Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/bank-customer-churn-prediction-dataset
    Explore at:
    zip(267794 bytes)Available download formats
    Dataset updated
    May 16, 2024
    Authors
    Saurabh Badole
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Description:

    This dataset contains information about bank customers and their churn status, which indicates whether they have exited the bank or not. It is suitable for exploring and analyzing factors influencing customer churn in banking institutions and for building predictive models to identify customers at risk of churning.

    Features:

    RowNumber: The sequential number assigned to each row in the dataset.

    CustomerId: A unique identifier for each customer.

    Surname: The surname of the customer.

    CreditScore: The credit score of the customer.

    Geography: The geographical location of the customer (e.g., country or region).

    Gender: The gender of the customer.

    Age: The age of the customer.

    Tenure: The number of years the customer has been with the bank.

    Balance: The account balance of the customer.

    NumOfProducts: The number of bank products the customer has.

    HasCrCard: Indicates whether the customer has a credit card (binary: yes/no).

    IsActiveMember: Indicates whether the customer is an active member (binary: yes/no).

    EstimatedSalary: The estimated salary of the customer.

    Exited: Indicates whether the customer has exited the bank (binary: yes/no).

    Usage:

    • This dataset can be used for exploratory data analysis to understand the factors influencing customer churn in banks.
    • It can also be used to build machine learning models for predicting customer churn based on the given features.

    License:

    This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

  2. Telco Customer Churn

    • kaggle.com
    zip
    Updated Feb 23, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    BlastChar (2018). Telco Customer Churn [Dataset]. https://www.kaggle.com/datasets/blastchar/telco-customer-churn
    Explore at:
    zip(175758 bytes)Available download formats
    Dataset updated
    Feb 23, 2018
    Authors
    BlastChar
    Description

    Context

    "Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]

    Content

    Each row represents a customer, each column contains customer’s attributes described on the column Metadata.

    The data set includes information about:

    • Customers who left within the last month – the column is called Churn
    • Services that each customer has signed up for – phone, multiple lines, internet, online security, online backup, device protection, tech support, and streaming TV and movies
    • Customer account information – how long they’ve been a customer, contract, payment method, paperless billing, monthly charges, and total charges
    • Demographic info about customers – gender, age range, and if they have partners and dependents

    Inspiration

    To explore this type of models and learn more about the subject.

    New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113

  3. Predictive Analytics for Customer Churn: Dataset

    • kaggle.com
    zip
    Updated Oct 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Safrin S (2023). Predictive Analytics for Customer Churn: Dataset [Dataset]. https://www.kaggle.com/datasets/safrin03/predictive-analytics-for-customer-churn-dataset
    Explore at:
    zip(25124511 bytes)Available download formats
    Dataset updated
    Oct 6, 2023
    Authors
    Safrin S
    Description

    Context : This dataset is part of a data science project focused on customer churn prediction for a subscription-based service. Customer churn, the rate at which customers cancel their subscriptions, is a vital metric for businesses offering subscription services. Predictive analytics techniques are employed to anticipate which customers are likely to churn, enabling companies to take proactive measures for customer retention.

    Content : This dataset contains anonymized information about customer subscriptions and their interaction with the service. The data includes various features such as subscription type, payment method, viewing preferences, customer support interactions, and other relevant attributes. It consists of three files such as "test.csv", "train.csv", "data_descriptions.csv".

    Columns :

    CustomerID: Unique identifier for each customer

    SubscriptionType: Type of subscription plan chosen by the customer (e.g., Basic, Premium, Deluxe)

    PaymentMethod: Method used for payment (e.g., Credit Card, Electronic Check, PayPal)

    PaperlessBilling: Whether the customer uses paperless billing (Yes/No)

    ContentType: Type of content accessed by the customer (e.g., Movies, TV Shows, Documentaries)

    MultiDeviceAccess: Whether the customer has access on multiple devices (Yes/No)

    DeviceRegistered: Device registered by the customer (e.g., Smartphone, Smart TV, Laptop)

    GenrePreference: Genre preference of the customer (e.g., Action, Drama, Comedy)

    Gender: Gender of the customer (Male/Female)

    ParentalControl: Whether parental control is enabled (Yes/No)

    SubtitlesEnabled: Whether subtitles are enabled (Yes/No)

    AccountAge: Age of the customer's subscription account (in months)

    MonthlyCharges: Monthly subscription charges

    TotalCharges: Total charges incurred by the customer

    ViewingHoursPerWeek: Average number of viewing hours per week

    SupportTicketsPerMonth: Number of customer support tickets raised per month

    AverageViewingDuration: Average duration of each viewing session

    ContentDownloadsPerMonth: Number of content downloads per month

    UserRating: Customer satisfaction rating (1 to 5)

    WatchlistSize: Size of the customer's content watchlist

    Acknowledgments : The dataset used in this project is obtained from Data Science Challenge on Coursera and is used for educational and research purposes. Any resemblance to real persons or entities is purely coincidental.

  4. Customer Churn Prediction Business Dataset

    • kaggle.com
    zip
    Updated Dec 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arif Miah (2025). Customer Churn Prediction Business Dataset [Dataset]. https://www.kaggle.com/datasets/miadul/customer-churn-prediction-business-dataset
    Explore at:
    zip(519989 bytes)Available download formats
    Dataset updated
    Dec 14, 2025
    Authors
    Arif Miah
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    📄 Dataset Description

    Customer churn is one of the most critical challenges for subscription-based and service-oriented businesses. Retaining existing customers is significantly more cost-effective than acquiring new ones, making churn prediction a key business analytics problem.

    This dataset is a synthetic but business-realistic customer churn dataset designed for machine learning, data science, and predictive analytics use cases. The data simulates real-world customer behavior by incorporating customer demographics, product usage patterns, billing and payment history, customer support interactions, and engagement metrics.

    The target variable, churn, indicates whether a customer is likely to discontinue the service. Churn labels are generated using business-driven rules combined with probabilistic noise, ensuring realistic feature correlations rather than random labeling.

    This dataset is ideal for:

    • Exploratory Data Analysis (EDA)
    • Feature engineering
    • Customer churn prediction modeling
    • Explainable AI (SHAP, feature importance)
    • Business dashboards and decision support systems
    • End-to-end ML deployment using Streamlit or Flask

    🧾 Dataset Characteristics

    • Number of records: 10,000 customers
    • Target variable: churn (0 = No, 1 = Yes)
    • Data types: Numerical & Categorical
    • Domain: Subscription / SaaS / Telecom / Service Business
    • Data source: Synthetic (business-logic driven)

    📊 Feature Categories

    • Customer Profile: age, gender, location, tenure, contract type
    • Product Usage: logins, session duration, feature usage, activity trends
    • Billing & Payment: subscription fees, revenue, payment failures, discounts
    • Customer Support: tickets, resolution time, CSAT, complaints
    • Engagement & Feedback: email activity, NPS score, survey responses

    🎯 Use Cases

    • Predict high-risk churn customers
    • Identify key churn drivers
    • Estimate revenue at risk
    • Build retention strategies
    • Train and evaluate ML/DL models
    • Create executive-level business dashboards

    ⚠️ Disclaimer

    This dataset is synthetically generated for educational, research, and portfolio purposes. While it reflects realistic business patterns, it does not represent real customer data.

  5. Online Retail Customer Churn Dataset

    • kaggle.com
    zip
    Updated Feb 14, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hassane Skikri (2024). Online Retail Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/hassaneskikri/online-retail-customer-churn-dataset
    Explore at:
    zip(23795 bytes)Available download formats
    Dataset updated
    Feb 14, 2024
    Authors
    Hassane Skikri
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Overview:

    This dataset provides a comprehensive overview of customer interactions with an online retail store, aiming to predict customer churn based on various behavioral and demographic features. It includes data on customer demographics, spending behavior, satisfaction levels, and engagement with marketing campaigns. The dataset is designed for analysis and development of predictive models to identify customers at risk of churn, enabling targeted customer retention strategies.

    Description of Columns:

    • Customer_ID: A unique identifier for each customer.
    • Age: The customer's age.
    • Gender: The customer's gender (Male, Female, Other).
    • Annual_Income: The annual income of the customer in thousands of dollars.
    • Total_Spend: The total amount spent by the customer in the last year.
    • Years_as_Customer: The number of years the individual has been a customer of the store.
    • Num_of_Purchases: The number of purchases the customer made in the last year.
    • Average_Transaction_Amount: The average amount spent per transaction.
    • Num_of_Returns: The number of items the customer returned in the last year.
    • Num_of_Support_Contacts: The number of times the customer contacted support in the last year.
    • Satisfaction_Score: A score from 1 to 5 indicating the customer's satisfaction with the store.
    • Last_Purchase_Days_Ago: The number of days since the customer's last purchase.
    • Email_Opt_In: Whether the customer has opted in to receive marketing emails.
    • Promotion_Response: The customer's response to the last promotional campaign (Responded, Ignored, Unsubscribed).
    • Target_Churn: Indicates whether the customer churned (True or False).
  6. Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Apr 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ziya (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/ziya07/customer-churn-prediction-dataset
    Explore at:
    zip(7446 bytes)Available download formats
    Dataset updated
    Apr 7, 2025
    Authors
    Ziya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The Customer Churn Prediction Dataset is a dataset designed to predict customer churn based on various behavioral and demographic features. The dataset contains information about 1,000 customers, and includes the following key features:

    Customer_ID: A unique identifier for each customer.

    Age: The age of the customer (ranging from 18 to 70 years).

    Gender: The gender of the customer (0 = Male, 1 = Female).

    Monthly_Spending: The amount of money spent monthly by the customer (ranging from 50 to 500 units).

    Subscription_Length: The number of years the customer has been subscribed to the service (ranging from 1 to 10 years).

    Support_Interactions: The number of times the customer has interacted with customer support (ranging from 0 to 5).

    Churn: The target variable indicating whether the customer has churned (1) or remained (0).

  7. Bank Customer Churn Dataset

    • kaggle.com
    zip
    Updated Aug 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Topre (2022). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/gauravtopre/bank-customer-churn-dataset
    Explore at:
    zip(191965 bytes)Available download formats
    Dataset updated
    Aug 30, 2022
    Authors
    Gaurav Topre
    Description

    This dataset is for ABC Multistate bank with following columns:

    1. customer_id, unused variable.
    2. credit_score, used as input.
    3. country, used as input.
    4. gender, used as input.
    5. age, used as input.
    6. tenure, used as input.
    7. balance, used as input.
    8. products_number, used as input.
    9. credit_card, used as input.
    10. active_member, used as input.
    11. estimated_salary, used as input.
    12. churn, used as the target. 1 if the client has left the bank during some period or 0 if he/she has not.

    Aim is to Predict the Customer Churn for ABC Bank.

    https://miro.medium.com/max/737/1*Xap6OxaZvD7C7eMQKkaHYQ.jpeg" alt="">

  8. Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Apr 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    AJ (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/smayanj/customer-churn-prediction-dataset
    Explore at:
    zip(622898 bytes)Available download formats
    Dataset updated
    Apr 12, 2025
    Authors
    AJ
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This is a synthetic dataset created to simulate customer behavior in a subscription-based service. It includes 15,000 rows, with each row representing a single customer.

    Features:

    • tenure_months
      How long (in months) the customer has been using the service.

    • monthly_usage_hours
      Average number of hours the customer uses the service per month.

    • has_multiple_devices
      Binary value (1 = yes, 0 = no). Whether the customer uses more than one device.

    • customer_support_calls
      Number of times the customer contacted customer support.

    • payment_failures
      Binary value (1 = yes, 0 = no). Whether the customer had recent payment issues.

    • is_premium_plan
      Binary value (1 = yes, 0 = no). Whether the customer is on a premium subscription.

    Target:

    • churn
      Binary value (1 = customer will leave, 0 = customer will stay).
      This is calculated based on a rule-based formula that considers factors like low tenure, low usage, support calls, and payment issues. Some randomness is added to mimic real-world uncertainty.
  9. Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Mar 31, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Şahide Şeker (2025). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sahideseker/customer-churn-prediction-dataset
    Explore at:
    zip(9296 bytes)Available download formats
    Dataset updated
    Mar 31, 2025
    Authors
    Şahide Şeker
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    🇬🇧 English:

    This synthetic dataset was designed for those who want to practice customer churn prediction using structured tabular data. It includes 1,000 customer records, each containing features such as age, service tenure, service type, monthly fee, and churn status.

    Use this dataset to:

    • Build classification models like Logistic Regression, Random Forest, or XGBoost
    • Explore churn-related patterns (e.g. short tenure, high price, mobile users)
    • Simulate real-world business scenarios without needing real customer data

    Features:

    • customer_id: Unique customer ID (e.g. C1001 to C2000)
    • age: Age of the customer
    • tenure: Number of months the customer has been active
    • service_type: Type of service used (internet, mobile, tv, bundle)
    • monthly_fee: Monthly subscription fee
    • churn: Whether the customer has left the service (1 = Yes, 0 = No)

    🇹🇷 Türkçe:

    Bu sentetik veri seti, müşteri kaybı (churn) tahmini üzerine çalışmak isteyen araştırmacılar ve öğrenciler için oluşturulmuştur. 1.000 müşteriye ait yaş, hizmet süresi, hizmet türü, aylık ödeme ve abonelik durumuna dair sahte ancak gerçekçi veriler içerir.

    Bu veri seti sayesinde:

    • Logistic Regression, Random Forest, XGBoost gibi sınıflandırma modelleri uygulanabilir
    • Churn davranışına etki eden faktörler incelenebilir (örneğin kısa üyelik, yüksek fiyat, mobil kullanıcılar)
    • Gerçek müşteri verilerine erişim gerekmeden iş senaryoları çalışılabilir

    🧾 Değişkenler:

    • customer_id: Müşteri kimliği (ör. C1001 – C2000)
    • age: Müşteri yaşı
    • tenure: Kaç aydır hizmet aldığı
    • service_type: Aldığı hizmet türü (internet, mobile, tv, bundle)
    • monthly_fee: Aylık ödeme miktarı
    • churn: Hizmeti bırakıp bırakmadığı (1 = Evet, 0 = Hayır)
  10. Patient Churn Prediction Dataset for Healthcare

    • kaggle.com
    zip
    Updated Jan 20, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nudrat Abbas (2026). Patient Churn Prediction Dataset for Healthcare [Dataset]. https://www.kaggle.com/datasets/nudratabbas/patient-churn-prediction-dataset-for-healthcare
    Explore at:
    zip(54926 bytes)Available download formats
    Dataset updated
    Jan 20, 2026
    Authors
    Nudrat Abbas
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Patient churn (attrition) is a critical challenge in healthcare, costing providers billions in lost revenue and disrupting continuity of care. Studies show:

    • Acquiring a new patient costs 5-25x more than retaining an existing one
    • 20-30% of patients switch providers annually
    • Low satisfaction and poor engagement are primary drivers of churn

    Understanding which patients are at risk of leaving—and why—enables healthcare organizations to:

    • Implement proactive retention strategies
    • Improve patient satisfaction and outcomes
    • Optimize resource allocation for high-risk patients
    • Increase lifetime patient value

    Content

    The dataset contains 2,000 patient records with detailed behavioral, satisfaction, and engagement metrics:

    Patient Demographics:

    • Age, Gender, Geographic location
    • Tenure with healthcare provider (months)

    Service Utilization:

    • Annual visit frequency
    • Missed appointments
    • Days since last interaction
    • Specialty of care

    Satisfaction Metrics:

    • Overall satisfaction score (1-5)
    • Wait time satisfaction
    • Staff interaction satisfaction
    • Provider rating

    Financial & Engagement Factors:

    • Insurance type
    • Average out-of-pocket costs
    • Billing issues flag
    • Patient portal usage
    • Referral behavior
    • Distance to facility

    Target Variable:

    • Churned (0 = Retained, 1 = Churned)

    Inspiration & Use Cases

    This dataset enables healthcare organizations to:

    1. Churn Prediction - Build machine learning models to identify at-risk patients before they leave
    2. Retention Marketing - Target high-risk patients with personalized retention campaigns
    3. Patient Experience Improvement - Identify key drivers of satisfaction and dissatisfaction
    4. Resource Optimization - Allocate retention resources to patients most likely to benefit
    5. Lifetime Value Analysis - Understand the long-term impact of patient retention
    6. Service Quality Enhancement - Pinpoint operational issues affecting patient loyalty

    Ideal for: - Healthcare analysts, data scientists, patient experience managers, students learning classification algorithms

  11. Online Retail Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Jun 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sahil Islam007 (2025). Online Retail Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/sahilislam007/online-retail-customer-churn-prediction-dataset
    Explore at:
    zip(430917 bytes)Available download formats
    Dataset updated
    Jun 21, 2025
    Authors
    Sahil Islam007
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This Synthetic dataset simulates customer behavior data for an online retail company and is designed to be useful for Exploratory Data Analysis (EDA) and various machine learning tasks such as:

    Customer segmentation

    Churn prediction

    Recommendation systems

    Customer lifetime value estimation

    🔍 Dataset Overview: Each row represents a unique customer, and the columns provide information on their demographics, shopping habits, engagement with the website, and satisfaction.

    ColumnDescription
    CustomerIDUnique identifier for each customer
    AgeCustomer's age
    GenderGender of the customer
    Annual_Income_USDAnnual income in US dollars
    Spending_ScoreScore based on spending behavior (1–100)
    Membership_StatusCustomer loyalty level (Bronze to Platinum)
    Preferred_Payment_MethodPayment method most often used
    RegionGeographical region (e.g., North, South)
    Total_PurchasesTotal number of purchases made
    Avg_Purchase_ValueAverage value of each purchase
    Last_Purchase_DateDate of the most recent purchase
    ChurnWhether the customer has churned (0 = No, 1 = Yes)
    Satisfaction_ScoreSatisfaction score (1–5 scale)
    Website_Visits_Last_MonthNumber of visits to the website last month
    Avg_Time_Per_Visit_MinutesAverage time spent on website per visit
    Support_Tickets_Last_6_MonthsNumber of support tickets raised
    Referred_FriendsNumber of friends referred to the platform

    ✅ Use Cases: Churn Prediction: Predict if a customer will churn based on behavior and demographics.

    Segmentation: Use clustering to segment customers by behavior (e.g., income, spending, satisfaction).

    Classification/Regression: Predict customer satisfaction or spending score.

    Recommendation Engines: Based on purchase history and behavior patterns.

  12. Data from: Customer Churn Dataset

    • kaggle.com
    zip
    Updated Dec 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sonal Shinde (2025). Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/sonalshinde123/customer-churn-prediction-dataset
    Explore at:
    zip(593965 bytes)Available download formats
    Dataset updated
    Dec 23, 2025
    Authors
    Sonal Shinde
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset is a synthetic customer churn dataset designed to simulate real-world telecom customer behavior. It is generated using business-driven rules based on customer tenure, billing amount, contract type, service usage, and support interactions. Controlled randomness and noise are added to avoid perfect patterns and make the dataset suitable for realistic machine learning classification tasks. The dataset is ideal for beginners to practice exploratory data analysis, feature engineering, and customer churn prediction using machine learning models.

  13. SaaS Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Feb 14, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Suhani Gupta_04 (2026). SaaS Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/suhanigupta04/saas-customer-churn-prediction-dataset
    Explore at:
    zip(114493 bytes)Available download formats
    Dataset updated
    Feb 14, 2026
    Authors
    Suhani Gupta_04
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    📊 SaaS Customer Churn Prediction

    Context

    "TechFlow" is a fictional SaaS company providing project management software. Like many SaaS companies, they are experiencing customer churn (users cancelling subscriptions). The company has collected data on user usage, account age, and the textual content of their latest customer support interaction.

    Content

    The dataset contains 2,500 records (split into train.csv and test.csv). Each row represents a unique customer.

    Features

    • Customer_ID: Unique ID for the customer.
    • Name: Name of the customer.
    • Email: Email of the customer.
    • Account_Age_Days: Number of days the customer has been active.
    • Login_Frequency: Categorical (Daily, Weekly, Rarely).
    • Daily_Usage_Mins: Average minutes the customer spends on the platform per day.
    • Last_Support_Ticket: The full text of the customer's most recent support ticket.
    • Churn: Target label (1 = Churned/Left, 0 = Retained/Stayed).

    Inspiration

    Most churn datasets are purely numerical. This dataset challenges you to combine numerical analysis with Natural Language Processing (NLP). 1. EDA: Does Login_Frequency correlate with Churn? 2. NLP: Can you perform sentiment analysis on Last_Support_Ticket to see if angry customers are more likely to churn? 3. Modeling: Build a model that uses both the usage metrics and the text data to predict churn with high accuracy.

    Data Generation

    This is a synthetic dataset generated using Python's Faker library. Real-world patterns (e.g., angry support tickets leading to higher churn) were simulated using weighted probabilities to make the data useful for machine learning practice.

  14. Netflix Customer Churn dataset(upvote if you like)

    • kaggle.com
    zip
    Updated Jul 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abdul Wadood (2025). Netflix Customer Churn dataset(upvote if you like) [Dataset]. https://www.kaggle.com/datasets/abdulwadood11220/netflix-customer-churn-dataset
    Explore at:
    zip(190004 bytes)Available download formats
    Dataset updated
    Jul 5, 2025
    Authors
    Abdul Wadood
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains synthetic data simulating customer behavior for a Netflix-like video streaming service. It includes 5,000 records with 14 carefully engineered features designed for churn prediction modeling, business insights, and customer segmentation.

    The dataset is ideal for:

    Machine learning classification tasks (churn vs. non-churn)

    Exploratory data analysis (EDA)

    Customer behavior modeling in OTT platforms

  15. Customer Churn Prediction Dataset_ 1M

    • kaggle.com
    zip
    Updated Jan 11, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Isandeep06 (2026). Customer Churn Prediction Dataset_ 1M [Dataset]. https://www.kaggle.com/datasets/isandeep06/customer-churn-prediction-dataset-1m
    Explore at:
    zip(42854638 bytes)Available download formats
    Dataset updated
    Jan 11, 2026
    Authors
    Isandeep06
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Customer Churn Prediction Dataset

    Overview

    A comprehensive, production-scale synthetic dataset for customer churn prediction in the telecommunications industry. This dataset contains 1,000,000 customer records with 28 features designed to simulate real-world churn behavior while maintaining privacy compliance. Perfect for exploring machine learning, feature engineering, and business analytics in customer retention scenarios.

    Business Context

    Telecommunications companies face significant customer attrition (churn) that directly impacts revenue. Predicting which customers are likely to churn allows businesses to implement targeted retention strategies, optimize marketing spend, and improve customer lifetime value.

    Dataset Details

    • Rows: 1,000,000 customer records
    • Columns: 28 features + 1 target variable
    • Size: ~200 MB (compressed ~40 MB)
    • Churn Rate: 20-25% (industry-realistic)
    • Time Span: 5 years of customer data
    • Format: CSV with UTF-8 encoding

    Column Descriptions

    Customer Demographics (8 features)

    ColumnTypeDescriptionMissing Values
    customer_idStringUnique customer identifier0%
    signup_dateDateCustomer registration date0%
    ageIntegerCustomer age (18-90)0%
    genderCategoricalGender: Male, Female, Other0%
    annual_incomeFloatAnnual income in USD3%
    educationCategoricalEducation level0%
    marital_statusCategoricalMarital status0%
    dependentsIntegerNumber of dependents (0-5)0%
    senior_citizenBinary1 if age ≥ 65, else 00%

    Account Information (4 features)

    ColumnTypeDescriptionMissing Values
    tenureIntegerMonths with company (1-72)0%
    contractCategoricalContract type: month_to_month, one_year, two_year0%
    payment_methodCategoricalPayment method0%
    paperless_billingCategoricalPaperless billing: Yes/No0%

    Service Usage & Billing (11 features)

    ColumnTypeDescriptionMissing Values
    monthlychargesFloatMonthly service charges ($20-$200)0%
    totalchargesFloatTotal charges to date0%
    num_servicesIntegerNumber of subscribed services (1-6)0%
    has_phone_serviceBinary1 if has phone service0%
    has_internet_serviceBinary1 if has internet service0%
    has_online_securityBinary1 if has online security0%
    has_online_backupBinary1 if has online backup0%
    has_device_protectionBinary1 if has device protection0%
    has_tech_supportBinary1 if has tech support0%
    has_streaming_tvBinary1 if has streaming TV0%
    has_streaming_moviesBinary1 if has streaming movies0%

    Customer Behavior & Risk (7 features)

    ColumnTypeDescriptionMissing Values
    customer_satisfactionIntegerSatisfaction score (1-10)2%
    num_complaintsIntegerComplaints in last year (0-8)3%
    num_service_callsIntegerService calls last month (0-12)0%
    late_paymentsIntegerLate payments last 3 months (0-5)0%
    avg_monthly_gbFloatAverage monthly data usage (GB)5%
    days_since_last_interactionIntegerDays since last contact (1-365)0%
    credit_scoreIntegerCredit score (300-850)4%

    Target Variable

    ColumnTypeDescriptionMissing Values
    churnBinaryTarget: 1 if churned, 0 if retained0%

    Key Dataset Characteristics

    Realistic Patterns

    • New customers (tenure < 6 months) have 2-3x higher churn risk
    • Monthly contracts show 40%+ churn rates vs <10% for two-year contracts
    • High satisfaction (8+) reduces churn probability by 60%
    • Late payments increase churn risk substantially
    • Service bundling (multiple services) improves retention

    Data Quality Features

    • Missing values: Strategically introduced (2-5%) in key columns
    • Outliers: Mild outliers in monetary columns (0.1% of data)
    • Skewed distributions: Income (lognormal), tenure (exponential)
    • Correlations: Services correlate with internet subscription
    • Noise: ±10% noise in totalcharges ≈ monthlycharges × tenure

    Interaction Effects

    • Low tenure × monthly contract → Very high churn risk
    • High charges × low satisfaction → Elevated churn probability
    • Many service calls × no tech support → Increased churn likelihood

    Potential Use Cases

    Machine Learning

    • Binary classification (churn prediction)
    • Fea...
  16. Customer Churn Prediction Dataset

    • kaggle.com
    zip
    Updated Sep 18, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Study Mart (2020). Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/studymart/customer-churn-prediction
    Explore at:
    zip(175743 bytes)Available download formats
    Dataset updated
    Sep 18, 2020
    Authors
    Study Mart
    Description

    Dataset

    This dataset was created by Study Mart

    Contents

  17. Real World Customer Churn Dataset

    • kaggle.com
    zip
    Updated Oct 24, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lasal Jayawardena (2023). Real World Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/lasaljaywardena/real-world-churn/code
    Explore at:
    zip(415985883 bytes)Available download formats
    Dataset updated
    Oct 24, 2023
    Authors
    Lasal Jayawardena
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    World
    Description

    60,000+ Real Anonymized Customer Usage Data for Churn Prediction!

    Dataset Information

    • Dataset Name: Real World Customer Churn Dataset in Telco Domain
    • Snapshot Period: January 1, 2023, to March 31, 2023
    • Source: One of the Largest Telco Companies in Sri Lanka
    • Data Anonymization: The Dataset is Anonymized to Protect Customer Privacy.

    Overview

    The "Real World Customer Churn Dataset in Telco Domain" is a comprehensive collection of anonymized data that provides insights into customer behavior and churn prediction within the telecommunications industry.

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F6361330%2F860271e0362e6c10503889f289201402%2FCustomer-churn.jpg?generation=1698182677600097&alt=media" alt="Dataset Image">

    Usage Categories

    The dataset contains data on over 60,000 customers across more than 10+ distinct usage categories. Some of the key usage categories include:

    • usage_app_youtube_daily: YouTube Traffic in MBs.
    • usage_app_facebook_daily: Facebook Traffic in MBs.
    • usage_app_tiktok_daily: TikTok Traffic in MBs.
    • usage_app_whatsapp_daily: WhatsApp Traffic in MBs.
    • usage_app_helakuru_daily: Helakuru App traffic in MBs.
    • usage_voice_o2o_outgoing: Outgoing call volume in minutes between the same operator.
    • usage_voice_o2op_outgoing: Outgoing call volume in minutes between operator and other operators.
    • usage_voice_o2o_incoming: Incoming call volume in minutes between the same operator.
    • usage_voice_op2o_incoming: Incoming call volume in minutes between other operator to operator.
    • usage_pack_data: Spend in LKR for data package purchasing.
    • usage_pack_vas: Spend in LKR for value-added service rentals or usage.

    Dataset Files

    The dataset consists of the following key files:

    1. main.csv: An aggregated dataset that compiles usage data from all usage categories, providing a holistic view of customer behavior.
    2. raw_dump folder: The raw data export, preserving the original source data for detailed exploration.
    3. test and train folders: These folders contain customer IDs and corresponding Churn Labels, facilitating model training and testing.
    4. usage_profiles folder: It comprises broken-down data frames for each customer under specific usage categories, allowing in-depth analysis of individual customer behavior within various usage categories.

    Potential Use Cases

    The "Real World Customer Churn Dataset in Telco Domain" offers a range of potential use cases, including:

    • Customer Churn Prediction: Leveraging customer usage patterns to predict and reduce churn.
    • Targeted Marketing: Designing customized marketing campaigns based on customer preferences.
    • Service Quality Enhancement: Identifying areas for service improvement, such as network quality.
    • Revenue Optimization: Maximizing revenue through the analysis of data package spending and value-added service usage.

    Dataset Importance

    This dataset's real-world aspect is of significant importance. It reflects actual customer interactions with a major telecommunications company in Sri Lanka, offering insights that can be directly applied to real-world scenarios. The dataset is sourced from one of the largest telco companies in the country, adding credibility and relevance to the insights it provides.

    Understanding customer churn and usage behavior is pivotal for the telecommunications industry, and this dataset empowers researchers, data scientists, and businesses to gain deeper insights into these aspects.

    Disclaimer

    The dataset is anonymized to protect customer privacy, and all data used is in compliance with privacy regulations and agreements. Users are encouraged to explore and contribute to the "Real World Customer Churn Dataset in Telco Domain."

    Thank you for your valuable contributions to this dataset.

  18. Bank Customer Churn Dataset

    • kaggle.com
    zip
    Updated Jul 11, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhuvi Ranga (2023). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/bhuviranga/customer-churn-data
    Explore at:
    zip(191965 bytes)Available download formats
    Dataset updated
    Jul 11, 2023
    Authors
    Bhuvi Ranga
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention

  19. Tour & Travels Customer Churn Prediction

    • kaggle.com
    zip
    Updated Oct 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tejashvi (2021). Tour & Travels Customer Churn Prediction [Dataset]. https://www.kaggle.com/datasets/tejashvi14/tour-travels-customer-churn-prediction
    Explore at:
    zip(3537 bytes)Available download formats
    Dataset updated
    Oct 31, 2021
    Authors
    Tejashvi
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    A Tour & Travels Company Wants To Predict Whether A Customer Will Churn Or Not Based On Indicators Given Below. Help Build Predictive Models And Save The Company's Money. Perform Fascinating EDAs. The Data Was Used For Practice Purposes And Also During A Mini Hackathon, Its Completely Free To Use

  20. Telecom Churn Predict

    • kaggle.com
    zip
    Updated Aug 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swaraj Khan (2023). Telecom Churn Predict [Dataset]. https://www.kaggle.com/datasets/swarajkhan/telecom-churn-predict
    Explore at:
    zip(27579 bytes)Available download formats
    Dataset updated
    Aug 11, 2023
    Authors
    Swaraj Khan
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    "Telecom Customer Churn Prediction Dataset" is a synthetic dataset designed to simulate customer data for a telecommunications company. This dataset is created for the purpose of predicting customer churn, which refers to the phenomenon of customers discontinuing their services with the company. The dataset contains a variety of features that capture different aspects of customer behavior and characteristics.

    The dataset includes information such as customer age, gender, contract type, monthly charges, total amount spent, number of devices connected, and the number of customer support calls made. The key focus of this dataset is the binary target variable "Churn," which indicates whether a customer has churned (1) or not (0). This variable is essential for training and evaluating predictive models aimed at identifying customers who are likely to leave the service.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Saurabh Badole (2024). Banking Customer Churn Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/saurabhbadole/bank-customer-churn-prediction-dataset
Organization logo

Banking Customer Churn Prediction Dataset

Understanding Customer Behavior and Predicting Churn in Banking Institutions

Explore at:
27 scholarly articles cite this dataset (View in Google Scholar)
zip(267794 bytes)Available download formats
Dataset updated
May 16, 2024
Authors
Saurabh Badole
License

Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically

Description

Description:

This dataset contains information about bank customers and their churn status, which indicates whether they have exited the bank or not. It is suitable for exploring and analyzing factors influencing customer churn in banking institutions and for building predictive models to identify customers at risk of churning.

Features:

RowNumber: The sequential number assigned to each row in the dataset.

CustomerId: A unique identifier for each customer.

Surname: The surname of the customer.

CreditScore: The credit score of the customer.

Geography: The geographical location of the customer (e.g., country or region).

Gender: The gender of the customer.

Age: The age of the customer.

Tenure: The number of years the customer has been with the bank.

Balance: The account balance of the customer.

NumOfProducts: The number of bank products the customer has.

HasCrCard: Indicates whether the customer has a credit card (binary: yes/no).

IsActiveMember: Indicates whether the customer is an active member (binary: yes/no).

EstimatedSalary: The estimated salary of the customer.

Exited: Indicates whether the customer has exited the bank (binary: yes/no).

Usage:

  • This dataset can be used for exploratory data analysis to understand the factors influencing customer churn in banks.
  • It can also be used to build machine learning models for predicting customer churn based on the given features.

License:

This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Search
Clear search
Close search
Google apps
Main menu