http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset contains a wealth of customer information collected from within a consumer credit card portfolio, with the aim of helping analysts predict customer attrition. It includes comprehensive demographic details such as age, gender, marital status and income category, as well as insight into each customer’s relationship with the credit card provider such as the card type, number of months on book and inactive periods. Additionally it holds key data about customers’ spending behavior drawing closer to their churn decision such as total revolving balance, credit limit, average open to buy rate and analyzable metrics like total amount of change from quarter 4 to quarter 1, average utilization ratio and Naive Bayes classifier attrition flag (Card category is combined with contacts count in 12months period alongside dependent count plus education level & months inactive). Faced with this set of useful predicted data points across multiple variables capture up-to-date information that can determine long term account stability or an impending departure therefore offering us an equipped understanding when seeking to manage a portfolio or serve individual customers
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset can be used to analyze the key factors that influence customer attrition. Analysts can use this dataset to understand customer demographics, spending patterns, and relationship with the credit card provider to better predict customer attrition.
- Using the customer demographics, such as gender, marital status, education level and income category to determine which customer demographic is more likely to churn.
- Analyzing the customer’s spending behavior leading up to churning and using this data to better predict the likelihood of a customer of churning in the future.
- Creating a classifier that can predict potential customers who are more susceptible to attrition based on their credit score, credit limit, utilization ratio and other spending behavior metrics over time; this could be used as an early warning system for predicting potential attrition before it happens
If you use this dataset in your research, please credit the original authors. Data Source
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: BankChurners.csv | Column name | Description | |:---------------------------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------------------------------------------| | CLIENTNUM | Unique identifier for each customer. (Integer) | | Attrition_Flag | Flag indicating whether or not the customer has churned out. (Boolean) | | Customer_Age | Age of customer. (Integer) | | Gender | Gender of customer. (String) | | Dependent_count | Number of dependents that customer has. (Integer) | | Education_Level ...
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset was created by Avinash Bhardwaz
Released under CC0: Public Domain
Bank Customer Churn Prediction dataset Source: https://huggingface.co/datasets/krisnadwipaj/customer-churn
In comparison to the original dataset mentioned in S4E1 Playground Series Data description this dataset has 4 additional columns: Complain, Satisfaction Score, Card Type, Points Earned
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10356799%2Fcd443895ee4018e0b563a722695cb2d6%2FScreenshot%202024-01-23%20at%2022.12.01.png?generation=1706044758886339&alt=media" alt="">
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Churn for Bank Customers’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mathchi/churn-for-bank-customers on 12 November 2021.
--- Dataset description provided by original source is as follows ---
As we know, it is much more expensive to sign in a new client than keeping an existing one.
It is advantageous for banks to know what leads a client towards the decision to leave the company.
Churn prevention allows companies to develop loyalty programs and retention campaigns to keep as many customers as possible.
--- Original source retains full ownership of the source dataset ---
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
This dataset provides comprehensive information about a bank's customers, focusing on their demographic, financial, and account activity details. It is designed to help analyze factors influencing customer churn and develop predictive models for customer retention strategies.
This dataset is perfect for beginners and professionals alike to explore customer churn prediction, develop insights, and create impactful business solutions.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The data will be used to predict whether a customer of the bank will churn. If a customer churns, it means they left the bank and took their business elsewhere. If you can predict which customers are likely to churn, you can take measures to retain them before they do. These measures could be promotions, discounts, or other incentives to boost customer satisfaction and, therefore, retention.
The dataset contains:
10,000 rows – each row is a unique customer of the bank
14 columns:
RowNumber: Row numbers from 1 to 10,000
CustomerId: Customer’s unique ID assigned by bank
Surname: Customer’s last name
CreditScore: Customer’s credit score. This number can range from 300 to 850.
Geography: Customer’s country of residence
Gender: Categorical indicator
Age: Customer’s age (years)
Tenure: Number of years customer has been with bank
Balance: Customer’s bank balance (Euros)
NumOfProducts: Number of products the customer has with the bank
HasCrCard: Indicates whether the customer has a credit card with the bank
IsActiveMember: Indicates whether the customer is considered active
EstimatedSalary: Customer’s estimated annual salary (Euros)
Exited: Indicates whether the customer churned (left the bank)
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If you found the dataset useful, your upvote will help others discover it. Thanks for your support!
This dataset simulates customer behavior for a fictional telecommunications company. It contains demographic information, account details, services subscribed to, and whether the customer ultimately churned (stopped using the service) or not. The data is synthetically generated but designed to reflect realistic patterns often found in telecom churn scenarios.
Purpose:
The primary goal of this dataset is to provide a clean and straightforward resource for beginners learning about:
Features:
The dataset includes the following columns:
CustomerID
: Unique identifier for each customer.Age
: Customer's age in years.Gender
: Customer's gender (Male/Female).Location
: General location of the customer (e.g., New York, Los Angeles).SubscriptionDurationMonths
: How many months the customer has been subscribed.MonthlyCharges
: The amount the customer is charged each month.TotalCharges
: The total amount the customer has been charged over their subscription period.ContractType
: The type of contract the customer has (Month-to-month, One year, Two year).PaymentMethod
: How the customer pays their bill (e.g., Electronic check, Credit card).OnlineSecurity
: Whether the customer has online security service (Yes, No, No internet service).TechSupport
: Whether the customer has tech support service (Yes, No, No internet service).StreamingTV
: Whether the customer has TV streaming service (Yes, No, No internet service).StreamingMovies
: Whether the customer has movie streaming service (Yes, No, No internet service).Churn
: (Target Variable) Whether the customer churned (1 = Yes, 0 = No).Data Quality:
This dataset is intentionally clean with no missing values, making it easy for beginners to focus on analysis and modeling concepts without complex data cleaning steps.
Inspiration:
Understanding customer churn is crucial for many businesses. This dataset provides a sandbox environment to practice the fundamental techniques used in churn analysis and prediction.
https://cdla.io/permissive-1-0/https://cdla.io/permissive-1-0/
This dataset had adapted from 'Credit Card Churn Prediction: https://www.kaggle.com/datasets/anwarsan/credit-card-bank-churn ' for visualization in our university project. We have modified customer information, spending behavior, and also added revenue targets.
Scenario 🕶️
In 2019, the marketing team launched a campaign to attract millennial customers (born 1980-1996) with the goal of increasing revenue and enhancing the brand's appeal to a younger audience.
As the BI team, your task is to create a dashboard for users.
1. The Vice President of Sales wants to view the performance of the credit business.
2. The marketing team is interested in understanding customer segments and customer spending to measure Customer Lifetime Value (CLV) and Marketing Cost per Acquired Customer (MCAC).
⚠️Note: This is just a suggestion to guide the creation of the dashboard
Example in Tableau
Executive summary
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10099382%2F508a2d2d89dabdfd368743f86c2a71e1%2Fexecutive%20overview.JPG?generation=1696110593484137&alt=media" alt="">
Customer behavior
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10099382%2F1e4a1f62a25eab3c6707d002243894c7%2Fcustomer_behaviour.JPG?generation=1696110689732332&alt=media" alt="">
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Dataset Description This dataset contains information about 8,500+ mobile service customers, including demographic details, device usage, billing patterns, and call behavior. The primary goal of this dataset is to enable analysis and modeling to predict customer churn — i.e., customers who decide to drop their mobile service provider.
The data includes 33 features and one binary target column (customer_dropped). This dataset is ideal for exploring churn prediction models, customer segmentation, lifetime value analysis, and marketing strategy development.
Features - customer_id: Unique identifier for each customer - age: Age of the customer - job: Occupation or profession of the customer - urban_rural: Indicates whether the customer resides in an urban or rural area - marital_status: Marital status of the customer - kids: Number of children the customer has - disposable_income: Disposable income of the customer - mobiles_changed: Number of times the customer has changed their mobile device - mobile_age: Age of the current mobile device - own_smartphone: Indicates whether the customer owns a smartphone - current_mobile_price: Price of the customer's current mobile device - credit_card_type: Type of credit card held - own_house: Indicates whether the customer owns a house - own_cr_card: Indicates whether the customer owns a credit card - monthly_bill: Monthly bill for mobile service - call_mins: Total call minutes used - basic_plan_amount: Basic mobile plan amount - extra_mins: Extra minutes used beyond the plan - roam_call_mins: Roaming call minutes - call_mins_delta: Change in call minutes compared to the previous billing period - bill_amount_delta: Change in bill amount compared to the previous billing period - incoming_call_mins: Total incoming call minutes - outgoing_calls: Number of outgoing calls - incoming_calls: Number of incoming calls - day_night_call_ratio: Ratio of call minutes during the day versus night - day_night_call_delta: Change in day vs night call minutes compared to the previous period - calls_dropped: Number of calls dropped - loyalty_months: Customer tenure in months - complaint_calls: Number of complaint calls made - promo_calls_made: Number of promotional calls made - promo_offers_accepted: Number of promotional offers accepted - new_numbers_called: Number of new contacts called - customer_dropped: Target column indicating churn (1 = churned, 0 = retained)
Use Cases - Develop machine learning models for churn prediction - Perform customer segmentation and behavioral profiling - Analyze call usage trends and billing sensitivity - Identify key drivers of customer loyalty or attrition - Design data-driven retention strategies
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
The dataset contains the following columns:
Customer ID: A unique identifier for each customer. Customer Name: The name of the customer (generated by Faker). Customer Age: The age of the customer (generated by Faker). Gender: The gender of the customer (generated by Faker). Purchase Date: The date of each purchase made by the customer. Product Category: The category or type of the purchased product. Product Price: The price of the purchased product. Quantity: The quantity of the product purchased. Total Purchase Amount: The total amount spent by the customer in each transaction. Payment Method: The method of payment used by the customer (e.g., credit card, PayPal). Returns: Whether the customer returned any products from the order (binary: 0 for no return, 1 for return). Churn: A binary column indicating whether the customer has churned (0 for retained, 1 for churned).
Note:
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Credit risk assessment remains a critical function within financial services, influencing lending decisions, portfolio risk management, and regulatory compliance. It integrates multiple categories of financial, transactional, and behavioral data to enable advanced machine learning applications in the domain of financial risk modeling.
The dataset comprises a total of 1,212 distinct features, systematically grouped into four principal categories, alongside a binary target variable. Each feature category represents a specific dimension of credit risk assessment, reflecting both internal transactional data and externally sourced credit bureau information.
The dependent variable, denoted as bad_flag, represents a binary risk classification outcome associated with each customer account. The variable takes the following values:
This variable serves as the target for binary classification models aimed at predicting credit risk propensity.
Category | Number of Features | Description |
---|---|---|
Transaction Attributes | 664 | Customer-level transaction behavior, repayment patterns, financial habits |
Bureau Credit Data | 452 | Credit scores, external bureau records, delinquency flags, historical credit data |
Bureau Enquiries | 50 | Credit inquiry history, frequency and type of external credit applications |
ONUS Attributes | 48 | Internal bank relationship metrics, account engagement indicators |
Each feature within a category follows a systematic sequential naming convention (e.g., transaction_attribute_1
, bureau_1
), facilitating programmatic identification and group-level analysis.
The dataset exhibits several characteristics that mirror operational credit risk data environments:
The dataset was constructed by simulating data generation processes typical within financial services institutions. Transactional behaviors, bureau records, and inquiry histories were aggregated and engineered into derivative features.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset provides the scikit-survival 0.23.1
Python package in .whl
format, enabling users to perform survival analysis using machine learning techniques. scikit-survival
is a powerful library that extends scikit-learn
to handle censored data, commonly encountered in medical research, reliability engineering, and event-time prediction tasks.
To install the package, first, download the .whl
file from this Kaggle dataset. Then, install it using pip
:
pip install scikit_survival-0.23.1-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl
Ensure that you have Python 3.13 installed, as this wheel is built specifically for that version.
scikit-learn
for easy model training and validation Not seeing a result you expected?
Learn how you can add new datasets to our index.
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention