Facebook
Twitter"Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]
Each row represents a customer, each column contains customer’s attributes described on the column Metadata.
The data set includes information about:
To explore this type of models and learn more about the subject.
New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113
Facebook
TwitterDataset Card for Telco Customer Churn
This dataset contains information about customers of a fictional telecommunications company, including demographic information, services subscribed to, location details, and churn behavior. This merged dataset combines the information from the original Telco Customer Churn dataset with additional details.
Dataset Details
Dataset Description
This merged Telco Customer Churn dataset provides a comprehensive view of customer… See the full description on the dataset page: https://huggingface.co/datasets/aai510-group1/telco-customer-churn.
Facebook
TwitterContext : This dataset is part of a data science project focused on customer churn prediction for a subscription-based service. Customer churn, the rate at which customers cancel their subscriptions, is a vital metric for businesses offering subscription services. Predictive analytics techniques are employed to anticipate which customers are likely to churn, enabling companies to take proactive measures for customer retention.
Content : This dataset contains anonymized information about customer subscriptions and their interaction with the service. The data includes various features such as subscription type, payment method, viewing preferences, customer support interactions, and other relevant attributes. It consists of three files such as "test.csv", "train.csv", "data_descriptions.csv".
Columns :
CustomerID: Unique identifier for each customer
SubscriptionType: Type of subscription plan chosen by the customer (e.g., Basic, Premium, Deluxe)
PaymentMethod: Method used for payment (e.g., Credit Card, Electronic Check, PayPal)
PaperlessBilling: Whether the customer uses paperless billing (Yes/No)
ContentType: Type of content accessed by the customer (e.g., Movies, TV Shows, Documentaries)
MultiDeviceAccess: Whether the customer has access on multiple devices (Yes/No)
DeviceRegistered: Device registered by the customer (e.g., Smartphone, Smart TV, Laptop)
GenrePreference: Genre preference of the customer (e.g., Action, Drama, Comedy)
Gender: Gender of the customer (Male/Female)
ParentalControl: Whether parental control is enabled (Yes/No)
SubtitlesEnabled: Whether subtitles are enabled (Yes/No)
AccountAge: Age of the customer's subscription account (in months)
MonthlyCharges: Monthly subscription charges
TotalCharges: Total charges incurred by the customer
ViewingHoursPerWeek: Average number of viewing hours per week
SupportTicketsPerMonth: Number of customer support tickets raised per month
AverageViewingDuration: Average duration of each viewing session
ContentDownloadsPerMonth: Number of content downloads per month
UserRating: Customer satisfaction rating (1 to 5)
WatchlistSize: Size of the customer's content watchlist
Acknowledgments : The dataset used in this project is obtained from Data Science Challenge on Coursera and is used for educational and research purposes. Any resemblance to real persons or entities is purely coincidental.
Facebook
TwitterAttribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
This dataset contains information about bank customers and their churn status, which indicates whether they have exited the bank or not. It is suitable for exploring and analyzing factors influencing customer churn in banking institutions and for building predictive models to identify customers at risk of churning.
RowNumber: The sequential number assigned to each row in the dataset.
CustomerId: A unique identifier for each customer.
Surname: The surname of the customer.
CreditScore: The credit score of the customer.
Geography: The geographical location of the customer (e.g., country or region).
Gender: The gender of the customer.
Age: The age of the customer.
Tenure: The number of years the customer has been with the bank.
Balance: The account balance of the customer.
NumOfProducts: The number of bank products the customer has.
HasCrCard: Indicates whether the customer has a credit card (binary: yes/no).
IsActiveMember: Indicates whether the customer is an active member (binary: yes/no).
EstimatedSalary: The estimated salary of the customer.
Exited: Indicates whether the customer has exited the bank (binary: yes/no).
This dataset is made available under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Facebook
TwitterAlthough the results were close, the industry in the United States where customers were most likely to leave their current provider due to poor customer service appears to be cable television, with a 25 percent churn rate in 2020.
Churn rate
Churn rate, sometimes also called attrition rate, is the percentage of customers that stop utilizing a service within a time given period. It is often used to measure businesses which have a contractual customer base, especially subscriber-based service models.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Customer churn prediction dataset of a fictional telecommunication company made by IBM Sample Datasets. Context Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs. Content Each row represents a customer, each column contains customer’s attributes described on the column metadata. The data set includes information about:
Customers who left within the last month: the column is called Churn Services that each customer… See the full description on the dataset page: https://huggingface.co/datasets/scikit-learn/churn-prediction.
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Telco Customer Churn Dataset includes carrier customer service usage, account information, demographics and churn, which can be used to predict and analyze customer churn.
2) Data Utilization (1) Telco Customer Churn Dataset has characteristics that: • This dataset includes a variety of customer and service characteristics, including gender, age group, partner and dependents, service subscription status (telephone, Internet, security, backup, device protection, technical support, streaming, etc.), contract type, payment method, monthly fee, total fee, and departure. (2) Telco Customer Churn Dataset can be used to: • Development of customer churn prediction model: Using customer service usage patterns and account information, we can build a machine learning-based churn prediction model to proactively identify customers at risk of churn.
Facebook
TwitterBusiness problem overview In the telecom industry, customers are able to choose from multiple service providers and actively switch from one operator to another. In this highly competitive market, the telecommunications industry experiences an average of 15-25% annual churn rate. Given the fact that it costs 5-10 times more to acquire a new customer than to retain an existing one, customer retention has now become even more important than customer acquisition.
For many incumbent operators, retaining high profitable customers is the number one business goal.
To reduce customer churn, telecom companies need to predict which customers are at high risk of churn.
In this project, you will analyse customer-level data of a leading telecom firm, build predictive models to identify customers at high risk of churn and identify the main indicators of churn.
Understanding and defining churn There are two main models of payment in the telecom industry - postpaid (customers pay a monthly/annual bill after using the services) and prepaid (customers pay/recharge with a certain amount in advance and then use the services).
In the postpaid model, when customers want to switch to another operator, they usually inform the existing operator to terminate the services, and you directly know that this is an instance of churn.
However, in the prepaid model, customers who want to switch to another network can simply stop using the services without any notice, and it is hard to know whether someone has actually churned or is simply not using the services temporarily (e.g. someone may be on a trip abroad for a month or two and then intend to resume using the services again).
Thus, churn prediction is usually more critical (and non-trivial) for prepaid customers, and the term ‘churn’ should be defined carefully. Also, prepaid is the most common model in India and Southeast Asia, while postpaid is more common in Europe in North America.
This project is based on the Indian and Southeast Asian market.
Definitions of churn There are various ways to define churn, such as:
Revenue-based churn: Customers who have not utilised any revenue-generating facilities such as mobile internet, outgoing calls, SMS etc. over a given period of time. One could also use aggregate metrics such as ‘customers who have generated less than INR 4 per month in total/average/median revenue’.
The main shortcoming of this definition is that there are customers who only receive calls/SMSes from their wage-earning counterparts, i.e. they don’t generate revenue but use the services. For example, many users in rural areas only receive calls from their wage-earning siblings in urban areas.
Usage-based churn: Customers who have not done any usage, either incoming or outgoing - in terms of calls, internet etc. over a period of time.
A potential shortcoming of this definition is that when the customer has stopped using the services for a while, it may be too late to take any corrective actions to retain them. For e.g., if you define churn based on a ‘two-months zero usage’ period, predicting churn could be useless since by that time the customer would have already switched to another operator.
In this project, you will use the usage-based definition to define churn.
High-value churn In the Indian and the Southeast Asian market, approximately 80% of revenue comes from the top 20% customers (called high-value customers). Thus, if we can reduce churn of the high-value customers, we will be able to reduce significant revenue leakage.
In this project, you will define high-value customers based on a certain metric (mentioned later below) and predict churn only on high-value customers.
Understanding the business objective and the data The dataset contains customer-level information for a span of four consecutive months - June, July, August and September. The months are encoded as 6, 7, 8 and 9, respectively.
The business objective is to predict the churn in the last (i.e. the ninth) month using the data (features) from the first three months. To do this task well, understanding the typical customer behaviour during churn will be helpful.
Understanding customer behaviour during churn Customers usually do not decide to switch to another competitor instantly, but rather over a period of time (this is especially applicable to high-value customers). In churn prediction, we assume that there are three phases of customer lifecycle :
The ‘good’ phase: In this phase, the customer is happy with the service and behaves as usual.
The ‘action’ phase: The customer experience starts to sore in this phase, for e.g. he/she gets a compelling offer from a competitor, faces unjust charges, becomes unhappy with service quality etc. In this phase, the customer usually shows different behaviour than the ‘good’ months. Also, it is crucial to...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This Global Customer Churn Dataset is meticulously curated to aid in understanding and predicting customer churn Behaviour across various industries. With detailed customer profiles, including demographics, product interactions, and banking behaviors, this dataset is an invaluable resource for developing machine learning models aimed at identifying at-risk customers and devising targeted retention strategies."
Break down the dataset in detail, describing what each column represents:
RowNumber: A unique identifier for each row in the dataset.
CustomerId: Unique customer identification number.
Surname: The last name of the customer (for privacy reasons, consider anonymizing this data if not already done).
CreditScore: The customer's credit score at the time of data collection.
Geography: The customer's country or region, providing insights into location-based trends in churn.
Gender: The customer's gender.
Age: The customer's age, valuable for demographic analysis.
Tenure: The number of years the customer has been with the bank.
Balance: The customer's account balance.
Num Of Products: The number of products the customer has purchased or subscribed to.
HasCrCard: Indicates whether the customer has a credit card (1) or not (0).
IsActiveMember: Indicates whether the customer is an active member (1) or not (0).
EstimatedSalary: The customer's estimated salary.
Exited: The target variable, indicating whether the customer has churned (1) or not (0).
Facebook
Twitterhttps://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Customer Churn Dataset is a dataset that collects various customer characteristics and service usage information to predict whether or not communication service customers will turn.
2) Data Utilization (1) Customer Churn Dataset has characteristics that: • The dataset consists of several categorical and numerical variables, including customer demographics, service types, contract information, charges, usage patterns, and Turn. (2) Customer Churn Dataset can be used to: • Development of customer churn prediction model : Machine learning and deep learning techniques can be used to develop classification models that predict churn based on customer characteristics and service use data. • Segmenting customers and developing marketing strategies : It can be used to analyze customer groups at high risk of departure and to design custom retention strategies or targeted marketing campaigns.
Facebook
Twitterhttps://www.marketresearchintellect.com/privacy-policyhttps://www.marketresearchintellect.com/privacy-policy
Find detailed analysis in Market Research Intellect's Customer Churn Analysis Software Market Report, estimated at USD 2.1 billion in 2024 and forecasted to climb to USD 4.8 billion by 2033, reflecting a CAGR of 10.2%.Stay informed about adoption trends, evolving technologies, and key market participants.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If you found the dataset useful, your upvote will help others discover it. Thanks for your support!
This dataset simulates customer behavior for a fictional telecommunications company. It contains demographic information, account details, services subscribed to, and whether the customer ultimately churned (stopped using the service) or not. The data is synthetically generated but designed to reflect realistic patterns often found in telecom churn scenarios.
Purpose:
The primary goal of this dataset is to provide a clean and straightforward resource for beginners learning about:
Features:
The dataset includes the following columns:
CustomerID: Unique identifier for each customer.Age: Customer's age in years.Gender: Customer's gender (Male/Female).Location: General location of the customer (e.g., New York, Los Angeles).SubscriptionDurationMonths: How many months the customer has been subscribed.MonthlyCharges: The amount the customer is charged each month.TotalCharges: The total amount the customer has been charged over their subscription period.ContractType: The type of contract the customer has (Month-to-month, One year, Two year).PaymentMethod: How the customer pays their bill (e.g., Electronic check, Credit card).OnlineSecurity: Whether the customer has online security service (Yes, No, No internet service).TechSupport: Whether the customer has tech support service (Yes, No, No internet service).StreamingTV: Whether the customer has TV streaming service (Yes, No, No internet service).StreamingMovies: Whether the customer has movie streaming service (Yes, No, No internet service).Churn: (Target Variable) Whether the customer churned (1 = Yes, 0 = No).Data Quality:
This dataset is intentionally clean with no missing values, making it easy for beginners to focus on analysis and modeling concepts without complex data cleaning steps.
Inspiration:
Understanding customer churn is crucial for many businesses. This dataset provides a sandbox environment to practice the fundamental techniques used in churn analysis and prediction.
Facebook
Twitterkrisnadwipaj/customer-churn dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterSudeendraMG/Bank-Customer-Churn dataset hosted on Hugging Face and contributed by the HF Datasets community
Facebook
TwitterIn the first quarter of 2024, T-Mobile US had a churn rate of **** percent for postpaid subscribers, a *****percentage point increase compared to the previous quarter. T-Mobile US has lowered its postpaid churn rate from more than *** percent to below *** percent over the last ten years.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Discover the booming Churn Prediction Software market! Learn about key trends, growth drivers, leading companies like Salesforce & SAP, and regional market analysis for 2025-2033. Maximize customer retention with predictive analytics.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Customer Churn Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/sercanyesiloz/customer-churn-dataset on 30 September 2021.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This is a collection of the data used for analysis (master dataset, training data, test data), and the code and processes that have been used to conduct the analysis for this research project.
Facebook
TwitterThis dataset is designed for analyzing customer behavior and predicting customer churn in a retail store. With 5,329 samples and 19 independent variables, the dataset provides a comprehensive view of various factors that influence whether a customer will continue their engagement with the store or not. The primary goal is to derive actionable insights and trends that can improve overall business performance, particularly in reducing customer churn.
Customer Churn Indicator: This binary variable indicates whether a customer has churned (i.e., stopped engaging with the retail store) or not. It serves as the target variable for the machine learning model.
1. Customer Information: Customer ID: Unique identifier for each customer. Gender: Gender of the customer (Male/Female). Marital Status: Indicates whether the customer is single, married, divorced, etc. Number of Complaints: Total number of complaints filed by the customer to the retail store. Total Orders (1 month): Number of orders placed by the customer in the last month.
2. Transaction Information: Preferred Log-In Device: The type of device type used by the customer to connect to the retail store for purchases (e.g., mobile phone, computer). Payment Method: The payment method preferred by the customer (e.g., Credit Card, UPI). Product Category: The category to which the purchased products belong. Distance from Warehouse: The distance between the retail store's warehouse and the customer's location.
The main objective of analyzing this dataset is to predict customer churn and understand the factors contributing to it. By doing so, the retail store can develop targeted strategies for customer retention, optimize marketing efforts, and improve overall customer satisfaction.
The insights gained from this analysis will be invaluable for the store's management and marketing teams. They can identify patterns and trends related to customer churn, enabling them to take proactive steps to retain valuable customers, address customer complaints effectively, and tailor marketing campaigns to specific customer segments. The ultimate goal is to enhance business performance by reducing churn and increasing customer loyalty.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analyzing customers’ characteristics and giving the early warning of customer churn based on machine learning algorithms, can help enterprises provide targeted marketing strategies and personalized services, and save a lot of operating costs. Data cleaning, oversampling, data standardization and other preprocessing operations are done on 900,000 telecom customer personal characteristics and historical behavior data set based on Python language. Appropriate model parameters were selected to build BPNN (Back Propagation Neural Network). Random Forest (RF) and Adaboost, the two classic ensemble learning models were introduced, and the Adaboost dual-ensemble learning model with RF as the base learner was put forward. The four models and the other four classical machine learning models-decision tree, naive Bayes, K-Nearest Neighbor (KNN), Support Vector Machine (SVM) were utilized respectively to analyze the customer churn data. The results show that the four models have better performance in terms of recall rate, precision rate, F1 score and other indicators, and the RF-Adaboost dual-ensemble model has the best performance. Among them, the recall rates of BPNN, RF, Adaboost and RF-Adaboost dual-ensemble model on positive samples are respectively 79%, 90%, 89%,93%, the precision rates are 97%, 99%, 98%, 99%, and the F1 scores are 87%, 95%, 94%, 96%. The RF-Adaboost dual-ensemble model has the best performance, and the three indicators are 10%, 1%, and 6% higher than the reference. The prediction results of customer churn provide strong data support for telecom companies to adopt appropriate retention strategies for pre-churn customers and reduce customer churn.
Facebook
Twitter"Predict behavior to retain customers. You can analyze all relevant customer data and develop focused customer retention programs." [IBM Sample Data Sets]
Each row represents a customer, each column contains customer’s attributes described on the column Metadata.
The data set includes information about:
To explore this type of models and learn more about the subject.
New version from IBM: https://community.ibm.com/community/user/businessanalytics/blogs/steven-macko/2019/07/11/telco-customer-churn-1113