Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Bank Customer Churn Dataset is a collection of data related to customers of a bank who have either left (churned) or stayed with the bank. This dataset is typically used for predictive modeling to identify patterns and factors that lead to customer churn, enabling banks to take proactive measures to retain customers.
id: Unique identifier for each customer.
CustomerId: Unique identifier for the customer account.
Surname: Last name of the customer.
CreditScore: Numeric representation of the customer's creditworthiness.
Geography:str, Gender:str:Country or region where the customer resides ,Gender of the customer (e.g., Male, Female).
Age: Age of the customer.
Tenure: Number of years the customer has been with the bank.
Balance: Current balance in the customer's account.
NumOfProducts: Number of bank products the customer uses.
HasCrCard: Binary indicator (0 or 1) for whether the customer has a credit card.
IsActiveMember: Binary indicator (0 or 1) for whether the customer is an active member.
EstimatedSalary: Estimated salary of the customer.
Exited: Binary indicator (0 or 1) for whether the customer has churned (the target).
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset belongs to a leading online E-commerce company. The company wants to identify customers who are likely to churn, so they can proactively approach these customers with promotional offers.
The dataset contains various features related to customer behavior and characteristics, which can be used to predict customer churn.
The main task is to predict customer churn based on the given features. This is a binary classification problem where the target variable is 'Churn'.
This dataset is provided for educational purposes. While it represents a real-world scenario, the data itself may be simulated or anonymized.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
259
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
1) Data Introduction • The Telco Customer Churn Dataset includes carrier customer service usage, account information, demographics and churn, which can be used to predict and analyze customer churn.
2) Data Utilization (1) Telco Customer Churn Dataset has characteristics that: • This dataset includes a variety of customer and service characteristics, including gender, age group, partner and dependents, service subscription status (telephone, Internet, security, backup, device protection, technical support, streaming, etc.), contract type, payment method, monthly fee, total fee, and departure. (2) Telco Customer Churn Dataset can be used to: • Development of customer churn prediction model: Using customer service usage patterns and account information, we can build a machine learning-based churn prediction model to proactively identify customers at risk of churn.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F8d3442e6c82d8026c6a448e4780ab38c%2FPicture2.png?generation=1688638685268853&alt=media" alt="">
9. Plot the decision tree
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F9ab0591e323dc30fe116c79f6d014d06%2FPicture3.png?generation=1688638747644320&alt=media" alt="">
Average customer churn is 27%. The churn can take place if the tenure is more than >=7.5 and there is no internet service
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F16080ac04d3743ec238227e1ef2c8269%2FPicture4.png?generation=1688639197455166&alt=media" alt="">
Significant variables are Internet Service, Tenure and the least significant are Streaming Movies, Tech Support.
Run library(randomForest). Here we are using the default ntree (500) and mtry (p/3) where p is the number of
independent variables.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fc27fe7e83f0b53b7e067371b69c7f4a7%2FPicture6.png?generation=1688640478682685&alt=media" alt="">
Through confusion matrix, accuracy is coming 79.27%. The accuracy is marginally higher than that of decision tree i.e 79.00%. The error rate is pretty low when predicting "No" and much higher when predicting "Yes".
Plot the model showing which variables reduce the gini impunity the most and least. Total charges and tenure reduce the gini impunity the most while phone service has the least impact.
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fec25fc3ba74ab9cef1a81188209512b1%2FPicture7.png?generation=1688640726235724&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F50aa40e5dd676c8285020fd2fe627bf1%2FPicture8.png?generation=1688640896763066&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F87211e1b218c595911fbe6ea2806e27a%2FPicture9.png?generation=1688641103367564&alt=media" alt="">
Tune the model mtry=2 has the lowest OOB error rate
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F6057af5bb0719b16f1a97a58c3d4aa1d%2FPicture10.png?generation=1688641391027971&alt=media" alt="">
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2Fc7045eba4ee298c58f1bd0230c24c00d%2FPicture11.png?generation=1688641605829830&alt=media" alt="">
Use random forest with mtry = 2 and ntree = 200
https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F10868729%2F01541eff1f9c6303591aa50dd707b5f5%2FPicture12.png?generation=1688641634979403&alt=media" alt="">
Through confusion matrix, accuracy is coming 79.71%. The accuracy is marginally higher than that of default (when ntree was 500 and mtry was 4) i.e 79.27% and of decision tree i.e 79.00%. The error rate is pretty low when predicting "No" and m...
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This Synthetic Customer Churn Prediction Dataset has been designed as an educational resource for exploring data science, machine learning, and predictive modelling techniques in a customer retention context. The dataset simulates key attributes relevant to customer churn analysis, such as service usage, contract details, and customer demographics. It allows users to practice data manipulation, visualization, and the development of models to predict churn behaviour in industries like telecommunications, subscription services, or utilities.
https://storage.googleapis.com/opendatabay_public/images/churn_c4aae9d4-3939-4866-a249-35d81c5965dc.png" alt="Synthetic Customer Churn Prediction Dataset Distribution">
This dataset is useful for a variety of applications, including:
This dataset is synthetic and anonymized, making it a safe tool for experimentation and learning without compromising real patient privacy.
CCO (Public Domain)
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
The global customer churn software market size was valued at approximately USD 1.5 billion in 2023 and is projected to reach USD 4.8 billion by 2032, growing at a CAGR of 13.7% during the forecast period. This robust growth is driven by several factors, including the increasing importance of customer retention in competitive markets, advancements in AI and machine learning technologies, and the growing adoption of digital transformation initiatives across industries.
One of the primary growth factors propelling the customer churn software market is the increasing emphasis on customer satisfaction and retention. In today's highly competitive business environment, retaining existing customers is more cost-effective than acquiring new ones. Companies are realizing the value of customer loyalty, and as a result, they are investing heavily in tools that can help predict and mitigate churn. Customer churn software offers advanced analytics and predictive capabilities, enabling organizations to identify at-risk customers and take proactive measures to retain them.
Another significant driver is the advancement in artificial intelligence (AI) and machine learning technologies. These technologies have revolutionized the way customer data is analyzed and interpreted. AI-powered customer churn software can process vast amounts of data from multiple sources, identify patterns, and generate actionable insights. This ability to leverage big data and predictive analytics is crucial for businesses aiming to stay ahead of the competition. As AI and machine learning continue to evolve, the effectiveness and efficiency of customer churn software are expected to improve further.
The increasing adoption of digital transformation initiatives across various industries is also contributing to the market growth. As businesses undergo digital transformation, they generate enormous amounts of data related to customer behavior, preferences, and interactions. Customer churn software helps organizations make sense of this data, enabling them to develop personalized strategies to enhance customer experience and loyalty. The shift towards data-driven decision-making is compelling companies to invest in advanced analytics solutions, thereby driving the demand for customer churn software.
From a regional perspective, North America holds a significant share of the customer churn software market, driven by the presence of major technology companies and the early adoption of advanced analytics solutions. However, the Asia Pacific region is expected to witness the highest growth rate during the forecast period. Factors such as the rapid digitalization of economies, increasing investments in AI and machine learning, and the growing focus on customer-centric strategies in emerging markets are fueling the demand for customer churn software in this region.
The customer churn software market is segmented into two primary components: software and services. The software segment includes the actual customer churn solutions, while the services segment encompasses implementation, training, support, and consulting services. The software segment is expected to dominate the market due to the high demand for advanced analytics and predictive tools. Companies across various industries are increasingly adopting software solutions to gain insights into customer behavior and predict churn. The software segment's growth is further supported by continuous advancements in AI and machine learning technologies, which enhance the capabilities of customer churn solutions.
The services segment, although smaller in comparison to the software segment, plays a crucial role in the market. Services such as implementation and training ensure that organizations can effectively deploy and utilize customer churn software. Support and consulting services are equally important, as they help companies optimize their software usage and develop customized strategies to address specific churn-related challenges. The demand for these services is expected to grow in tandem with the adoption of customer churn software, as businesses seek to maximize their return on investment and achieve better customer retention outcomes.
Moreover, the integration of customer churn software with existing CRM systems and other business applications is becoming increasingly important. This integration enables a seamless flow of data and enhances the overall efficiency of customer retention efforts. As a result, solutions that offer robust integration capa
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘JB Link Telco Customer Churn’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/johnflag/jb-link-telco-customer-churn on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This is a customized version of the widely known IBM Telco Customer Churn dataset. I've added a few more columns and modified others in order to make it a little more realistic.
My customizations are based on the following version: Telco customer churn (11.1.3+)
Below you may find a fictional business problem I created. You may use it in order to start developing something around this dataset.
JB Link is a small size telecom company located in the state of California that provides Phone and Internet services to customers on more than a 1,000 cities and 1,600 zip codes.
The company is in the market for just 6 years and has quickly grown by investing on infrastructure to bring internet and phone networks to regions that had poor or no coverage.
The company also has a very skilled sales team that is always performing well on attracting new customers. The number of new customers acquired in the past quarter represent 15% over the total.
However, by the end of this same period, only 43% of this customers stayed with the company and most of them decided on not renewing their contracts after a few months, meaning the customer churn rate is very high and the company is now facing a big challenge on retaining its customers.
The total customer churn rate last quarter was around 27%, resulting in a decrease of almost 12% in the total number of customers.
The executive leadership of JB Link is aware that some competitors are investing on new technologies and on the expansion of their network coverage and they believe this is one of the main drivers of the high customer churn rate.
Therefore, as an action plan, they have decided to created a task force inside the company that will be responsible to work on a customer retention strategy.
The task force will involve members from different areas of the company, including Sales, Finance, Marketing, Customer Service, Tech Support and a recent formed Data Science team.
The data science team will play a key role on this process and was assigned some very important tasks that will support on the decisions and actions the other teams will be taking : - Gather insights from the data to understand what is driving the high customer churn rate. - Develop a Machine Learning model that can accurately predict the customers that are more likely to churn. - Prescribe customized actions that could be taken in order to retain each of those customers.
The Data Science team was given a dataset with a random sample of 7,043 customers that can help on achieving this task.
The executives are aware that the cost of acquiring a new customer can be up to five times higher than the cost of retaining a customer, so they are expecting that the results of this project will save a lot of money to the company and make it start growing again.
--- Original source retains full ownership of the source dataset ---
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
This dataset contains customer data from multiple sources that can be used to predict customer churn and analyze its effect on revenue. We'll use this data to gain insights into customer behavior, such as when customers are likely to churn, how their behavior affects revenue and what patterns of behavior can help us better understand customers. This dataset features several different attributes for each customer: their unique identifier, total charges paid over time, contract information and more. Additionally, we can use the predictive analytical models based on this data to identify at-risk customers that may be more likely to churn in the near future. By gaining deep insight into which customers are most likely to leave and why they are leaving, businesses will be better equipped with tools necessary for taking proactive measures against potential revenue losses due to customer churn
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
This dataset is an excellent tool for businesses to understand what factors are associated with customer churn and its impact on revenue. It can provide insights into which customers are most likely to leave, and how companies can prevent them from leaving.
To use this dataset, here are the steps businesses can follow: 1. Understand each of the data points available in the dataset and what they represent - For example, CustomerID is a unique identifier for each customer, Churn indicates if a customer has left the company or not, gender denotes what gender the customer is etc. 2. Analyze any trends or patterns in your data – Look out for correlations between different variables like OnlineSecurity usage and Churn rate or MonthlyCharges and tenure to determine how these variables affect customers’ decisions to stay with a company or leave it etc. 3. Use machine learning models on your dataset – Utilize supervised learning algorithms such as logistic regression on this dataset to determine which variable most closely correlates with loyalty of customers i.e., which variable will decide whether a particular customer will stay with your company or not?
4. Explore various ways of increasing retention rates – Think about ways you could incentivize customers who might be considering leaving their current provider (for example, offer discounts, free trials etc.). You could try different strategies like A/B testing too see which incentive works best for churn prevention/retention rate increase etc. 5.. Test out strategies before implementing them - Once you have decided on incentives that might work well, run small scale tests to check if they generate desired results before investing resources into full rollout programs .The systems based on machine learning algorithms allows you to quickly test assumptions efficiently without large investments in time & money prior committing these changes fully operational processes
- Using customer data to identify and target customers who are at a high risk of churning to counter this effect with relevant customer service initiatives.
- Analyzing the effects of promotional campaigns and loyalty programs on customer retention rates and overall revenue.
- Machine learning models that predict future chances of customer churn which can be used by businesses to improve strategies for better retention & profitability
If you use this dataset in your research, please credit the original authors. Data Source
License
License: CC0 1.0 Universal (CC0 1.0) - Public Domain Dedication No Copyright - You can copy, modify, distribute and perform the work, even for commercial purposes, all without asking permission. See Other Information.
File: dataset1.csv | Column name | Description | |:---------------------|:-----------------------------------------------------------------| | CustomerID | Unique identifier for each customer. (Integer) | | Churn | Whether or not the customer has churned. (Boolean) | | gender | Gender of the customer. (String) | | SeniorCitizen | Whether or not the customer is a senior citizen. (Boolean) ...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Churn prediction aims to detect customers intended to leave a service provider. Retaining one customer costs an organization from 5 to 10 times than gaining a new one. Predictive models can provide correct identification of possible churners in the near future in order to provide a retention solution. This paper presents a new prediction model based on Data Mining (DM) techniques. The proposed model is composed of six steps which are; identify problem domain, data selection, investigate data set, classification, clustering and knowledge usage. A data set with 23 attributes and 5000 instances is used. 4000 instances used for training the model and 1000 instances used as a testing set. The predicted churners are clustered into 3 categories in case of using in a retention strategy. The data mining techniques used in this paper are Decision Tree, Support Vector Machine and Neural Network throughout an open source software name WEKA.
According to our latest research, the AI-powered customer churn prediction market size reached USD 1.96 billion globally in 2024, with a robust CAGR of 18.3% projected through the forecast period. By 2033, the market is expected to hit USD 8.87 billion, driven by the increasing adoption of AI and machine learning solutions across multiple industries to proactively manage and reduce customer attrition. The rapid digital transformation and the growing emphasis on customer experience optimization have emerged as primary growth factors fueling the expansion of this dynamic market.
One of the core growth factors propelling the AI-powered customer churn prediction market is the exponential increase in customer data generation across industries. As businesses increasingly digitize their operations, vast amounts of customer interactions, behavioral data, and transactional records are being accumulated every day. AI-powered churn prediction tools leverage advanced analytics and machine learning algorithms to extract actionable insights from this data, allowing companies to identify at-risk customers with high accuracy. This enables organizations to implement timely retention strategies, reduce churn rates, and ultimately boost long-term profitability. The continuous evolution of AI algorithms, including deep learning and natural language processing, further enhances the predictive capabilities of these solutions, making them indispensable in highly competitive sectors such as telecommunications, BFSI, and retail.
Another significant driver is the escalating demand for personalized customer experiences. Modern consumers expect brands to anticipate their needs and deliver tailored interactions across all touchpoints. AI-powered customer churn prediction systems empower businesses to segment their customer base, understand individual preferences, and proactively address potential pain points. This targeted approach not only improves customer satisfaction but also increases the effectiveness of marketing campaigns and retention efforts. Moreover, the integration of AI with CRM platforms and omnichannel engagement tools has streamlined the deployment of churn prediction models, making them accessible even to small and medium-sized enterprises. The ability to automate and scale these insights across large customer populations is a critical factor stimulating market growth.
The rising cost of customer acquisition compared to retention is also amplifying the importance of AI-powered churn prediction solutions. As competition intensifies and customer loyalty becomes harder to secure, organizations are prioritizing strategies that maximize the lifetime value of existing clients. AI-driven churn analytics provide a cost-effective means to identify early warning signals and intervene before customers decide to leave. This not only reduces the financial impact of churn but also enhances brand reputation and customer advocacy. The scalability, real-time processing, and predictive accuracy offered by AI solutions are attracting investments from both established enterprises and emerging startups, further accelerating market expansion.
Regionally, North America continues to dominate the AI-powered customer churn prediction market, accounting for the largest revenue share in 2024. The region’s advanced technological infrastructure, high digital adoption rates, and concentration of leading AI vendors are key contributors to its leadership position. However, the Asia Pacific region is poised for the fastest growth, fueled by the rapid digitization of economies, increasing mobile and internet penetration, and rising investments in AI and analytics by enterprises. Europe also presents significant opportunities, particularly in sectors like BFSI and retail, where regulatory pressures and customer-centricity are driving early adoption of churn prediction tools. The market landscape in Latin America and the Middle East & Africa is evolving, with organizations gradually recognizing the value of proactive churn management in enhancing competitiveness and customer loyalty.
https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the AI-powered customer churn prediction market size reached USD 1.58 billion globally in 2024, with a robust CAGR of 19.7% expected from 2025 to 2033. Driven by rapid digital transformation and the increasing need for predictive analytics across sectors, the market is forecasted to attain a value of USD 7.57 billion by 2033. The growth of this market is primarily attributed to the escalating adoption of AI and machine learning technologies by enterprises seeking to reduce customer attrition, optimize retention strategies, and enhance overall customer lifetime value, as per the latest industry research.
One of the fundamental growth drivers for the AI-powered customer churn prediction market is the proliferation of customer data and the imperative need for businesses to leverage this data to drive actionable insights. With the advent of digital touchpoints, organizations are now able to collect vast amounts of structured and unstructured data from various customer interactions. This data, when processed using advanced AI and machine learning algorithms, empowers companies to predict potential churn with high accuracy. As a result, businesses across industries such as telecommunications, BFSI, retail, and healthcare are increasingly investing in AI-powered churn prediction solutions to proactively identify at-risk customers and implement targeted retention strategies, thereby reducing revenue loss and improving profitability.
Another significant factor fueling market expansion is the growing emphasis on customer experience and personalization. In today's hyper-competitive landscape, retaining existing customers has become more cost-effective than acquiring new ones. AI-powered churn prediction tools enable organizations to segment their customer base, understand behavior patterns, and tailor interventions for individual customers. This level of personalization not only helps in reducing churn rates but also enhances customer satisfaction and loyalty. The integration of AI-driven insights into CRM systems and marketing automation platforms further streamlines the process, making it easier for businesses to act on predictions in real time. Moreover, the rising adoption of cloud-based solutions has made these technologies more accessible to small and medium enterprises (SMEs), broadening the market’s reach.
The surge in demand for scalable, real-time analytics platforms is also contributing to market growth. Enterprises are increasingly seeking AI-powered solutions that can integrate seamlessly with their existing IT infrastructure, deliver instant insights, and scale as their data grows. The shift towards cloud deployment models has accelerated this trend, offering cost-effective, flexible, and easily deployable churn prediction solutions. Additionally, advancements in natural language processing (NLP), deep learning, and big data analytics are further enhancing the accuracy and reliability of churn prediction models. As organizations strive to stay ahead of the competition by minimizing customer attrition, the demand for sophisticated, AI-driven predictive analytics tools continues to rise.
Regionally, North America holds the largest market share, followed by Europe and Asia Pacific. The dominance of North America can be attributed to the early adoption of AI technologies, presence of major technology vendors, and a strong focus on customer-centric strategies among enterprises in the region. Europe is also witnessing significant growth, driven by stringent regulations around data protection and a growing emphasis on customer retention in industries like BFSI and retail. The Asia Pacific region is expected to exhibit the highest CAGR during the forecast period, fueled by rapid digitalization, increasing investments in AI, and the expansion of e-commerce and telecommunications sectors. Latin America and the Middle East & Africa are also experiencing gradual adoption, primarily in financial services and telecommunications.
The component segment of the AI-powered customer churn prediction market is categorized into software and services. The software segment dominates the market, accounting for the largest share in 2024, owing to the widespread deployment of advanced AI and machine learning platforms
Although the results were close, the industry in the United States where customers were most likely to leave their current provider due to poor customer service appears to be cable television, with a 25 percent churn rate in 2020.
Churn rate
Churn rate, sometimes also called attrition rate, is the percentage of customers that stop utilizing a service within a time given period. It is often used to measure businesses which have a contractual customer base, especially subscriber-based service models.
https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy
The Customer Churn Software market is experiencing robust growth, driven by the increasing need for businesses across diverse sectors to improve customer retention and enhance profitability. The market's expansion is fueled by several key factors. Firstly, the rising adoption of cloud-based solutions offers scalability and cost-effectiveness, attracting a wider range of businesses. Secondly, advancements in AI and machine learning are enabling more sophisticated churn prediction and proactive customer engagement strategies. The telecommunications, banking and finance, and retail and e-commerce sectors are currently leading the adoption, leveraging the software to identify at-risk customers and implement targeted retention programs. However, factors such as high implementation costs, integration challenges with existing systems, and the need for skilled personnel to manage the software can act as restraints on market growth. We project a substantial market expansion in the coming years, with a steady compound annual growth rate (CAGR) contributing to a significant increase in market value. The competitive landscape is dynamic, with established players like IBM, Salesforce, and Microsoft competing alongside specialized churn management solution providers. This competition fosters innovation and drives the development of more advanced features and functionalities. Looking ahead, the market will witness further consolidation through mergers and acquisitions, as larger companies seek to expand their market share. The increasing emphasis on data privacy and security regulations will also shape market dynamics, with vendors focusing on compliant solutions. The market is expected to witness the rise of niche solutions tailored to specific industry segments, providing customized functionalities. The geographic distribution of the market is expected to remain concentrated in North America and Europe initially, with significant growth potential in emerging markets like Asia Pacific and the Middle East & Africa, fueled by increasing digitalization and adoption of sophisticated business analytics. The continued evolution of AI and machine learning algorithms will be crucial in improving the accuracy and efficiency of churn prediction models, further enhancing the value proposition of Customer Churn Software. This convergence of technological advancement, regulatory compliance, and industry-specific needs will shape the future trajectory of the Customer Churn Software market.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Dataset
About the Customer chun dataset
To start out, you'll be working with real data from the Kenyan Telecommunication survey. This dataset "Churn.xls" is related to customer churn analysis for a telecommunications company. Customer churn refers to the phenomenon where customers stop doing business with a company. The dataset includes various attributes of customers and their usage patterns, which are typically used to predict whether a customer is likely to leave the service (churn) or stay. Here is a brief description of the variables provided in the dataset: 1.ID: A unique identifier for each customer. 2.COLLEGE: Indicates whether the customer has a college degree ("one" for yes, "zero" for no). 3.INCOME: The annual income of the customer. 4.OVERAGE: The number of overage minutes the customer used. 5.LEFTOVER: The number of leftover minutes the customer has. 6.HOUSE: The value of the customer's house. 7.HANDSET_PRICE: The price of the customer's handset. 8.OVER_15MINS_CALLS_PER_MONTH: The number of calls per month that exceed 15 minutes. 9.AVERAGE_CALL_DURATION: The average duration of calls made by the customer. 10.REPORTED_SATISFACTION: The customer's reported level of satisfaction with the service (e.g., "unsat", "very_sat"). 11.REPORTED_USAGE_LEVEL: The customer's reported usage level of the service (e.g., "little", "very_high"). 12.CONSIDERING_CHANGE_OF_PLAN: Indicates whether the customer is considering changing their plan (e.g., "no", "considering"). 13.LEAVE: The target variable indicating whether the customer decided to leave ("LEAVE") or stay ("STAY"). Customers who left within the last month – the column is called "LEAVE". Based on these variables, the dataset shall beused for predictive modeling to identify factors that influence customer churn and to develop strategies to retain customers. The variables cover demographic information, usage patterns, customer satisfaction, and the likelihood of changing plans, all of which are crucial in understanding and predicting churn behavior.
Why Analysis? Customer churn refers to the phenomenon where customers discontinue their relationship or subscription with a company or service provider. It represents the rate at which customers stop using a company's products or services within a specific period. Churn is an important metric for businesses as it directly impacts revenue, growth, and customer retention. In the context of the Churn dataset, the churn label indicates whether a customer has churned or not. A churned customer is one who has decided to discontinue their subscription or usage of the company's services. On the other hand, a non-churned customer is one who continues to remain engaged and retains their relationship with the company. Understanding customer churn is crucial for businesses to identify patterns, factors, and indicators that contribute to customer attrition. By analyzing churn behavior and its associated features, companies can develop strategies to retain existing customers, improve customer satisfaction, and reduce customer turnover. Predictive modeling techniques can also be applied to forecast and proactively address potential churn, enabling companies to take proactive measures to retain at-risk customers.
https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy
BASE YEAR | 2024 |
HISTORICAL DATA | 2019 - 2024 |
REPORT COVERAGE | Revenue Forecast, Competitive Landscape, Growth Factors, and Trends |
MARKET SIZE 2023 | 5.01(USD Billion) |
MARKET SIZE 2024 | 5.64(USD Billion) |
MARKET SIZE 2032 | 14.52(USD Billion) |
SEGMENTS COVERED | Deployment Mode ,Application ,Industry ,Model Complexity ,Data Type ,Regional |
COUNTRIES COVERED | North America, Europe, APAC, South America, MEA |
KEY MARKET DYNAMICS | Cloudbased Deployment Integration of Machine Learning Big Data Analytics Increase in Demand for Predictive Analytics Rising Prevalence of Chronic Diseases |
MARKET FORECAST UNITS | USD Billion |
KEY COMPANIES PROFILED | Qlik Technologies ,Oracle ,Tableau Software ,Alteryx ,Teradata ,SAS Institute ,Dell Technologies ,KNIME ,H2O.ai ,DataRobot ,HP Enterprise ,SAP SE ,Microsoft ,IBM ,RapidMiner |
MARKET FORECAST PERIOD | 2025 - 2032 |
KEY MARKET OPPORTUNITIES | 1 Expanding healthcare applications 2 Growing demand in pharmaceuticals 3 Rise of ecommerce and logistics 4 Increasing focus on predictive analytics 5 Advancements in machine learning algorithms |
COMPOUND ANNUAL GROWTH RATE (CAGR) | 12.56% (2025 - 2032) |
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Predictive analytics is rapidly transforming the banking sector, offering institutions the ability to enhance decision-making across various operations. The market, currently valued at approximately $15 billion in 2025, is projected to experience robust growth, driven by several key factors. Increasing regulatory scrutiny demanding improved risk management necessitates advanced analytical tools. The need for personalized customer experiences, coupled with the rising adoption of digital banking channels, fuels demand for predictive modeling in areas such as fraud detection, customer churn prediction, and targeted marketing. Furthermore, the availability of vast amounts of data, combined with advancements in machine learning and artificial intelligence, empowers banks to derive actionable insights with unprecedented accuracy. The market's expansion is further accelerated by the growing adoption of cloud-based solutions, offering scalability and cost-effectiveness. However, challenges remain. Data security and privacy concerns are paramount, requiring robust data governance frameworks. The need for skilled professionals to develop, implement, and interpret predictive models presents another hurdle. Additionally, the integration of predictive analytics solutions with existing legacy systems within banking institutions can prove complex and time-consuming. Despite these challenges, the long-term outlook for predictive analytics in banking remains positive, with a projected Compound Annual Growth Rate (CAGR) of approximately 15% from 2025 to 2033. This growth is anticipated to be driven by continuous technological innovation, increasing data availability, and the growing recognition of the substantial return on investment associated with predictive modeling within the financial industry. The competitive landscape includes established players like FICO, IBM, and Oracle, as well as specialized providers such as Accretive Technologies and Angoss Software, vying for market share through innovative solutions and strategic partnerships.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
If you found the dataset useful, your upvote will help others discover it. Thanks for your support!
This dataset simulates customer behavior for a fictional telecommunications company. It contains demographic information, account details, services subscribed to, and whether the customer ultimately churned (stopped using the service) or not. The data is synthetically generated but designed to reflect realistic patterns often found in telecom churn scenarios.
Purpose:
The primary goal of this dataset is to provide a clean and straightforward resource for beginners learning about:
Features:
The dataset includes the following columns:
CustomerID
: Unique identifier for each customer.Age
: Customer's age in years.Gender
: Customer's gender (Male/Female).Location
: General location of the customer (e.g., New York, Los Angeles).SubscriptionDurationMonths
: How many months the customer has been subscribed.MonthlyCharges
: The amount the customer is charged each month.TotalCharges
: The total amount the customer has been charged over their subscription period.ContractType
: The type of contract the customer has (Month-to-month, One year, Two year).PaymentMethod
: How the customer pays their bill (e.g., Electronic check, Credit card).OnlineSecurity
: Whether the customer has online security service (Yes, No, No internet service).TechSupport
: Whether the customer has tech support service (Yes, No, No internet service).StreamingTV
: Whether the customer has TV streaming service (Yes, No, No internet service).StreamingMovies
: Whether the customer has movie streaming service (Yes, No, No internet service).Churn
: (Target Variable) Whether the customer churned (1 = Yes, 0 = No).Data Quality:
This dataset is intentionally clean with no missing values, making it easy for beginners to focus on analysis and modeling concepts without complex data cleaning steps.
Inspiration:
Understanding customer churn is crucial for many businesses. This dataset provides a sandbox environment to practice the fundamental techniques used in churn analysis and prediction.
In the first quarter of 2024, T-Mobile US had a churn rate of **** percent for postpaid subscribers, a *** percentage point increase compared to the previous quarter. T-Mobile US has lowered its postpaid churn rate from more than *** percent to below *** percent over the last ten years.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The Big Data and Machine Learning (BDML) in Telecom market is experiencing robust growth, driven by the explosive increase in mobile data traffic, the rise of 5G networks, and the increasing need for personalized customer experiences. The market, valued at approximately $15 billion in 2025, is projected to witness a Compound Annual Growth Rate (CAGR) of 18% from 2025 to 2033, reaching an estimated $60 billion by 2033. This expansion is fueled by several key factors. Telecom operators are leveraging BDML for network optimization, predictive maintenance, fraud detection, customer churn prediction, and personalized service offerings. The adoption of descriptive, predictive, and prescriptive analytics across various applications, including processing, storage, and analysis of vast datasets, is a significant driver. Furthermore, advancements in machine learning algorithms and feature engineering techniques are empowering telecom companies to extract deeper insights from their data, leading to significant efficiency gains and improved revenue streams. The increasing availability of cloud-based BDML solutions is also fostering wider adoption, particularly among smaller operators. However, challenges remain. Data security and privacy concerns, the need for skilled data scientists and engineers, and the high initial investment costs associated with implementing BDML solutions can hinder market growth. Despite these restraints, the strategic advantages offered by BDML are undeniable, making its adoption crucial for telecom companies aiming to stay competitive in a rapidly evolving landscape. Segments like predictive analytics and machine learning for network optimization are expected to experience the most significant growth during the forecast period, driven by the increasing complexity of telecom networks and the demand for proactive network management. Geographic regions such as North America and Asia Pacific, with their advanced technological infrastructure and substantial investments in 5G, are anticipated to lead the market, followed by Europe and other regions.
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Bank Customer Churn Dataset is a collection of data related to customers of a bank who have either left (churned) or stayed with the bank. This dataset is typically used for predictive modeling to identify patterns and factors that lead to customer churn, enabling banks to take proactive measures to retain customers.
id: Unique identifier for each customer.
CustomerId: Unique identifier for the customer account.
Surname: Last name of the customer.
CreditScore: Numeric representation of the customer's creditworthiness.
Geography:str, Gender:str:Country or region where the customer resides ,Gender of the customer (e.g., Male, Female).
Age: Age of the customer.
Tenure: Number of years the customer has been with the bank.
Balance: Current balance in the customer's account.
NumOfProducts: Number of bank products the customer uses.
HasCrCard: Binary indicator (0 or 1) for whether the customer has a credit card.
IsActiveMember: Binary indicator (0 or 1) for whether the customer is an active member.
EstimatedSalary: Estimated salary of the customer.
Exited: Binary indicator (0 or 1) for whether the customer has churned (the target).