23 datasets found
  1. f

    Details of feature variables of the data set.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Details of feature variables of the data set. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  2. Bank Customer Churn Dataset

    • kaggle.com
    Updated Jul 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bhuvi Ranga (2023). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/bhuviranga/customer-churn-data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 11, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Bhuvi Ranga
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention

  3. f

    Confusion matrix.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  4. Bank Churn (test)

    • kaggle.com
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harshit Sharma (2024). Bank Churn (test) [Dataset]. https://www.kaggle.com/datasets/harshitstark/bank-churn-dataset-test
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 21, 2024
    Dataset provided by
    Kaggle
    Authors
    Harshit Sharma
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Explore the 'Bank Churn (test)' dataset, a comprehensive collection designed for evaluating predictive models and analyzing customer attrition in the banking sector. This test dataset, derived from real-world scenarios, offers a robust platform to assess the effectiveness of machine learning algorithms in predicting and understanding bank churn dynamics.

  5. P

    Predictive Analytics in Banking Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Predictive Analytics in Banking Report [Dataset]. https://www.datainsightsmarket.com/reports/predictive-analytics-in-banking-1448930
    Explore at:
    pdf, doc, pptAvailable download formats
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Predictive analytics is rapidly transforming the banking sector, offering institutions the ability to enhance decision-making across various operations. The market, currently valued at approximately $15 billion in 2025, is projected to experience robust growth, driven by several key factors. Increasing regulatory scrutiny demanding improved risk management necessitates advanced analytical tools. The need for personalized customer experiences, coupled with the rising adoption of digital banking channels, fuels demand for predictive modeling in areas such as fraud detection, customer churn prediction, and targeted marketing. Furthermore, the availability of vast amounts of data, combined with advancements in machine learning and artificial intelligence, empowers banks to derive actionable insights with unprecedented accuracy. The market's expansion is further accelerated by the growing adoption of cloud-based solutions, offering scalability and cost-effectiveness. However, challenges remain. Data security and privacy concerns are paramount, requiring robust data governance frameworks. The need for skilled professionals to develop, implement, and interpret predictive models presents another hurdle. Additionally, the integration of predictive analytics solutions with existing legacy systems within banking institutions can prove complex and time-consuming. Despite these challenges, the long-term outlook for predictive analytics in banking remains positive, with a projected Compound Annual Growth Rate (CAGR) of approximately 15% from 2025 to 2033. This growth is anticipated to be driven by continuous technological innovation, increasing data availability, and the growing recognition of the substantial return on investment associated with predictive modeling within the financial industry. The competitive landscape includes established players like FICO, IBM, and Oracle, as well as specialized providers such as Accretive Technologies and Angoss Software, vying for market share through innovative solutions and strategic partnerships.

  6. f

    Comparison results of different model.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Comparison results of different model. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  7. Bank customer churn model prediction

    • kaggle.com
    Updated Jul 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    T S S ABHI RAM KOTIPALLI (2023). Bank customer churn model prediction [Dataset]. https://www.kaggle.com/datasets/tssabhiramkotipalli/bank-customer-churn-model-prediction/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 13, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    T S S ABHI RAM KOTIPALLI
    Description

    Dataset

    This dataset was created by T S S ABHI RAM KOTIPALLI

    Contents

  8. Data from: Bank Customer Churn Prediction

    • kaggle.com
    Updated Mar 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Murilo Zangari (2024). Bank Customer Churn Prediction [Dataset]. https://www.kaggle.com/datasets/murilozangari/customer-churn-from-a-bank/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 21, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Murilo Zangari
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    The data will be used to predict whether a customer of the bank will churn. If a customer churns, it means they left the bank and took their business elsewhere. If you can predict which customers are likely to churn, you can take measures to retain them before they do. These measures could be promotions, discounts, or other incentives to boost customer satisfaction and, therefore, retention.

    The dataset contains:

    10,000 rows – each row is a unique customer of the bank

    14 columns:

    RowNumber: Row numbers from 1 to 10,000

    CustomerId: Customer’s unique ID assigned by bank

    Surname: Customer’s last name

    CreditScore: Customer’s credit score. This number can range from 300 to 850.

    Geography: Customer’s country of residence

    Gender: Categorical indicator

    Age: Customer’s age (years)

    Tenure: Number of years customer has been with bank

    Balance: Customer’s bank balance (Euros)

    NumOfProducts: Number of products the customer has with the bank

    HasCrCard: Indicates whether the customer has a credit card with the bank

    IsActiveMember: Indicates whether the customer is considered active

    EstimatedSalary: Customer’s estimated annual salary (Euros)

    Exited: Indicates whether the customer churned (left the bank)

  9. f

    Comparison of GA-XGBoost with XGBoost and LightGBM test results.

    • figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Comparison of GA-XGBoost with XGBoost and LightGBM test results. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of GA-XGBoost with XGBoost and LightGBM test results.

  10. Bank_ Customer_ Churn _Prediction_ Model_1

    • kaggle.com
    zip
    Updated Apr 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abraz Laskar (2021). Bank_ Customer_ Churn _Prediction_ Model_1 [Dataset]. https://www.kaggle.com/abrazlaskar/bank-customer-churn-prediction-model-1
    Explore at:
    zip(220458 bytes)Available download formats
    Dataset updated
    Apr 14, 2021
    Authors
    Abraz Laskar
    Description

    Dataset

    This dataset was created by Abraz Laskar

    Contents

  11. C

    Customer Churn Software Report

    • marketresearchforecast.com
    doc, pdf, ppt
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market Research Forecast (2025). Customer Churn Software Report [Dataset]. https://www.marketresearchforecast.com/reports/customer-churn-software-56060
    Explore at:
    pdf, ppt, docAvailable download formats
    Dataset updated
    Mar 25, 2025
    Dataset authored and provided by
    Market Research Forecast
    License

    https://www.marketresearchforecast.com/privacy-policyhttps://www.marketresearchforecast.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Customer Churn Software market is experiencing robust growth, driven by the increasing need for businesses across diverse sectors to improve customer retention and enhance profitability. The market's expansion is fueled by several key factors. Firstly, the rising adoption of cloud-based solutions offers scalability and cost-effectiveness, attracting a wider range of businesses. Secondly, advancements in AI and machine learning are enabling more sophisticated churn prediction and proactive customer engagement strategies. The telecommunications, banking and finance, and retail and e-commerce sectors are currently leading the adoption, leveraging the software to identify at-risk customers and implement targeted retention programs. However, factors such as high implementation costs, integration challenges with existing systems, and the need for skilled personnel to manage the software can act as restraints on market growth. We project a substantial market expansion in the coming years, with a steady compound annual growth rate (CAGR) contributing to a significant increase in market value. The competitive landscape is dynamic, with established players like IBM, Salesforce, and Microsoft competing alongside specialized churn management solution providers. This competition fosters innovation and drives the development of more advanced features and functionalities. Looking ahead, the market will witness further consolidation through mergers and acquisitions, as larger companies seek to expand their market share. The increasing emphasis on data privacy and security regulations will also shape market dynamics, with vendors focusing on compliant solutions. The market is expected to witness the rise of niche solutions tailored to specific industry segments, providing customized functionalities. The geographic distribution of the market is expected to remain concentrated in North America and Europe initially, with significant growth potential in emerging markets like Asia Pacific and the Middle East & Africa, fueled by increasing digitalization and adoption of sophisticated business analytics. The continued evolution of AI and machine learning algorithms will be crucial in improving the accuracy and efficiency of churn prediction models, further enhancing the value proposition of Customer Churn Software. This convergence of technological advancement, regulatory compliance, and industry-specific needs will shape the future trajectory of the Customer Churn Software market.

  12. f

    The summary of the literature review.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). The summary of the literature review. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  13. C

    Churn Prediction Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated May 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Churn Prediction Software Report [Dataset]. https://www.datainsightsmarket.com/reports/churn-prediction-software-502488
    Explore at:
    doc, ppt, pdfAvailable download formats
    Dataset updated
    May 11, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The Churn Prediction Software market is experiencing robust growth, driven by the increasing need for businesses across diverse sectors to proactively manage customer retention. The market's expansion is fueled by the rising adoption of cloud-based solutions, offering scalability and cost-effectiveness. Key applications include telecommunications, banking and finance, retail, e-commerce, and healthcare, where minimizing customer churn is crucial for profitability. The market is witnessing a shift towards sophisticated predictive analytics and machine learning algorithms that provide more accurate churn predictions, allowing businesses to implement targeted retention strategies. This includes personalized offers, proactive customer support, and improved product/service offerings. Furthermore, the integration of churn prediction software with CRM systems enhances data analysis and facilitates more effective customer relationship management. Competition is intensifying with established players like SAP, Salesforce, and Oracle competing alongside agile startups offering specialized solutions. The market's growth, while positive, also faces certain restraints, such as the high initial investment costs for implementing these sophisticated solutions and the need for skilled data scientists to interpret and leverage the insights derived from the analyses. Despite these challenges, the market's future remains promising. The increasing availability of large datasets, coupled with advancements in artificial intelligence and machine learning, is expected to drive innovation and further enhance the accuracy and effectiveness of churn prediction software. Regional growth will vary, with North America and Europe likely leading the market initially, driven by higher technology adoption rates and established business practices. However, growth in Asia-Pacific is anticipated to accelerate significantly in the coming years as businesses in developing economies prioritize customer retention strategies. The continued development of user-friendly interfaces and the increasing integration of these tools into existing business workflows will further contribute to the overall market expansion and wider adoption across various industries.

  14. A

    ‘Churn for Bank Customers’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Mar 27, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Churn for Bank Customers’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-churn-for-bank-customers-2e90/7961ea42/?iid=013-409&v=presentation
    Explore at:
    Dataset updated
    Mar 27, 2019
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Churn for Bank Customers’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mathchi/churn-for-bank-customers on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Content

    • RowNumber—corresponds to the record (row) number and has no effect on the output.
    • CustomerId—contains random values and has no effect on customer leaving the bank.
    • Surname—the surname of a customer has no impact on their decision to leave the bank.
    • CreditScore—can have an effect on customer churn, since a customer with a higher credit score is less likely to leave the bank.
    • Geography—a customer’s location can affect their decision to leave the bank.
    • Gender—it’s interesting to explore whether gender plays a role in a customer leaving the bank.
    • Age—this is certainly relevant, since older customers are less likely to leave their bank than younger ones.
    • Tenure—refers to the number of years that the customer has been a client of the bank. Normally, older clients are more loyal and less likely to leave a bank.
      • Balance—also a very good indicator of customer churn, as people with a higher balance in their accounts are less likely to leave the bank compared to those with lower balances.
      • NumOfProducts—refers to the number of products that a customer has purchased through the bank.
      • HasCrCard—denotes whether or not a customer has a credit card. This column is also relevant, since people with a credit card are less likely to leave the bank.
      • IsActiveMember—active customers are less likely to leave the bank.
      • EstimatedSalary—as with balance, people with lower salaries are more likely to leave the bank compared to those with higher salaries.
      • Exited—whether or not the customer left the bank.

    Acknowledgements

    As we know, it is much more expensive to sign in a new client than keeping an existing one.

    It is advantageous for banks to know what leads a client towards the decision to leave the company.

    Churn prevention allows companies to develop loyalty programs and retention campaigns to keep as many customers as possible.

    --- Original source retains full ownership of the source dataset ---

  15. f

    Performance comparison of different adoption algorithms in XGBoost model.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Performance comparison of different adoption algorithms in XGBoost model. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of different adoption algorithms in XGBoost model.

  16. Bank Customer Churn Dataset

    • kaggle.com
    Updated Aug 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaurav Topre (2022). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/gauravtopre/bank-customer-churn-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 30, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Gaurav Topre
    Description

    This dataset is for ABC Multistate bank with following columns:

    1. customer_id, unused variable.
    2. credit_score, used as input.
    3. country, used as input.
    4. gender, used as input.
    5. age, used as input.
    6. tenure, used as input.
    7. balance, used as input.
    8. products_number, used as input.
    9. credit_card, used as input.
    10. active_member, used as input.
    11. estimated_salary, used as input.
    12. churn, used as the target. 1 if the client has left the bank during some period or 0 if he/she has not.

    Aim is to Predict the Customer Churn for ABC Bank.

    https://miro.medium.com/max/737/1*Xap6OxaZvD7C7eMQKkaHYQ.jpeg" alt="">

  17. A

    ‘Bank Turnover Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Bank Turnover Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-bank-turnover-dataset-db8f/480641c5/?iid=013-140&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Bank Turnover Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling on 28 January 2022.

    --- No further description of dataset provided by original source ---

    --- Original source retains full ownership of the source dataset ---

  18. A

    ‘Bank Customers Churn ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Bank Customers Churn ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-bank-customers-churn-bbf0/7d7c24bc/?iid=029-116&v=presentation
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Bank Customers Churn ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/santoshd3/bank-customers on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    A dataset which contain some customers who are withdrawing their account from the bank due to some loss and other issues with the help this data we try to analyse and maintain accuracy.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

    --- Original source retains full ownership of the source dataset ---

  19. Bank Customer Churn

    • kaggle.com
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CAT Reloaded || Data Science circle (2025). Bank Customer Churn [Dataset]. https://www.kaggle.com/datasets/cat-reloaded-data-science/bank-customer-churn/versions/1
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    CAT Reloaded || Data Science circle
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Bank Customer Churn Dataset is a collection of data related to customers of a bank who have either left (churned) or stayed with the bank. This dataset is typically used for predictive modeling to identify patterns and factors that lead to customer churn, enabling banks to take proactive measures to retain customers.

    • id: Unique identifier for each customer.

    • CustomerId: Unique identifier for the customer account.

    • Surname: Last name of the customer.

    • CreditScore: Numeric representation of the customer's creditworthiness.

    • Geography:str, Gender:str:Country or region where the customer resides ,Gender of the customer (e.g., Male, Female).

    • Age: Age of the customer.

    • Tenure: Number of years the customer has been with the bank.

    • Balance: Current balance in the customer's account.

    • NumOfProducts: Number of bank products the customer uses.

    • HasCrCard: Binary indicator (0 or 1) for whether the customer has a credit card.

    • IsActiveMember: Binary indicator (0 or 1) for whether the customer is an active member.

    • EstimatedSalary: Estimated salary of the customer.

    • Exited: Binary indicator (0 or 1) for whether the customer has churned (the target).

  20. Churn Banking Prediction

    • kaggle.com
    Updated Jul 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aliahmd15 (2021). Churn Banking Prediction [Dataset]. https://www.kaggle.com/datasets/aliahmd15/churn-banking-prediction/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 21, 2021
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aliahmd15
    Description

    Dataset

    This dataset was created by Aliahmd15

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ke Peng; Yan Peng; Wenguang Li (2023). Details of feature variables of the data set. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t002

Details of feature variables of the data set.

Related Article
Explore at:
xlsAvailable download formats
Dataset updated
Dec 8, 2023
Dataset provided by
PLOS ONE
Authors
Ke Peng; Yan Peng; Wenguang Li
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

Search
Clear search
Close search
Google apps
Main menu