28 datasets found
  1. Bank Churn (test)

    • kaggle.com
    Updated Jan 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Harshit Sharma (2024). Bank Churn (test) [Dataset]. https://www.kaggle.com/datasets/harshitstark/bank-churn-dataset-test
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 21, 2024
    Dataset provided by
    Kaggle
    Authors
    Harshit Sharma
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Explore the 'Bank Churn (test)' dataset, a comprehensive collection designed for evaluating predictive models and analyzing customer attrition in the banking sector. This test dataset, derived from real-world scenarios, offers a robust platform to assess the effectiveness of machine learning algorithms in predicting and understanding bank churn dynamics.

  2. Bank Churn Prediction

    • kaggle.com
    Updated Jan 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    willian oliveira gibin (2024). Bank Churn Prediction [Dataset]. http://doi.org/10.34740/kaggle/dsv/7466166
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 23, 2024
    Dataset provided by
    Kaggle
    Authors
    willian oliveira gibin
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F16731800%2Ff48666dbf6dc3882a23c91000928c455%2FDesign%20sem%20nome.png?generation=1706043006289244&alt=media" alt="">In the synthetic dataset for the Playground Series S4 E1 Binary Classification with a Bank Churn Dataset, various features have been engineered to capture relevant information about customers. The dataset includes label-encoded surnames and features derived from them using the TFIDF vectorizer. The credit score serves as a numerical representation of a customer's creditworthiness, while the geography feature indicates the country of residence, with one-hot encoding for France, Spain, and Germany.

    Gender is represented with one-hot encoding for male and female categories. Age, tenure, balance, and the number of products used by the customer offer insights into their banking behavior. The presence of a credit card, active membership status, and estimated salary are also included as binary features.

    Notable engineered features provide additional insights. Mem_no_Products is the product of the number of products and active membership status, offering a combined metric. Cred_Bal_Sal represents the ratio of the product of credit score and balance to estimated salary, providing a relative measure of financial health. The balance-to-salary ratio (Bal_sal) and the tenure-to-age ratio (Tenure_Age) offer further dimensions for analysis. Finally, Age_Tenure_product is a feature capturing the interaction between age and tenure.

    The target variable, 'Exited,' indicates whether a customer has churned, with a value of 1 for churned customers and 0 for those who have not. This dataset, with its diverse set of features and engineered metrics, provides a comprehensive foundation for binary classification tasks, enabling the exploration of factors influencing customer churn in the banking domain. Analysts and data scientists can leverage these features to build predictive models and gain insights into the dynamics of customer retention.

  3. Bank Customer Churn

    • kaggle.com
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CAT Reloaded || Data Science circle (2025). Bank Customer Churn [Dataset]. https://www.kaggle.com/datasets/cat-reloaded-data-science/bank-customer-churn
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    CAT Reloaded || Data Science circle
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Bank Customer Churn Dataset is a collection of data related to customers of a bank who have either left (churned) or stayed with the bank. This dataset is typically used for predictive modeling to identify patterns and factors that lead to customer churn, enabling banks to take proactive measures to retain customers.

    • id: Unique identifier for each customer.

    • CustomerId: Unique identifier for the customer account.

    • Surname: Last name of the customer.

    • CreditScore: Numeric representation of the customer's creditworthiness.

    • Geography:str, Gender:str:Country or region where the customer resides ,Gender of the customer (e.g., Male, Female).

    • Age: Age of the customer.

    • Tenure: Number of years the customer has been with the bank.

    • Balance: Current balance in the customer's account.

    • NumOfProducts: Number of bank products the customer uses.

    • HasCrCard: Binary indicator (0 or 1) for whether the customer has a credit card.

    • IsActiveMember: Binary indicator (0 or 1) for whether the customer is an active member.

    • EstimatedSalary: Estimated salary of the customer.

    • Exited: Binary indicator (0 or 1) for whether the customer has churned (the target).

  4. A

    ‘Bank Turnover Dataset’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Bank Turnover Dataset’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-bank-turnover-dataset-db8f/latest
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Bank Turnover Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/barelydedicated/bank-customer-churn-modeling on 28 January 2022.

    --- No further description of dataset provided by original source ---

    --- Original source retains full ownership of the source dataset ---

  5. f

    Comparison of models test results.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Comparison of models test results. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t009
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  6. f

    Details of feature variables of the data set.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Details of feature variables of the data set. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  7. A

    ‘Churn for Bank Customers’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2022). ‘Churn for Bank Customers’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-churn-for-bank-customers-2e90/7961ea42/?iid=013-142&v=presentation
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Churn for Bank Customers’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mathchi/churn-for-bank-customers on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Content

    • RowNumber—corresponds to the record (row) number and has no effect on the output.
    • CustomerId—contains random values and has no effect on customer leaving the bank.
    • Surname—the surname of a customer has no impact on their decision to leave the bank.
    • CreditScore—can have an effect on customer churn, since a customer with a higher credit score is less likely to leave the bank.
    • Geography—a customer’s location can affect their decision to leave the bank.
    • Gender—it’s interesting to explore whether gender plays a role in a customer leaving the bank.
    • Age—this is certainly relevant, since older customers are less likely to leave their bank than younger ones.
    • Tenure—refers to the number of years that the customer has been a client of the bank. Normally, older clients are more loyal and less likely to leave a bank.
      • Balance—also a very good indicator of customer churn, as people with a higher balance in their accounts are less likely to leave the bank compared to those with lower balances.
      • NumOfProducts—refers to the number of products that a customer has purchased through the bank.
      • HasCrCard—denotes whether or not a customer has a credit card. This column is also relevant, since people with a credit card are less likely to leave the bank.
      • IsActiveMember—active customers are less likely to leave the bank.
      • EstimatedSalary—as with balance, people with lower salaries are more likely to leave the bank compared to those with higher salaries.
      • Exited—whether or not the customer left the bank.

    Acknowledgements

    As we know, it is much more expensive to sign in a new client than keeping an existing one.

    It is advantageous for banks to know what leads a client towards the decision to leave the company.

    Churn prevention allows companies to develop loyalty programs and retention campaigns to keep as many customers as possible.

    --- Original source retains full ownership of the source dataset ---

  8. Bank Customer Churn Data

    • kaggle.com
    Updated Nov 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Penta Krishna Kishore (2023). Bank Customer Churn Data [Dataset]. https://www.kaggle.com/datasets/pentakrishnakishore/bank-customer-churn-data/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Nov 3, 2023
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Penta Krishna Kishore
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    the churn prediction dataset, which contains raw data of 28,382 customers. The dataset includes the following columns:

    • customer_id: Unique identifier for each customer.
    • vintage: The duration of the customer's relationship with the company.
    • age: Age of the customer.
    • gender: Gender of the customer.
    • dependents: Number of dependents the customer has.
    • occupation: The occupation of the customer.
    • city: City in which the customer is located.
    • customer_nw_category: Net worth category of the customer.
    • branch_code: Code identifying the branch associated with the customer.
    • current_balance: Current balance in the customer's account.
    • previous_month_end_balance: Account balance at the end of the previous month.
    • average_monthly_balance_prevQ: Average monthly balance in the previous quarter.
    • average_monthly_balance_prevQ2: Average monthly balance in the second previous quarter.
    • current_month_credit: Credit amount in the current month.
    • previous_month_credit: Credit amount in the previous month.
    • current_month_debit: Debit amount in the current month.
    • previous_month_debit: Debit amount in the previous month.
    • current_month_balance: Account balance in the current month.
    • previous_month_balance: Account balance in the previous month.
    • churn: The target variable indicating whether the customer has churned (1 for churned, 0 for not churned).
    • last_transaction: Timestamp of the customer's last transaction. This dataset provides a comprehensive view of various attributes related to the customers' banking activities. With these features, it becomes possible to build predictive models to identify potential churners based on historical and current customer behavior. The dataset's size allows for robust analysis and modeling to improve customer retention strategies.
  9. A

    ‘Bank Customers Churn ’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 13, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Bank Customers Churn ’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-bank-customers-churn-bbf0/7d7c24bc/?iid=029-116&v=presentation
    Explore at:
    Dataset updated
    Nov 13, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Bank Customers Churn ’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/santoshd3/bank-customers on 30 September 2021.

    --- Dataset description provided by original source is as follows ---

    Context

    A dataset which contain some customers who are withdrawing their account from the bank due to some loss and other issues with the help this data we try to analyse and maintain accuracy.

    Content

    What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

    --- Original source retains full ownership of the source dataset ---

  10. f

    Confusion matrix.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Confusion matrix. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t004
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  11. h

    bank-churn-synthetic

    • huggingface.co
    Updated Apr 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kevin Jiang (2024). bank-churn-synthetic [Dataset]. https://huggingface.co/datasets/kevin50jiang/bank-churn-synthetic
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 2, 2024
    Authors
    Kevin Jiang
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Collated dataset for LLM training on the dataset for https://www.kaggle.com/competitions/playground-series-s4e1/data

  12. Bank churn dataset

    • kaggle.com
    Updated Jan 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Hamza01 (2024). Bank churn dataset [Dataset]. https://www.kaggle.com/datasets/alihamza01/bank-churn-dataset/suggestions
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 23, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Ali Hamza01
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Ali Hamza01

    Released under Apache 2.0

    Contents

  13. f

    Comparison of GA-XGBoost with XGBoost and LightGBM test results.

    • figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Comparison of GA-XGBoost with XGBoost and LightGBM test results. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of GA-XGBoost with XGBoost and LightGBM test results.

  14. BANK CUSTOMER CHURN

    • kaggle.com
    Updated Jan 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    NITANT TYAGI (2024). BANK CUSTOMER CHURN [Dataset]. https://www.kaggle.com/nitanttyagi/bank-customer-churn
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 20, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    NITANT TYAGI
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by NITANT TYAGI

    Released under Apache 2.0

    Contents

  15. Bank Customer Churn Model

    • kaggle.com
    Updated Feb 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shashank Moon (2024). Bank Customer Churn Model [Dataset]. https://www.kaggle.com/datasets/shashankmoon/bank-customer-churn-model/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Feb 17, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Shashank Moon
    Description

    Dataset

    This dataset was created by Shashank Moon

    Contents

  16. f

    The summary of the literature review.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). The summary of the literature review. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    In recent years, with the continuous improvement of the financial system and the rapid development of the banking industry, the competition of the banking industry itself has intensified. At the same time, with the rapid development of information technology and Internet technology, customers’ choice of financial products is becoming more and more diversified, and customers’ dependence and loyalty to banking institutions is becoming less and less, and the problem of customer churn in commercial banks is becoming more and more prominent. How to predict customer behavior and retain existing customers has become a major challenge for banks to solve. Therefore, this study takes a bank’s business data on Kaggle platform as the research object, uses multiple sampling methods to compare the data for balancing, constructs a bank customer churn prediction model for churn identification by GA-XGBoost, and conducts interpretability analysis on the GA-XGBoost model to provide decision support and suggestions for the banking industry to prevent customer churn. The results show that: (1) The applied SMOTEENN is more effective than SMOTE and ADASYN in dealing with the imbalance of banking data. (2) The F1 and AUC values of the model improved and optimized by XGBoost using genetic algorithm can reach 90% and 99%, respectively, which are optimal compared to other six machine learning models. The GA-XGBoost classifier was identified as the best solution for the customer churn problem. (3) Using Shapley values, we explain how each feature affects the model results, and analyze the features that have a high impact on the model prediction, such as the total number of transactions in the past year, the amount of transactions in the past year, the number of products owned by customers, and the total sales balance. The contribution of this paper is mainly in two aspects: (1) this study can provide useful information from the black box model based on the accurate identification of churned customers, which can provide reference for commercial banks to improve their service quality and retain customers; (2) it can provide reference for customer churn early warning models of other related industries, which can help the banking industry to maintain customer stability, maintain market position and reduce corporate losses.

  17. Bank Churn Dataset

    • kaggle.com
    Updated Jan 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Charlamagne (2024). Bank Churn Dataset [Dataset]. https://www.kaggle.com/datasets/firmanhasibuan1/bank-churn-dataset/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 22, 2024
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Charlamagne
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Charlamagne

    Released under Apache 2.0

    Contents

  18. A

    ‘Churn Modelling’ analyzed by Analyst-2

    • analyst-2.ai
    Updated May 22, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2019). ‘Churn Modelling’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-churn-modelling-0f30/9c1f1166/?iid=013-176&v=presentation
    Explore at:
    Dataset updated
    May 22, 2019
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Churn Modelling’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/shrutimechlearn/churn-modelling on 28 January 2022.

    --- Dataset description provided by original source is as follows ---

    Content

    This data set contains details of a bank's customers and the target variable is a binary variable reflecting the fact whether the customer left the bank (closed his account) or he continues to be a customer.

    Acknowledgements

    Big thanks to https://www.superdatascience.com/pages/deep-learning Banner Photo by Sharon McCutcheon on Unsplash

    --- Original source retains full ownership of the source dataset ---

  19. Performance comparison of different adoption algorithms in XGBoost model.

    • plos.figshare.com
    xls
    Updated Dec 8, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ke Peng; Yan Peng; Wenguang Li (2023). Performance comparison of different adoption algorithms in XGBoost model. [Dataset]. http://doi.org/10.1371/journal.pone.0289724.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Dec 8, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ke Peng; Yan Peng; Wenguang Li
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Performance comparison of different adoption algorithms in XGBoost model.

  20. Bank Churn Dataset

    • kaggle.com
    Updated Jan 25, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sanyam Jain (2025). Bank Churn Dataset [Dataset]. https://www.kaggle.com/datasets/sanyamjain404/bank-churn-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jan 25, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Sanyam Jain
    Description

    Dataset

    This dataset was created by Sanyam Jain

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Harshit Sharma (2024). Bank Churn (test) [Dataset]. https://www.kaggle.com/datasets/harshitstark/bank-churn-dataset-test
Organization logo

Bank Churn (test)

Evaluating Customer Retention: Test Dataset for Bank Churn Analysis.

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jan 21, 2024
Dataset provided by
Kaggle
Authors
Harshit Sharma
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Explore the 'Bank Churn (test)' dataset, a comprehensive collection designed for evaluating predictive models and analyzing customer attrition in the banking sector. This test dataset, derived from real-world scenarios, offers a robust platform to assess the effectiveness of machine learning algorithms in predicting and understanding bank churn dynamics.

Search
Clear search
Close search
Google apps
Main menu