11 datasets found

i
Data from: Customer Churn Dataset
ieee-dataport.org
Updated Jun 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Usman JOY (2024). Customer Churn Dataset [Dataset]. https://ieee-dataport.org/documents/customer-churn-dataset
Explore at:
Dataset updated
Jun 4, 2024
Authors
Usman JOY
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
259
Bank Customer Churn Dataset
kaggle.com
Updated Jul 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bhuvi Ranga (2023). Bank Customer Churn Dataset [Dataset]. https://www.kaggle.com/datasets/bhuviranga/customer-churn-data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 11, 2023
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Bhuvi Ranga
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
The customer churn dataset is a collection of customer data that focuses on predicting customer churn, which refers to the tendency of customers to stop using a company's products or services. The dataset contains various features that describe each customer, such as their credit score, country, gender, age, tenure, balance, number of products, credit card status, active membership, estimated salary, and churn status. The churn status indicates whether a customer has churned or not. The dataset is used to analyze and understand factors that contribute to customer churn and to build predictive models to identify customers at risk of churning. The goal is to develop strategies and interventions to reduce churn and improve customer retention
E-commerce Customer Churn
kaggle.com
Updated Aug 6, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Semaya (2024). E-commerce Customer Churn [Dataset]. https://www.kaggle.com/datasets/samuelsemaya/e-commerce-customer-churn
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 6, 2024
Dataset provided by
Kaggle
Authors
Samuel Semaya
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
E-commerce Customer Churn Dataset

Context

This dataset belongs to a leading online E-commerce company. The company wants to identify customers who are likely to churn, so they can proactively approach these customers with promotional offers.

Content

The dataset contains various features related to customer behavior and characteristics, which can be used to predict customer churn.

Features

Tenure: Tenure of a customer in the company (numeric)

WarehouseToHome: Distance between the warehouse to the customer's home (numeric)

NumberOfDeviceRegistered: Total number of devices registered to a particular customer (numeric)

PreferedOrderCat: Preferred order category of a customer in the last month (categorical)

SatisfactionScore: Satisfactory score of a customer on service (numeric)

MaritalStatus: Marital status of a customer (categorical)

NumberOfAddress: Total number of addresses added for a particular customer (numeric)

Complaint: Whether any complaint has been raised in the last month (binary)

DaySinceLastOrder: Days since last order by customer (numeric)

CashbackAmount: Average cashback in last month (numeric)

Churn: Churn flag (target variable, binary)

Task

The main task is to predict customer churn based on the given features. This is a binary classification problem where the target variable is 'Churn'.

Potential Applications

Customer Retention: Identify at-risk customers and take proactive measures to retain them.

Targeted Marketing: Design specific marketing campaigns for customers likely to churn.

Service Improvement: Analyze features contributing to churn and improve those aspects of the service.

Acknowledgements

This dataset is provided for educational purposes. While it represents a real-world scenario, the data itself may be simulated or anonymized.
i
WA_Fn-UseC_-Telco-Customer-Churn
ieee-dataport.org
Updated Feb 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mengjing Hao (2024). WA_Fn-UseC_-Telco-Customer-Churn [Dataset]. https://ieee-dataport.org/documents/wafn-usec-telco-customer-churn
Explore at:
Dataset updated
Feb 19, 2024
Authors
Mengjing Hao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Nowadays
A
‘Customer Churn’ analyzed by Analyst-2
analyst-2.ai
Updated Mar 5, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2018). ‘Customer Churn’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-customer-churn-4f0b/a31eb722/?iid=005-077&v=presentation
Explore at:
Dataset updated
Mar 5, 2018
Dataset authored and provided by
Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Analysis of ‘Customer Churn’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/hassanamin/customer-churn on 14 February 2022.

--- Dataset description provided by original source is as follows ---

Binary Customer Churn

A marketing agency has many customers that use their service to produce ads for the client/customer websites. They've noticed that they have quite a bit of churn in clients. They basically randomly assign account managers right now, but want you to create a machine learning model that will help predict which customers will churn (stop buying their service) so that they can correctly assign the customers most at risk to churn an account manager. Luckily they have some historical data, can you help them out? Create a classification algorithm that will help classify whether or not a customer churned. Then the company can test this against incoming data for future customers to predict which customers will churn and assign them an account manager.

Content

The data is saved as customer_churn.csv. Here are the fields and their definitions:

Name : Name of the latest contact at Company

Age: Customer Age

Total_Purchase: Total Ads Purchased

Account_Manager: Binary 0=No manager, 1= Account manager assigned

Years: Totaly Years as a customer

Num_sites: Number of websites that use the service.

Onboard_date: Date that the name of the latest contact was onboarded

Location: Client HQ Address

Company: Name of Client Company

Once you've created the model and evaluated it, test out the model on some new data (you can think of this almost like a hold-out set) that your client has provided, saved under new_customers.csv. The client wants to know which customers are most likely to churn given this data (they don't have the label yet).

Acknowledgements

We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

Inspiration

Your data will be in front of the world's largest data science community. What questions do you want to see answered?

--- Original source retains full ownership of the source dataset ---
t
Telco_Customer_churn_Data
test.researchdata.tuwien.at
bin, csv, png
Updated Apr 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Erum Naz; Erum Naz; Erum Naz; Erum Naz (2025). Telco_Customer_churn_Data [Dataset]. http://doi.org/10.82556/b0ch-cn44
Explore at:
png, csv, binAvailable download formats
Unique identifier
https://doi.org/10.82556/b0ch-cn44
Dataset updated
Apr 28, 2025
Dataset provided by
TU Wien
Authors
Erum Naz; Erum Naz; Erum Naz; Erum Naz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 28, 2025
Description
Context and Methodology

The dataset originates from the research domain of Customer Churn Prediction in the Telecom Industry. It was created as part of the project "Data-Driven Churn Prediction: ML Solutions for the Telecom Industry," completed within the Data Stewardship course (Master programme Data Science, TU Wien).

The primary purpose of this dataset is to support machine learning model development for predicting customer churn based on customer demographics, service usage, and account information.
The dataset enables the training, testing, and evaluation of classification algorithms, allowing researchers and practitioners to explore techniques for customer retention optimization.

The dataset was originally obtained from the IBM Accelerator Catalog and adapted for academic use. It was uploaded to TU Wien’s DBRepo test system and accessed via SQLAlchemy connections to the MariaDB environment.

Technical Details

The dataset has a tabular structure and was initially stored in CSV format. It contains:

Rows: 7,043 customer records

Columns: 21 features including customer attributes (gender, senior citizen status, partner status), account information (tenure, contract type, payment method), service usage (internet service, streaming TV, tech support), and the target variable (Churn: Yes/No).

Naming Convention:

The table in the database is named telco_customer_churn_data.

Software Requirements:

To open and work with the dataset, any standard database client or programming language supporting MariaDB connections can be used (e.g., Python etc).

For machine learning applications, libraries such as pandas, scikit-learn, and joblib are typically used.

Additional Resources:

Source code for data loading, preprocessing, model training, and evaluation is available at the associated GitHub repository: https://github.com/nazerum/fair-ml-customer-churn

Further Details

When reusing the dataset, users should be aware:

Licensing: The dataset is shared under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.

Use Case Suitability: The dataset is best suited for classification tasks, particularly binary classification (churn vs. no churn).

Metadata Standards: Metadata describing the dataset adheres to FAIR principles and is supplemented by CodeMeta and Croissant standards for improved interoperability.
Expresso Churn Prediction Challenge
kaggle.com
Updated Aug 30, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hamza (2021). Expresso Churn Prediction Challenge [Dataset]. https://www.kaggle.com/hamzaghanmi/expresso-churn-prediction-challenge/tasks
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Aug 30, 2021
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Hamza
Description
Context

This data was imported from the zindi platform in the context of competition and here is the link to the competition The objective of the competition is to develop a predictive model that determines the likelihood for a customer to churn - to stop purchasing airtime and data from Expresso.

Content

The data describes 2.5 million Expresso clients. * Train.csv - contains information about 2 million customers. There is a column called CHURN that indicates if a client churned or did not churn. This is the target. You must estimate the likelihood that these clients churned. You will use this file to train your model. * Test.csv - is similar to train, but without the Churn column. You will use this file to test your model on. * SampleSubmission.csv - is an example of what your submission should look like. The order of the rows does not matter but the name of the user_id must be correct.
Club Data Set
kaggle.com
Updated Mar 4, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
so NaN (2020). Club Data Set [Dataset]. https://www.kaggle.com/sonannguyenngoc/club-data-set/metadata
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 4, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
so NaN
Description
Context

A certain premium club boasts a large customer membership. The members pay an annual membership fee in return for using the exclusive facilities offered by this club. The fees are customized for every member's personal package. In the last few years, however, the club has been facing an issue with a lot of members cancelling their memberships. The club management plans to address this issue by proactively addressing customer grievances. They, however, do not have enough bandwidth to reach out to the entire customer base individually and are looking to see whether a statistical approach can help them identify customers at risk. Can you help them ? Relevant historical data is provided in the “club_churn_train.csv”

Acknowledgements

Club Data Set

Inspiration

Club Data Set
h
cofinfad
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Luis David Trejos Rojas, cofinfad [Dataset]. http://doi.org/10.57967/hf/2942
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.57967/hf/2942
Authors
Luis David Trejos Rojas
License
https://choosealicense.com/licenses/odc-by/https://choosealicense.com/licenses/odc-by/
Description
COFINFAD: Colombian Fintech Financial Analytics Dataset

COFINFAD (Colombian Fintech Financial Analytics Dataset) is a dataset containing almost 12 months of transactional and demographic data from an anonymous Colombian fintech company. This dataset is designed to facilitate research in customer behavior analysis, churn prediction, and financial pattern recognition in the Latin American fintech sector.

Files

customer_data.csv: Contains demographic, behavioral… See the full description on the dataset page: https://huggingface.co/datasets/luisdavidtrejosrojas/cofinfad.
BCG Data Science Simulation
kaggle.com
Updated Feb 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
PAVITR KUMAR SWAIN (2025). BCG Data Science Simulation [Dataset]. https://www.kaggle.com/datasets/pavitrkumar/bcg-data-science-simulation
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Feb 12, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
PAVITR KUMAR SWAIN
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
** Feature Engineering for Churn Prediction**

🚀**# BCG Data Science Job Simulation | Forage** This notebook focuses on feature engineering techniques to enhance a dataset for churn prediction modeling. As part of the BCG Data Science Job Simulation, I transformed raw customer data into valuable features to improve predictive performance.

📊 What’s Inside? ✅ Data Cleaning: Removing irrelevant columns to reduce noise ✅ Date-Based Feature Extraction: Converting raw dates into useful insights like activation year, contract length, and renewal month ✅ New Predictive Features:

consumption_trend → Measures if a customer’s last-month usage is increasing or decreasing total_gas_and_elec → Aggregates total energy consumption ✅ Final Processed Dataset: Ready for churn prediction modeling

📂Dataset Used: 📌 clean_data_after_eda.csv → Original dataset after Exploratory Data Analysis (EDA) 📌 clean_data_with_new_features.csv → Final dataset after feature engineering

🛠 Technologies Used: 🔹 Python (Pandas, NumPy) 🔹 Data Preprocessing & Feature Engineering

🌟 Why Feature Engineering? Feature engineering is one of the most critical steps in machine learning. Well-engineered features improve model accuracy and uncover deeper insights into customer behavior.

🚀 This notebook is a great reference for anyone learning data preprocessing, feature selection, and predictive modeling in Data Science!

📩 Connect with Me: 🔗 GitHub Repo: https://github.com/Pavitr-Swain/BCG-Data-Science-Job-Simulation 💼 LinkedIn: https://www.linkedin.com/in/pavitr-kumar-swain-ab708b227/

🔍 Let’s explore churn prediction insights together! 🎯

Airline Loyalty Program (Canada)

kaggle.com

Updated May 28, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Siddharth Vora (2025). Airline Loyalty Program (Canada) [Dataset]. https://www.kaggle.com/datasets/siddharth0935/airline-loyalty-program

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 28, 2025

Dataset provided by

Kaggle

Authors

Siddharth Vora

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Area covered

Canada

Description

Airline Loyalty Program Promotion Dataset

This dataset contains information about customer activity and demographics related to an airline's loyalty program, including a promotional campaign aimed at enhancing program enrollment.

Files

1. Customer Flight Activity.csv

Field	Description
Loyalty Number	Customer's unique loyalty number
Year	Year of the period
Month	Month of the period
Flights Booked	Number of flights booked for member only in the period
Flights with Companions	Number of flights booked with additional passengers in the period
Total Flights	Sum of Flights Booked and Flights with Companions
Distance	Flight distance traveled in the period (km)
Points Accumulated	Loyalty points accumulated in the period
Points Redeemed	Loyalty points redeemed in the period
Dollar Cost Points Redeemed	Dollar equivalent for points redeemed in the period in CDN

2. Customer Loyalty History.csv

Field	Description
Loyalty Number	Customer's unique loyalty number
Country	Country of residence
Province	Province of residence
City	City of residence
Postal Code	Postal code of residence
Gender	Gender
Education	Highest education level (High school or lower > College > Bachelor > Master > Doctor)
Salary	Annual income
Marital Status	Marital status (Single, Married, Divorced)
Loyalty Card	Loyalty card status (Star > Nova > Aurora)
CLV	Customer lifetime value - total invoice value for all flights ever booked by member
Enrollment Type	Enrollment type (Standard / 2018 Promotion)
Enrollment Year	Year Member enrolled in membership program
Enrollment Month	Month Member enrolled in membership program
Cancellation Year	Year Member cancelled their membership
Cancellation Month	Month Member cancelled their membership

Context

The airline implemented a promotional campaign (2018 Promotion) aimed at enhancing program enrollment. The dataset encompasses information regarding: - Customer flight activity and loyalty points - Program signups and enrollment details - Cancellations within the loyalty program - Comprehensive customer demographics

Potential Use Cases

Analyze the effectiveness of the promotional campaign
Predict customer churn/cancellations
Identify high-value customer segments
Understand factors influencing loyalty program engagement
Optimize loyalty point redemption strategies

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Usman JOY (2024). Customer Churn Dataset [Dataset]. https://ieee-dataport.org/documents/customer-churn-dataset

Data from: Customer Churn Dataset

Explore at:

Dataset updated

Jun 4, 2024

Authors

Usman JOY

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

259

Clear search

Close search

Google apps

Main menu

Data from: Customer Churn Dataset

Bank Customer Churn Dataset

E-commerce Customer Churn

E-commerce Customer Churn Dataset

Context

Content

Features

Task

Potential Applications

Acknowledgements

WA_Fn-UseC_-Telco-Customer-Churn

‘Customer Churn’ analyzed by Analyst-2

Binary Customer Churn

Content

Acknowledgements

Inspiration

Telco_Customer_churn_Data

Context and Methodology

Technical Details

Further Details

Expresso Churn Prediction Challenge

Context

Content

Club Data Set

Context

Acknowledgements

Inspiration

cofinfad

BCG Data Science Simulation

** Feature Engineering for Churn Prediction**

Airline Loyalty Program (Canada)

Airline Loyalty Program Promotion Dataset

Files

1. Customer Flight Activity.csv

2. Customer Loyalty History.csv

Context

Potential Use Cases

Data from: Customer Churn Dataset

Feature Engineering for Churn Prediction