24 datasets found

Loan Approval Classification Dataset

kaggle.com

Updated Oct 29, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Ta-wei Lo (2024). Loan Approval Classification Dataset [Dataset]. https://www.kaggle.com/datasets/taweilo/loan-approval-classification-data

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 29, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Ta-wei Lo

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

1. Data Source

This dataset is a synthetic version inspired by the original Credit Risk dataset on Kaggle and enriched with additional variables based on Financial Risk for Loan Approval data. SMOTENC was used to simulate new data points to enlarge the instances. The dataset is structured for both categorical and continuous features.

2. Metadata

The dataset contains 45,000 records and 14 variables, each described below:

Column	Description	Type
`person_age`	Age of the person	Float
`person_gender`	Gender of the person	Categorical
`person_education`	Highest education level	Categorical
`person_income`	Annual income	Float
`person_emp_exp`	Years of employment experience	Integer
`person_home_ownership`	Home ownership status (e.g., rent, own, mortgage)	Categorical
`loan_amnt`	Loan amount requested	Float
`loan_intent`	Purpose of the loan	Categorical
`loan_int_rate`	Loan interest rate	Float
`loan_percent_income`	Loan amount as a percentage of annual income	Float
`cb_person_cred_hist_length`	Length of credit history in years	Float
`credit_score`	Credit score of the person	Integer
`previous_loan_defaults_on_file`	Indicator of previous loan defaults	Categorical
`loan_status` (target variable)	Loan approval status: 1 = approved; 0 = rejected	Integer

3. Data Usage

The dataset can be used for multiple purposes:

Exploratory Data Analysis (EDA): Analyze key features, distribution patterns, and relationships to understand credit risk factors.
Classification: Build predictive models to classify the loan_status variable (approved/not approved) for potential applicants.
Regression: Develop regression models to predict the credit_score variable based on individual and loan-related attributes.

Mind the data issue from the original data, such as the instance > 100-year-old as age.

This dataset provides a rich basis for understanding financial risk factors and simulating predictive modeling processes for loan approval and credit scoring.

Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀

Data from: Loan Approval Prediction Dataset
kaggle.com
Updated Mar 14, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kamran Ansari (2025). Loan Approval Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/korpionn/loan-approval-prediction-dataset/discussion
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Mar 14, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Kamran Ansari
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Dataset

This dataset was created by Kamran Ansari

Released under Database: Open Database, Contents: Database Contents

Contents
G
Bank Loan Application Approvals
gomask.ai
csv, json
Updated Jul 12, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
GoMask.ai (2025). Bank Loan Application Approvals [Dataset]. https://gomask.ai/marketplace/datasets/bank-loan-application-approvals
Explore at:
csv(10 MB), jsonAvailable download formats
Dataset updated
Jul 12, 2025
Dataset provided by
GoMask.ai
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Time period covered
2024 - 2025
Area covered
Global
Variables measured
loan_amount, applicant_id, loan_purpose, applicant_dob, decision_date, denial_reason, interest_rate, application_id, residence_city, approved_amount, and 18 more
Description
This dataset contains detailed synthetic records of bank loan applications, including applicant demographics, financial background, loan request details, and final approval or denial outcomes. It is ideal for developing and benchmarking predictive models for credit risk assessment, as well as for analyzing approval patterns and fairness in lending decisions.

Data from: Loan Approval Prediction

kaggle.com

Updated May 28, 2022

Facebook

Twitter

Click to copy link

Link copied

Cite

Siddharth Sharma (2022). Loan Approval Prediction [Dataset]. https://www.kaggle.com/datasets/ssiddharth408/loan-prediction-dataset

Explore at:

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

May 28, 2022

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Siddharth Sharma

License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

This is a classification problem. The dataset contains 13 columns where, the loan_status column is the one we have to predict.

Columns

Variable	Description
Loan_ID	Unique Loan ID
Gender	Male/ Female
Married	Applicant married (Y/N)
Dependents	Number of dependents
Education	Applicant Education (Graduate/ Under Graduate)
Self_Employed	Self employed (Y/N)
ApplicantIncome	Applicant income
CoapplicantIncome	Coapplicant income
LoanAmount	Loan amount in thousands
Loan_Amount_Term	Term of loan in months
Credit_History	credit history meets guidelines
Property_Area	Urban/ Semi Urban/ Rural
Loan_Status	(Target) Loan approved (Y/N)

Data from: Optimizing Bank Loan Approval with Cutting-Edge Deep Learning...
zenodo.org
bin
Updated Oct 25, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdalla Mahgoub; Abdalla Mahgoub (2023). Optimizing Bank Loan Approval with Cutting-Edge Deep Learning model [Dataset]. http://doi.org/10.5281/zenodo.10041577
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10041577
Dataset updated
Oct 25, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Abdalla Mahgoub; Abdalla Mahgoub
Description
Abstract
For any bank or financial institution, managing loans and controlling leverage is one of the most
important tasks they have to undertake. A bank cannot function efficiently without a well-
designed loan-to-deposit business model. As technology continues to evolve, the mechanism of
handling and granting loans underwent a significant change with the introduction of use cases
concerning machine learning and data science.
Hence, this data-driven research utilized advanced machine learning techniques to analyze and
manipulate the data, aiming to predict the best possible way to recommend a loan to a client.
These predictions are based on modified yet unique features created from the data obtained from
the client. The dataset was tested using two different methodologies: a logistic regression model
and a Neural Network algorithm. Both of these methodologies produced high-level accuracy
rates. However, the latter outperformed the currently used methodologies by over 20%, resulting
in an accuracy of 90%.
The successful research results were obtained due to the use of a perfectly balanced, unbiased,
and cleaned dataset, as well as the well-executed combination of activation functions for the
Neural Network model. A performance assessment was conducted based on a confusion matrix
evaluation to demonstrate its feasibility and performance
CPL Prediction
kaggle.com
Updated Jun 26, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Plavak Das (2020). CPL Prediction [Dataset]. https://www.kaggle.com/plavak10/cpl-prediction/activity
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jun 26, 2020
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Plavak Das
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Content

Data of persons relating to loan approval status
c
(Cleaned) Credit Score for Classification Dataset
cubig.ai
Updated Jun 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). (Cleaned) Credit Score for Classification Dataset [Dataset]. https://cubig.ai/store/products/504/cleaned-credit-score-for-classification-dataset
Explore at:
Dataset updated
Jun 22, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Synthetic data generation using AI techniques for model training, Privacy-preserving data transformation via differential privacy
Description
1) Data Introduction • The (Cleaned) Credit Score Dataset for Classification Dataset is a structured dataset designed for training machine learning models to classify individuals into credit score categories based on various credit-related attributes.

2) Data Utilization (1) Characteristics of the (Cleaned) Credit Score Dataset for Classification Dataset: • The dataset includes key financial variables that influence credit scoring, such as delinquency history, credit limit, credit utilization ratio, and repayment records. The credit score category serves as the multiclass classification label.

(2) Applications of the (Cleaned) Credit Score Dataset for Classification Dataset: • Credit score classification model training: The dataset can be used to train machine learning models that predict an individual’s credit score category based on financial indicators. • Financial risk assessment and customer segmentation: It can support tasks such as loan approval decision-making, interest rate setting, and personalized financial product recommendations by identifying a customer’s credit level in advance.
Historical Loan Records with Default Status
kaggle.com
Updated Jul 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhishek Nishad (2025). Historical Loan Records with Default Status [Dataset]. https://www.kaggle.com/datasets/abhisheknishad8988/defaulter-data/versions/1
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 16, 2025
Dataset provided by
Kaggle
Authors
Abhishek Nishad
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
Loan Default Prediction Dataset

This dataset contains preprocessed loan records from a large-scale financial dataset, designed for loan default prediction modeling. It includes a wide range of features related to borrower profiles, loan terms, and historical repayment behavior.

This dataset has been cleaned, preprocessed, and structured for use in loan default prediction modeling. It includes most normalized numerical features, and a binary target column indicating whether a loan defaulted or not.

Want to Improve or Customize It?

Users are encouraged to:

Apply additional feature engineering (e.g., create debt-to-income ratio, rolling averages)

Encode categorical variables using different techniques (e.g., one-hot encoding, target encoding)

Handle class imbalance with oversampling (SMOTE) or undersampling

Perform train/test splitting , cross-validation , or scaling

Add derived features based on domain knowledge

Use Cases:

Binary classification for loan default prediction Credit risk modeling Financial machine learning Loan approval system development Model benchmarking and testing
bank_loan_data
kaggle.com
Updated Feb 19, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Uday Malviya (2025). bank_loan_data [Dataset]. http://doi.org/10.34740/kaggle/dsv/10791226
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.34740/kaggle/dsv/10791226
Dataset updated
Feb 19, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Uday Malviya
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Overview This dataset contains 45,000 records of loan applicants, with various attributes related to personal demographics, financial status, and loan details. The dataset can be used for predictive modeling, particularly in credit risk assessment and loan default prediction.

Dataset Content The dataset includes 14 columns representing different factors influencing loan approvals and defaults:

Personal Information

person_age: Age of the applicant (in years). person_gender: Gender of the applicant (male, female). person_education: Educational background (High School, Bachelor, Master, etc.). person_income: Annual income of the applicant (in USD). person_emp_exp: Years of employment experience. person_home_ownership: Type of home ownership (RENT, OWN, MORTGAGE). Loan Details

loan_amnt: Loan amount requested (in USD). loan_intent: Purpose of the loan (PERSONAL, EDUCATION, MEDICAL, etc.). loan_int_rate: Interest rate on the loan (percentage). loan_percent_income: Ratio of loan amount to income. Credit & Loan History

cb_person_cred_hist_length: Length of the applicant's credit history (in years). credit_score: Credit score of the applicant. previous_loan_defaults_on_file: Whether the applicant has previous loan defaults (Yes or No). Target Variable

loan_status: 1 if the loan was repaid successfully, 0 if the applicant defaulted. Use Cases Loan Default Prediction: Build a classification model to predict loan repayment. Credit Risk Analysis: Analyze the relationship between income, credit score, and loan defaults. Feature Engineering: Extract new insights from employment history, home ownership, and loan amounts. Acknowledgments This dataset is synthetic and designed for machine learning and financial risk analysis.
c
creditrisk Dataset
cubig.ai
Updated Jun 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
CUBIG (2025). creditrisk Dataset [Dataset]. https://cubig.ai/store/products/506/creditrisk-dataset
Explore at:
Dataset updated
Jun 22, 2025
Dataset authored and provided by
CUBIG
License
https://cubig.ai/store/terms-of-servicehttps://cubig.ai/store/terms-of-service
Measurement technique
Privacy-preserving data transformation via differential privacy, Synthetic data generation using AI techniques for model training
Description
1) Data Introduction • The credit_risk Dataset is a structured dataset designed to predict loan default status (default) based on a customer’s financial condition, credit history, and loan-related information. Each sample includes various features necessary for assessing the applicant’s credit risk.

2) Data Utilization (1) Characteristics of the credit_risk Dataset: • The dataset includes key financial indicators such as current account balance, savings balance, loan amount, job type, and number of existing loans. The default column serves as a binary classification label indicating whether the customer failed to repay the loan.

(2) Applications of the credit_risk Dataset: • Loan default prediction model training: The dataset can be used to train machine learning-based binary classification models that estimate a customer’s credit risk in advance and support decisions on loan approvals. • Credit risk analysis and policy development: By analyzing the relationship between financial status and credit history, the dataset can help in setting credit scoring criteria, adjusting risk-based interest rates, and personalizing financial services.
Comprehensive Loan Information for Credit Risk
kaggle.com
Updated Dec 21, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sheen (2023). Comprehensive Loan Information for Credit Risk [Dataset]. https://www.kaggle.com/datasets/nezukokamaado/auto-loan-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 21, 2023
Dataset provided by
Kaggle
Authors
Sheen
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Some of the applications are as follows :

1)Credit Risk Assessment: Banks and financial institutions can leverage the dataset to develop models for assessing the credit risk associated with loan applicants. This involves predicting the likelihood of loan default based on various features.

2)Loan Portfolio Management: Financial organizations can use the dataset to manage and optimize their loan portfolios. This includes diversifying risk, setting interest rates, and making informed decisions about loan approval or denial.

3)Market Trend Analysis: By analyzing the dataset, researchers and analysts can identify trends in borrower behavior, regional variations, and shifts in loan purposes. This information can be valuable for making data-driven market predictions.

4)Customer Segmentation: Understanding the characteristics of different borrower segments can help banks tailor their services and products. This dataset can be used for clustering customers based on attributes like income, employment length, and loan history.

5)Regulatory Compliance: Financial institutions can use the dataset to ensure compliance with regulations. For example, assessing whether loans are being offered fairly across different demographics and regions.

6)Machine Learning Model Development: Data scientists can use this dataset to develop and test machine learning models for predicting loan outcomes. This can include classification tasks such as predicting loan approval or denial.

7)Lending Strategy Optimization: Banks can optimize their lending strategies by analyzing patterns in loan amounts, interest rates, and repayment behavior. This could involve adjusting lending criteria to attract desirable borrowers.

8)Fraud Detection: The dataset may be used to identify patterns indicative of fraudulent loan applications. Unusual patterns in borrower information could be flagged for further investigation.
Loans Dataset
kaggle.com
Updated Apr 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zaki Hanfer (2024). Loans Dataset [Dataset]. https://www.kaggle.com/datasets/zakihanfer/loans-dataset
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Apr 5, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Zaki Hanfer
Description
Data Dictionary

The Data contains 1 file :

loan.csv: In this file there are 18 columns:

loanId: This is a unique loan identifier. Use this for joins with the payment.csv file

anon_ssn: This is a hash based on a client’s SSN (Anonymous ssn). You can use this as if it is a SSN to compare if a loan belongs to a previous customer.

payFrequency: This column represents repayment frequency of the loan:

B is biweekly payments

I is irregular

M is monthly

S is semi monthly

W is weekly

apr: Annual Percentage Rate of the loan (%)

applicationDate: Date of application (start date)

originated: Indicates if the loan has been initiated (underwriting process started).

originatedDate: Date of origination, day the loan was originated

nPaidOff: Number of MoneyLion loans previously paid off by the client.

approved: Indicates if the loan has been approved (final step of underwriting).

isFunded: Whether or not a loan is ultimately funded. a loan can be voided by a customer shortly after it is approved, so not all approved loans are ultimately funded.

loanStatus: Current loan status (this column is used for prediction). Most are selfexplanatory. Below are the statuses which need clarification:

Withdrawn Application: The applicant has withdrawn their loan application before it was approved or funded.

Paid Off Loan: The loan has been fully paid off by the borrower according to the repayment terms.

Rejected: The loan application was rejected, typically due to failure to meet underwriting criteria.

New Loan: A newly approved loan that has not yet been funded.

Internal Collection: The loan is being managed and collected internally by MoneyLion due to missed payments or delinquency.

CSR Voided New Loan: A new loan application was voided by a customer service representative (CSR) before funding.

External Collection: The loan has been transferred to an external collection agency for management and collection.

Returned Item: A payment on the loan has been returned due to insufficient funds in the borrower's account.

Customer Voided New Loan: The borrower voided a new loan application before funding.

Credit Return Void: The loan was voided due to a credit return, typically related to a refunded transaction.

Pending Paid Off: The loan is in the process of being paid off, but the process is pending completion.

Charged Off Paid Off: The loan has been charged off as a loss by MoneyLion but has also been paid off by the borrower.

Settled Bankruptcy: The loan has been settled as part of a bankruptcy proceeding.

Settlement Paid Off: The loan has been paid off through a settlement agreement.

Charged Off: The loan has been charged off as a loss by MoneyLion due to nonpayment.

Pending Rescind: The loan is pending rescission, meaning it may be canceled or reversed.

Customver Voided New Loan: Typo: Likely should be "Customer Voided New Loan". Similar to "Customer Voided New Loan", indicating the borrower voided a new loan application before funding.

Pending Application: The loan application is pending review and approval.

Voided New Loan: The loan application was voided before funding.• Pending Application Fee: The loan application is pending due to the application fee not being paid.

Settlement Pending Paid Off: The loan is pending being paid off through a settlement agreement.

loanAmount: Principal amount of the loan ('Dollars') (for non-funded loans this will be the principal in the loan application)

originallyScheduledPaymentAmount: This is the Initialy scheduled repayment amount ('Dollars') (if a customer pays off all his scheduled payments, this is the amount we should receive)

state: State of the client

Lead type: The lead type determines the underwriting rules for a lead.

bvMandatory: leads that are bought from the ping tree – required to perform bank verification before loan approval

lead: very similar to bvMandatory, except bank verification is optional for loan approval

california: similar to lead, but optimized for California lending rules

organic: customers that came through the MoneyLion website

rc_returning: customers who have at least 1 paid off loan in another loan portfolio. (The first paid off loan is not in this data set).

prescreen: preselected customers who have been offered a loan through direct mail campaigns

express: promotional “express” loans

repeat: promotional loans offered through ...
T
United Kingdom Mortgage Approvals
tradingeconomics.com
ko.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Sep 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). United Kingdom Mortgage Approvals [Dataset]. https://tradingeconomics.com/united-kingdom/mortgage-approvals
Explore at:
csv, excel, json, xmlAvailable download formats
Dataset updated
Sep 2, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 31, 1986 - Jul 31, 2025
Area covered
United Kingdom
Description
Mortgage Approvals in the United Kingdom increased to 65.35 Thousand in July from 64.57 Thousand in June of 2025. This dataset provides the latest reported value for - United Kingdom Mortgage Approvals - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Credit Approval (Mixed Attributes)
kaggle.com
Updated Dec 14, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2022). Credit Approval (Mixed Attributes) [Dataset]. https://www.kaggle.com/datasets/thedevastator/improving-credit-approval-with-mixed-attributes/data
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 14, 2022
Dataset provided by
Kagglehttp://kaggle.com/
Authors
The Devastator
Description
Credit Approval (Mixed Attributes)

Continuous and Categorical Features

By UCI [source]

About this dataset

This dataset explores the phenomenon of credit card application acceptance or rejection. It includes a range of both continuous and categorical attributes, such as the applicant's gender, credit score, and income; as well as details about recent credit card activity including balance transfers and delinquency. This data presents a unique opportunity to investigate how these different attributes interact in determining application status. With careful analysis of this dataset, we can gain valuable insights into understanding what factors help ensure a successful application outcome. This could lead us to developing more effective strategies for predicting and improving financial credit access for everyone

More Datasets

For more datasets, click here.

Featured Notebooks

🚨 Your notebook can be here! 🚨!

How to use the dataset

This dataset is an excellent resource for researching the credit approval process, as it provides a variety of attributes from both continuous and categorical sources. The aim of this guide is to provide tips and advice on how to make the most out of this dataset. - Understand the data: Before attempting to work with this dataset, it's important to understand what kind of information it contains. Since there is a mix of continuous and categorical attributes in this data set, make sure you familiarise yourself with all the different columns before proceeding further. - Exploratory Analysis: It is recommended that you conduct some exploratory analysis on your data in order to gain an overall understanding of its characteristics and distributions. By investigating things like missing values and correlations between different independent variables (IVs) or dependent variables (DVs), you can better prepare yourself for making meaningful analyses or predictions in further steps. - Data Cleaning: Once you have familiarised yourself with your data, begin cleaning up any potential discrepancies such as missing values or outliers by replacing them appropriately or removing them from your dataset if necessary - Feature Selection/Engineering: After cleansing your data set, feature selection/engineering may be necessary if certain columns are redundant or not proving useful for constructing meaningful models/analyses over your data set (usually observed after exploratory analysis). You should be very mindful when deciding which features should be removed so that no information about potentially important relationships are lost!
- Model Building/Analysis: Now that our data has been pre-processed appropriately we can move forward with developing our desired models / analyses over our newly transformed datasets!

Research Ideas

Developing predictive models to identify customers who are likely to default on their credit card payments.

Creating a risk analysis system that can identify customers who pose a higher risk for fraud or misuse of their credit cards.

Developing an automated lending decision system that can use the data points provided in the dataset (i.e., gender, average monthly balance, etc.) to decide whether or not to approve applications for new credit lines and loans

Acknowledgements

If you use this dataset in your research, please credit the original authors. Data Source

License

See the dataset description for more information.

Columns

File: crx.data.csv | Column name | Description | |:--------------|:--------------------------------------------------------------------------------------------------------------------------------------------------------------| | b | Gender (Categorical) | | 30.83 | Average Monthly Balance (Continuous) | | 0 | Number of Months Since Applicant's Last Delinquency (Continuous) | | w | Number of Months Since Applicant's Last Credit Card Approval (Continuous) | | 1.25 | Number Of Months since The applicant's last balance increase (Continuous) ...
T
China Loan Prime Rate
tradingeconomics.com
de.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Aug 20, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). China Loan Prime Rate [Dataset]. https://tradingeconomics.com/china/interest-rate
Explore at:
xml, csv, excel, jsonAvailable download formats
Dataset updated
Aug 20, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Oct 25, 2013 - Aug 20, 2025
Area covered
China
Description
The benchmark interest rate in China was last recorded at 3 percent. This dataset provides the latest reported value for - China Interest Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Average mortgage interest rates in the UK 2000-2025, by month and type
statista.com
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2025). Average mortgage interest rates in the UK 2000-2025, by month and type [Dataset]. https://www.statista.com/statistics/386301/uk-average-mortgage-interest-rates/
Explore at:
Dataset updated
Jun 24, 2025
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2000 - May 2025
Area covered
United Kingdom
Description
Mortgage rates increased at a record pace in 2022, with the 10-year fixed mortgage rate doubling between March 2022 and December 2022. With inflation increasing, the Bank of England introduced several bank rate hikes, resulting in higher mortgage rates. In May 2025, the average 10-year fixed rate interest rate reached **** percent. As borrowing costs get higher, demand for housing is expected to decrease, leading to declining market sentiment and slower house price growth. How have the mortgage hikes affected the market? After surging in 2021, the number of residential properties sold declined in 2023, reaching just above *** million. Despite the number of transactions falling, this figure was higher than the period before the COVID-19 pandemic. The falling transaction volume also impacted mortgage borrowing. Between the first quarter of 2023 and the first quarter of 2024, the value of new mortgage loans fell year-on-year for five straight quarters in a row. How are higher mortgages affecting homebuyers? Homeowners with a mortgage loan usually lock in a fixed rate deal for two to ten years, meaning that after this period runs out, they need to renegotiate the terms of the loan. Many of the mortgages outstanding were taken out during the period of record-low mortgage rates and have since faced notable increases in their monthly repayment. About **** million homeowners are projected to see their deal expire by the end of 2026. About *** million of these loans are projected to experience a monthly payment increase of up to *** British pounds by 2026.
submission.json
kaggle.com
Updated Sep 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sharmila Ghosh (2024). submission.json [Dataset]. https://www.kaggle.com/datasets/sharmilaghosh/submission-json/suggestions
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 22, 2024
Dataset provided by
Kagglehttp://kaggle.com/
Authors
Sharmila Ghosh
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Context, Sources, and Inspirations Behind the Dataset When developing a hybrid model that combines human-like reasoning with neural network precision, the choice of dataset is crucial. The datasets used in training such a model were selected and curated based on specific goals and requirements, drawing inspiration from a variety of contexts. Below is a breakdown of the datasets, their origins, sources, and the inspirations behind selecting them:

Context of the Dataset Selection Objective: To create a model capable of generalizing across diverse tasks, including classification, regression, language understanding, and visual recognition. The model is designed to tackle challenges involving unseen data, complex reasoning, and multi-modal inputs. Approach: A combination of publicly available benchmark datasets and proprietary datasets from specific domains was used. The data sources aimed to provide comprehensive coverage of real-world scenarios and diverse input types to enhance the model's robustness.

Data Sources Public Benchmark Datasets ImageNet and COCO (Common Objects in Context):

Inspiration: Widely recognized for image classification and object detection tasks. They provide a large and varied set of labeled images, covering thousands of object categories. Source: Open datasets maintained by research communities. Usage: Used for training and testing the vision component of the hybrid model, focusing on object recognition and scene understanding. MultiWOZ (Multi-Domain Wizard-of-Oz):

Inspiration: A comprehensive dialogue dataset covering multiple domains (e.g., restaurant booking, hotel reservations). Source: Created by dialogue researchers, it provides annotated conversations mimicking real-world human interactions. Usage: Leveraged for training the language understanding and dialogue generation capabilities of the model. ConceptNet:

Inspiration: Designed to provide commonsense knowledge, helping models reason beyond factual information by understanding relationships and contexts. Source: An open-source project that aggregates data from various crowdsourced resources like Wikipedia, WordNet, and Open Mind Common Sense. Usage: Integrated into the reasoning module to improve multi-hop and commonsense reasoning. UCI Machine Learning Repository:

Inspiration: A well-known repository containing diverse datasets for various machine learning tasks, such as loan approval and medical diagnosis. Source: Academic research and publicly available datasets contributed by the research community. Usage: Used for structured data tasks, particularly in financial and healthcare analytics. B. Proprietary and Domain-Specific Datasets Healthcare Records Dataset:

Inspiration: The increasing demand for predictive analytics in healthcare motivated the use of patient records to predict health outcomes. Source: Anonymized data collected from healthcare providers, including patient demographics, medical history, and diagnostic information. Usage: Trained and tested the model's ability to handle regression tasks, such as predicting patient recovery rates and health risks. Financial Transactions and Loan Application Data:

Inspiration: To address risk analytics in financial services, loan application datasets containing applicant profiles, credit scores, and financial history were used. Source: Collaboration with financial institutions provided access to anonymized loan application data. Usage: Focused on classification tasks for loan approval predictions and credit scoring. C. Synthesized Data and Augmented Datasets Synthetic Dialogue Scenarios: Inspiration: To test the model's performance on hypothetical scenarios and rare cases not covered in standard datasets. Source: Generated using rule-based models and simulations to create additional training samples, especially for edge cases in dialogue tasks. Usage: Improved model robustness by exposing it to challenging and less common dialogue interactions. 3. Inspirations Behind the Dataset Choice Diverse Task Requirements: The hybrid model was designed to handle multiple types of tasks (classification, regression, reasoning), necessitating diverse datasets covering different input formats (images, text, structured data). Real-World Relevance: The selected datasets were inspired by real-world use cases in healthcare, finance, and customer service, reflecting common scenarios where such a hybrid model could be applied. Challenging Scenarios: To test the model's reasoning capabilities, datasets like ConceptNet and synthetic scenarios were included, inspired by the need to handle complex logical reasoning and inferencing tasks. Inclusivity and Fairness: Public datasets were chosen to ensure coverage across various demographic groups, reducing bias and improving fairness in predictions. 4. Pre-Processing and Data Preparation Standardization and Normalization: Structured data were ...
T
Sweden Household Lending Growth
tradingeconomics.com
id.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2025). Sweden Household Lending Growth [Dataset]. https://tradingeconomics.com/sweden/loan-growth
Explore at:
excel, csv, xml, jsonAvailable download formats
Dataset updated
Aug 27, 2025
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 31, 1976 - Jul 31, 2025
Area covered
Sweden
Description
The value of loans in Sweden increased 2.60 percent in July of 2025 over the same month in the previous year. This dataset provides the latest reported value for - Sweden Household Lending Growth - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
T
Germany Bank Lending Rate
tradingeconomics.com
pl.tradingeconomics.com
+13more
csv, excel, json, xml
Updated Dec 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2024). Germany Bank Lending Rate [Dataset]. https://tradingeconomics.com/germany/bank-lending-rate
Explore at:
excel, xml, csv, jsonAvailable download formats
Dataset updated
Dec 15, 2024
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 31, 2003 - Jun 30, 2025
Area covered
Germany
Description
Bank Lending Rate in Germany decreased to 4 percent in June from 4.09 percent in May of 2025. This dataset provides the latest reported value for - Germany Bank Lending Rate - plus previous releases, historical high and low, short-term forecast and long-term prediction, economic calendar, survey consensus and news.
Credit Card Balance Prediction
kaggle.com
Updated Jul 13, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abdalrahman Ali El nashar (2022). Credit Card Balance Prediction [Dataset]. https://www.kaggle.com/datasets/abdalrahmanelnashar/credit-card-balance-prediction
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Jul 13, 2022
Dataset provided by
Kaggle
Authors
Abdalrahman Ali El nashar
Description
This dataset contains information about credit card balance. This data can be used for a lot of purposes such as credit card balance prediction. The columns in the given dataset are as follows: Income: Income of the customer. Limit: Credit limit provided to the customer. Rating: The customer's credit rating. Cards: The number of credit cards the customer has. Age: Age of the customer. Education: Educational level of the customer. Gender: Sex of the customer. Student: If the customer is a student or not. Married: If the customer is married. Ethnicity: Ethnicity of the customer. Balance: Credit balance of the customer.

$ Income : num 14.9 106 104.6 148.9 55.9 ...

$ Limit : int 3606 6645 7075 9504 4897 8047 3388 7114 3300 6819 ...

$ Rating : int 283 483 514 681 357 569 259 512 266 491 ...

$ Cards : int 2 3 4 3 2 4 2 2 5 3 ...

$ Age : int 34 82 71 36 68 77 37 87 66 41 ...

$ Education: int 11 15 11 11 16 10 12 9 13 19 ...

$ Gender : Factor w/ 2 levels " Male","Female": 1 2 1 2 1 1 2 1 2 2 ...

$ Student : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 1 1 1 1 2 ...

$ Married : Factor w/ 2 levels "No","Yes": 2 2 1 1 2 1 1 1 1 2 ...

$ Ethnicity: Factor w/ 3 levels "African American",..: 3 2 2 2 3 3 1 2 3 1 ...

$ Balance : int 333 903 580 964 331 1151 203 872 279 1350 ...

Facebook

Twitter

Click to copy link

Link copied

Cite

Ta-wei Lo (2024). Loan Approval Classification Dataset [Dataset]. https://www.kaggle.com/datasets/taweilo/loan-approval-classification-data

Loan Approval Classification Dataset

Synthetic Data for binary classification on Loan Approval

Explore at:

32 scholarly articles cite this dataset (View in Google Scholar)

CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.

Dataset updated

Oct 29, 2024

Dataset provided by

Kagglehttp://kaggle.com/

Authors

Ta-wei Lo

License

Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically

Description

1. Data Source

2. Metadata

The dataset contains 45,000 records and 14 variables, each described below:

Column	Description	Type
`person_age`	Age of the person	Float
`person_gender`	Gender of the person	Categorical
`person_education`	Highest education level	Categorical
`person_income`	Annual income	Float
`person_emp_exp`	Years of employment experience	Integer
`person_home_ownership`	Home ownership status (e.g., rent, own, mortgage)	Categorical
`loan_amnt`	Loan amount requested	Float
`loan_intent`	Purpose of the loan	Categorical
`loan_int_rate`	Loan interest rate	Float
`loan_percent_income`	Loan amount as a percentage of annual income	Float
`cb_person_cred_hist_length`	Length of credit history in years	Float
`credit_score`	Credit score of the person	Integer
`previous_loan_defaults_on_file`	Indicator of previous loan defaults	Categorical
`loan_status` (target variable)	Loan approval status: 1 = approved; 0 = rejected	Integer

3. Data Usage

The dataset can be used for multiple purposes:

Exploratory Data Analysis (EDA): Analyze key features, distribution patterns, and relationships to understand credit risk factors.
Classification: Build predictive models to classify the loan_status variable (approved/not approved) for potential applicants.
Regression: Develop regression models to predict the credit_score variable based on individual and loan-related attributes.

Mind the data issue from the original data, such as the instance > 100-year-old as age.

This dataset provides a rich basis for understanding financial risk factors and simulating predictive modeling processes for loan approval and credit scoring.

Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀

Clear search

Close search

Google apps

Main menu

Loan Approval Classification Dataset

1. Data Source

2. Metadata

3. Data Usage

Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀

Data from: Loan Approval Prediction Dataset

Dataset

Contents

Bank Loan Application Approvals

Data from: Loan Approval Prediction

Columns

Data from: Optimizing Bank Loan Approval with Cutting-Edge Deep Learning...

CPL Prediction

Content

(Cleaned) Credit Score for Classification Dataset

Historical Loan Records with Default Status

Loan Default Prediction Dataset

Want to Improve or Customize It?

Users are encouraged to:

Use Cases:

bank_loan_data

creditrisk Dataset

Comprehensive Loan Information for Credit Risk

Loans Dataset

Data Dictionary

The Data contains 1 file :

United Kingdom Mortgage Approvals

Credit Approval (Mixed Attributes)

Credit Approval (Mixed Attributes)

Continuous and Categorical Features

About this dataset

More Datasets

Featured Notebooks

How to use the dataset

Research Ideas

Acknowledgements

License

Columns

China Loan Prime Rate

Average mortgage interest rates in the UK 2000-2025, by month and type

submission.json

Sweden Household Lending Growth

Germany Bank Lending Rate

Credit Card Balance Prediction

$ Income : num 14.9 106 104.6 148.9 55.9 ...

$ Limit : int 3606 6645 7075 9504 4897 8047 3388 7114 3300 6819 ...

$ Rating : int 283 483 514 681 357 569 259 512 266 491 ...

$ Cards : int 2 3 4 3 2 4 2 2 5 3 ...

$ Age : int 34 82 71 36 68 77 37 87 66 41 ...

$ Education: int 11 15 11 11 16 10 12 9 13 19 ...

$ Gender : Factor w/ 2 levels " Male","Female": 1 2 1 2 1 1 2 1 2 2 ...

$ Student : Factor w/ 2 levels "No","Yes": 1 2 1 1 1 1 1 1 1 2 ...

$ Married : Factor w/ 2 levels "No","Yes": 2 2 1 1 2 1 1 1 1 2 ...

$ Ethnicity: Factor w/ 3 levels "African American",..: 3 2 2 2 3 3 1 2 3 1 ...

$ Balance : int 333 903 580 964 331 1151 203 872 279 1350 ...

Loan Approval Classification Dataset

Synthetic Data for binary classification on Loan Approval

1. Data Source

2. Metadata

3. Data Usage

Feel free to leave comments on the discussion. I'd appreciate your upvote if you find my dataset useful! 😀