https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains the customer's data from a loan company known as Prosper. This dataset comprises of 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others.
Definition of Variables:
ListingKey: Unique key for each listing, same value as the 'key' used in the listing object in the API. ListingNumber: The number that uniquely identifies the listing to the public as displayed on the website. ListingCreationDate: The date the listing was created. CreditGrade: The Credit rating that was assigned at the time the listing went live. Applicable for listings pre-2009 period and will only be populated for those listings. Term: The length of the loan expressed in months. LoanStatus: The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, PastDue. The PastDue status will be accompanied by a delinquency bucket. ClosedDate: Closed date is applicable for Cancelled, Completed, Chargedoff and Defaulted loan statuses. BorrowerAPR: The Borrower's Annual Percentage Rate (APR) for the loan. BorrowerRate: The Borrower's interest rate for this loan. LenderYield: The Lender yield on the loan. Lender yield is equal to the interest rate on the loan less the servicing fee. EstimatedEffectiveYield: Effective yield is equal to the borrower interest rate (i) minus the servicing fee rate, (ii) minus estimated uncollected interest on charge-offs, (iii) plus estimated collected late fees. Applicable for loans originated after July 2009. EstimatedLoss: Estimated loss is the estimated principal loss on charge-offs. Applicable for loans originated after July 2009. EstimatedReturn: The estimated return assigned to the listing at the time it was created. Estimated return is the difference between the Estimated Effective Yield and the Estimated Loss Rate. Applicable for loans originated after July 2009. ProsperRating (numeric): The Prosper Rating assigned at the time the listing was created: 0 - N/A, 1 - HR, 2 - E, 3 - D, 4 - C, 5 - B, 6 - A, 7 - AA. Applicable for loans originated after July 2009. ProsperRating (Alpha): The Prosper Rating assigned at the time the listing was created between AA - HR. Applicable for loans originated after July 2009. ProsperScore: A custom risk score built using historical Prosper data. The score ranges from 1-10, with 10 being the best, or lowest risk score. Applicable for loans originated after July 2009. ListingCategory: The category of the listing that the borrower selected when posting their listing: 0 - Not Available, 1 - Debt Consolidation, 2 - Home Improvement, 3 - Business, 4 - Personal Loan, 5 - Student Use, 6 - Auto, 7- Other, 8 - Baby&Adoption, 9 - Boat, 10 - Cosmetic Procedure, 11 - Engagement Ring, 12 - Green Loans, 13 - Household Expenses, 14 - Large Purchases, 15 - Medical/Dental, 16 - Motorcycle, 17 - RV, 18 - Taxes, 19 - Vacation, 20 - Wedding Loans BorrowerState: The two letter abbreviation of the state of the address of the borrower at the time the Listing was created. Occupation: The Occupation selected by the Borrower at the time they created the listing. EmploymentStatus: The employment status of the borrower at the time they posted the listing. EmploymentStatusDuration: The length in months of the employment status at the time the listing was created. IsBorrowerHomeowner: A Borrower will be classified as a homowner if they have a mortgage on their credit profile or provide documentation confirming they are a homeowner. CurrentlyInGroup: Specifies whether or not the Borrower was in a group at the time the listing was created. GroupKey: The Key of the group in which the Borrower is a member of. Value will be null if the borrower does not have a group affiliation. DateCreditPulled: The date the credit profile was pulled. CreditScoreRangeLower: The lower value representing the range of the borrower's credit score as provided by a consumer credit rating agency. CreditScoreRangeUpper: The upper value representing the range of the borrower's credit score as provided by a consumer credit rating agency. FirstRecordedCreditLine: The date the first credit line was opened. CurrentCreditLines: Number of current credit lines at the time the credit profile was pulled. OpenCreditLines: Number of open credit lines at the time the credit profile was pulled. TotalCreditLinespast7years: Number of credit lines in the past seven years at the time the credit profile was pulled. OpenRevolvingAccounts: Number of open revolving accounts at the time the credit profile was pulled. OpenRevolvingMonthlyPayment: Monthly payment on revolving accounts at the time the credit profile was pulled. InquiriesLast6Months: Number of inquiries in the past six months at the time the cre...
The National Student Loan Data System (NSLDS) is the national database of information about loans and grants awarded to students under Title IV of the Higher Education Act (HEA) of 1965. NSLDS provides a centralized, integrated view of Title IV loans and grants during their complete life cycle, from aid approval through disbursement, repayment, deferment, delinquency, and closure.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for LOAN GROWTH reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
These tables provide additional detail on the loan assets of U.S. depository institutions by reporting mortgage and consumer loan portfolios broken down by the banks' estimates of the probability of default, as defined below. This information facilitates analysis of the potential concentration of risk in specific loan categories. The institutions reporting this information are generally those with $10 billion or more of assets.
Lending Club offers peer-to-peer (P2P) loans through a technological platform for various personal finance purposes and is today one of the companies that dominate the US P2P lending market. The original dataset is publicly available on Kaggle and corresponds to all the loans issued by Lending Club between 2007 and 2018. The present version of the dataset is for constructing a granting model, that is, a model designed to make decisions on whether to grant a loan based on information available at the time of the loan application. Consequently, our dataset only has a selection of variables from the original one, which are the variables known at the moment the loan request is made. Furthermore, the target variable of a granting model represents the final status of the loan, that are "default" or "fully paid". Thus, we filtered out from the original dataset all the loans in transitory states. Our dataset comprises 1,347,681 records or obligations (approximately 60% of the original) and it was also cleaned for completeness and consistency (less than 1% of our dataset was filtered out).
TARGET VARIABLE
The dataset includes a target variable based on the final resolution of the credit: the default category corresponds to the event charged off and the non-default category to the event fully paid. It does not consider other values in the loan status variable since this variable represents the state of the loan at the end of the considered time window. Thus, there are no loans in transitory states. The original dataset includes the target variable “loan status”, which contains several categories ('Fully Paid', 'Current', 'Charged Off', 'In Grace Period', 'Late (31-120 days)', 'Late (16-30 days)', 'Default'). However, in our dataset, we just consider loans that are either “Fully Paid” or “Default” and transform this variable into a binary variable called “Default”, with a 0 for fully paid loans and a 1 for defaulted loans.
EXPLANATORY VARIABLES
The explanatory variables that we use correspond only to the information available at the time of the application. Variables such as the interest rate, grade, or subgrade are generated by the company as a result of a credit risk assessment process, so they were filtered out from the dataset as they must not be considered in risk models to predict the default in granting of credit.
FULL LIST OF VARIABLES
Loan identification variables:
id: Loan id (unique identifier).
issue_d: Month and year in which the loan was approved.
Quantitative variables:
revenue: Borrower's self-declared annual income during registration.
dti_n: Indebtedness ratio for obligations excluding mortgage. Monthly information. This ratio has been calculated considering the indebtedness of the whole group of applicants. It is estimated as the ratio calculated using the co-borrowers’ total payments on the total debt obligations divided by the co-borrowers’ combined monthly income.
loan_amnt: Amount of credit requested by the borrower.
fico_n: Defined between 300 and 850, reported by Fair Isaac Corporation as a risk measure based on historical credit information reported at the time of application. This value has been calculated as the average of the variables “fico_range_low” and “fico_range_high” in the original dataset.
experience_c: Binary variable that indicates whether the borrower is new to the entity. This variable is constructed from the credit date of the previous obligation in LC and the credit date of the current obligation; if the difference between dates is positive, it is not considered as a new experience with LC.
Categorical variables:
emp_length: Categorical variable with the employment length of the borrower (includes the no information category)
purpose: Credit purpose category for the loan request.
home_ownership_n: Homeownership status provided by the borrower in the registration process. Categories defined by LC: “mortgage”, “rent”, “own”, “other”, “any”, “none”. We merged the categories “other”, “any” and “none” as “other”.
addr_state: Borrower's residence state from the USA.
zip_code: Zip code of the borrower's residence.
Textual variables
title: Title of the credit request description provided by the borrower.
desc: Description of the credit request provided by the borrower.
We cleaned the textual variables. First, we removed all those descriptions that contained the default description provided by Lending Club on its web form (“Tell your story. What is your loan for?”). Moreover, we removed the prefix “Borrower added on DD/MM/YYYY >” from the descriptions to avoid any temporal background on them. Finally, as these descriptions came from a web form, we substituted all the HTML elements by their character (e.g. “&” was substituted by “&”, “<” was substituted by “<”, etc.).
RELATED WORKS
This dataset has been used in the following academic articles:
Sanz-Guerrero, M. Arroyo, J. (2024). Credit Risk Meets Large Language Models: Building a Risk Indicator from Loan Descriptions in P2P Lending. arXiv preprint arXiv:2401.16458. https://doi.org/10.48550/arXiv.2401.16458
Ariza-Garzón, M.J., Arroyo, J., Caparrini, A., Segovia-Vargas, M.J. (2020). Explainability of a machine learning granting scoring model in peer-to-peer lending. IEEE Access 8, 64873 - 64890. https://doi.org/10.1109/ACCESS.2020.2984412
This dataset contains loan application data aimed at detecting fraud. Each application has a loan status that indicates the outcome and is categorized into two groups: normal loans (value 0) and fraudulent loans (value 1).
Normal loans include statuses like Paid Off Loan, Charged Off Paid Off, and Settlement Paid Off. Fraudulent loans include Rejected, Internal Collection, and Charged Off. Other statuses are excluded from the classification process.
The dataset includes training data for building predictive models and evaluation data for testing. A detailed dictionary is provided to explain the columns for clarity.
In the data.zip
file, you will find 3 folders:
submission.csv
file should be filled with predictions based on this data.The Department of Energy's Loan Programs-administered by LPO-enable DOE to work with private companies and lenders to mitigate the financing risks associated with clean energy projects, and thereby encourage their development on a broader and much-needed scale. The Loan Programs consist of three separate programs managed by two offices, the Loan Guarantee Program Office (LGP) and the Advanced Technology Vehicles Manufacturing Loan Program Office. LPO originates, guarantees, and monitors loans to support clean energy projects through these programs. The programs are: Section 1703: Under Section 1703 of Title XVII, DOE LGP is authorized to guarantee loans for projects that employ new or significantly improved energy technologies and avoid, reduce or sequester air pollutants or greenhouse gases. Section 1705: Under Section 1705 of Title XVII, added by the American Reinvestment and Recovery Act (ARRA), DOE LGP is authorized to guarantee loans for certain clean energy projects that commenced construction on or before September 30, 2011. The Section 1705 program expired, pursuant to statute, on September 30, 2011 and will actively monitor projects that previously received loan guarantees under the 1705 program. LPO will no longer issue new loan guarantees under the 1705 program. Advanced Technology Vehicles Manufacturing (ATVM): Under Section 136 of the Energy Independence and Security Act of 2007, DOE is authorized to provide direct loans to finance advanced vehicle technologies.
As of August 2021, the forbearance rate of single-family housing mortgages owned by Freddie Mac in the U.S. was approximately 2.09 percent. Forbearance is a type of borrower assistance which allows the lender to negotiate a temporary postponement of a mortgage repayment. It allows a payment period relief in lieu of the creditor foreclosing on any property that was used as collateral for the loan.
This dataset was created by ridah ahmed
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States WAS: Refinance Loans % of the Total Loan Amounts data was reported at 32.700 % in 20 Jul 2018. This records an increase from the previous number of 31.900 % for 13 Jul 2018. United States WAS: Refinance Loans % of the Total Loan Amounts data is updated weekly, averaging 47.010 % from Jan 1990 (Median) to 20 Jul 2018, with 1490 observations. The data reached an all-time high of 86.500 % in 09 Jan 2009 and a record low of 9.990 % in 10 Mar 1995. United States WAS: Refinance Loans % of the Total Loan Amounts data remains active status in CEIC and is reported by Mortgage Bankers Association. The data is categorized under Global Database’s USA – Table US.KA016: Weekly Applications Survey: Mortgage Loan Applications.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Loans and Leases in Bank Credit, All Commercial Banks (LOANS) from Jan 1947 to Apr 2025 about leases, credits, commercial, loans, banks, depository institutions, and USA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The inherent fragility of the agricultural industry significantly restricts the financing channels available to new agricultural operating entities. Access to credit loans emerges as a pivotal means to address capital shortages among farmers and enhance production inputs. Drawing on survey data from 17,745 new agricultural operating entities engaged in food production in Lu’an City, Anhui Province, and agricultural households documented in the China Household Finance Survey Database, this paper employs the Logit model and the Heckman selection model to empirically analyze the loan decision-making behavior of these entities from two perspectives: loan willingness and credit scale. The research reveals that several key variables exert a significant positive influence on the borrowing willingness of grain producers. Specifically, the planting area range, input range per hectare, the rate range of return on investment, membership in cooperatives, and operation as a family farm all notably enhance their willingness to seek loans. Conversely, the net income per hectare and the number of crop types cultivated significantly diminish their inclination to borrow. Additionally, male operators and those with higher educational backgrounds demonstrate a stronger willingness to obtain loans. Furthermore, the study indicates that the planting area and membership in cooperatives also positively correlate with the scale of loans secured by these agricultural operating entities. Therefore, from the perspective of food security, it is essential to cultivate food-producing new agricultural operating entities. This requires a focus on the counter-cyclical adjustment of financial support, increasing credit support during years of low investment returns. Additionally, it is necessary to develop multiple forms of moderate-scale operations, enhance policy support, and boost the production enthusiasm of food-producing new agricultural operating entities.
The assessment is accomplished by estimating the loan's default probability through analysing this historical dataset and then classifying the loan into one of two categories: (a) higher risk—likely to default on the loan (i.e., be charged off/failure to pay in full) or (b) lower risk—likely to pay off the loan in full.
What's inside is more than just rows and columns. Make it easy for others to get started by describing how you acquired the data and what time period it represents, too.
We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.
Should This Loan be Approved or Denied?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The model accuracy verification and comparison result on LendingClub Loan Data and German Credit Data.
The loan balance of JPMorgan Chase increased every year between 2019 and 2023. In 2023, the global loans portfolio of JPMorgan Chase was valued at around 1.32 trillion U.S. dollars. JPMorgan Chase, which has its headquarters in New York, was the bank in the U.S. with the largest consumer loans portfolio.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Netflix reported $14.01B in Loan Capital for its fiscal quarter ending in March of 2025. Data for Netflix | NFLX - Loan Capital including historical, tables and charts were last updated by Trading Economics this last June in 2025.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Loan: Local & Foreign Currency: Culture, Sport & Entertainment data was reported at 645.069 RMB bn in 2022. This records an increase from the previous number of 610.039 RMB bn for 2021. Loan: Local & Foreign Currency: Culture, Sport & Entertainment data is updated yearly, averaging 334.300 RMB bn from Dec 2010 (Median) to 2022, with 13 observations. The data reached an all-time high of 645.069 RMB bn in 2022 and a record low of 100.873 RMB bn in 2010. Loan: Local & Foreign Currency: Culture, Sport & Entertainment data remains active status in CEIC and is reported by The People's Bank of China. The data is categorized under China Premium Database’s Money and Banking – Table CN.KB: Loan: By Industry.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Finance Rate on Consumer Installment Loans at Commercial Banks, New Autos 48 Month Loan (TERMCBAUTO48NS) from Feb 1972 to Feb 2025 about 4-years, installment, financing, consumer credit, vehicles, new, loans, consumer, interest rate, banks, interest, depository institutions, rate, and USA.
Financial institutions incur significant losses due to the default of vehicle loans. This has led to the tightening up of vehicle loan underwriting and increased vehicle loan rejection rates. The need for a better credit risk scoring model is also raised by these institutions. This warrants a study to estimate the determinants of vehicle loan default. A financial institution has hired you to accurately predict the probability of loanee/borrower defaulting on a vehicle loan in the first EMI (Equated Monthly Instalments) on the due date. Following Information regarding the loan and loanee are provided in the datasets: Loanee Information (Demographic data like age, Identity proof etc.) Loan Information (Disbursal details, loan to value ratio etc.) Bureau data & history (Bureau score, number of active accounts, the status of other loans, credit history etc.) Doing so will ensure that clients capable of repayment are not rejected and important determinants can be identified which can be further used for minimising the default rates.
https://fred.stlouisfed.org/legal/#copyright-public-domainhttps://fred.stlouisfed.org/legal/#copyright-public-domain
Graph and download economic data for Loan Loss Reserve to Total Loans for all U.S. Banks (DISCONTINUED) (USLLRTL) from Q1 1984 to Q3 2020 about gains/losses, reserves, loans, banks, depository institutions, and USA.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains the customer's data from a loan company known as Prosper. This dataset comprises of 113,937 loans with 81 variables on each loan, including loan amount, borrower rate (or interest rate), current loan status, borrower income, and many others.
Definition of Variables:
ListingKey: Unique key for each listing, same value as the 'key' used in the listing object in the API. ListingNumber: The number that uniquely identifies the listing to the public as displayed on the website. ListingCreationDate: The date the listing was created. CreditGrade: The Credit rating that was assigned at the time the listing went live. Applicable for listings pre-2009 period and will only be populated for those listings. Term: The length of the loan expressed in months. LoanStatus: The current status of the loan: Cancelled, Chargedoff, Completed, Current, Defaulted, FinalPaymentInProgress, PastDue. The PastDue status will be accompanied by a delinquency bucket. ClosedDate: Closed date is applicable for Cancelled, Completed, Chargedoff and Defaulted loan statuses. BorrowerAPR: The Borrower's Annual Percentage Rate (APR) for the loan. BorrowerRate: The Borrower's interest rate for this loan. LenderYield: The Lender yield on the loan. Lender yield is equal to the interest rate on the loan less the servicing fee. EstimatedEffectiveYield: Effective yield is equal to the borrower interest rate (i) minus the servicing fee rate, (ii) minus estimated uncollected interest on charge-offs, (iii) plus estimated collected late fees. Applicable for loans originated after July 2009. EstimatedLoss: Estimated loss is the estimated principal loss on charge-offs. Applicable for loans originated after July 2009. EstimatedReturn: The estimated return assigned to the listing at the time it was created. Estimated return is the difference between the Estimated Effective Yield and the Estimated Loss Rate. Applicable for loans originated after July 2009. ProsperRating (numeric): The Prosper Rating assigned at the time the listing was created: 0 - N/A, 1 - HR, 2 - E, 3 - D, 4 - C, 5 - B, 6 - A, 7 - AA. Applicable for loans originated after July 2009. ProsperRating (Alpha): The Prosper Rating assigned at the time the listing was created between AA - HR. Applicable for loans originated after July 2009. ProsperScore: A custom risk score built using historical Prosper data. The score ranges from 1-10, with 10 being the best, or lowest risk score. Applicable for loans originated after July 2009. ListingCategory: The category of the listing that the borrower selected when posting their listing: 0 - Not Available, 1 - Debt Consolidation, 2 - Home Improvement, 3 - Business, 4 - Personal Loan, 5 - Student Use, 6 - Auto, 7- Other, 8 - Baby&Adoption, 9 - Boat, 10 - Cosmetic Procedure, 11 - Engagement Ring, 12 - Green Loans, 13 - Household Expenses, 14 - Large Purchases, 15 - Medical/Dental, 16 - Motorcycle, 17 - RV, 18 - Taxes, 19 - Vacation, 20 - Wedding Loans BorrowerState: The two letter abbreviation of the state of the address of the borrower at the time the Listing was created. Occupation: The Occupation selected by the Borrower at the time they created the listing. EmploymentStatus: The employment status of the borrower at the time they posted the listing. EmploymentStatusDuration: The length in months of the employment status at the time the listing was created. IsBorrowerHomeowner: A Borrower will be classified as a homowner if they have a mortgage on their credit profile or provide documentation confirming they are a homeowner. CurrentlyInGroup: Specifies whether or not the Borrower was in a group at the time the listing was created. GroupKey: The Key of the group in which the Borrower is a member of. Value will be null if the borrower does not have a group affiliation. DateCreditPulled: The date the credit profile was pulled. CreditScoreRangeLower: The lower value representing the range of the borrower's credit score as provided by a consumer credit rating agency. CreditScoreRangeUpper: The upper value representing the range of the borrower's credit score as provided by a consumer credit rating agency. FirstRecordedCreditLine: The date the first credit line was opened. CurrentCreditLines: Number of current credit lines at the time the credit profile was pulled. OpenCreditLines: Number of open credit lines at the time the credit profile was pulled. TotalCreditLinespast7years: Number of credit lines in the past seven years at the time the credit profile was pulled. OpenRevolvingAccounts: Number of open revolving accounts at the time the credit profile was pulled. OpenRevolvingMonthlyPayment: Monthly payment on revolving accounts at the time the credit profile was pulled. InquiriesLast6Months: Number of inquiries in the past six months at the time the cre...