2 datasets found
  1. Systimec_And_Banking_Crises

    • kaggle.com
    zip
    Updated May 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mohamed Abd Al-mgyd (2022). Systimec_And_Banking_Crises [Dataset]. https://www.kaggle.com/datasets/mohamedabdalmgyd/systimec-and-banking-crises
    Explore at:
    zip(267294 bytes)Available download formats
    Dataset updated
    May 29, 2022
    Authors
    Mohamed Abd Al-mgyd
    Description

    (Banking And Systemic Crises)

    prepared by (Mohamed Abd Al-mgyd)

    https://github.com/1145267383/Systemic-And-Banking-Crises

    Dataset

    A)20160923_global_crisis_data:

    https://www.hbs.edu/behavioral-finance-and-financial-stability/data/Pages/global.aspx

    This data was collected over many years by Carmen Reinhart (with her coauthors Ken Rogoff, Christoph Trebesch, and Vincent Reinhart). This data contains the banking crises of 70 countries, from 1800 AD to 2016 AD, with a total of 15,190 records and 16 variables. But the data stabilized after cleaning and adjusting to 8642 records and 17 variables.

    B)Label_Country: This data contains a description of the country whether it's Developing or Developed .

    Variable: Description:

    1-Case: ID Number for Country.

    2-Cc3: ID String for Country.

    3-Country : Name Country.

    4-Year: The date from 1800 to 2016.

    5-Banking_Crisis: Banking problems can often be traced to a decrease the value of banks' assets.

    A) due to a collapse in real estate prices or When the bank asset values decrease substantially . B) if a government stops paying its obligations, this can trigger a sharp decline in value of bonds.

    6-Systemic_Crisis : when many banks in a country are in serious solvency or liquidity problems at the same time—either:

    A) because there are all hits by the same outside shock. B) or because failure in one bank or a group of banks spreads to other banks in the system.

    7-Gold_Standard: The Country have crisis in Gold Standard.

    8-Exch_Usd: Exch local currency in USD, Except exch USD currency in GBP.

    9-Domestic_Debt_In_Default: The Country have domestic debt in default.

    10-Sovereign_External_Debt_1: Default and Restructurings, -Does not include defaults on WWI debt to United States and United Kingdom and post-1975 defaults on Official External Creditors.

    11-Sovereign_External_Debt_2: Default and Restructurings, -Does not include defaults on WWI debt to United States and United Kingdom but includes post-1975 defaults on Official External Creditors.

    12-Gdp_Weighted_Default:GDP Weighted Default for country.

    13-Inflation: Annual percentages of average consumer prices.

    14-Independence: Independence for country.

    15-Currency_Crises: The Country have crisis in Currency.

    16-Inflation_Crises: The Country have crisis in Inflation.

    17-Level_Country: The description of the country whether it's Developing or Developed.

  2. Realistic Loan Approval Dataset | US & Canada

    • kaggle.com
    zip
    Updated Nov 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Parth Patel2130 (2025). Realistic Loan Approval Dataset | US & Canada [Dataset]. https://www.kaggle.com/datasets/parthpatel2130/realistic-loan-approval-dataset-us-and-canada
    Explore at:
    zip(1717268 bytes)Available download formats
    Dataset updated
    Nov 1, 2025
    Authors
    Parth Patel2130
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Area covered
    Canada, United States
    Description

    🏦 Synthetic Loan Approval Dataset

    A Realistic, High-Quality Dataset for Credit Risk Modelling

    🎯 Why This Dataset?

    Most loan datasets on Kaggle have unrealistic patterns where:

    1. ❌ Credit scores don't matter
    2. ❌ Approval logic is backwards
    3. ❌ Models learn nonsense patterns

    Unlike most loan datasets available online, this one is built on real banking criteria from US and Canadian financial institutions. Drawing from 3 years of hands-on finance industry experience, the dataset incorporates realistic correlations and business logic that reflect how actual lending decisions are made. This makes it perfect for data scientists looking to build portfolio projects that showcase not just coding ability, but genuine understanding of credit risk modelling.

    📊 Dataset Overview

    MetricValue
    Total Records50,000
    Features20 (customer_id + 18 predictors + 1 target)
    Target Distribution55% Approved, 45% Rejected
    Missing Values0 (Complete dataset)
    Product TypesCredit Card, Personal Loan, Line of Credit
    MarketUnited States & Canada
    Use CaseBinary Classification (Approved/Rejected)

    🔑 Key Features

    Identifier:

    -Customer ID (unique identifier for each application)

    Demographics:

    -Age, Occupation Status, Years Employed

    Financial Profile:

    -Annual Income, Credit Score, Credit History Length -Savings/Assets, Current Debt

    Credit Behaviour:

    -Defaults on File, Delinquencies, Derogatory Marks

    Loan Request:

    -Product Type, Loan Intent, Loan Amount, Interest Rate

    Calculated Ratios:

    -Debt-to-Income, Loan-to-Income, Payment-to-Income

    💡 What Makes This Dataset Special?

    1️⃣ Real-World Approval Logic The dataset implements actual banking criteria: - DTI ratio > 50% = automatic rejection - Defaults on file = instant reject - Credit score bands match real lending thresholds - Employment verification for loans ≥$20K

    2️⃣ Realistic Correlations - Higher income → Better credit scores - Older applicants → Longer credit history - Students → Lower income, special treatment for small loans - Loan intent affects approval (Education best, Debt Consolidation worst)

    3️⃣ Product-Specific Rules - Credit Cards: More lenient, higher limits - Personal Loans: Standard criteria, up to $100K - Line of Credit: Capped at $50K, manual review for high amounts

    4️⃣ Edge Cases Included - Young applicants (age 18) building first credit - Students with thin credit files - Self-employed with variable income - High debt-to-income ratios - Multiple delinquencies

    🎓 Perfect For - Machine Learning Practice: Binary classification with real patterns - Credit Risk Modelling: Learn actual lending criteria - Portfolio Projects: Build impressive, explainable models - Feature Engineering: Rich dataset with meaningful relationships - Business Analytics: Understand financial decision-making

    📈 Quick Stats

    Approval Rates by Product - Credit Card: 60.4% more lenient) - Personal Loan: 46.9 (standard) - Line of Credit: 52.6% (moderate)

    Loan Intent (Best → Worst Approval Odds) 1. Education (63% approved) 2. Personal (58% approved) 3. Medical/Home (52% approved) 4. Business (48% approved) 5. Debt Consolidation (40% approved)

    Credit Score Distribution - Mean: 644 - Range: 300-850 - Realistic bell curve around 600-700

    Income Distribution - Mean: $50,063 - Median: $41,608 - Range: $15K - $250K

    🎯 Expected Model Performance

    With proper feature engineering and tuning: - Accuracy: 75-85% - ROC-AUC: 0.80-0.90 - F1-Score: 0.75-0.85

    Important: Feature importance should show: 1. Credit Score (most important) 2. Debt-to-Income Ratio 3. Delinquencies 4. Loan Amount 5. Income

    If your model shows different patterns, something's wrong!

    🏆 Use Cases & Projects

    Beginner - Binary classification with XGBoost/Random Forest - EDA and visualization practice - Feature importance analysis

    Intermediate - Custom threshold optimization (profit maximization) - Cost-sensitive learning (false positive vs false negative) - Ensemble methods and stacking

    Advanced - Explainable AI (SHAP, LIME) - Fairness analysis across demographics - Production-ready API with FastAPI/Flask - Streamlit deployment with business rules

    ⚠️ Important Notes

    This is SYNTHETIC Data - Generated based on real banking criteria - No real customer data was used - Safe for public sharing and portfolio use

    Limitations - Simplified approval logic (real banks use 100+ factors) - No temporal component (no time series) - Single country/currency assumed (USD) - No external factors (economy, market conditions)

    Educational Purpose This dataset is designed for: - Learning credit risk modeling - Portfolio projects - ML practice - Understanding lending criteria

    NOT for: - Actual lending decisions - Financial advice - Production use without validation

    🤝 Contributing

    Found an issue? Have suggestions? - Open an issue on GitHub - Suggest i...

  3. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Mohamed Abd Al-mgyd (2022). Systimec_And_Banking_Crises [Dataset]. https://www.kaggle.com/datasets/mohamedabdalmgyd/systimec-and-banking-crises
Organization logo

Systimec_And_Banking_Crises

Explore at:
zip(267294 bytes)Available download formats
Dataset updated
May 29, 2022
Authors
Mohamed Abd Al-mgyd
Description

(Banking And Systemic Crises)

prepared by (Mohamed Abd Al-mgyd)

https://github.com/1145267383/Systemic-And-Banking-Crises

Dataset

A)20160923_global_crisis_data:

https://www.hbs.edu/behavioral-finance-and-financial-stability/data/Pages/global.aspx

This data was collected over many years by Carmen Reinhart (with her coauthors Ken Rogoff, Christoph Trebesch, and Vincent Reinhart). This data contains the banking crises of 70 countries, from 1800 AD to 2016 AD, with a total of 15,190 records and 16 variables. But the data stabilized after cleaning and adjusting to 8642 records and 17 variables.

B)Label_Country: This data contains a description of the country whether it's Developing or Developed .

Variable: Description:

1-Case: ID Number for Country.

2-Cc3: ID String for Country.

3-Country : Name Country.

4-Year: The date from 1800 to 2016.

5-Banking_Crisis: Banking problems can often be traced to a decrease the value of banks' assets.

A) due to a collapse in real estate prices or When the bank asset values decrease substantially . B) if a government stops paying its obligations, this can trigger a sharp decline in value of bonds.

6-Systemic_Crisis : when many banks in a country are in serious solvency or liquidity problems at the same time—either:

A) because there are all hits by the same outside shock. B) or because failure in one bank or a group of banks spreads to other banks in the system.

7-Gold_Standard: The Country have crisis in Gold Standard.

8-Exch_Usd: Exch local currency in USD, Except exch USD currency in GBP.

9-Domestic_Debt_In_Default: The Country have domestic debt in default.

10-Sovereign_External_Debt_1: Default and Restructurings, -Does not include defaults on WWI debt to United States and United Kingdom and post-1975 defaults on Official External Creditors.

11-Sovereign_External_Debt_2: Default and Restructurings, -Does not include defaults on WWI debt to United States and United Kingdom but includes post-1975 defaults on Official External Creditors.

12-Gdp_Weighted_Default:GDP Weighted Default for country.

13-Inflation: Annual percentages of average consumer prices.

14-Independence: Independence for country.

15-Currency_Crises: The Country have crisis in Currency.

16-Inflation_Crises: The Country have crisis in Inflation.

17-Level_Country: The description of the country whether it's Developing or Developed.

Search
Clear search
Close search
Google apps
Main menu