67 datasets found
  1. Heart Disease Prediction UCI

    • kaggle.com
    zip
    Updated Apr 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanka (2020). Heart Disease Prediction UCI [Dataset]. https://www.kaggle.com/datasets/priyanka841/heart-disease-prediction-uci
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    Apr 16, 2020
    Authors
    Priyanka
    Description

    Dataset

    This dataset was created by Priyanka

    Contents

  2. Heart Disease UCI

    • kaggle.com
    zip
    Updated Aug 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt Hartman (2022). Heart Disease UCI [Dataset]. https://www.kaggle.com/datasets/hartman/heart-disease-uci
    Explore at:
    zip(3494 bytes)Available download formats
    Dataset updated
    Aug 13, 2022
    Authors
    Matt Hartman
    Description
    1. age - age in years
    2. sex (1 = male; 0 = female)
    3. cp - chest pain type
      • 0: Typical angina: chest pain related decrease blood supply to the heart
      • 1: Atypical angina: chest pain not related to heart
      • 2: Non-anginal pain: typically esophageal spasms (non heart related)
      • 3: Asymptomatic: chest pain not showing signs of disease
    4. trestbps - resting blood pressure (in mm Hg on admission to the hospital) anything above 130-140 is typically cause for concern
    5. chol - serum cholostoral in mg/dl
      • serum = LDL + HDL + .2 * triglycerides
      • above 200 is cause for concern
    6. fbs - (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
      • '>126' mg/dL signals diabetes
    7. restecg - resting electrocardiographic results
      • 0: Nothing to note
      • 1: ST-T Wave abnormality
        • can range from mild symptoms to severe problems
        • signals non-normal heart beat
      • 2: Possible or definite left ventricular hypertophy
        • Enlarged heart's main pumping chamber
    8. thalach - maximum heart rate achieved
    9. exange - exercise induced angina (1 = yes; 0 = no)
    10. oldpeak - ST depression induced by exercise relative to rest looks at stress of heart during exercise unhealthy heart will stress more
    11. slope - the slope of the peak exercise ST segment
      • 0: Upsloping: better heart rate with exercise (uncommon)
      • 1: Flatsloping: minimal change (typically healthy heart)
      • 3: Downsloping: signs of unhealthy heart
    12. ca - number of major vessels (0-3) colored by flourosopy
      • colored vessel means the doctor can see the blood passing through
      • the more blood movement the better (no clots)
    13. thal - thalium stress result 3 = normal; 6 = fixed defect; 7 = reversable defect
      • 1,3: normal
      • 6: fixed defect: used to be defect but ok now
      • 7: reversable defect: no proper blood movement when exercising
    14. target - have disease or not (1=yes, 0=no) (=the predicted attribute)
  3. i

    Cardiovascular Disease Dataset

    • ieee-dataport.org
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajib Kumar Halder Halder (2025). Cardiovascular Disease Dataset [Dataset]. https://ieee-dataport.org/documents/cardiovascular-disease-dataset
    Explore at:
    Dataset updated
    Oct 29, 2025
    Authors
    Rajib Kumar Halder Halder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This heart disease dataset is curated by combining 3 popular heart disease datasets. The first dataset (Collected from Kaggle) contains 70000 records with 11 independent features which makes it the largest heart disease dataset available so far for research purposes. These data were collected at the moment of medical examination and information given by the patient. Second and third datasets contain 303 and 293 intstances respectively with 13 common features. The three datasets used for its curation are:Cardio Data (Kaggle Dataset)

  4. Heart Disease UCI

    • kaggle.com
    zip
    Updated Oct 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ronak Kantariya (2024). Heart Disease UCI [Dataset]. https://www.kaggle.com/datasets/ronakkantariya/heart-disease-uci
    Explore at:
    zip(12672 bytes)Available download formats
    Dataset updated
    Oct 6, 2024
    Authors
    Ronak Kantariya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Ronak Kantariya

    Released under CC0: Public Domain

    Contents

  5. Heart Disease Dataset 2.0

    • kaggle.com
    zip
    Updated Jun 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Grems (2024). Heart Disease Dataset 2.0 [Dataset]. https://www.kaggle.com/datasets/gregorygrems/heart-disease-dataset-2002/code
    Explore at:
    zip(5743488 bytes)Available download formats
    Dataset updated
    Jun 17, 2024
    Authors
    Gregory Grems
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Heart Disease Data combined from UCI repository of following places:

    Cleveland, Hungary, Switzerland, and VA Long Beach

    Features: Age: Age of individual. 20-80 Sex: This is the gender of the individual. It is represented as a binary value where 1 stands for male and 0 stands for female. ChestPainType: This categorizes the type of chest pain experienced by the individual. The values are: Value 1: Typical angina, which is chest pain related to the heart. Value 2: Atypical angina, which is chest pain not related to the heart. Value 3: Non-anginal pain, which is typically sharp and non-continuous. Value 4: Asymptomatic, meaning the individual experiences no symptoms. RestingBP: This is the individual’s resting blood pressure (in mm Hg) when they are at rest. Cholesterol: This is the individual’s cholesterol level, measured in mg/dl. FastingBS: This indicates whether the individual’s fasting blood sugar is greater than 120 mg/dl. It is represented as a binary value where 1 stands for true and 0 stands for false. MaxHR: This is the maximum heart rate achieved by the individual. ExerciseAngina: This indicates whether the individual experiences angina (chest pain) induced by exercise. It is represented as a binary value where 1 stands for yes and 0 stands for no.

  6. Heart disease uci dataset

    • kaggle.com
    zip
    Updated Aug 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muneer Iqbal24 (2024). Heart disease uci dataset [Dataset]. https://www.kaggle.com/datasets/muneeriqbal24/heart-disease-uci-dataset
    Explore at:
    zip(12672 bytes)Available download formats
    Dataset updated
    Aug 23, 2024
    Authors
    Muneer Iqbal24
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Muneer Iqbal24

    Released under CC0: Public Domain

    Contents

  7. Heart Disease Data Set

    • figshare.com
    • kaggle.com
    txt
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyue Zhang (2023). Heart Disease Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.19322552.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Xinyue Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Adaptation of http://archive.ics.uci.edu/ml/datasets/Heart+Disease

    Ready for usage with ehrapy

  8. Heart Disease Risk Prediction Dataset

    • kaggle.com
    zip
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahatir Ahmed Tusher (2025). Heart Disease Risk Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/mahatiratusher/heart-disease-risk-prediction-dataset
    Explore at:
    zip(1448235 bytes)Available download formats
    Dataset updated
    Feb 7, 2025
    Authors
    Mahatir Ahmed Tusher
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Heart Disease Risk Prediction Dataset

    Overview

    This synthetic dataset is designed to predict the risk of heart disease based on a combination of symptoms, lifestyle factors, and medical history. Each row in the dataset represents a patient, with binary (Yes/No) indicators for symptoms and risk factors, along with a computed risk label indicating whether the patient is at high or low risk of developing heart disease.

    The dataset contains 70,000 samples, making it suitable for training machine learning models for classification tasks. The goal is to provide researchers, data scientists, and healthcare professionals with a clean and structured dataset to explore predictive modeling for cardiovascular health.

    This dataset is a side project of EarlyMed, developed by students of Vellore Institute of Technology (VIT-AP). EarlyMed aims to leverage data science and machine learning for early detection and prevention of chronic diseases.

    Dataset Features

    Input Features

    Symptoms (Binary - Yes/No)

    1. Chest Pain (chest_pain): Presence of chest pain, a common symptom of heart disease.
    2. Shortness of Breath (shortness_of_breath): Difficulty breathing, often associated with heart conditions.
    3. Unexplained Fatigue (fatigue): Persistent tiredness without an obvious cause.
    4. Palpitations (palpitations): Irregular or rapid heartbeat.
    5. Dizziness/Fainting (dizziness): Episodes of lightheadedness or fainting.
    6. Swelling in Legs/Ankles (swelling): Swelling due to fluid retention, often linked to heart failure.
    7. Pain in Arm/Jaw/Neck/Back (radiating_pain): Radiating pain, a hallmark of angina or heart attacks.
    8. Cold Sweats & Nausea (cold_sweats): Symptoms commonly associated with acute cardiac events.

    Risk Factors (Binary - Yes/No or Continuous)

    1. Age (age): Patient's age in years (continuous variable).
    2. High Blood Pressure (hypertension): History of hypertension (Yes/No).
    3. High Cholesterol (cholesterol_high): Elevated cholesterol levels (Yes/No).
    4. Diabetes (diabetes): Diagnosis of diabetes (Yes/No).
    5. Smoking History (smoker): Whether the patient is a smoker (Yes/No).
    6. Obesity (obesity): Obesity status (Yes/No).
    7. Family History of Heart Disease (family_history): Family history of cardiovascular conditions (Yes/No).

    Output Label

    • Heart Disease Risk (risk_label): Binary label indicating the risk of heart disease:
      • 0: Low risk
      • 1: High risk

    Data Generation Process

    This dataset was synthetically generated using Python libraries such as numpy and pandas. The generation process ensured a balanced distribution of high-risk and low-risk cases while maintaining realistic correlations between features. For example: - Patients with multiple risk factors (e.g., smoking, hypertension, and diabetes) were more likely to be labeled as high risk. - Symptom patterns were modeled after clinical guidelines and research studies on heart disease.

    Sources of Inspiration

    The design of this dataset was inspired by the following resources:

    Books

    • "Harrison's Principles of Internal Medicine" by J. Larry Jameson et al.: A comprehensive resource on cardiovascular diseases and their symptoms.
    • "Mayo Clinic Cardiology" by Joseph G. Murphy et al.: Provides insights into heart disease risk factors and diagnostic criteria.

    Research Papers

    • Framingham Heart Study: A landmark study identifying key risk factors for cardiovascular disease.
    • American Heart Association (AHA) Guidelines: Recommendations for diagnosing and managing heart disease.

    Existing Datasets

    • UCI Heart Disease Dataset: A widely used dataset for heart disease prediction.
    • Kaggle’s Heart Disease datasets: Various datasets contributed by the community.

    Clinical Guidelines

    • Centers for Disease Control and Prevention (CDC): Information on heart disease symptoms and risk factors.
    • World Health Organization (WHO): Global statistics and risk factor analysis for cardiovascular diseases.

    Applications

    This dataset can be used for a variety of purposes:

    1. Machine Learning Research:

      • Train classification models (e.g., Logistic Regression, Random Forest, XGBoost) to predict heart disease risk.
      • Experiment with feature engineering, model tuning, and evaluation metrics like Accuracy, Precision, Recall, and ROC-AUC.
    2. Healthcare Analytics:

      • Identify key risk factors contributing to heart disease.
      • Develop decision support systems for early detection of cardiovascular risks.
    3. Educational Purposes:

      • Teach students and practitioners about predictive modeling in healthcare.
      • Demonstrate the importance of feature selection...
  9. Heart Disease Dataset

    • kaggle.com
    zip
    Updated Jul 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nezahat Korkmaz (2024). Heart Disease Dataset [Dataset]. https://www.kaggle.com/datasets/nezahatkk/heart-disease-data
    Explore at:
    zip(29490 bytes)Available download formats
    Dataset updated
    Jul 25, 2024
    Authors
    Nezahat Korkmaz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Heart Disease Data: Enhanced with Feature Engineering for Advanced Analysis

    This dataset is an advanced version of the classic UCI Machine Learning heart disease dataset, enriched with feature engineering to support more sophisticated analyses. The original features have been supplemented with newly derived attributes that help to better understand and model cardiovascular risk factors.

    Dataset Description:

    1. age: Age of the patient (int).
    2. sex: Gender of the patient; 1 for male, 0 for female (int).
    3. cp: Chest pain type (int); 1: Typical angina, 2: Atypical angina, 3: Non-anginal pain, 4: Asymptomatic.
    4. trestbps: Resting blood pressure (mm Hg) (int).
    5. chol: Serum cholesterol level (mg/dl) (int).
    6. fbs: Fasting blood sugar > 120 mg/dl; 1 if true, 0 if false (int).
    7. restecg: Resting electrocardiographic results (int); 0: Normal, 1: ST-T wave abnormality, 2: Showing probable or definite left ventricular hypertrophy.
    8. thalach: Maximum heart rate achieved (int).
    9. exang: Exercise induced angina; 1 if yes, 0 if no (int).
    10. oldpeak: Depression induced by exercise relative to rest (float).
    11. slope: Slope of the peak exercise ST segment (int); 1: Upsloping, 2: Flat, 3: Downsloping.
    12. ca: Number of major vessels colored by fluoroscopy (float).
    13. thal: Thalassemia; 3: Normal, 6: Fixed defect, 7: Reversible defect (float).
    14. num: Heart disease diagnosis (int); 0: No disease, 1-4: Disease present with increasing severity.

    Derived Features:

    1. age_group: Age category; '30s', '40s', '50s', '60s'.
    2. cholesterol_level: Cholesterol level category; 'low', 'normal', 'high'.
    3. bp_level: Blood pressure category; 'low', 'normal', 'high'.
    4. risk_score: Calculated risk score using the formula: (\text{age} \times \text{chol} / 1000 + \text{trestbps} / 100).
    5. symptom_severity: Severity of symptoms; calculated as (\text{cp} \times \text{oldpeak}).
    6. log_chol: Logarithm of cholesterol level.
    7. log_trestbps: Logarithm of resting blood pressure.
    8. age_squared: Square of age.
    9. chol_squared: Square of cholesterol level.
    10. age_thalach_ratio: Ratio of maximum heart rate to age (plus 1 to avoid division by zero).
    11. risk_factor: Risk factor calculated as (\text{cp} \times \text{oldpeak} \times \text{thal}).
    12. missing_values: Count of missing values in 'ca' and 'thal'.
    13. chol_trestbps_ratio: Ratio of cholesterol level to resting blood pressure.
    14. log_thalach_chol: Logarithm of the product of maximum heart rate and cholesterol level.
    15. symptom_zscore: Z-score of symptom severity.
    16. avg_chol_by_age_group: Average cholesterol level for the age group.
    17. thalach_chol_diff: Difference between maximum heart rate and cholesterol level.
    18. symptom_severity_diff: Difference in symptom severity compared to the average for the age group.
    19. age_chol_effect: Product of age and cholesterol level.
    20. thalach_risk_effect: Product of maximum heart rate and risk score.
    21. age_trestbps_effect: Product of age and resting blood pressure.
    22. chol_risk_ratio: Ratio of cholesterol level to risk score.

    These additional features facilitate a deeper analysis of cardiovascular health by incorporating various derived metrics and categorizations, enhancing the overall utility of the dataset for predictive modeling and data exploration.

  10. UCI Heart Disease - Explainable AI Project Assets

    • kaggle.com
    zip
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ariyan_Pro (2025). UCI Heart Disease - Explainable AI Project Assets [Dataset]. https://www.kaggle.com/datasets/ariyannadeem/uci-heart-disease-explainable-ai-project-assets
    Explore at:
    zip(1051043 bytes)Available download formats
    Dataset updated
    Nov 18, 2025
    Authors
    Ariyan_Pro
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Medical-Grade Explainable AI Project Assets

    This dataset contains comprehensive assets for a production-ready Explainable AI (XAI) heart disease prediction system achieving 94.1% accuracy with full model transparency.

    📊 CONTEXT: Healthcare AI faces a critical "black box" problem where models make predictions without explanations. This project demonstrates how to build trustworthy medical AI using SHAP and LIME for real-time explainability.

    🎯 PROJECT GOAL: Create a clinically deployable AI system that not only predicts heart disease with high accuracy but also provides interpretable explanations for each prediction, enabling doctor-AI collaboration.

    🚀 KEY FEATURES: - 94.1% prediction accuracy (XGBoost + Optuna) - Real-time SHAP & LIME explanations - FastAPI backend with medical validation - Gradio clinical dashboard - Full MLOps pipeline (MLflow tracking) - 4-Layer enterprise architecture

    📁 ASSETS INCLUDED: - heart_clean.csv - Clinical dataset ready for analysis - SHAP summary plots for global explainability - Performance metrics and visualizations - Architecture diagrams - Model evaluation results

    🔗 COMPANION RESOURCES: - Live Demo: https://huggingface.co/spaces/Ariyan-Pro/HeartDisease-Predictor - Notebook: https://www.kaggle.com/code/ariyannadeem/heart-disease-prediction-with-explainable-ai - Source Code: https://github.com/Ariyan-Pro/ExplainableAI-HeartDisease

    Perfect for learning medical AI implementation, explainable AI techniques, and production deployment.

  11. Data from: Heart Failure Prediction Dataset

    • kaggle.com
    zip
    Updated Sep 10, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    fedesoriano (2021). Heart Failure Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/fedesoriano/heart-failure-prediction/code
    Explore at:
    zip(8762 bytes)Available download formats
    Dataset updated
    Sep 10, 2021
    Authors
    fedesoriano
    License

    Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
    License information was derived automatically

    Description

    Similar Datasets

    • Hepatitis C Dataset: LINK
    • Body Fat Prediction Dataset: LINK
    • Cirrhosis Prediction Dataset: LINK
    • Stroke Prediction Dataset: LINK
    • Stellar Classification Dataset - SDSS17: LINK
    • Wind Speed Prediction Dataset: LINK
    • Spanish Wine Quality Dataset: LINK

    Context

    Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worldwide. Four out of 5CVD deaths are due to heart attacks and strokes, and one-third of these deaths occur prematurely in people under 70 years of age. Heart failure is a common event caused by CVDs and this dataset contains 11 features that can be used to predict a possible heart disease.

    People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.

    Attribute Information

    1. Age: age of the patient [years]
    2. Sex: sex of the patient [M: Male, F: Female]
    3. ChestPainType: chest pain type [TA: Typical Angina, ATA: Atypical Angina, NAP: Non-Anginal Pain, ASY: Asymptomatic]
    4. RestingBP: resting blood pressure [mm Hg]
    5. Cholesterol: serum cholesterol [mm/dl]
    6. FastingBS: fasting blood sugar [1: if FastingBS > 120 mg/dl, 0: otherwise]
    7. RestingECG: resting electrocardiogram results [Normal: Normal, ST: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), LVH: showing probable or definite left ventricular hypertrophy by Estes' criteria]
    8. MaxHR: maximum heart rate achieved [Numeric value between 60 and 202]
    9. ExerciseAngina: exercise-induced angina [Y: Yes, N: No]
    10. Oldpeak: oldpeak = ST [Numeric value measured in depression]
    11. ST_Slope: the slope of the peak exercise ST segment [Up: upsloping, Flat: flat, Down: downsloping]
    12. HeartDisease: output class [1: heart disease, 0: Normal]

    Source

    This dataset was created by combining different datasets already available independently but not combined before. In this dataset, 5 heart datasets are combined over 11 common features which makes it the largest heart disease dataset available so far for research purposes. The five datasets used for its curation are:

    • Cleveland: 303 observations
    • Hungarian: 294 observations
    • Switzerland: 123 observations
    • Long Beach VA: 200 observations
    • Stalog (Heart) Data Set: 270 observations

    Total: 1190 observations Duplicated: 272 observations

    Final dataset: 918 observations

    Every dataset used can be found under the Index of heart disease datasets from UCI Machine Learning Repository on the following link: https://archive.ics.uci.edu/ml/machine-learning-databases/heart-disease/

    Citation

    fedesoriano. (September 2021). Heart Failure Prediction Dataset. Retrieved [Date Retrieved] from https://www.kaggle.com/fedesoriano/heart-failure-prediction.

    Acknowledgements

    Creators:

    1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
    2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
    3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
    4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.

    Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779

  12. Heart Disease Dataset

    • kaggle.com
    zip
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Williams77555 (2023). Heart Disease Dataset [Dataset]. https://www.kaggle.com/datasets/georgewilliams77555/heart-disease-dataset
    Explore at:
    zip(18005 bytes)Available download formats
    Dataset updated
    Feb 16, 2023
    Authors
    George Williams77555
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This data set dates from 1988 and consists of four databases: Cleveland, Hungary, Switzerland, and Long Beach V. It contains 9 attributes and is a shorter version of the original model. The "target" field refers to the presence of heart disease in the patient. It is integer valued 0 = no disease and 1 = disease. Source of the original data can be found here: https://archive.ics.uci.edu/ml/datasets/heart+Disease

    1. age
    2. sex
    3. chest pain type (4 values)
    4. resting blood pressure
    5. serum cholestoral in mg/dl
    6. fasting blood sugar > 120 mg/dl
    7. heart rate max- maximum heart rate achieved
    8. angina - exercise induced angina 0 no, 1 yes
    9. target - 1 = heart disease, 0 = no heart disease
  13. Heart Disease UCI Dataset

    • kaggle.com
    zip
    Updated Aug 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Navjot Kaushal (2025). Heart Disease UCI Dataset [Dataset]. https://www.kaggle.com/datasets/navjotkaushal/heart-disease-uci-dataset
    Explore at:
    zip(9617 bytes)Available download formats
    Dataset updated
    Aug 16, 2025
    Authors
    Navjot Kaushal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains cleaned clinical, demographic, and physiological attributes collected from patients undergoing medical evaluation for potential heart disease. It is widely used for predictive modeling in healthcare, particularly to identify whether a patient is likely to have heart disease based on diagnostic measurements.

    The target variable (num) indicates the presence or absence of heart disease, making this dataset suitable for binary classification tasks.

    Dataset Structure

    Columns (features):

    1. age → Age of the patient (years)

    2. sex → Gender (Male / Female)

    3. cp → Chest pain type

    • typical angina

    • atypical angina

    • non-anginal pain

    • asymptomatic

    1. trestbps → Resting blood pressure (mm Hg)

    2. chol → Serum cholesterol (mg/dl)

    3. fbs → Fasting blood sugar (True if > 120 mg/dl, else False)

    4. restecg → Resting electrocardiographic results (normal, lv hypertrophy, etc.)

    5. thalch → Maximum heart rate achieved

    6. exang → Exercise induced angina (True = yes, False = no)

    7. oldpeak → ST depression induced by exercise relative to rest (numeric value)

    8. num → Target variable (Heart disease diagnosis)

           0 → No heart disease
           1-4 → Heart disease present (severity levels)
      

    Use Cases-

    • Predictive modeling for heart disease classification

    • Exploratory data analysis (EDA) of risk factors

    • Machine learning projects in healthcare analytics

    • Medical research on correlations between risk factors and heart disease

  14. heart-disease-uci

    • kaggle.com
    zip
    Updated Oct 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QuangNguyen711 (2024). heart-disease-uci [Dataset]. https://www.kaggle.com/datasets/quangnguyen711/heart-disease-uci
    Explore at:
    zip(142030 bytes)Available download formats
    Dataset updated
    Oct 3, 2024
    Authors
    QuangNguyen711
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by QuangNguyen711

    Released under MIT

    Contents

  15. UCI Cleveland Heart Dataset

    • kaggle.com
    zip
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nishant Bansal (2022). UCI Cleveland Heart Dataset [Dataset]. https://www.kaggle.com/datasets/nishantbansal01/uci-cleveland-heart-dataset
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    Sep 8, 2022
    Authors
    Nishant Bansal
    Area covered
    Cleveland
    Description

    Dataset

    This dataset was created by Nishant Bansal

    Contents

  16. Framingham heart study dataset

    • kaggle.com
    zip
    Updated Apr 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ashish Bhardwaj (2022). Framingham heart study dataset [Dataset]. https://www.kaggle.com/datasets/aasheesh200/framingham-heart-study-dataset
    Explore at:
    zip(59440 bytes)Available download formats
    Dataset updated
    Apr 19, 2022
    Authors
    Ashish Bhardwaj
    Area covered
    Framingham
    Description

    The "Framingham" heart disease dataset includes over 4,240 records,16 columns and 15 attributes. The goal of the dataset is to predict whether the patient has 10-year risk of future (CHD) coronary heart disease

  17. Heart Disease Prediction Dataset

    • kaggle.com
    zip
    Updated May 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haider Rasool Qadri (2024). Heart Disease Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/haiderrasoolqadri/heart-disease-dataset-uci
    Explore at:
    zip(12672 bytes)Available download formats
    Dataset updated
    May 26, 2024
    Authors
    Haider Rasool Qadri
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Context

    This is a multivariate type of dataset which means providing or involving a variety of separate mathematical or statistical variables, multivariate numerical data analysis. It is composed of 14 attributes which are age, sex, chest pain type, resting blood pressure, serum cholesterol, fasting blood sugar, resting electrocardiographic results, maximum heart rate achieved, exercise-induced angina, oldpeak — ST depression induced by exercise relative to rest, the slope of the peak exercise ST segment, number of major vessels and Thalassemia. This database includes 76 attributes, but all published studies relate to the use of a subset of 14 of them. The Cleveland database is the only one used by ML researchers to date. One of the major tasks on this dataset is to predict based on the given attributes of a patient that whether that particular person has heart disease or not and other is the experimental task to diagnose and find out various insights from this dataset which could help in understanding the problem more.

    Content

    Column Descriptions:

    id: (Unique id for each patient) age: (Age of the patient in years) origin: (place of study) sex: (Male/Female) cp: chest pain type: 1. typical angina 2. atypical angina 3. non-anginal 4. asymptomatic trestbps: resting blood pressure (resting blood pressure (in mm Hg on admission to the hospital)) chol: (serum cholesterol in mg/dl) fbs: (if fasting blood sugar > 120 mg/dl) restecg: (resting electrocardiographic results) Values: [normal, stt abnormality, lv hypertrophy] thalach: maximum heart rate achieved exang: exercise-induced angina (True/ False) oldpeak: ST depression induced by exercise relative to rest slope: the slope of the peak exercise ST segment ca: number of major vessels (0-3) colored by fluoroscopy thal: [normal; fixed defect; reversible defect] num: the predicted attribute [0 shows no disease and 1, 2, 3 and 4 shows different level of disease]

    Acknowledgements

    Creators:

    Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.

    Relevant Papers:

    Detrano, R., Janosi, A., Steinbrunn, W., Pfisterer, M., Schmid, J., Sandhu, S., Guppy, K., Lee, S., & Froelicher, V. (1989). International application of a new probability algorithm for the diagnosis of coronary artery disease. American Journal of Cardiology, 64,304--310. David W. Aha & Dennis Kibler. "Instance-based prediction of heart-disease presence with the Cleveland database." Gennari, J.H., Langley, P, & Fisher, D. (1989). Models of incremental concept formation. Artificial Intelligence, 40, 11--61.

    Citation Request:

    The authors of the databases have requested that any publications resulting from the use of the data include the names of the principal investigator responsible for the data collection at each institution.

    They would be:

    Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation:Robert Detrano, M.D., Ph.D.

  18. Integrated Heart Disease Dataset

    • kaggle.com
    zip
    Updated Apr 2, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rahul Gyawali (2019). Integrated Heart Disease Dataset [Dataset]. https://www.kaggle.com/unikpoet/heartdisease
    Explore at:
    zip(37479 bytes)Available download formats
    Dataset updated
    Apr 2, 2019
    Authors
    Rahul Gyawali
    Description

    Context

    This dataset integrates all the databases present in Heart Disease Dataset available at UCI Machine Learning Repository. Original one contains 4 databases: Cleveland, Hungarian, Long Beach, and Switzerland. Most of the work has been done using Cleveland dataset only.

    Content

    Originally there are 76 attributes in the dataset, Selection of attributes depends on one's need. Here I've taken 10 attributes for the prediction.

    Acknowledgements

    We wouldn't be here without the help of others. If you owe any attributions or thanks, include them here along with any citations of past research.

    Inspiration

    Your data will be in front of the world's largest data science community. What questions do you want to see answered?

  19. Heart Disease Prediction using DifferentTechniques

    • kaggle.com
    zip
    Updated Oct 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jillani SofTech (2021). Heart Disease Prediction using DifferentTechniques [Dataset]. https://www.kaggle.com/datasets/jillanisofttech/heart-disease-prediction-using-differenttechniques
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    Oct 22, 2021
    Authors
    Jillani SofTech
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context: The leading cause of death in the developed world is heart disease. Therefore there needs to be work done to help prevent the risks of having a heart attack or stroke.

    Content: Use this dataset to predict which patients are most likely to suffer from heart disease in the near future using the features given.

    Acknowledgment: This data comes from the UCI at https://archive.ics.uci.edu/ml/datasets/Heart+Disease.

  20. Heart Disease UCI Dataset

    • kaggle.com
    zip
    Updated May 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Saqlain Sheikh (2024). Heart Disease UCI Dataset [Dataset]. https://www.kaggle.com/datasets/saqlainsheikh/heart-disease-uci-dataset/suggestions
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    May 8, 2024
    Authors
    Saqlain Sheikh
    Description

    Dataset

    This dataset was created by Saqlain Sheikh

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Priyanka (2020). Heart Disease Prediction UCI [Dataset]. https://www.kaggle.com/datasets/priyanka841/heart-disease-prediction-uci
Organization logo

Heart Disease Prediction UCI

Machine Learning Classification Project

Explore at:
11 scholarly articles cite this dataset (View in Google Scholar)
zip(3478 bytes)Available download formats
Dataset updated
Apr 16, 2020
Authors
Priyanka
Description

Dataset

This dataset was created by Priyanka

Contents

Search
Clear search
Close search
Google apps
Main menu