83 datasets found
  1. Heart disease uci dataset

    • kaggle.com
    zip
    Updated Aug 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Muneer Iqbal24 (2024). Heart disease uci dataset [Dataset]. https://www.kaggle.com/datasets/muneeriqbal24/heart-disease-uci-dataset
    Explore at:
    zip(12672 bytes)Available download formats
    Dataset updated
    Aug 23, 2024
    Authors
    Muneer Iqbal24
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Muneer Iqbal24

    Released under CC0: Public Domain

    Contents

  2. Heart Disease Prediction UCI

    • kaggle.com
    zip
    Updated Apr 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Priyanka (2020). Heart Disease Prediction UCI [Dataset]. https://www.kaggle.com/datasets/priyanka841/heart-disease-prediction-uci
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    Apr 16, 2020
    Authors
    Priyanka
    Description

    Dataset

    This dataset was created by Priyanka

    Contents

  3. i

    Cardiovascular Disease Dataset

    • ieee-dataport.org
    Updated Oct 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rajib Kumar Halder Halder (2025). Cardiovascular Disease Dataset [Dataset]. https://ieee-dataport.org/documents/cardiovascular-disease-dataset
    Explore at:
    Dataset updated
    Oct 29, 2025
    Authors
    Rajib Kumar Halder Halder
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This heart disease dataset is curated by combining 3 popular heart disease datasets. The first dataset (Collected from Kaggle) contains 70000 records with 11 independent features which makes it the largest heart disease dataset available so far for research purposes. These data were collected at the moment of medical examination and information given by the patient. Second and third datasets contain 303 and 293 intstances respectively with 13 common features. The three datasets used for its curation are:Cardio Data (Kaggle Dataset)

  4. Heart Disease Dataset 2.0

    • kaggle.com
    zip
    Updated Jun 17, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Grems (2024). Heart Disease Dataset 2.0 [Dataset]. https://www.kaggle.com/datasets/gregorygrems/heart-disease-dataset-2002/code
    Explore at:
    zip(5743488 bytes)Available download formats
    Dataset updated
    Jun 17, 2024
    Authors
    Gregory Grems
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Heart Disease Data combined from UCI repository of following places:

    Cleveland, Hungary, Switzerland, and VA Long Beach

    Features: Age: Age of individual. 20-80 Sex: This is the gender of the individual. It is represented as a binary value where 1 stands for male and 0 stands for female. ChestPainType: This categorizes the type of chest pain experienced by the individual. The values are: Value 1: Typical angina, which is chest pain related to the heart. Value 2: Atypical angina, which is chest pain not related to the heart. Value 3: Non-anginal pain, which is typically sharp and non-continuous. Value 4: Asymptomatic, meaning the individual experiences no symptoms. RestingBP: This is the individual’s resting blood pressure (in mm Hg) when they are at rest. Cholesterol: This is the individual’s cholesterol level, measured in mg/dl. FastingBS: This indicates whether the individual’s fasting blood sugar is greater than 120 mg/dl. It is represented as a binary value where 1 stands for true and 0 stands for false. MaxHR: This is the maximum heart rate achieved by the individual. ExerciseAngina: This indicates whether the individual experiences angina (chest pain) induced by exercise. It is represented as a binary value where 1 stands for yes and 0 stands for no.

  5. Heart Disease Data Set

    • figshare.com
    • kaggle.com
    txt
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinyue Zhang (2023). Heart Disease Data Set [Dataset]. http://doi.org/10.6084/m9.figshare.19322552.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Xinyue Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Adaptation of http://archive.ics.uci.edu/ml/datasets/Heart+Disease

    Ready for usage with ehrapy

  6. h

    heart

    • huggingface.co
    Updated Apr 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mattia (2023). heart [Dataset]. https://huggingface.co/datasets/mstz/heart
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2023
    Authors
    Mattia
    License

    https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/

    Description

    Heart

    The Heart dataset from the UCI ML repository. Does the patient have heart disease?

      Configurations and tasks
    

    Configuration Task

    hungary Binary classification

      Usage
    

    from datasets import load_dataset

    dataset = load_dataset("mstz/heart", "hungary")["train"]

  7. Heart Disease UCI Dataset

    • kaggle.com
    zip
    Updated Aug 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Navjot Kaushal (2025). Heart Disease UCI Dataset [Dataset]. https://www.kaggle.com/datasets/navjotkaushal/heart-disease-uci-dataset
    Explore at:
    zip(9617 bytes)Available download formats
    Dataset updated
    Aug 16, 2025
    Authors
    Navjot Kaushal
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains cleaned clinical, demographic, and physiological attributes collected from patients undergoing medical evaluation for potential heart disease. It is widely used for predictive modeling in healthcare, particularly to identify whether a patient is likely to have heart disease based on diagnostic measurements.

    The target variable (num) indicates the presence or absence of heart disease, making this dataset suitable for binary classification tasks.

    Dataset Structure

    Columns (features):

    1. age → Age of the patient (years)

    2. sex → Gender (Male / Female)

    3. cp → Chest pain type

    • typical angina

    • atypical angina

    • non-anginal pain

    • asymptomatic

    1. trestbps → Resting blood pressure (mm Hg)

    2. chol → Serum cholesterol (mg/dl)

    3. fbs → Fasting blood sugar (True if > 120 mg/dl, else False)

    4. restecg → Resting electrocardiographic results (normal, lv hypertrophy, etc.)

    5. thalch → Maximum heart rate achieved

    6. exang → Exercise induced angina (True = yes, False = no)

    7. oldpeak → ST depression induced by exercise relative to rest (numeric value)

    8. num → Target variable (Heart disease diagnosis)

           0 → No heart disease
           1-4 → Heart disease present (severity levels)
      

    Use Cases-

    • Predictive modeling for heart disease classification

    • Exploratory data analysis (EDA) of risk factors

    • Machine learning projects in healthcare analytics

    • Medical research on correlations between risk factors and heart disease

  8. h

    heart-disease-dataset

    • huggingface.co
    Updated Mar 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nezahat Korkmaz (2025). heart-disease-dataset [Dataset]. https://huggingface.co/datasets/nezahatkorkmaz/heart-disease-dataset
    Explore at:
    Dataset updated
    Mar 25, 2025
    Authors
    Nezahat Korkmaz
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    ❤️ Heart Disease Dataset (Enhanced with Feature Engineering)

      📌 Overview
    

    This dataset is an enhanced version of the classic UCI Heart Disease dataset, enriched with extensive feature engineering to support advanced data analysis and machine learning applications. In addition to the original clinical features, several derived variables have been introduced to provide deeper insights into cardiovascular risk patterns. These engineered features allow for improved predictive… See the full description on the dataset page: https://huggingface.co/datasets/nezahatkorkmaz/heart-disease-dataset.

  9. H

    Replication Data for: Cleveland Heart Disease

    • dataverse.harvard.edu
    • search.dataone.org
    Updated Apr 6, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Bartley (2016). Replication Data for: Cleveland Heart Disease [Dataset]. http://doi.org/10.7910/DVN/QWXVNT
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2016
    Dataset provided by
    Harvard Dataverse
    Authors
    Christopher Bartley
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Area covered
    Cleveland
    Description

    Original Data from: https://archive.ics.uci.edu/ml/datasets/Heart+Disease Changes made: - four rows with missing values were removed, leaving 299 records - Chest Pain Type, Restecg, Thal variables were converted to indicator variables - class attribute binarised to -1 (no disease) / +1 disease (original values 1,2,3) Attributes: Col 0: CLASS: -1: no disease +1: disease Col 1: Age (cts) Col 2: Sex (0/1) Col 3: indicator (0/1) for typ angina Col 4: indicator for atyp angina Col 5: indicator for non-ang pain Col 6: resting blood pressure (cts) Col 7: Serum cholest (cts) Col 8: fasting blood sugar >120mg/dl (0/1) Col 9: indicator for electrocardio value 1 Col 10: indicator for electrocardio value 2 Col 11: Max heart rate (cts) Col 12: exercised induced angina (0/1) Col 13: ST depression induced by exercise (cts) Col 14: indicator for slope of peak exercise up Col 15: indicator for slope of peak exercise down Col 16: no major vessels colored by fluro (ctsish: 0,1,2,3) Col 17: Thal reversible defect indicator Col 18: Thal fixed defect indicator Col 19: Class 0-4, where 0 is disease not present, 1-4 is present

  10. UCI Cleveland Heart Dataset

    • kaggle.com
    zip
    Updated Sep 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nishant Bansal (2022). UCI Cleveland Heart Dataset [Dataset]. https://www.kaggle.com/datasets/nishantbansal01/uci-cleveland-heart-dataset
    Explore at:
    zip(3478 bytes)Available download formats
    Dataset updated
    Sep 8, 2022
    Authors
    Nishant Bansal
    Area covered
    Cleveland
    Description

    Dataset

    This dataset was created by Nishant Bansal

    Contents

  11. Heart Disease Dataset

    • kaggle.com
    zip
    Updated Jul 25, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nezahat Korkmaz (2024). Heart Disease Dataset [Dataset]. https://www.kaggle.com/datasets/nezahatkk/heart-disease-data
    Explore at:
    zip(29490 bytes)Available download formats
    Dataset updated
    Jul 25, 2024
    Authors
    Nezahat Korkmaz
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Heart Disease Data: Enhanced with Feature Engineering for Advanced Analysis

    This dataset is an advanced version of the classic UCI Machine Learning heart disease dataset, enriched with feature engineering to support more sophisticated analyses. The original features have been supplemented with newly derived attributes that help to better understand and model cardiovascular risk factors.

    Dataset Description:

    1. age: Age of the patient (int).
    2. sex: Gender of the patient; 1 for male, 0 for female (int).
    3. cp: Chest pain type (int); 1: Typical angina, 2: Atypical angina, 3: Non-anginal pain, 4: Asymptomatic.
    4. trestbps: Resting blood pressure (mm Hg) (int).
    5. chol: Serum cholesterol level (mg/dl) (int).
    6. fbs: Fasting blood sugar > 120 mg/dl; 1 if true, 0 if false (int).
    7. restecg: Resting electrocardiographic results (int); 0: Normal, 1: ST-T wave abnormality, 2: Showing probable or definite left ventricular hypertrophy.
    8. thalach: Maximum heart rate achieved (int).
    9. exang: Exercise induced angina; 1 if yes, 0 if no (int).
    10. oldpeak: Depression induced by exercise relative to rest (float).
    11. slope: Slope of the peak exercise ST segment (int); 1: Upsloping, 2: Flat, 3: Downsloping.
    12. ca: Number of major vessels colored by fluoroscopy (float).
    13. thal: Thalassemia; 3: Normal, 6: Fixed defect, 7: Reversible defect (float).
    14. num: Heart disease diagnosis (int); 0: No disease, 1-4: Disease present with increasing severity.

    Derived Features:

    1. age_group: Age category; '30s', '40s', '50s', '60s'.
    2. cholesterol_level: Cholesterol level category; 'low', 'normal', 'high'.
    3. bp_level: Blood pressure category; 'low', 'normal', 'high'.
    4. risk_score: Calculated risk score using the formula: (\text{age} \times \text{chol} / 1000 + \text{trestbps} / 100).
    5. symptom_severity: Severity of symptoms; calculated as (\text{cp} \times \text{oldpeak}).
    6. log_chol: Logarithm of cholesterol level.
    7. log_trestbps: Logarithm of resting blood pressure.
    8. age_squared: Square of age.
    9. chol_squared: Square of cholesterol level.
    10. age_thalach_ratio: Ratio of maximum heart rate to age (plus 1 to avoid division by zero).
    11. risk_factor: Risk factor calculated as (\text{cp} \times \text{oldpeak} \times \text{thal}).
    12. missing_values: Count of missing values in 'ca' and 'thal'.
    13. chol_trestbps_ratio: Ratio of cholesterol level to resting blood pressure.
    14. log_thalach_chol: Logarithm of the product of maximum heart rate and cholesterol level.
    15. symptom_zscore: Z-score of symptom severity.
    16. avg_chol_by_age_group: Average cholesterol level for the age group.
    17. thalach_chol_diff: Difference between maximum heart rate and cholesterol level.
    18. symptom_severity_diff: Difference in symptom severity compared to the average for the age group.
    19. age_chol_effect: Product of age and cholesterol level.
    20. thalach_risk_effect: Product of maximum heart rate and risk score.
    21. age_trestbps_effect: Product of age and resting blood pressure.
    22. chol_risk_ratio: Ratio of cholesterol level to risk score.

    These additional features facilitate a deeper analysis of cardiovascular health by incorporating various derived metrics and categorizations, enhancing the overall utility of the dataset for predictive modeling and data exploration.

  12. O

    UCI Heart Disease

    • opendatalab.com
    zip
    Updated Feb 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hungarian Institute of Cardiology (2024). UCI Heart Disease [Dataset]. https://opendatalab.com/OpenDataLab/UCI_Heart_Disease
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 4, 2024
    Dataset provided by
    Hungarian Institute of Cardiology
    University Hospital, Zurich
    License

    https://archive.ics.uci.edu/ml/datasets/heart+Diseasehttps://archive.ics.uci.edu/ml/datasets/heart+Disease

    Description

    The UCI Heart Disease Dataset is a heart disease dataset that contains a total of 76 attributes, but all published experiments refer to a subset of 14 attributes, of which the Cleveland database is the only one ML researchers have used.goal ” field refers to whether a patient has heart disease or not, and the experiments on the Cleveland database focused on trying to distinguish between presence (values 1, 2, 3, 4) and absence (value 0).

  13. Heart Disease UCI

    • kaggle.com
    zip
    Updated Oct 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ronak Kantariya (2024). Heart Disease UCI [Dataset]. https://www.kaggle.com/datasets/ronakkantariya/heart-disease-uci
    Explore at:
    zip(12672 bytes)Available download formats
    Dataset updated
    Oct 6, 2024
    Authors
    Ronak Kantariya
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    This dataset was created by Ronak Kantariya

    Released under CC0: Public Domain

    Contents

  14. i

    Heart Disease Dataset (Comprehensive)

    • ieee-dataport.org
    Updated Jan 1, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    MANU SIDDHARTHA (2025). Heart Disease Dataset (Comprehensive) [Dataset]. https://ieee-dataport.org/open-access/heart-disease-dataset-comprehensive
    Explore at:
    Dataset updated
    Jan 1, 2025
    Authors
    MANU SIDDHARTHA
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This heart disease dataset is curated by combining 5 popular heart disease datasets already available independently but not combined before. In this dataset

  15. o

    arrhythmia

    • openml.org
    Updated Apr 6, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    H. Altay Guvenir; Burak Acar; Haldun Muderrisoglu (2014). arrhythmia [Dataset]. https://www.openml.org/d/5
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Apr 6, 2014
    Authors
    H. Altay Guvenir; Burak Acar; Haldun Muderrisoglu
    Description

    Author: H. Altay Guvenir, Burak Acar, Haldun Muderrisoglu
    Source: UCI
    Please cite: UCI

    Cardiac Arrhythmia Database
    The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal.

    Concerning the study of H. Altay Guvenir: "The aim is to distinguish between the presence and absence of cardiac arrhythmia and to classify it in one of the 16 groups. Class 01 refers to 'normal' ECG classes, 02 to 15 refers to different classes of arrhythmia and class 16 refers to the rest of unclassified ones. For the time being, there exists a computer program that makes such a classification. However, there are differences between the cardiologist's and the program's classification. Taking the cardiologist's as a gold standard we aim to minimize this difference by means of machine learning tools.

    The names and id numbers of the patients were recently removed from the database.

    Attribute Information

      1 Age: Age in years , linear
      2 Sex: Sex (0 = male; 1 = female) , nominal
      3 Height: Height in centimeters , linear
      4 Weight: Weight in kilograms , linear
      5 QRS duration: Average of QRS duration in msec., linear
      6 P-R interval: Average duration between onset of P and Q waves
       in msec., linear
      7 Q-T interval: Average duration between onset of Q and offset
       of T waves in msec., linear
      8 T interval: Average duration of T wave in msec., linear
      9 P interval: Average duration of P wave in msec., linear
     Vector angles in degrees on front plane of:, linear
     10 QRS
     11 T
     12 P
     13 QRST
     14 J
     15 Heart rate: Number of heart beats per minute ,linear
     Of channel DI:
      Average width, in msec., of: linear
      16 Q wave
      17 R wave
      18 S wave
      19 R' wave, small peak just after R
      20 S' wave
      21 Number of intrinsic deflections, linear
      22 Existence of ragged R wave, nominal
      23 Existence of diphasic derivation of R wave, nominal
      24 Existence of ragged P wave, nominal
      25 Existence of diphasic derivation of P wave, nominal
      26 Existence of ragged T wave, nominal
      27 Existence of diphasic derivation of T wave, nominal
     Of channel DII: 
      28 .. 39 (similar to 16 .. 27 of channel DI)
     Of channels DIII:
      40 .. 51
     Of channel AVR:
      52 .. 63
     Of channel AVL:
      64 .. 75
     Of channel AVF:
      76 .. 87
     Of channel V1:
      88 .. 99
     Of channel V2:
      100 .. 111
     Of channel V3:
      112 .. 123
     Of channel V4:
      124 .. 135
     Of channel V5:
      136 .. 147
     Of channel V6:
      148 .. 159
     Of channel DI:
      Amplitude , * 0.1 milivolt, of
      160 JJ wave, linear
      161 Q wave, linear
      162 R wave, linear
      163 S wave, linear
      164 R' wave, linear
      165 S' wave, linear
      166 P wave, linear
      167 T wave, linear
      168 QRSA , Sum of areas of all segments divided by 10,
        ( Area= width * height / 2 ), linear
      169 QRSTA = QRSA + 0.5 * width of T wave * 0.1 * height of T
        wave. (If T is diphasic then the bigger segment is
        considered), linear
     Of channel DII:
      170 .. 179
     Of channel DIII:
      180 .. 189
     Of channel AVR:
      190 .. 199
     Of channel AVL:
      200 .. 209
     Of channel AVF:
      210 .. 219
     Of channel V1:
      220 .. 229
     Of channel V2:
      230 .. 239
     Of channel V3:
      240 .. 249
     Of channel V4:
      250 .. 259
     Of channel V5:
      260 .. 269
     Of channel V6:
      270 .. 279
    

    Class code - class - number of instances:

      01       Normal        245
      02       Ischemic changes (Coronary Artery Disease)  44
      03       Old Anterior Myocardial Infarction      15
      04       Old Inferior Myocardial Infarction      15
      05       Sinus tachycardy    13
      06       Sinus bradycardy    25
      07       Ventricular Premature Contraction (PVC)    3
      08       Supraventricular Premature Contraction    2
      09       Left bundle branch block     9 
      10       Right bundle branch block    50
      11       1. degree AtrioVentricular block    0 
      12       2. degree AV block        0
      13       3. degree AV block        0
      14       Left ventricule hypertrophy        4
      15       Atrial Fibrillation or Flutter        5
      16       Others         22
    
  16. Heart Disease Risk Prediction Dataset

    • kaggle.com
    zip
    Updated Feb 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mahatir Ahmed Tusher (2025). Heart Disease Risk Prediction Dataset [Dataset]. https://www.kaggle.com/datasets/mahatiratusher/heart-disease-risk-prediction-dataset
    Explore at:
    zip(1448235 bytes)Available download formats
    Dataset updated
    Feb 7, 2025
    Authors
    Mahatir Ahmed Tusher
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Heart Disease Risk Prediction Dataset

    Overview

    This synthetic dataset is designed to predict the risk of heart disease based on a combination of symptoms, lifestyle factors, and medical history. Each row in the dataset represents a patient, with binary (Yes/No) indicators for symptoms and risk factors, along with a computed risk label indicating whether the patient is at high or low risk of developing heart disease.

    The dataset contains 70,000 samples, making it suitable for training machine learning models for classification tasks. The goal is to provide researchers, data scientists, and healthcare professionals with a clean and structured dataset to explore predictive modeling for cardiovascular health.

    This dataset is a side project of EarlyMed, developed by students of Vellore Institute of Technology (VIT-AP). EarlyMed aims to leverage data science and machine learning for early detection and prevention of chronic diseases.

    Dataset Features

    Input Features

    Symptoms (Binary - Yes/No)

    1. Chest Pain (chest_pain): Presence of chest pain, a common symptom of heart disease.
    2. Shortness of Breath (shortness_of_breath): Difficulty breathing, often associated with heart conditions.
    3. Unexplained Fatigue (fatigue): Persistent tiredness without an obvious cause.
    4. Palpitations (palpitations): Irregular or rapid heartbeat.
    5. Dizziness/Fainting (dizziness): Episodes of lightheadedness or fainting.
    6. Swelling in Legs/Ankles (swelling): Swelling due to fluid retention, often linked to heart failure.
    7. Pain in Arm/Jaw/Neck/Back (radiating_pain): Radiating pain, a hallmark of angina or heart attacks.
    8. Cold Sweats & Nausea (cold_sweats): Symptoms commonly associated with acute cardiac events.

    Risk Factors (Binary - Yes/No or Continuous)

    1. Age (age): Patient's age in years (continuous variable).
    2. High Blood Pressure (hypertension): History of hypertension (Yes/No).
    3. High Cholesterol (cholesterol_high): Elevated cholesterol levels (Yes/No).
    4. Diabetes (diabetes): Diagnosis of diabetes (Yes/No).
    5. Smoking History (smoker): Whether the patient is a smoker (Yes/No).
    6. Obesity (obesity): Obesity status (Yes/No).
    7. Family History of Heart Disease (family_history): Family history of cardiovascular conditions (Yes/No).

    Output Label

    • Heart Disease Risk (risk_label): Binary label indicating the risk of heart disease:
      • 0: Low risk
      • 1: High risk

    Data Generation Process

    This dataset was synthetically generated using Python libraries such as numpy and pandas. The generation process ensured a balanced distribution of high-risk and low-risk cases while maintaining realistic correlations between features. For example: - Patients with multiple risk factors (e.g., smoking, hypertension, and diabetes) were more likely to be labeled as high risk. - Symptom patterns were modeled after clinical guidelines and research studies on heart disease.

    Sources of Inspiration

    The design of this dataset was inspired by the following resources:

    Books

    • "Harrison's Principles of Internal Medicine" by J. Larry Jameson et al.: A comprehensive resource on cardiovascular diseases and their symptoms.
    • "Mayo Clinic Cardiology" by Joseph G. Murphy et al.: Provides insights into heart disease risk factors and diagnostic criteria.

    Research Papers

    • Framingham Heart Study: A landmark study identifying key risk factors for cardiovascular disease.
    • American Heart Association (AHA) Guidelines: Recommendations for diagnosing and managing heart disease.

    Existing Datasets

    • UCI Heart Disease Dataset: A widely used dataset for heart disease prediction.
    • Kaggle’s Heart Disease datasets: Various datasets contributed by the community.

    Clinical Guidelines

    • Centers for Disease Control and Prevention (CDC): Information on heart disease symptoms and risk factors.
    • World Health Organization (WHO): Global statistics and risk factor analysis for cardiovascular diseases.

    Applications

    This dataset can be used for a variety of purposes:

    1. Machine Learning Research:

      • Train classification models (e.g., Logistic Regression, Random Forest, XGBoost) to predict heart disease risk.
      • Experiment with feature engineering, model tuning, and evaluation metrics like Accuracy, Precision, Recall, and ROC-AUC.
    2. Healthcare Analytics:

      • Identify key risk factors contributing to heart disease.
      • Develop decision support systems for early detection of cardiovascular risks.
    3. Educational Purposes:

      • Teach students and practitioners about predictive modeling in healthcare.
      • Demonstrate the importance of feature selection...
  17. processed.cleveland.data.csv

    • figshare.com
    txt
    Updated Aug 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ramkumar R P; Sanjeeva Polepaka; Karuna G; Ch Mallikarjuna Rao (2022). processed.cleveland.data.csv [Dataset]. http://doi.org/10.6084/m9.figshare.20410665.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 1, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Ramkumar R P; Sanjeeva Polepaka; Karuna G; Ch Mallikarjuna Rao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Cleveland
    Description

    Heart Disease Dataset from UCI Repository

  18. Heart Disease UCI

    • kaggle.com
    zip
    Updated Aug 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matt Hartman (2022). Heart Disease UCI [Dataset]. https://www.kaggle.com/datasets/hartman/heart-disease-uci
    Explore at:
    zip(3494 bytes)Available download formats
    Dataset updated
    Aug 13, 2022
    Authors
    Matt Hartman
    Description
    1. age - age in years
    2. sex (1 = male; 0 = female)
    3. cp - chest pain type
      • 0: Typical angina: chest pain related decrease blood supply to the heart
      • 1: Atypical angina: chest pain not related to heart
      • 2: Non-anginal pain: typically esophageal spasms (non heart related)
      • 3: Asymptomatic: chest pain not showing signs of disease
    4. trestbps - resting blood pressure (in mm Hg on admission to the hospital) anything above 130-140 is typically cause for concern
    5. chol - serum cholostoral in mg/dl
      • serum = LDL + HDL + .2 * triglycerides
      • above 200 is cause for concern
    6. fbs - (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
      • '>126' mg/dL signals diabetes
    7. restecg - resting electrocardiographic results
      • 0: Nothing to note
      • 1: ST-T Wave abnormality
        • can range from mild symptoms to severe problems
        • signals non-normal heart beat
      • 2: Possible or definite left ventricular hypertophy
        • Enlarged heart's main pumping chamber
    8. thalach - maximum heart rate achieved
    9. exange - exercise induced angina (1 = yes; 0 = no)
    10. oldpeak - ST depression induced by exercise relative to rest looks at stress of heart during exercise unhealthy heart will stress more
    11. slope - the slope of the peak exercise ST segment
      • 0: Upsloping: better heart rate with exercise (uncommon)
      • 1: Flatsloping: minimal change (typically healthy heart)
      • 3: Downsloping: signs of unhealthy heart
    12. ca - number of major vessels (0-3) colored by flourosopy
      • colored vessel means the doctor can see the blood passing through
      • the more blood movement the better (no clots)
    13. thal - thalium stress result 3 = normal; 6 = fixed defect; 7 = reversable defect
      • 1,3: normal
      • 6: fixed defect: used to be defect but ok now
      • 7: reversable defect: no proper blood movement when exercising
    14. target - have disease or not (1=yes, 0=no) (=the predicted attribute)
  19. Heart Disease Dataset

    • kaggle.com
    zip
    Updated Feb 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Williams77555 (2023). Heart Disease Dataset [Dataset]. https://www.kaggle.com/datasets/georgewilliams77555/heart-disease-dataset
    Explore at:
    zip(18005 bytes)Available download formats
    Dataset updated
    Feb 16, 2023
    Authors
    George Williams77555
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    This data set dates from 1988 and consists of four databases: Cleveland, Hungary, Switzerland, and Long Beach V. It contains 9 attributes and is a shorter version of the original model. The "target" field refers to the presence of heart disease in the patient. It is integer valued 0 = no disease and 1 = disease. Source of the original data can be found here: https://archive.ics.uci.edu/ml/datasets/heart+Disease

    1. age
    2. sex
    3. chest pain type (4 values)
    4. resting blood pressure
    5. serum cholestoral in mg/dl
    6. fasting blood sugar > 120 mg/dl
    7. heart rate max- maximum heart rate achieved
    8. angina - exercise induced angina 0 no, 1 yes
    9. target - 1 = heart disease, 0 = no heart disease
  20. heart-disease-uci

    • kaggle.com
    zip
    Updated Oct 3, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QuangNguyen711 (2024). heart-disease-uci [Dataset]. https://www.kaggle.com/datasets/quangnguyen711/heart-disease-uci
    Explore at:
    zip(142030 bytes)Available download formats
    Dataset updated
    Oct 3, 2024
    Authors
    QuangNguyen711
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset

    This dataset was created by QuangNguyen711

    Released under MIT

    Contents

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Muneer Iqbal24 (2024). Heart disease uci dataset [Dataset]. https://www.kaggle.com/datasets/muneeriqbal24/heart-disease-uci-dataset
Organization logo

Heart disease uci dataset

Explore at:
zip(12672 bytes)Available download formats
Dataset updated
Aug 23, 2024
Authors
Muneer Iqbal24
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Dataset

This dataset was created by Muneer Iqbal24

Released under CC0: Public Domain

Contents

Search
Clear search
Close search
Google apps
Main menu