Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This heart disease dataset is curated by combining 3 popular heart disease datasets. The first dataset (Collected from Kaggle) contains 70000 records with 11 independent features which makes it the largest heart disease dataset available so far for research purposes. These data were collected at the moment of medical examination and information given by the patient. Second and third datasets contain 303 and 293 intstances respectively with 13 common features. The three datasets used for its curation are:Cardio Data (Kaggle Dataset)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Heart Disease UCI’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/ronitf/heart-disease-uci on 28 January 2022.
--- Dataset description provided by original source is as follows ---
This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.
Attribute Information:
- age
- sex
- chest pain type (4 values)
- resting blood pressure
- serum cholestoral in mg/dl
- fasting blood sugar > 120 mg/dl
- resting electrocardiographic results (values 0,1,2)
- maximum heart rate achieved
- exercise induced angina
- oldpeak = ST depression induced by exercise relative to rest
- the slope of the peak exercise ST segment
- number of major vessels (0-3) colored by flourosopy
- thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory.
To see Test Costs (donated by Peter Turney), please see the folder "Costs"
Creators:
1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.
Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779
Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0).
See if you can find any other trends in heart data to predict certain cardiovascular events or find any clear indications of heart health.
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Adaptation of http://archive.ics.uci.edu/ml/datasets/Heart+Disease
Ready for usage with ehrapy
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The dataset is the Cleveland Heart Disease dataset taken from the UCI repository. The dataset consists of 303 individuals’ data. There are 14 columns in the dataset(which have been extracted from a larger set of 75). No missing values. The classification task is to predict whether an individual is suffering from heart disease or not. (0: absence, 1: presence)
original data: https://archive.ics.uci.edu/ml/datasets/Heart+Disease
This database contains 13 attributes and a target variable. It has 8 nominal values and 5 numeric values. The detailed description of all these features are as follows:
Absence (1) or presence (2) of heart disease
Cost Matrix
abse pres
absence 0 1 presence 50
where the rows represent the true values and the columns the predicted.
No missing values.
303 observations
Creators: 1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. 2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. 3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. 4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.
Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779
https://www.reddit.com/wiki/apihttps://www.reddit.com/wiki/api
This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer valued from 0 (no presence) to 4.
Attribute Information:
- age
- sex
- chest pain type (4 values)
- resting blood pressure
- serum cholestoral in mg/dl
- fasting blood sugar > 120 mg/dl
- resting electrocardiographic results (values 0,1,2)
- maximum heart rate achieved
- exercise induced angina
- oldpeak = ST depression induced by exercise relative to rest
- the slope of the peak exercise ST segment
- number of major vessels (0-3) colored by flourosopy
- thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
The names and social security numbers of the patients were recently removed from the database, replaced with dummy values. One file has been "processed", that one containing the Cleveland database. All four unprocessed files also exist in this directory.
To see Test Costs (donated by Peter Turney), please see the folder "Costs"
Creators:
1. Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D.
2. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D.
3. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D.
4. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., Ph.D.
Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779
Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0).
See if you can find any other trends in heart data to predict certain cardiovascular events or find any clear indications of heart health.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This heart disease dataset is curated by combining 5 popular heart disease datasets already available independently but not combined before. In this dataset
https://choosealicense.com/licenses/cc/https://choosealicense.com/licenses/cc/
Heart
The Heart dataset from the UCI ML repository. Does the patient have heart disease?
Configurations and tasks
Configuration Task
hungary Binary classification
Usage
from datasets import load_dataset
dataset = load_dataset("mstz/heart", "hungary")["train"]
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘Heart Disease Prediction UCI’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/priyanka841/heart-disease-prediction-uci on 28 January 2022.
--- No further description of dataset provided by original source ---
--- Original source retains full ownership of the source dataset ---
This dataset was created by Yujie Ma
This dataset was created by Nagaveda Reddy
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
❤️ Heart Disease Dataset (Enhanced with Feature Engineering)
📌 Overview
This dataset is an enhanced version of the classic UCI Heart Disease dataset, enriched with extensive feature engineering to support advanced data analysis and machine learning applications. In addition to the original clinical features, several derived variables have been introduced to provide deeper insights into cardiovascular risk patterns. These engineered features allow for improved predictive… See the full description on the dataset page: https://huggingface.co/datasets/nezahatkorkmaz/heart-disease-dataset.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Heart Disease Dataset from UCI Repository
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
This data set dates from 1988 and consists of four databases: Cleveland, Hungary, Switzerland, and Long Beach V. It contains 9 attributes and is a shorter version of the original model. The "target" field refers to the presence of heart disease in the patient. It is integer valued 0 = no disease and 1 = disease. Source of the original data can be found here: https://archive.ics.uci.edu/ml/datasets/heart+Disease
Author: H. Altay Guvenir, Burak Acar, Haldun Muderrisoglu
Source: UCI
Please cite: UCI
Cardiac Arrhythmia Database
The aim is to determine the type of arrhythmia from the ECG recordings. This database contains 279 attributes, 206 of which are linear valued and the rest are nominal.
Concerning the study of H. Altay Guvenir: "The aim is to distinguish between the presence and absence of cardiac arrhythmia and to classify it in one of the 16 groups. Class 01 refers to 'normal' ECG classes, 02 to 15 refers to different classes of arrhythmia and class 16 refers to the rest of unclassified ones. For the time being, there exists a computer program that makes such a classification. However, there are differences between the cardiologist's and the program's classification. Taking the cardiologist's as a gold standard we aim to minimize this difference by means of machine learning tools.
The names and id numbers of the patients were recently removed from the database.
1 Age: Age in years , linear
2 Sex: Sex (0 = male; 1 = female) , nominal
3 Height: Height in centimeters , linear
4 Weight: Weight in kilograms , linear
5 QRS duration: Average of QRS duration in msec., linear
6 P-R interval: Average duration between onset of P and Q waves
in msec., linear
7 Q-T interval: Average duration between onset of Q and offset
of T waves in msec., linear
8 T interval: Average duration of T wave in msec., linear
9 P interval: Average duration of P wave in msec., linear
Vector angles in degrees on front plane of:, linear
10 QRS
11 T
12 P
13 QRST
14 J
15 Heart rate: Number of heart beats per minute ,linear
Of channel DI:
Average width, in msec., of: linear
16 Q wave
17 R wave
18 S wave
19 R' wave, small peak just after R
20 S' wave
21 Number of intrinsic deflections, linear
22 Existence of ragged R wave, nominal
23 Existence of diphasic derivation of R wave, nominal
24 Existence of ragged P wave, nominal
25 Existence of diphasic derivation of P wave, nominal
26 Existence of ragged T wave, nominal
27 Existence of diphasic derivation of T wave, nominal
Of channel DII:
28 .. 39 (similar to 16 .. 27 of channel DI)
Of channels DIII:
40 .. 51
Of channel AVR:
52 .. 63
Of channel AVL:
64 .. 75
Of channel AVF:
76 .. 87
Of channel V1:
88 .. 99
Of channel V2:
100 .. 111
Of channel V3:
112 .. 123
Of channel V4:
124 .. 135
Of channel V5:
136 .. 147
Of channel V6:
148 .. 159
Of channel DI:
Amplitude , * 0.1 milivolt, of
160 JJ wave, linear
161 Q wave, linear
162 R wave, linear
163 S wave, linear
164 R' wave, linear
165 S' wave, linear
166 P wave, linear
167 T wave, linear
168 QRSA , Sum of areas of all segments divided by 10,
( Area= width * height / 2 ), linear
169 QRSTA = QRSA + 0.5 * width of T wave * 0.1 * height of T
wave. (If T is diphasic then the bigger segment is
considered), linear
Of channel DII:
170 .. 179
Of channel DIII:
180 .. 189
Of channel AVR:
190 .. 199
Of channel AVL:
200 .. 209
Of channel AVF:
210 .. 219
Of channel V1:
220 .. 229
Of channel V2:
230 .. 239
Of channel V3:
240 .. 249
Of channel V4:
250 .. 259
Of channel V5:
260 .. 269
Of channel V6:
270 .. 279
Class code - class - number of instances:
01 Normal 245 02 Ischemic changes (Coronary Artery Disease) 44 03 Old Anterior Myocardial Infarction 15 04 Old Inferior Myocardial Infarction 15 05 Sinus tachycardy 13 06 Sinus bradycardy 25 07 Ventricular Premature Contraction (PVC) 3 08 Supraventricular Premature Contraction 2 09 Left bundle branch block 9 10 Right bundle branch block 50 11 1. degree AtrioVentricular block 0 12 2. degree AV block 0 13 3. degree AV block 0 14 Left ventricule hypertrophy 4 15 Atrial Fibrillation or Flutter 5 16 Others 22
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Context: The leading cause of death in the developed world is heart disease. Therefore there needs to be work done to help prevent the risks of of having a heart attack or stroke.
Content: Use this dataset to predict which patients are most likely to suffer from a heart disease in the near future using the features given.
Acknowledgement: This data comes from the University of California Irvine's Machine Learning Repository at https://archive.ics.uci.edu/ml/datasets/Heart+Disease.
This dataset was created by PAVAN KUMAR D
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Among the world population, the Disease of Heart is one of the biggest mortality and morbidity causes. This disease's precise prediction and early detection might decline rate of mortality rate certainly. Learning machines are utilized to consider several problems in the science of information. In Fortune, one of efficient methods for classification is naïve bayes (NB) is because of the ability of it for learning inherent features of data. Although, generally such method groups data with just one single that makes this less efficient relatively in several classes for big classification issue. In the article, we present the tree structural naïve bayes (Tree-NB) that classifies big classification in small classifications with utilizing structure of tree. The particular classifier is adjusted after division for every small classification. By several classifiers that are employed, Tree-NB is able to complement each other in performance of classification as well as one classifier issue is solved. As all several classifiers are end-to-end frameworks, automatically Tree-NB is able to learn nonlinear relationship among output and input data with no extraction of feature. For verifying our model validity, we compare modern methods with Tree-NB by utilizing dataset of UCI. Experimental results illustrate that Tree- NB is able to obtain the higher performance in less time of training. Average Tree- NB accuracy is 1.19 % higher than the other modern methods also it possesses higher average recall and precision.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Context This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. In particular, the Cleveland database is the only one that has been used by ML researchers to this date. The "goal" field refers to the presence of heart disease in the patient. It is integer-valued from 0 (no presence) to 4.
Acknowledgements Creators:
Hungarian Institute of Cardiology. Budapest: Andras Janosi, M.D. University Hospital, Zurich, Switzerland: William Steinbrunn, M.D. University Hospital, Basel, Switzerland: Matthias Pfisterer, M.D. V.A. Medical Center, Long Beach and Cleveland Clinic Foundation: Robert Detrano, M.D., PhD. Donor: David W. Aha (aha '@' ics.uci.edu) (714) 856-8779
Inspiration Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1,2,3,4) from absence (value 0).
See if you can find any other trends in heart data to predict certain cardiovascular events or find any clear indications of heart health.
About Dataset: This dataset is a heart disease database similar to a database already present in the repository (Heart Disease databases) but in a slightly different form.
Cite at: Dua, D. and Graff, C. (2019). UCI Machine Learning Repository [http://archive.ics.uci.edu/ml]. Irvine, CA: University of California, School of Information and Computer Science.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This program establishes a deep learning model of CNN+LSTM, which is used for continuous monitoring of exercise heart rate with PPG signals containing motion artifacts, and has achieved good results in the PPG-DaLiA database. The description is as follows: 1. The file main_program_file is the main file, including model construction, data processing, data training, model data verification, and other processing programs for PPG signals that are not used in this article. model: build exercise heart rate monitoring model file; activity_time.xls: Collect each activity time node of each volunteer signal obtained from the PPG-DaLiA database; original_data_read.py: signal data preprocessing program (signal from the PPG-DaLiA database); ppg_filed_hr_cornet_estimate.py: training and prediction program for all volunteers’ PPG signals; ppg_filed_hr_cornet_estimate_single.py: a program to predict the PPG signal of a single volunteer; _1d_cnn, _2d_cnn, ppg_excerise_cnn_type.py, ppg_filed_hr_cnn_estimate.py: programs that use the CNN method for prediction; spc_hr_cornet_estimate.py, spc_hr_cnn_estimate.py: programs for predicting and verifying using other database PPG signals. save_model_estimate_hr.py, save_model_estimate_hr_spc.py: save the heart rate prediction model and the model program for the heart rate prediction model to be used in the SPC database. out_fig: model prediction picture output folder; 2. Data source The data comes from the PPG-DaLiA database (PPG Data For Daily Life Activity, https://archive.ics.uci.edu/ml/datasets/PPG-DaLiA): The database comes from Robert Bosch GmbH and Bosch Sensortec GmbH. The signals in this database come from 15 volunteers of different ages and different physical conditions. PPG and heart rate data are continuously collected during different exercises. The preprocessing of the downloaded data is in the program original_data_read.py. 3.other _0_basic_fun, ch3_preprocess, my_pyhht_lib: some external references of the main program, mainly the functions called by the data preprocessing part, and the main program can view their functions.This program establishes a deep learning model of CNN+LSTM, which is used for continuous monitoring of exercise heart rate with PPG signals containing motion artifacts, and has achieved good results in the PPG-DaLiA database. The description is as follows: 1. The file main_program_file is the main file, including model construction, data processing, data training, model data verification, and other processing programs for PPG signals that are not used in this article. model: build exercise heart rate monitoring model file; activity_time.xls: Collect each activity time node of each volunteer signal obtained from the PPG-DaLiA database; original_data_read.py: signal data preprocessing program (signal from the PPG-DaLiA database); ppg_filed_hr_cornet_estimate.py: training and prediction program for all volunteers’ PPG signals; ppg_filed_hr_cornet_estimate_single.py: a program to predict the PPG signal of a single volunteer; _1d_cnn, _2d_cnn, ppg_excerise_cnn_type.py, ppg_filed_hr_cnn_estimate.py: programs that use the CNN method for prediction; spc_hr_cornet_estimate.py, spc_hr_cnn_estimate.py: programs for predicting and verifying using other database PPG signals. save_model_estimate_hr.py, save_model_estimate_hr_spc.py: save the heart rate prediction model and the model program for the heart rate prediction model to be used in the SPC database. out_fig: model prediction picture output folder; 2. Data source The data comes from the PPG-DaLiA database (PPG Data For Daily Life Activity, https://archive.ics.uci.edu/ml/datasets/PPG-DaLiA): The database comes from Robert Bosch GmbH and Bosch Sensortec GmbH. The signals in this database come from 15 volunteers of different ages and different physical conditions. PPG and heart rate data are continuously collected during different exercises. The preprocessing of the downloaded data is in the program original_data_read.py. 3.other _0_basic_fun, ch3_preprocess, my_pyhht_lib: some external references of the main program, mainly the functions called by the data preprocessing part, and the main program can view their functions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This heart disease dataset is curated by combining 3 popular heart disease datasets. The first dataset (Collected from Kaggle) contains 70000 records with 11 independent features which makes it the largest heart disease dataset available so far for research purposes. These data were collected at the moment of medical examination and information given by the patient. Second and third datasets contain 303 and 293 intstances respectively with 13 common features. The three datasets used for its curation are:Cardio Data (Kaggle Dataset)