100+ datasets found
  1. m

    An Extensive Dataset for the Heart Disease Classification System

    • data.mendeley.com
    Updated Feb 17, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.2
    Explore at:
    Dataset updated
    Feb 17, 2022
    Authors
    Sozan S. Maghdid
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70. A comprehensive database for factors that contribute to a heart attack has been constructed. The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. The size of the dataset is 1319 samples, which have nine fields, where eight fields are for input fields and one field for an output field. Age, gender, heart rate (impulse), systolic BP (pressurehight), diastolic BP (pressurelow), blood sugar(glucose), CK-MB (kcm), and Test-Troponin (troponin) are representing the input fields, while the output field pertains to the presence of heart attack (class), which is divided into two categories (negative and positive); negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.

  2. p

    Heart Failure Prediction - Dataset - CKAN

    • data.poltekkes-smg.ac.id
    Updated Oct 8, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Heart Failure Prediction - Dataset - CKAN [Dataset]. https://data.poltekkes-smg.ac.id/dataset/heart-failure-prediction
    Explore at:
    Dataset updated
    Oct 8, 2024
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure. Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies. People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.

  3. s

    GP recorded coronary heart disease rates - Dataset - data.gov.uk

    • ckan.publishing.service.gov.uk
    Updated Jun 3, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2016). GP recorded coronary heart disease rates - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/gp-recorded-chd-rates
    Explore at:
    Dataset updated
    Jun 3, 2016
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    A dataset providing GP recorded coronary heart disease. Coronary heart disease (CHD) is the leading cause of death both in the UK and worldwide. It's responsible for more than 73,000 deaths in the UK each year. About 1 in 6 men and 1 in 10 women die from CHD. In the UK, there are an estimated 2.3 million people living with CHD and around 2 million people affected by angina (the most common symptom of coronary heart disease). CHD generally affects more men than women, although from the age of 50 the chances of developing the condition are similar for both sexes. As well as angina (chest pain), the main symptoms of CHD are heart attacks and heart failure. However, not everyone has the same symptoms and some people may not have any before CHD is diagnosed. CHD is sometimes called ischaemic heart disease.

  4. Predicting Heart Failure

    • kaggle.com
    Updated Sep 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Chauhan (2022). Predicting Heart Failure [Dataset]. https://www.kaggle.com/datasets/whenamancodes/heart-failure-clinical-records
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 13, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    Aman Chauhan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure.

    Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.

    People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.

    Attribute Information:

    Thirteen (13) clinical features: - age: age of the patient (years) - anaemia: decrease of red blood cells or hemoglobin (boolean) - high blood pressure: if the patient has hypertension (boolean) - creatinine phosphokinase (CPK): level of the CPK enzyme in the blood (mcg/L) - diabetes: if the patient has diabetes (boolean) - ejection fraction: percentage of blood leaving the heart at each contraction (percentage) - platelets: platelets in the blood (kiloplatelets/mL) - sex: woman or man (binary) - serum creatinine: level of serum creatinine in the blood (mg/dL) - serum sodium: level of serum sodium in the blood (mEq/L) - smoking: if the patient smokes or not (boolean) - time: follow-up period (days) - [target] death event: if the patient deceased during the follow-up period (boolean)

    More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Haha

  5. d

    SHIP Age-Adjusted Mortality Rate From Heart Disease 2009-2021

    • catalog.data.gov
    • opendata.maryland.gov
    • +2more
    Updated Aug 16, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata.maryland.gov (2024). SHIP Age-Adjusted Mortality Rate From Heart Disease 2009-2021 [Dataset]. https://catalog.data.gov/dataset/ship-age-adjusted-mortality-rate-from-heart-disease-2009-2017
    Explore at:
    Dataset updated
    Aug 16, 2024
    Dataset provided by
    opendata.maryland.gov
    Description

    This is historical data. The update frequency has been set to "Static Data" and is here for historic value. Updated on 8/14/2024 Age-Adjusted Mortality Rate From Heart Disease - This indicator shows the age-adjusted mortality rate from heart disease (per 100,000 population). Heart disease is the leading cause of death in Maryland accounting for 25% of all deaths. Between 2012-2014, over 30,000 people died of heart disease in Maryland. Link to Data Details

  6. a

    Coronary heart disease (in persons of all ages): England

    • hub.arcgis.com
    • data.catchmentbasedapproach.org
    Updated Apr 7, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Rivers Trust (2021). Coronary heart disease (in persons of all ages): England [Dataset]. https://hub.arcgis.com/datasets/theriverstrust::coronary-heart-disease-in-persons-of-all-ages-england/explore
    Explore at:
    Dataset updated
    Apr 7, 2021
    Dataset authored and provided by
    The Rivers Trust
    Area covered
    Description

    SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of coronary heart disease (in persons of all ages). Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to coronary heart disease (in persons of all ages).This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.The percentage of each MSOA’s population (all ages) with coronary heart disease was estimated. This was achieved by calculating a weighted average based on:The percentage of the MSOA area that was covered by each GP practice’s catchment areaOf the GPs that covered part of that MSOA: the percentage of registered patients that have that illness The estimated percentage of each MSOA’s population with coronary heart disease was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with coronary heart disease, within the relevant age range.Each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have coronary heart diseaseB) the NUMBER of people within that MSOA who are estimated to have coronary heart diseaseAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA that are estimated to have coronary heart disease, compared to other MSOAs. In other words, those are areas where it’s estimated a large number of people suffer from coronary heart disease, and where those people make up a large percentage of the population, indicating there is a real issue with coronary heart disease within the population and the investment of resources to address that issue could have the greatest benefits.LIMITATIONS1. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).2. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.3. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of coronary heart disease, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of coronary heart disease.TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:Health and wellbeing statistics (GP-level, England): Missing data and potential outliersLevels of obesity, inactivity and associated illnesses (England): Missing dataDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.

  7. d

    1.1 Under 75 mortality rate from cardiovascular disease

    • digital.nhs.uk
    csv, pdf, xlsx
    Updated Mar 17, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). 1.1 Under 75 mortality rate from cardiovascular disease [Dataset]. https://digital.nhs.uk/data-and-information/publications/statistical/nhs-outcomes-framework/march-2022
    Explore at:
    csv(148.2 kB), pdf(860.1 kB), xlsx(239.1 kB), pdf(225.4 kB)Available download formats
    Dataset updated
    Mar 17, 2022
    License

    https://digital.nhs.uk/about-nhs-digital/terms-and-conditionshttps://digital.nhs.uk/about-nhs-digital/terms-and-conditions

    Time period covered
    Jan 1, 2003 - Dec 31, 2020
    Area covered
    England
    Description

    Update 2 March 2023: Following the merger of NHS Digital and NHS England on 1st February 2023 we are reviewing the future presentation of the NHS Outcomes Framework indicators. As part of this review, the annual publication which was due to be released in March 2023 has been delayed. Further announcements about this dataset will be made on this page in due course. Directly standardised mortality rate from cardiovascular disease for people aged under 75, per 100,000 population. To ensure that the NHS is held to account for doing all that it can to prevent deaths from cardiovascular disease in people under 75. Some different patterns have been observed in the 2020 mortality data which are likely to have been impacted by the coronavirus (COVID-19) pandemic. Statistics from this period should also be interpreted with care. Legacy unique identifier: P01730

  8. d

    Mortality Rates

    • catalog.data.gov
    • data.amerigeoss.org
    • +3more
    Updated Nov 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lake County Illinois GIS (2024). Mortality Rates [Dataset]. https://catalog.data.gov/dataset/mortality-rates-6fb72
    Explore at:
    Dataset updated
    Nov 22, 2024
    Dataset provided by
    Lake County Illinois GIS
    Description

    Mortality Rates for Lake County, Illinois. Explanation of field attributes: Average Age of Death – The average age at which a people in the given zip code die. Cancer Deaths – Cancer deaths refers to individuals who have died of cancer as the underlying cause. This is a rate per 100,000. Heart Disease Related Deaths – Heart Disease Related Deaths refers to individuals who have died of heart disease as the underlying cause. This is a rate per 100,000. COPD Related Deaths – COPD Related Deaths refers to individuals who have died of chronic obstructive pulmonary disease (COPD) as the underlying cause. This is a rate per 100,000.

  9. Deaths, by cause, Chapter IX: Diseases of the circulatory system (I00 to...

    • www150.statcan.gc.ca
    • ouvert.canada.ca
    • +2more
    Updated Feb 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Government of Canada, Statistics Canada (2025). Deaths, by cause, Chapter IX: Diseases of the circulatory system (I00 to I99) [Dataset]. http://doi.org/10.25318/1310014701-eng
    Explore at:
    Dataset updated
    Feb 19, 2025
    Dataset provided by
    Statistics Canadahttps://statcan.gc.ca/en
    Area covered
    Canada
    Description

    Number of deaths caused by diseases of the circulatory system, by age group and sex, 2000 to most recent year.

  10. A

    ‘Heart Failure Prediction’ analyzed by Analyst-2

    • analyst-2.ai
    Updated Nov 21, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com) (2021). ‘Heart Failure Prediction’ analyzed by Analyst-2 [Dataset]. https://analyst-2.ai/analysis/kaggle-heart-failure-prediction-e809/6cc020ab/?iid=025-855&v=presentation
    Explore at:
    Dataset updated
    Nov 21, 2021
    Dataset authored and provided by
    Analyst-2 (analyst-2.ai) / Inspirient GmbH (inspirient.com)
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of ‘Heart Failure Prediction’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/andrewmvd/heart-failure-clinical-data on 21 November 2021.

    --- Dataset description provided by original source is as follows ---

    About this dataset

    Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure.

    Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.

    People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.

    How to use this dataset

    • Create a model for predicting mortality caused by Heart Failure.
    • Your kernel can be featured here!
    • More datasets

    Acknowledgements

    If you use this dataset in your research, please credit the authors

    Citation

    Davide Chicco, Giuseppe Jurman: Machine learning can predict survival of patients with heart failure from serum creatinine and ejection fraction alone. BMC Medical Informatics and Decision Making 20, 16 (2020). (link)

    License

    CC BY 4.0

    Splash icon

    Icon by Freepik, available on Flaticon.

    Splash banner

    Wallpaper by jcomp, available on Freepik.

    --- Original source retains full ownership of the source dataset ---

  11. I

    Ivory Coast CI: Mortality from CVD, Cancer, Diabetes or CRD between Exact...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Ivory Coast CI: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male [Dataset]. https://www.ceicdata.com/en/ivory-coast/health-statistics/ci-mortality-from-cvd-cancer-diabetes-or-crd-between-exact-ages-30-and-70-male
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2000 - Dec 1, 2016
    Area covered
    Côte d'Ivoire
    Description

    Ivory Coast CI: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data was reported at 28.200 NA in 2016. This records a decrease from the previous number of 28.500 NA for 2015. Ivory Coast CI: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data is updated yearly, averaging 27.700 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 28.500 NA in 2015 and a record low of 25.200 NA in 2000. Ivory Coast CI: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Ivory Coast – Table CI.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;

  12. BRFSS 2020 Heart Disease Dataset(Cleaned Version)

    • zenodo.org
    csv
    Updated May 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koushal Kumar; BP Pande; Koushal Kumar; BP Pande (2025). BRFSS 2020 Heart Disease Dataset(Cleaned Version) [Dataset]. http://doi.org/10.5281/zenodo.15364962
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 8, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Koushal Kumar; BP Pande; Koushal Kumar; BP Pande
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Originally, the dataset come from the CDC and is a major part of the Behavioral Risk Factor Surveillance System (BRFSS), which conducts annual telephone surveys to gather data on the health status of U.S. residents. As the CDC describes: "Established in 1984 with 15 states, BRFSS now collects data in all 50 states as well as the District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.". The most recent dataset (as of February 15, 2022) includes data from 2020. It consists of 401,958 rows and 279 columns. The vast majority of columns are questions asked to respondents about their health status, such as "Do you have serious difficulty walking or climbing stairs?" or "Have you smoked at least 100 cigarettes in your entire life? [Note: 5 packs = 100 cigarettes]".

    To improve the efficiency and relevance of our analysis, we removed certain attributes from the original BRFSS dataset. Many of the 279 original attributes included administrative codes, metadata, or survey-specific variables that do not contribute meaningfully to heart disease prediction—such as respondent IDs, timestamps, state-level identifiers, and detailed lifestyle questions unrelated to cardiovascular health. By focusing on a carefully selected subset of 18 attributes directly linked to medical, behavioral, and demographic factors known to influence heart health, we streamlined the dataset. This not only reduced computational complexity but also improved model interpretability and performance by eliminating noise and irrelevant information. All predicting variables could be divided into 4 broad categories:

    1. Demographic factors: sex, age category (14 levels), race, BMI (Body Mass Index)

    2. Diseases: weather respondent ever had such diseases as asthma, skin cancer, diabetes, stroke or kidney disease (not including kidney stones, bladder infection or incontinence)

    3. Unhealthy habits:

      • Smoking - respondents that smoked at least 100 cigarettes in their entire life (5 packs = 100 cigarettes)
      • Alcohol Drinking - heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week
    4. General Health:

      • Difficulty Walking - weather respondent have serious difficulty walking or climbing stairs
      • Physical Activity - adults who reported doing physical activity or exercise during the past 30 days other than their regular job
      • Sleep Time - respondent’s reported average hours of sleep in a 24-hour period
      • Physical Health - number of days being physically ill or injured (0-30 days)
      • Mental Health - number of days having bad mental health (0-30 days)
      • General Health - respondents declared their health as ’Excellent’, ’Very good’, ’Good’ ,’Fair’ or ’Poor’

    Below is a description of the features collected for each patient:

    <td style="width:

    S. No.

    Original Variable/Attribute

    Coded Variable/Attribute

    Interpretation

    1.

    CVDINFR4

    HeartDisease

    Those who have ever had CHD or myocardial infarction

    2.

    _BMI5CAT

    BMI

    Body Mass Index

    3.

    _SMOKER3

    Smoking

    Have you ever smoked more than 100 cigarettes in your life? (The answer is either yes or no)

    4.

    _RFDRHV7

    AlcoholDrinking

    Adult men who drink more than 14 drinks per week and adult women who consume more than 7 drinks per week are considered heavy drinkers

    5.

    CVDSTRK3

    Stroke

    (Ever told) (you had) a stroke?

    6.

    PHYSHLTH

    PhysicalHealth

    It includes physical illness and injury during the past 30 days

    7.

    MENTHLTH

    MentalHealth

    How many days in the last 30 days have you had poor mental health?

    8.

    DIFFWALK

    DiffWalking

    Are you having trouble walking or climbing stairs?

    9.

    SEXVAR

    Sex

    Are you male or female?

    10.

    _AGE_G

    AgeCategory

    Out of given fourteen age groups, which group do you fall into?

  13. BRFSS 2020 Heart Disease Dataset(Cleaned Version)

    • zenodo.org
    csv
    Updated May 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Koushal Kumar; BP Pande; Koushal Kumar; BP Pande (2025). BRFSS 2020 Heart Disease Dataset(Cleaned Version) [Dataset]. http://doi.org/10.5281/zenodo.15336526
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 4, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Koushal Kumar; BP Pande; Koushal Kumar; BP Pande
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Originally, the dataset come from the CDC and is a major part of the Behavioral Risk Factor Surveillance System (BRFSS), which conducts annual telephone surveys to gather data on the health status of U.S. residents. As the CDC describes: "Established in 1984 with 15 states, BRFSS now collects data in all 50 states as well as the District of Columbia and three U.S. territories. BRFSS completes more than 400,000 adult interviews each year, making it the largest continuously conducted health survey system in the world.". The most recent dataset (as of February 15, 2022) includes data from 2020. It consists of 401,958 rows and 279 columns. The vast majority of columns are questions asked to respondents about their health status, such as "Do you have serious difficulty walking or climbing stairs?" or "Have you smoked at least 100 cigarettes in your entire life? [Note: 5 packs = 100 cigarettes]".

    To improve the efficiency and relevance of our analysis, we removed certain attributes from the original BRFSS dataset. Many of the 279 original attributes included administrative codes, metadata, or survey-specific variables that do not contribute meaningfully to heart disease prediction—such as respondent IDs, timestamps, state-level identifiers, and detailed lifestyle questions unrelated to cardiovascular health. By focusing on a carefully selected subset of 18 attributes directly linked to medical, behavioral, and demographic factors known to influence heart health, we streamlined the dataset. This not only reduced computational complexity but also improved model interpretability and performance by eliminating noise and irrelevant information. All predicting variables could be divided into 4 broad categories:

    1. Demographic factors: sex, age category (14 levels), race, BMI (Body Mass Index)

    2. Diseases: weather respondent ever had such diseases as asthma, skin cancer, diabetes, stroke or kidney disease (not including kidney stones, bladder infection or incontinence)

    3. Unhealthy habits:

      • Smoking - respondents that smoked at least 100 cigarettes in their entire life (5 packs = 100 cigarettes)
      • Alcohol Drinking - heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week
    4. General Health:

      • Difficulty Walking - weather respondent have serious difficulty walking or climbing stairs
      • Physical Activity - adults who reported doing physical activity or exercise during the past 30 days other than their regular job
      • Sleep Time - respondent’s reported average hours of sleep in a 24-hour period
      • Physical Health - number of days being physically ill or injured (0-30 days)
      • Mental Health - number of days having bad mental health (0-30 days)
      • General Health - respondents declared their health as ’Excellent’, ’Very good’, ’Good’ ,’Fair’ or ’Poor’

    Below is a description of the features collected for each patient:

    #FeatureCoded Variable NameDescription
    1HeartDiseaseCVDINFR4Respondents that have ever reported having coronary heart disease (CHD) or myocardial infarction (MI)
    2BMI_BMI5CATBody Mass Index (BMI)
    3Smoking_SMOKER3Have you smoked at least 100 cigarettes in your entire life? [Note: 5 packs = 100 cigarettes]
    4AlcoholDrinking_RFDRHV7Heavy drinkers (adult men having more than 14 drinks per week and adult women having more than 7 drinks per week
    5StrokeCVDSTRK3(Ever told) (you had) a stroke?
    6PhysicalHealthPHYSHLTHNow thinking about your physical health, which includes physical illness and injury, for how many days during the past 30
    7MentalHealthMENTHLTHThinking about your mental health, for how many days during the past 30 days was your mental health not good?
    8DiffWalkingDIFFWALKDo you have serious difficulty walking or climbing stairs?
    9SexSEXVARAre you male or female?
    10AgeCategory_AGE_G,Fourteen-level age category
    11Race_IMPRACEImputed race/ethnicity value
    12DiabeticDIABETE4(Ever told) (you had) diabetes?
    13PhysicalActivityEXERANY2Adults who reported doing physical activity or exercise during the past 30 days other than their regular job
    14GenHealthGENHLTHWould you say that in general your health is...
    15SleepTimeSLEPTIM1On average, how many hours of sleep do you get in a 24-hour period?
    16AsthmaCHASTHMA(Ever told) (you had) asthma?
    17KidneyDiseaseCHCKDNY2Not including kidney stones, bladder infection or incontinence, were you ever told you had kidney disease?
    18SkinCancerCHCSCNCR(Ever told) (you had) skin cancer?
  14. Under 75 mortality rate from cardiovascular disease (NHSOF 1.1) - Dataset -...

    • ckan.publishing.service.gov.uk
    Updated Aug 4, 2015
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ckan.publishing.service.gov.uk (2015). Under 75 mortality rate from cardiovascular disease (NHSOF 1.1) - Dataset - data.gov.uk [Dataset]. https://ckan.publishing.service.gov.uk/dataset/under-75-mortality-rate-from-cardiovascular-disease-nhsof-1-1
    Explore at:
    Dataset updated
    Aug 4, 2015
    Dataset provided by
    CKANhttps://ckan.org/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Directly standardised mortality rate from cardiovascular disease for people aged under 75, per 100,000 population. Purpose To ensure that the NHS is held to account for doing all that it can to prevent deaths from cardiovascular disease in people under 75. Current version updated: Feb-17 Next version due: Nov-17

  15. A

    SHIP Age-Adjusted Mortality Rate From Heart Disease 2009-2017

    • data.amerigeoss.org
    csv, json, rdf, xml
    Updated Jul 28, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    United States[old] (2019). SHIP Age-Adjusted Mortality Rate From Heart Disease 2009-2017 [Dataset]. https://data.amerigeoss.org/pl/dataset/a2c21bc0-3229-4bc7-a063-8a708e298601
    Explore at:
    xml, csv, rdf, jsonAvailable download formats
    Dataset updated
    Jul 28, 2019
    Dataset provided by
    United States[old]
    Description

    Age-Adjusted Mortality Rate From Heart Disease - This indicator shows the age-adjusted mortality rate from heart disease (per 100,000 population). Heart disease is the leading cause of death in Maryland accounting for 25% of all deaths. Between 2012-2014, over 30,000 people died of heart disease in Maryland.

  16. Microsoft Data Science Capstone

    • kaggle.com
    Updated Jul 30, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nandvard (2018). Microsoft Data Science Capstone [Dataset]. https://www.kaggle.com/nandvard/microsoft-data-science-capstone/code
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jul 30, 2018
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    nandvard
    Description

    The goal is to predict the rate of heart disease (per 100,000 individuals) across the United States at the county-level from other socioeconomic indicators. The data is compiled from a wide range of sources and made publicly available by the United States Department of Agriculture Economic Research Service (USDA ERS).

    There are 33 variables in this dataset. Each row in the dataset represents a United States county, and the dataset we are working with covers two particular years, denoted a, and b We don't provide a unique identifier for an individual county, just a row_id for each row.

    The variables in the dataset have names that of the form category_variable, where category is the high level category of the variable (e.g. econ or health). variable is what the specific column contains.

    We're trying to predict the variable heart_disease_mortality_per_100k (a positive integer) for each row of the test data set.

    Columns

    area — information about the county

    area_rucc — Rural-Urban Continuum Codes "form a classification scheme that distinguishes metropolitan counties by the population size of their metro area, and nonmetropolitan counties by degree of urbanization and adjacency to a metro area. The official Office of Management and Budget (OMB) metro and nonmetro categories have been subdivided into three metro and six nonmetro categories. Each county in the U.S. is assigned one of the 9 codes." (USDA Economic Research Service, https://www.ers.usda.gov/data-products/rural-urban-continuum-codes/)

    area_urban_influence — Urban Influence Codes "form a classification scheme that distinguishes metropolitan counties by population size of their metro area, and nonmetropolitan counties by size of the largest city or town and proximity to metro and micropolitan areas." (USDA Economic Research Service, https://www.ers.usda.gov/data-products/urban-influence-codes/)

    econ — economic indicators

    econ_economic_typology — County Typology Codes "classify all U.S. counties according to six mutually exclusive categories of economic dependence and six overlapping categories of policy-relevant themes. The economic dependence types include farming, mining, manufacturing, Federal/State government, recreation, and nonspecialized counties. The policy-relevant types include low education, low employment, persistent poverty, persistent child poverty, population loss, and retirement destination." (USDA Economic Research Service, https://www.ers.usda.gov/data-products/county-typology-codes.aspx)

    econ_pct_civilian_labor — Civilian labor force, annual average, as percent of population (Bureau of Labor Statistics, http://www.bls.gov/lau/)

    econ_pct_unemployment — Unemployment, annual average, as percent of population (Bureau of Labor Statistics, http://www.bls.gov/lau/)

    econ_pct_uninsured_adults — Percent of adults without health insurance (Bureau of Labor Statistics, http://www.bls.gov/lau/) econ_pct_uninsured_children — Percent of children without health insurance (Bureau of Labor Statistics, http://www.bls.gov/lau/)

    health — health indicators

    health_pct_adult_obesity — Percent of adults who meet clinical definition of obese (National Center for Chronic Disease Prevention and Health Promotion)

    health_pct_adult_smoking — Percent of adults who smoke (Behavioral Risk Factor Surveillance System)

    health_pct_diabetes — Percent of population with diabetes (National Center for Chronic Disease Prevention and Health Promotion, Division of Diabetes Translation)

    health_pct_low_birthweight — Percent of babies born with low birth weight (National Center for Health Statistics)

    health_pct_excessive_drinking — Percent of adult population that engages in excessive consumption of alcohol (Behavioral Risk Factor Surveillance System, )

    health_pct_physical_inacticity — Percent of adult population that is physically inactive (National Center for Chronic Disease Prevention and Health Promotion)

    health_air_pollution_particulate_matter — Fine particulate matter in µg/m³ (CDC WONDER, https://wonder.cdc.gov/wonder/help/pm.html)

    health_homicides_per_100k — Deaths by homicide per 100,000 population (National Center for Health Statistics)

    health_motor_vehicle_crash_deaths_per_100k — Deaths by motor vehicle crash per 100,000 population (National Center for Health Statistics)

    health_pop_per_dentist — Population per dentist (HRSA Area Resource File)

    health_pop_per_primary_care_physician — Population per Primary Care Physician (HRSA Area Resource File)

    demo — demographics information

    demo_pct_female — Percent of population that is female (US Census Population Estimates)

    demo_pct_below_18_years_of_age — Percent of population that is below 18 years of age (US Census Population Estimates)

    demo_pct_aged_65_years_and_older — Percent of population that is aged 65 years or older (US Census Population Estimates)

    dem...

  17. K

    Kenya KE: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30...

    • ceicdata.com
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CEICdata.com, Kenya KE: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 [Dataset]. https://www.ceicdata.com/en/kenya/health-statistics/ke-mortality-from-cvd-cancer-diabetes-or-crd-between-exact-ages-30-and-70
    Explore at:
    Dataset provided by
    CEICdata.com
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Dec 1, 2000 - Dec 1, 2015
    Area covered
    Kenya
    Description

    Kenya KE: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data was reported at 13.400 % in 2016. This records an increase from the previous number of 13.300 % for 2015. Kenya KE: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data is updated yearly, averaging 13.400 % from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 17.300 % in 2000 and a record low of 13.300 % in 2015. Kenya KE: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Kenya – Table KE.World Bank: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted Average;

  18. Cardiovascular disease and diabetes profiles: March 2024 update

    • gov.uk
    Updated Mar 5, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for Health Improvement and Disparities (2024). Cardiovascular disease and diabetes profiles: March 2024 update [Dataset]. https://www.gov.uk/government/statistics/cardiovascular-disease-and-diabetes-profiles-march-2024-update
    Explore at:
    Dataset updated
    Mar 5, 2024
    Dataset provided by
    GOV.UKhttp://gov.uk/
    Authors
    Office for Health Improvement and Disparities
    Description

    The cardiovascular disease profiles have been updated by the Office for Health Improvement and Disparities (OHID).

    The profiles provide an overview of data on cardiovascular and cardiovascular related conditions of heart disease, stroke, diabetes and kidney disease. They are intended to help commissioners and health professionals assess the impact of cardiovascular disease (CVD) on their local population, make decisions about services and improve outcomes for patients.

  19. m

    ECG Images dataset of Cardiac Patients

    • data.mendeley.com
    Updated Mar 19, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ali Haider Khan (2021). ECG Images dataset of Cardiac Patients [Dataset]. http://doi.org/10.17632/gwbz3fsgp8.2
    Explore at:
    Dataset updated
    Mar 19, 2021
    Authors
    Ali Haider Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ECG images dataset of Cardiac Patients created under the auspices of Ch. Pervaiz Elahi Institute of Cardiology Multan, Pakistan that aims to help the scientific community for conducting the research for Cardiovascular diseases.

  20. f

    Diagnosis of Coronary Heart Diseases Using Gene Expression Profiling; Stable...

    • plos.figshare.com
    • omicsdi.org
    tiff
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nabila Kazmi; Tom R. Gaunt (2023). Diagnosis of Coronary Heart Diseases Using Gene Expression Profiling; Stable Coronary Artery Disease, Cardiac Ischemia with and without Myocardial Necrosis [Dataset]. http://doi.org/10.1371/journal.pone.0149475
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Nabila Kazmi; Tom R. Gaunt
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cardiovascular disease (including coronary artery disease and myocardial infarction) is one of the leading causes of death in Europe, and is influenced by both environmental and genetic factors. With the recent advances in genomic tools and technologies there is potential to predict and diagnose heart disease using molecular data from analysis of blood cells. We analyzed gene expression data from blood samples taken from normal people (n = 21), non-significant coronary artery disease (n = 93), patients with unstable angina (n = 16), stable coronary artery disease (n = 14) and myocardial infarction (MI; n = 207). We used a feature selection approach to identify a set of gene expression variables which successfully differentiate different cardiovascular diseases. The initial features were discovered by fitting a linear model for each probe set across all arrays of normal individuals and patients with myocardial infarction. Three different feature optimisation algorithms were devised which identified two discriminating sets of genes, one using MI and normal controls (total genes = 6) and another one using MI and unstable angina patients (total genes = 7). In all our classification approaches we used a non-parametric k-nearest neighbour (KNN) classification method (k = 3). The results proved the diagnostic robustness of the final feature sets in discriminating patients with myocardial infarction from healthy controls. Interestingly it also showed efficacy in discriminating myocardial infarction patients from patients with clinical symptoms of cardiac ischemia but no myocardial necrosis or stable coronary artery disease, despite the influence of batch effects and different microarray gene chips and platforms.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sozan S. Maghdid (2022). An Extensive Dataset for the Heart Disease Classification System [Dataset]. http://doi.org/10.17632/65gxgy2nmg.2

An Extensive Dataset for the Heart Disease Classification System

Explore at:
5 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
Feb 17, 2022
Authors
Sozan S. Maghdid
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Finding a good data source is the first step toward creating a database. Cardiovascular illnesses (CVDs) are the major cause of death worldwide. CVDs include coronary heart disease, cerebrovascular disease, rheumatic heart disease, and other heart and blood vessel problems. According to the World Health Organization, 17.9 million people die each year. Heart attacks and strokes account for more than four out of every five CVD deaths, with one-third of these deaths occurring before the age of 70. A comprehensive database for factors that contribute to a heart attack has been constructed. The main purpose here is to collect characteristics of Heart Attack or factors that contribute to it. The size of the dataset is 1319 samples, which have nine fields, where eight fields are for input fields and one field for an output field. Age, gender, heart rate (impulse), systolic BP (pressurehight), diastolic BP (pressurelow), blood sugar(glucose), CK-MB (kcm), and Test-Troponin (troponin) are representing the input fields, while the output field pertains to the presence of heart attack (class), which is divided into two categories (negative and positive); negative refers to the absence of a heart attack, while positive refers to the presence of a heart attack.

Search
Clear search
Close search
Google apps
Main menu