Rate: Age-adjusted death rate, number of deaths due to diabetes, per 100,000 population.
Definition: Deaths with diabetes as the underlying cause of death (ICD-10 codes: E10-E14).
Data Sources:
(1) Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health
(2) Population Estimates, State Data Center, New Jersey Department of Labor and Workforce Development
This dataset contains information on the total proportion of adults diagnosed with diabetes, collected from the system of health-related telephone surveys, the Behavioral Risk Factor Surveillance System (BRFSS), conducted in more than 400,000 patients, from 50 states in the US, the District of Columbia and three US territories.
SUMMARYThis analysis, designed and executed by Ribble Rivers Trust, identifies areas across England with the greatest levels of diabetes mellitus in persons (aged 17+). Please read the below information to gain a full understanding of what the data shows and how it should be interpreted.ANALYSIS METHODOLOGYThe analysis was carried out using Quality and Outcomes Framework (QOF) data, derived from NHS Digital, relating to diabetes mellitus in persons (aged 17+).This information was recorded at the GP practice level. However, GP catchment areas are not mutually exclusive: they overlap, with some areas covered by 30+ GP practices. Therefore, to increase the clarity and usability of the data, the GP-level statistics were converted into statistics based on Middle Layer Super Output Area (MSOA) census boundaries.The percentage of each MSOA’s population (aged 17+) with diabetes mellitus was estimated. This was achieved by calculating a weighted average based on:The percentage of the MSOA area that was covered by each GP practice’s catchment areaOf the GPs that covered part of that MSOA: the percentage of registered patients that have that illness The estimated percentage of each MSOA’s population with diabetes mellitus was then combined with Office for National Statistics Mid-Year Population Estimates (2019) data for MSOAs, to estimate the number of people in each MSOA with depression, within the relevant age range.Each MSOA was assigned a relative score between 1 and 0 (1 = worst, 0 = best) based on:A) the PERCENTAGE of the population within that MSOA who are estimated to have diabetes mellitusB) the NUMBER of people within that MSOA who are estimated to have diabetes mellitusAn average of scores A & B was taken, and converted to a relative score between 1 and 0 (1= worst, 0 = best). The closer to 1 the score, the greater both the number and percentage of the population in the MSOA that are estimated to have diabetes mellitus, compared to other MSOAs. In other words, those are areas where it’s estimated a large number of people suffer from diabetes mellitus, and where those people make up a large percentage of the population, indicating there is a real issue with diabetes mellitus within the population and the investment of resources to address that issue could have the greatest benefits.LIMITATIONS1. GP data for the financial year 1st April 2018 – 31st March 2019 was used in preference to data for the financial year 1st April 2019 – 31st March 2020, as the onset of the COVID19 pandemic during the latter year could have affected the reporting of medical statistics by GPs. However, for 53 GPs (out of 7670) that did not submit data in 2018/19, data from 2019/20 was used instead. Note also that some GPs (997 out of 7670) did not submit data in either year. This dataset should be viewed in conjunction with the ‘Health and wellbeing statistics (GP-level, England): Missing data and potential outliers’ dataset, to determine areas where data from 2019/20 was used, where one or more GPs did not submit data in either year, or where there were large discrepancies between the 2018/19 and 2019/20 data (differences in statistics that were > mean +/- 1 St.Dev.), which suggests erroneous data in one of those years (it was not feasible for this study to investigate this further), and thus where data should be interpreted with caution. Note also that there are some rural areas (with little or no population) that do not officially fall into any GP catchment area (although this will not affect the results of this analysis if there are no people living in those areas).2. Although all of the obesity/inactivity-related illnesses listed can be caused or exacerbated by inactivity and obesity, it was not possible to distinguish from the data the cause of the illnesses in patients: obesity and inactivity are highly unlikely to be the cause of all cases of each illness. By combining the data with data relating to levels of obesity and inactivity in adults and children (see the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset), we can identify where obesity/inactivity could be a contributing factor, and where interventions to reduce obesity and increase activity could be most beneficial for the health of the local population.3. It was not feasible to incorporate ultra-fine-scale geographic distribution of populations that are registered with each GP practice or who live within each MSOA. Populations might be concentrated in certain areas of a GP practice’s catchment area or MSOA and relatively sparse in other areas. Therefore, the dataset should be used to identify general areas where there are high levels of diabetes mellitus, rather than interpreting the boundaries between areas as ‘hard’ boundaries that mark definite divisions between areas with differing levels of diabetes mellitus.TO BE VIEWED IN COMBINATION WITH:This dataset should be viewed alongside the following datasets, which highlight areas of missing data and potential outliers in the data:Health and wellbeing statistics (GP-level, England): Missing data and potential outliersLevels of obesity, inactivity and associated illnesses (England): Missing dataDOWNLOADING THIS DATATo access this data on your desktop GIS, download the ‘Levels of obesity, inactivity and associated illnesses: Summary (England)’ dataset.DATA SOURCESThis dataset was produced using:Quality and Outcomes Framework data: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.GP Catchment Outlines. Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital. Data was cleaned by Ribble Rivers Trust before use.COPYRIGHT NOTICEThe reproduction of this data must be accompanied by the following statement:© Ribble Rivers Trust 2021. Analysis carried out using data that is: Copyright © 2020, Health and Social Care Information Centre. The Health and Social Care Information Centre is a non-departmental body created by statute, also known as NHS Digital.CaBA HEALTH & WELLBEING EVIDENCE BASEThis dataset forms part of the wider CaBA Health and Wellbeing Evidence Base.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data was reported at 19.800 NA in 2016. This records a decrease from the previous number of 20.000 NA for 2015. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data is updated yearly, averaging 21.200 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 23.400 NA in 2000 and a record low of 19.800 NA in 2016. India IN: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Female data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s India – Table IN.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;
Provisional death counts of diabetes, coronavirus disease 2019 (COVID-19) and other select causes of death, by month, sex, and age.
Health, United States is an annual report on trends in health statistics, find more information at http://www.cdc.gov/nchs/hus.htm.
Note: This dataset is historical only and there are not corresponding datasets for more recent time periods. For that more-recent information, please visit the Chicago Health Atlas at https://chicagohealthatlas.org.
This dataset contains the annual number of hospital discharges, crude hospitalization rates with corresponding 95% confidence intervals, and age-adjusted hospitalization rates with corresponding 95% confidence intervals, for the years 2000 – 2011, by Chicago U.S. Postal Service ZIP code or ZIP code aggregate. See the full description at http://bit.ly/Os5wnn.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data was reported at 20.900 NA in 2016. This records an increase from the previous number of 20.800 NA for 2015. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data is updated yearly, averaging 21.000 NA from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 22.600 NA in 2000 and a record low of 20.800 NA in 2015. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70: Male data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Nigeria – Table NG.World Bank.WDI: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted average;
This is historical data. The update frequency has been set to "Static Data" and is here for historic value. Updated 8/14/2024.
Number of deaths among Maryland residents for which diabetes mellitus was the underlying cause of death. This includes deaths coded to the following International Classification of Diseases codes: ICD-3 (1920-1929) -- 57 ICD-4 (1930-1938) -- 59 ICD-5 (1939-1948) -- 61 ICD-6 (1949-1957) -- 260 ICD-7 (1958-1967) -- 260 ICD-8 (1968-1978) -- 250 ICD-9 (1979-1998) -- 250 ICD-10 (1999-present) -- E10-E14.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data was reported at 22.500 % in 2016. This stayed constant from the previous number of 22.500 % for 2015. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data is updated yearly, averaging 22.900 % from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 25.500 % in 2000 and a record low of 22.500 % in 2016. Nigeria NG: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Nigeria – Table NG.World Bank: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted Average;
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
United States US: Diabetes Prevalence: % of Population Aged 20-79 data was reported at 10.790 % in 2017. United States US: Diabetes Prevalence: % of Population Aged 20-79 data is updated yearly, averaging 10.790 % from Dec 2017 (Median) to 2017, with 1 observations. United States US: Diabetes Prevalence: % of Population Aged 20-79 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s USA – Table US.World Bank: Health Statistics. Diabetes prevalence refers to the percentage of people ages 20-79 who have type 1 or type 2 diabetes.; ; International Diabetes Federation, Diabetes Atlas.; Weighted average;
Data Series: Mortality rate attributed to cardiovascular disease, cancer, diabetes or chronic respiratory disease, by sex Indicator: III.11 - Mortality rate attributed to cardiovascular disease, cancer, diabetes or chronic respiratory disease, by sex Source year: 2022 This dataset is part of the Minimum Gender Dataset compiled by the United Nations Statistics Division. Domain: Health and related services
T1DiabetesGranada
A longitudinal multi-modal dataset of type 1 diabetes mellitus
Documented by:
Rodriguez-Leon, C., Aviles-Perez, M. D., Banos, O., Quesada-Charneco, M., Lopez-Ibarra, P. J., Villalonga, C., & Munoz-Torres, M. (2023). T1DiabetesGranada: a longitudinal multi-modal dataset of type 1 diabetes mellitus. Scientific Data, 10(1), 916. https://doi.org/10.1038/s41597-023-02737-4
Background
Type 1 diabetes mellitus (T1D) patients face daily difficulties in keeping their blood glucose levels within appropriate ranges. Several techniques and devices, such as flash glucose meters, have been developed to help T1D patients improve their quality of life. Most recently, the data collected via these devices is being used to train advanced artificial intelligence models to characterize the evolution of the disease and support its management. The main problem for the generation of these models is the scarcity of data, as most published works use private or artificially generated datasets. For this reason, this work presents T1DiabetesGranada, a open under specific permission longitudinal dataset that not only provides continuous glucose levels, but also patient demographic and clinical information. The dataset includes 257780 days of measurements over four years from 736 T1D patients from the province of Granada, Spain. This dataset progresses significantly beyond the state of the art as one the longest and largest open datasets of continuous glucose measurements, thus boosting the development of new artificial intelligence models for glucose level characterization and prediction.
Data Records
The data are stored in four comma-separated values (CSV) files which are available in T1DiabetesGranada.zip. These files are described in detail below.
Patient_info.csv
Patient_info.csv is the file containing information about the patients, such as demographic data, start and end dates of blood glucose level measurements and biochemical parameters, number of biochemical parameters or number of diagnostics. This file is composed of 736 records, one for each patient in the dataset, and includes the following variables:
Patient_ID – Unique identifier of the patient. Format: LIB19XXXX.
Sex – Sex of the patient. Values: F (for female), masculine (for male)
Birth_year – Year of birth of the patient. Format: YYYY.
Initial_measurement_date – Date of the first blood glucose level measurement of the patient in the Glucose_measurements.csv file. Format: YYYY-MM-DD.
Final_measurement_date – Date of the last blood glucose level measurement of the patient in the Glucose_measurements.csv file. Format: YYYY-MM-DD.
Number_of_days_with_measures – Number of days with blood glucose level measurements of the patient, extracted from the Glucose_measurements.csv file. Values: ranging from 8 to 1463.
Number_of_measurements – Number of blood glucose level measurements of the patient, extracted from the Glucose_measurements.csv file. Values: ranging from 400 to 137292.
Initial_biochemical_parameters_date – Date of the first biochemical test to measure some biochemical parameter of the patient, extracted from the Biochemical_parameters.csv file. Format: YYYY-MM-DD.
Final_biochemical_parameters_date – Date of the last biochemical test to measure some biochemical parameter of the patient, extracted from the Biochemical_parameters.csv file. Format: YYYY-MM-DD.
Number_of_biochemical_parameters – Number of biochemical parameters measured on the patient, extracted from the Biochemical_parameters.csv file. Values: ranging from 4 to 846.
Number_of_diagnostics – Number of diagnoses realized to the patient, extracted from the Diagnostics.csv file. Values: ranging from 1 to 24.
Glucose_measurements.csv
Glucose_measurements.csv is the file containing the continuous blood glucose level measurements of the patients. The file is composed of more than 22.6 million records that constitute the time series of continuous blood glucose level measurements. It includes the following variables:
Patient_ID – Unique identifier of the patient. Format: LIB19XXXX.
Measurement_date – Date of the blood glucose level measurement. Format: YYYY-MM-DD.
Measurement_time – Time of the blood glucose level measurement. Format: HH:MM:SS.
Measurement – Value of the blood glucose level measurement in mg/dL. Values: ranging from 40 to 500.
Biochemical_parameters.csv
Biochemical_parameters.csv is the file containing data of the biochemical tests performed on patients to measure their biochemical parameters. This file is composed of 87482 records and includes the following variables:
Patient_ID – Unique identifier of the patient. Format: LIB19XXXX.
Reception_date – Date of receipt in the laboratory of the sample to measure the biochemical parameter. Format: YYYY-MM-DD.
Name – Name of the measured biochemical parameter. Values: 'Potassium', 'HDL cholesterol', 'Gammaglutamyl Transferase (GGT)', 'Creatinine', 'Glucose', 'Uric acid', 'Triglycerides', 'Alanine transaminase (GPT)', 'Chlorine', 'Thyrotropin (TSH)', 'Sodium', 'Glycated hemoglobin (Ac)', 'Total cholesterol', 'Albumin (urine)', 'Creatinine (urine)', 'Insulin', 'IA ANTIBODIES'.
Value – Value of the biochemical parameter. Values: ranging from -4.0 to 6446.74.
Diagnostics.csv
Diagnostics.csv is the file containing diagnoses of diabetes mellitus complications or other diseases that patients have in addition to type 1 diabetes mellitus. This file is composed of 1757 records and includes the following variables:
Patient_ID – Unique identifier of the patient. Format: LIB19XXXX.
Code – ICD-9-CM diagnosis code. Values: subset of 594 of the ICD-9-CM codes (https://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes).
Description – ICD-9-CM long description. Values: subset of 594 of the ICD-9-CM long description (https://www.cms.gov/Medicare/Coding/ICD9ProviderDiagnosticCodes/codes).
Technical Validation
Blood glucose level measurements are collected using FreeStyle Libre devices, which are widely used for healthcare in patients with T1D. Abbott Diabetes Care, Inc., Alameda, CA, USA, the manufacturer company, has conducted validation studies of these devices concluding that the measurements made by their sensors compare to YSI analyzer devices (Xylem Inc.), the gold standard, yielding results of 99.9% of the time within zones A and B of the consensus error grid. In addition, other studies external to the company concluded that the accuracy of the measurements is adequate.
Moreover, it was also checked in most cases the blood glucose level measurements per patient were continuous (i.e. a sample at least every 15 minutes) in the Glucose_measurements.csv file as they should be.
Usage Notes
For data downloading, it is necessary to be authenticated on the Zenodo platform, accept the Data Usage Agreement and send a request specifying full name, email, and the justification of the data use. This request will be processed by the Secretary of the Department of Computer Engineering, Automatics, and Robotics of the University of Granada and access to the dataset will be granted.
The files that compose the dataset are CSV type files delimited by commas and are available in T1DiabetesGranada.zip. A Jupyter Notebook (Python v. 3.8) with code that may help to a better understanding of the dataset, with graphics and statistics, is available in UsageNotes.zip.
Graphs_and_stats.ipynb
The Jupyter Notebook generates tables, graphs and statistics for a better understanding of the dataset. It has four main sections, one dedicated to each file in the dataset. In addition, it has useful functions such as calculating the patient age, deleting a patient list from a dataset file and leaving only a patient list in a dataset file.
Code Availability
The dataset was generated using some custom code located in CodeAvailability.zip. The code is provided as Jupyter Notebooks created with Python v. 3.8. The code was used to conduct tasks such as data curation and transformation, and variables extraction.
Original_patient_info_curation.ipynb
In the Jupyter Notebook is preprocessed the original file with patient data. Mainly irrelevant rows and columns are removed, and the sex variable is recoded.
Glucose_measurements_curation.ipynb
In the Jupyter Notebook is preprocessed the original file with the continuous glucose level measurements of the patients. Principally rows without information or duplicated rows are removed and the variable with the timestamp is transformed into two new variables, measurement date and measurement time.
Biochemical_parameters_curation.ipynb
In the Jupyter Notebook is preprocessed the original file with patient data of the biochemical tests performed on patients to measure their biochemical parameters. Mainly irrelevant rows and columns are removed and the variable with the name of the measured biochemical parameter is translated.
Diagnostic_curation.ipynb
In the Jupyter Notebook is preprocessed the original file with patient data of the diagnoses of diabetes mellitus complications or other diseases that patients have in addition to T1D.
Get_patient_info_variables.ipynb
In the Jupyter Notebook it is coded the feature extraction process from the files Glucose_measurements.csv, Biochemical_parameters.csv and Diagnostics.csv to complete the file Patient_info.csv. It is divided into six sections, the first three to extract the features from each of the mentioned files and the next three to add the extracted features to the resulting new file.
Data Usage Agreement
The conditions for use are as follows:
You confirm that you will not attempt to re-identify research participants for any reason, including for re-identification theory research.
You commit to keeping the T1DiabetesGranada dataset confidential and secure and will not redistribute data or Zenodo account credentials.
You will require
Population-based county-level estimates for prevalence of DC were obtained from the Institute for Health Metrics and Evaluation (IHME) for the years 2004-2012 (16). DC prevalence rate was defined as the propor-tion of people within a county who had previously been diagnosed with diabetes (high fasting plasma glu-cose 126 mg/dL, hemoglobin A1c (HbA1c) of 6.5%, or diabetes diagnosis) but do not currently have high fasting plasma glucose or HbA1c for the period 2004-2012. DC prevalence estimates were calculated using a two-stage approach. The first stage used National Health and Nutrition Examination Survey (NHANES) data to predict high fasting plasma glucose (FPG) levels (≥126 mg/dL) and/or HbA1C levels (≥6.5% [48 mmol/mol]) based on self-reported demographic and behavioral characteristics (16). This model was then applied to Behavioral Risk Factor Surveillance System (BRFSS) data to impute high FPG and/or HbA1C status for each BRFSS respondent (16). The second stage used the imputed BRFSS data to fit a series of small area models, which were used to predict county-level prevalence of diabetes-related outcomes, including DC (16). The EQI was constructed for 2006-2010 for all US counties and is composed of five domains (air, water, built, land, and sociodemographic), each composed of variables to represent the environmental quality of that domain. Domain-specific EQIs were developed using principal components analysis (PCA) to reduce these variables within each domain while the overall EQI was constructed from a second PCA from these individual domains (L. C. Messer et al., 2014). To account for differences in environment across rural and urban counties, the overall and domain-specific EQIs were stratified by rural urban continuum codes (RUCCs) (U.S. Department of Agriculture, 2015). Results are reported as prevalence rate differences (PRD) with 95% confidence intervals (CIs) comparing the highest quintile/worst environmental quality to the lowest quintile/best environmental quality expo-sure metrics. PRDs are representative of the entire period of interest, 2004-2012. Due to availability of DC data and covariate data, not all counties were captured, however, the majority, 3134 of 3142 were utilized in the analysis. This dataset is not publicly accessible because: EPA cannot release personally identifiable information regarding living individuals, according to the Privacy Act and the Freedom of Information Act (FOIA). This dataset contains information about human research subjects. Because there is potential to identify individual participants and disclose personal information, either alone or in combination with other datasets, individual level data are not appropriate to post for public access. Restricted access may be granted to authorized persons by contacting the party listed. It can be accessed through the following means: Human health data are not available publicly. EQI data are available at: https://edg.epa.gov/data/Public/ORD/NHEERL/EQI. Format: Data are stored as csv files. This dataset is associated with the following publication: Jagai, J., A. Krajewski, K. Price, D. Lobdell, and R. Sargis. Diabetes control is associated with environmental quality in the USA. Endocrine Connections. BioScientifica Ltd., Bristol, UK, 10(9): 1018-1026, (2021).
This Obesity and Diabetes Related Indicators dataset provides a subset of data (40 indicators) for the two topics: Obesity and Diabetes. The dataset includes percentage or rate for Cirrhosis/Diabetes and Obesity and Related Indicators, where available, for all counties, regions and state.
New York State Community Health Indicator Reports (CHIRS) were developed in 2012, and annually updated to provide data for over 300 health indicators, organized by 15 health topic and data for all counties, regions and state are presented in table format with links to trend graphs and maps (http://www.health.ny.gov/statistics/chac/indicators/).
Most recent county and state level data are provided. Multiple year combined data offers stable estimates for the burden and risk factors for these two health topics. For more information, check out: http://www.health.ny.gov/statistics/chac/indicators/ or go to the “About” tab.
This dataset tracks the updates made on the dataset "Public Health Statistics - Diabetes hospitalizations in Chicago, 2000-2011 - Historical" as a repository for previous versions of the data and metadata.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cardiovascular diseases (CVDs) are the number 1 cause of death globally, taking an estimated 17.9 million lives each year, which accounts for 31% of all deaths worlwide. Heart failure is a common event caused by CVDs and this dataset contains 12 features that can be used to predict mortality by heart failure.
Most cardiovascular diseases can be prevented by addressing behavioural risk factors such as tobacco use, unhealthy diet and obesity, physical inactivity and harmful use of alcohol using population-wide strategies.
People with cardiovascular disease or who are at high cardiovascular risk (due to the presence of one or more risk factors such as hypertension, diabetes, hyperlipidaemia or already established disease) need early detection and management wherein a machine learning model can be of great help.
Thirteen (13) clinical features: - age: age of the patient (years) - anaemia: decrease of red blood cells or hemoglobin (boolean) - high blood pressure: if the patient has hypertension (boolean) - creatinine phosphokinase (CPK): level of the CPK enzyme in the blood (mcg/L) - diabetes: if the patient has diabetes (boolean) - ejection fraction: percentage of blood leaving the heart at each contraction (percentage) - platelets: platelets in the blood (kiloplatelets/mL) - sex: woman or man (binary) - serum creatinine: level of serum creatinine in the blood (mg/dL) - serum sodium: level of serum sodium in the blood (mEq/L) - smoking: if the patient smokes or not (boolean) - time: follow-up period (days) - [target] death event: if the patient deceased during the follow-up period (boolean)
More - Find More Exciting🙀 Datasets Here - An Upvote👍 A Dayᕙ(`▿´)ᕗ , Keeps Aman Hurray Hurray..... ٩(˘◡˘)۶Haha
BackgroundRandomized controlled trials have shown the importance of tight glucose control in type 1 diabetes (T1DM), but few recent studies have evaluated the risk of cardiovascular disease (CVD) and all-cause mortality among adults with T1DM. We evaluated these risks in adults with T1DM compared with the non-diabetic population in a nationwide study from Scotland and examined control of CVD risk factors in those with T1DM. Methods and FindingsThe Scottish Care Information-Diabetes Collaboration database was used to identify all people registered with T1DM and aged ≥20 years in 2005–2007 and to provide risk factor data. Major CVD events and deaths were obtained from the national hospital admissions database and death register. The age-adjusted incidence rate ratio (IRR) for CVD and mortality in T1DM (n = 21,789) versus the non-diabetic population (3.96 million) was estimated using Poisson regression. The age-adjusted IRR for first CVD event associated with T1DM versus the non-diabetic population was higher in women (3.0: 95% CI 2.4–3.8, p<0.001) than men (2.3: 2.0–2.7, p<0.001) while the IRR for all-cause mortality associated with T1DM was comparable at 2.6 (2.2–3.0, p<0.001) in men and 2.7 (2.2–3.4, p<0.001) in women. Between 2005–2007, among individuals with T1DM, 34 of 123 deaths among 10,173 who were <40 years and 37 of 907 deaths among 12,739 who were ≥40 years had an underlying cause of death of coma or diabetic ketoacidosis. Among individuals 60–69 years, approximately three extra deaths per 100 per year occurred among men with T1DM (28.51/1,000 person years at risk), and two per 100 per year for women (17.99/1,000 person years at risk). 28% of those with T1DM were current smokers, 13% achieved target HbA1c of <7% and 37% had very poor (≥9%) glycaemic control. Among those aged ≥40, 37% had blood pressures above even conservative targets (≥140/90 mmHg) and 39% of those ≥40 years were not on a statin. Although many of these risk factors were comparable to those previously reported in other developed countries, CVD and mortality rates may not be generalizable to other countries. Limitations included lack of information on the specific insulin therapy used. ConclusionsAlthough the relative risks for CVD and total mortality associated with T1DM in this population have declined relative to earlier studies, T1DM continues to be associated with higher CVD and death rates than the non-diabetic population. Risk factor management should be improved to further reduce risk but better treatment approaches for achieving good glycaemic control are badly needed. Please see later in the article for the Editors' Summary
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Saudi Arabia SA: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data was reported at 16.400 % in 2016. This records a decrease from the previous number of 16.500 % for 2015. Saudi Arabia SA: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data is updated yearly, averaging 17.900 % from Dec 2000 (Median) to 2016, with 5 observations. The data reached an all-time high of 18.900 % in 2000 and a record low of 16.400 % in 2016. Saudi Arabia SA: Mortality from CVD, Cancer, Diabetes or CRD between Exact Ages 30 and 70 data remains active status in CEIC and is reported by World Bank. The data is categorized under Global Database’s Saudi Arabia – Table SA.World Bank: Health Statistics. Mortality from CVD, cancer, diabetes or CRD is the percent of 30-year-old-people who would die before their 70th birthday from any of cardiovascular disease, cancer, diabetes, or chronic respiratory disease, assuming that s/he would experience current mortality rates at every age and s/he would not die from any other cause of death (e.g., injuries or HIV/AIDS).; ; World Health Organization, Global Health Observatory Data Repository (http://apps.who.int/ghodata/).; Weighted Average;
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the shadows of the Covid-19 pandemic, there is another global health crisis that has gone largely unnoticed. This is the Noncommunicable Disease (NCD) pandemic.
The WHO website describes NCDs as follows:
Noncommunicable diseases (NCDs), also known as chronic diseases, tend to be of long duration and are the result of a combination of genetic, physiological, environmental and behaviours factors.
The main types of NCDs are cardiovascular diseases (like heart attacks and stroke), cancers, chronic respiratory diseases (such as chronic obstructive pulmonary disease and asthma) and diabetes.
NCDs disproportionately affect people in low- and middle-income countries where more than three quarters of global NCD deaths – 32million – occur.
- Noncommunicable diseases (NCDs) kill 41 million people each year, equivalent to 71% of all deaths globally.
- Each year, 15 million people die from a NCD between the ages of 30 and 69 years; over 85% of these "premature" deaths occur in low- and middle-income > * countries.
- Cardiovascular diseases account for most NCD deaths, or 17.9 million people annually, followed by cancers (9.0 million), respiratory diseases (3.9million), and diabetes (1.6 million).
- These 4 groups of diseases account for over 80% of all premature NCD deaths.
- Tobacco use, physical inactivity, the harmful use of alcohol and unhealthy diets all increase the risk of dying from a NCD.
- Detection, screening and treatment of NCDs, as well as palliative care, are key components of the response to NCDs.
This data repository consists of 3 CSV files: WHO-cause-of-death-by-NCD.csv is the main dataset, which provides the percentage of deaths caused by NCDs out of all causes of death, for each nation globally. Metadata_Country.csv and Metadata_Indicator.csv provide additional metadata which is helpful for interpreting the main CSV.
The data collected spans a period from 2000 to 2016. The main CSV has columns for every year from 1960 to 2019. It is advisable to drop all redundant columns where no data was collected.
Furthermore, it is advisable to merge Metadata_Country.csv with the main CSV as it provides valuable additional information, particularly on the economic situation of each nation.
This dataset has been extracted from The World Bank 'Cause of death, by non-communicable diseases (% of total)' Dataset, derived based on the data from WHO's Global Health Estimates. It is freely provided under a Creative Commons Attribution 4.0 International License (CC BY 4.0), with the additional terms as stated on the World Bank website: World Bank Terms of Use for Datasets.
I would be interested to see some good data wrangling (dropping redundant columns), as well as kernels interpreting additional information in 'SpecialNotes' column in Metadata_country.csv
It would also be great to see what different factors influence NCDs: most of all, the geopolitical factors. Would be great to see some choropleth visualisations to get an idea of which regions are most affected by NCDs.
Rate: Age-adjusted death rate, number of deaths due to diabetes, per 100,000 population.
Definition: Deaths with diabetes as the underlying cause of death (ICD-10 codes: E10-E14).
Data Sources:
(1) Death Certificate Database, Office of Vital Statistics and Registry, New Jersey Department of Health
(2) Population Estimates, State Data Center, New Jersey Department of Labor and Workforce Development