Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides COVID-19 mortality data with details on age groups, sex, and pre-existing conditions such as diabetes and hypertensive diseases. It includes the date of death, COVID-19 diagnosis, and comorbidities, helping to analyze the impact of COVID-19 on different demographics and health conditions. The dataset is valuable for epidemiological research, healthcare policy planning, and understanding the role of comorbidities in COVID-19-related deaths.
Facebook
TwitterProvisional death counts of diabetes, coronavirus disease 2019 (COVID-19) and other select causes of death, by month, sex, and age.
Facebook
TwitterThe objective of this study was to compare the effect of diabetes and pathologies potentially related to diabetes on the risk of infection and death from COVID-19 among people from Highly-Developed-Country (HDC), including Italians, and immigrants from the High-Migratory-Pressure-Countries (HMPC). Among the population with diabetes, whose prevalence is known to be higher among immigrants, we compared the effect of body mass index among HDC and HMPC populations. A population-based cohort study was conducted, using population registries and routinely collected surveillance data. The population was stratified into HDC and HMPC, according to the place of birth; moreover, a focus was set on the South Asiatic population. Analyses restricted to the population with type-2 diabetes were performed. We reported incidence (IRR) and mortality rate ratios (MRR) and hazard ratios (HR) with 95% confidence interval (CI) to estimate the effect of diabetes on SARS-CoV-2 infection and COVID-19 mortality. Overall, IRR of infection and MRR from COVID-19 comparing HMPC with HDC group were 0.84 (95% CI 0.82–0.87) and 0.67 (95% CI 0.46–0.99), respectively. The effect of diabetes on the risk of infection and death from COVID-19 was slightly higher in the HMPC population than in the HDC population (HRs for infection: 1.37 95% CI 1.22–1.53 vs. 1.20 95% CI 1.14–1.25; HRs for mortality: 3.96 95% CI 1.82–8.60 vs. 1.71 95% CI 1.50–1.95, respectively). No substantial difference in the strength of the association was observed between obesity or other comorbidities and SARS-CoV-2 infection. Similarly for COVID-19 mortality, HRs for obesity (HRs: 18.92 95% CI 4.48–79.87 vs. 3.91 95% CI 2.69–5.69) were larger in HMPC than in the HDC population, but differences could be due to chance. Among the population with diabetes, the HMPC group showed similar incidence (IRR: 0.99 95% CI: 0.88–1.12) and mortality (MRR: 0.89 95% CI: 0.49–1.61) to that of HDC individuals. The effect of obesity on incidence was similar in both HDC and HMPC populations (HRs: 1.73 95% CI 1.41–2.11 among HDC vs. 1.41 95% CI 0.63–3.17 among HMPC), although the estimates were very imprecise. Despite a higher prevalence of diabetes and a stronger effect of diabetes on COVID-19 mortality in HMPC than in the HDC population, our cohort did not show an overall excess risk of COVID-19 mortality in immigrants.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionDiabetes is one of the comorbidities associated with poor prognosis in hospitalized COVID-19 patients. In this nationwide retrospective study, we evaluated the risk of in-hospital death attributed to diabetes.MethodsWe analyzed data from discharge reports of patients hospitalized with COVID-19 in 2020 as submitted to the Polish National Health Fund. Several multivariate logistic regression models were used. In each model, in-hospital death was estimated with explanatory variables. Models were built either on the whole cohorts or cohorts matched with propensity score matching (PSM). The models examined either the main effects of diabetes itself or the interaction of diabetes with other variables.ResultsWe included 174,621 patients with COVID-19 who were hospitalized in the year 2020. Among them, there were 40,168 diabetic patients (DPs), and the proportion of DPs in this group was higher than in the general population (23.0% vs. 9.5%, p
Facebook
TwitterThis study at Eka Kotebe Hospital in Addis Ababa, Ethiopia, examined the impact of diabetes on COVID-19 mortality. We conducted a matched-retrospective cohort study of consecutive patients admitted with COVID-19. We compared severity markers and outcomes to determine the risk of death in patients with diabetes compared to matched controls. We used descriptive statistics, chi-square, and Poisson regression. In a univariate comparison, a p-value less than 0.05 was considered significant. Ethics approval was obtained from the Eka Kotebe Hospital Institutional Ethics Committee. The study involved 284 patients, with a 1:1 proportion of diabetics and non-diabetics. Results showed that diabetic patients had a higher number of severe and critical cases but did not have a higher mortality rate. Mortality was associated with malignancy, HIV, and a lymphocyte count <1000/µL.
Facebook
TwitterIt was estimated that around 30 percent of those aged 80 years and older who had COVID-19 in the United States from January 22 to May 30, 2020 died from the disease. Deaths due to COVID-19 are much higher among those with underlying health conditions such as cardiovascular disease, chronic lung disease, or diabetes. This statistic shows the percentage of people in the U.S. who had COVID-19 from January 22 to May 30, 2020 who died, by age.
For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Facts and Figures page.
Facebook
TwitterBackgroundDiabetes mellitus (DM) is one of the most frequent comorbidities in patients suffering from severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) with a higher rate of severe course of coronavirus disease (COVID-19). However, data about post-COVID-19 syndrome (PCS) in patients with DM are limited.MethodsThis multicenter, propensity score-matched study compared long-term follow-up data about cardiovascular, neuropsychiatric, respiratory, gastrointestinal, and other symptoms in 8,719 patients with DM to those without DM. The 1:1 propensity score matching (PSM) according to age and sex resulted in 1,548 matched pairs.ResultsDiabetics and nondiabetics had a mean age of 72.6 ± 12.7 years old. At follow-up, cardiovascular symptoms such as dyspnea and increased resting heart rate occurred less in patients with DM (13.2% vs. 16.4%; p = 0.01) than those without DM (2.8% vs. 5.6%; p = 0.05), respectively. The incidence of newly diagnosed arterial hypertension was slightly lower in DM patients as compared to non-DM patients (0.5% vs. 1.6%; p = 0.18). Abnormal spirometry was observed more in patients with DM than those without DM (18.8% vs. 13; p = 0.24). Paranoia was diagnosed more frequently in patients with DM than in non-DM patients at follow-up time (4% vs. 1.2%; p = 0.009). The incidence of newly diagnosed renal insufficiency was higher in patients suffering from DM as compared to patients without DM (4.8% vs. 2.6%; p = 0.09). The rate of readmission was comparable in patients with and without DM (19.7% vs. 18.3%; p = 0.61). The reinfection rate with COVID-19 was comparable in both groups (2.9% in diabetics vs. 2.3% in nondiabetics; p = 0.55). Long-term mortality was higher in DM patients than in non-DM patients (33.9% vs. 29.1%; p = 0.005).ConclusionsThe mortality rate was higher in patients with DM type II as compared to those without DM. Readmission and reinfection rates with COVID-19 were comparable in both groups. The incidence of cardiovascular symptoms was higher in patients without DM.
Facebook
TwitterBy Valtteri Kurkela [source]
The dataset is constantly updated and synced hourly to ensure up-to-date information. With over several columns available for analysis and exploration purposes, users can extract valuable insights from this extensive dataset.
Some of the key metrics covered in the dataset include:
Vaccinations: The dataset covers total vaccinations administered worldwide as well as breakdowns of people vaccinated per hundred people and fully vaccinated individuals per hundred people.
Testing & Positivity: Information on total tests conducted along with new tests conducted per thousand people is provided. Additionally, details on positive rate (percentage of positive Covid-19 tests out of all conducted) are included.
Hospital & ICU: Data on ICU patients and hospital patients are available along with corresponding figures normalized per million people. Weekly admissions to intensive care units and hospitals are also provided.
Confirmed Cases: The number of confirmed Covid-19 cases globally is captured in both absolute numbers as well as normalized values representing cases per million people.
5.Confirmed Deaths: Total confirmed deaths due to Covid-19 worldwide are provided with figures adjusted for population size (total deaths per million).
6.Reproduction Rate: The estimated reproduction rate (R) indicates the contagiousness of the virus within a particular country or region.
7.Policy Responses: Besides healthcare-related metrics, this comprehensive dataset includes policy responses implemented by countries or regions such as lockdown measures or travel restrictions.
8.Other Variables of InterestThe data encompasses various socioeconomic factors that may influence Covid-19 outcomes including population density,membership in a continent,gross domestic product(GDP)per capita;
For demographic factors: -Age Structure : percentage populations aged 65 and older,aged (70)older,median age -Gender-specific factors: Percentage of female smokers -Lifestyle-related factors: Diabetes prevalence rate and extreme poverty rate
- Excess Mortality: The dataset further provides insights into excess mortality rates, indicating the percentage increase in deaths above the expected number based on historical data.
The dataset consists of numerous columns providing specific information for analysis, such as ISO code for countries/regions, location names,and units of measurement for different parameters.
Overall,this dataset serves as a valuable resource for researchers, analysts, and policymakers seeking to explore various aspects related to Covid-19
Introduction:
Understanding the Basic Structure:
- The dataset consists of various columns containing different data related to vaccinations, testing, hospitalization, cases, deaths, policy responses, and other key variables.
- Each row represents data for a specific country or region at a certain point in time.
Selecting Desired Columns:
- Identify the specific columns that are relevant to your analysis or research needs.
- Some important columns include population, total cases, total deaths, new cases per million people, and vaccination-related metrics.
Filtering Data:
- Use filters based on specific conditions such as date ranges or continents to focus on relevant subsets of data.
- This can help you analyze trends over time or compare data between different regions.
Analyzing Vaccination Metrics:
- Explore variables like total_vaccinations, people_vaccinated, and people_fully_vaccinated to assess vaccination coverage in different countries.
- Calculate metrics such as people_vaccinated_per_hundred or total_boosters_per_hundred for standardized comparisons across populations.
Investigating Testing Information:
- Examine columns such as total_tests, new_tests, and tests_per_case to understand testing efforts in various countries.
- Calculate rates like tests_per_case to assess testing efficiency or identify changes in testing strategies over time.
Exploring Hospitalization and ICU Data:
- Analyze variables like hosp_patients, icu_patients, and hospital_beds_per_thousand to understand healthcare systems' strain.
- Calculate rates like icu_patients_per_million or hosp_patients_per_million for cross-country comparisons.
Assessing Covid-19 Cases and Deaths:
- Analyze variables like total_cases, new_ca...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.
The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.
The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.
Facebook
TwitterIt was estimated that around 20 percent of those with underlying health conditions who had COVID-19 in the United States from January 22 to May 30, 2020 died from the disease, compared to just 2 percent of COVID-patients without underlying health conditions. Underlying health conditions such as cardiovascular disease, chronic lung disease, or diabetes greatly increase the chance of death due to COVID-19. This statistic shows the percentage of people in the U.S. who had COVID-19 from January 22 to May 30, 2020 with and without underlying health conditions who died, by age.
For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Facts and Figures page.
Facebook
TwitterObjectives: Diabetes is a risk factor for poor COVID-19 prognosis. The analysis of related prognostic factors in diabetic patients with COVID-19 would be helpful for further treatment of such patients.Methods: This retrospective study involved 3623 patients with COVID-19 (325 with diabetes). Clinical characteristics and laboratory tests were collected and compared between the diabetic group and the non-diabetic group. Binary logistic regression analysis was applied to explore risk factors associated in diabetic patients with COVID-19. A prediction model was built based on these risk factors.Results: The risk factors for higher mortality in diabetic patients with COVID-19 were dyspnea, lung disease, cardiovascular diseases, neutrophil, PLT count, and CKMB. Similarly, dyspnea, cardiovascular diseases, neutrophil, PLT count, and CKMB were risk factors related to the severity of diabetes with COVID-19. Based on these factors, a risk score was built to predict the severity of disease in diabetic patients with COVID-19. Patients with a score of 7 or higher had an odds ratio of 7.616.Conclusions: Dyspnea is a critical clinical manifestation that is closely related to the severity of disease in diabetic patients with COVID-19. Attention should also be paid to the neutrophil, PLT count and CKMB levels after admission.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundWe investigated if the concentration and “rangeability” of cystatin C (CysC) influenced the prognosis of coronavirus disease 2019 (COVID-19) in patients suffering from, or not suffering from, type 2 diabetes mellitus (T2DM).MethodsA total of 675 T2DM patients and 572 non-T2DM patients were divided into “low” and “high” CysC groups and low and high CysC-rangeability groups according to serum CysC level and range of change of CysC level, respectively. Demographic characteristics, clinical data, and laboratory results of the four groups were analyzed.ResultsCOVID-19 patients with a high level and rangeability of CysC had more organ damage and a higher risk of death compared with those with a low level or low rangeability of CysC. Patients with a higher level and rangeability of CysC had more blood lymphocytes and higher levels of C-reactive protein, alanine aminotransferase, and aspartate aminotransferase. After adjustment for possible confounders, multivariate analysis revealed that CysC >0.93 mg/dL was significantly associated with the risk of heart failure (OR = 2.231, 95% CI: 1.125–5.312) and all-cause death (2.694, 1.161–6.252). CysC rangeability >0 was significantly associated with all-cause death (OR = 4.217, 95% CI: 1.953–9.106). These associations were stronger in patients suffering from T2DM than in those not suffering from T2DM.ConclusionsThe level and rangeability of CysC may influence the prognosis of COVID-19. Special care and appropriate intervention should be undertaken in COVID-19 patients with an increased CysC level during hospitalization and follow-up, especially for those with T2DM.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Logistic regression analysis on the relationships of comorbidities with deaths for COVID-19a.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Provisional counts of deaths by the week the deaths occurred, by state of occurrence, and by select underlying causes of death for 2020-2022. The dataset also includes weekly provisional counts of death for COVID-19, coded to ICD-10 code U07.1 as an underlying or multiple cause of death.
NOTE: death counts are presented with a one week lag.
This dataset to be updated weekly with the notebook run. The coverage period is between 2020-2022. This can used in conjunction with other datasets to plot the bigger picture. ex. 2014-2018
The dataset highlights select causes of death. Some prominent causes are not listed in specifics.
| Column Name | Description |
|---|---|
| Data As Of | Date of analysis |
| Jurisdiction of Occurrence | Jurisdiction of Occurrence |
| MMWR Year | MMWR Year |
| MMWR Week | MMWR Week |
| Week Ending Date | Week Ending Date |
| All Cause | All Cause |
| Natural Cause | Natural Cause (A00-R99, U07) |
| Septicemia (A40-A41) | Septicemia (A40-A41) |
| Malignant neoplasms (C00-C97) | Malignant neoplasms (C00-C97) |
| Diabetes mellitus (E10-E14) | Diabetes mellitus (E10-E14) |
| Alzheimer disease (G30) | Alzheimer disease (G30) |
| Influenza and pneumonia (J09-J18) | Influenza and pneumonia (J09-J18) |
| Chronic lower respiratory diseases (J40-J47) | Chronic lower respiratory diseases (J40-J47) |
| Other diseases of respiratory system (J00-J06,J30-J39,J67,J70-J98) | Other diseases of respiratory system (J00-J06,J30-J39,J67,J70-J98) |
| Nephritis, nephrotic syndrome and nephrosis (N00-N07,N17-N19,N25-N27) | Nephritis, nephrotic syndrome and nephrosis (N00-N07,N17-N19,N25-N27) |
| Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified (R00-R99) | Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified (R00-R99) |
| Diseases of heart (I00-I09,I11,I13,I20-I51) | Diseases of heart (I00-I09,I11,I13,I20-I51) |
| Cerebrovascular diseases (I60-I69) | Cerebrovascular diseases (I60-I69) |
| COVID-19 (U071, Multiple Cause of Death) | COVID-19 (U071, Multiple Cause of Death) |
| COVID-19 (U071, Underlying Cause of Death) | COVID-19 (U071, Underlying Cause of Death) |
| flag_allcause | Suppressed (counts 1-9) for All causes of death |
| flag_natcause | Suppressed (counts 1-9) for Natural causes of death |
| flag_sept | Suppressed (counts 1-9) for Septicemia |
| flag_neopl | Suppressed (counts 1-9) for Malignant eoplasms |
| flag_diab | Suppressed (counts 1-9) for Diabetes mellitis |
| flag_alz | Suppressed (counts 1-9) for Alzheimer disease |
| flag_inflpn | Suppressed (counts 1-9) for Influenza and pneumonia |
| flag_clrd | Suppressed (counts 1-9) for Chronic lower respiratory diseases |
| flag_otherresp | Suppressed (counts 1-9) for Other diseases of respiratory system |
| flag_nephr | Suppressed (counts 1-9) for Nephritis, nephrotic syndrome and nephrosis |
| flag_otherunk | Suppressed (counts 1-9) for Symptoms, signs and abnormal clinical and laboratory findings, not elsewhere classified |
| flag_hd | Suppressed (counts 1-9) for Diseases of heart |
| flag_stroke | Suppressed (counts 1-9) for Cerebrovascular diseases |
| flag_cov19mcod | Suppressed (counts 1-9) for COVID-19 (U071, Multiple Cause of Death) |
| flag_cov19ucod | Suppressed (counts 1-9) for COVID-19 (U071, Underlying Cause of Death) |
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The full text of this article can be freely accessed on the publisher's website.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A novel Coronavirus found its First case in December 2019, and after that, coronavirus cases are increasing with each subsequent day. As we all know, many people have lost their lives in the first wave of COVID-19, and the number of Deaths increased in the 2nd Wave of COVID-19.
COVID-19 is commonly mild and self-limiting, but in a considerable portion of patients the disease is severe and fatal. Determining which patients are at high risk of severe illness or mortality is essential for appropriate clinical decision-making.
The data file contains information on demographics, comorbidities, admission laboratory values, admission medications, admission supplemental oxygen orders, discharge, and mortality. The data were derived from a healthcare surveillance software package (Clinical Looking Glass [CLG]; Streamline Health, Atlanta, Georgia) and review of the primary medical records. The data relate to COVID-19 patients admitted to a single healthcare system, over a specific period of time, and separated into the 1st 3 weeks of the pandemic and the 2nd 3 weeks of the pandemic. Some of the variables included in the dataset are: length of hospital stay (LOS), myocardial infraction (MI), peripheral vascular disease (PVD), congestive heart failure (CHF), cardiovascular disease (CVD), dementia (Dement), Chronic obstructive pulmonary disease (COPD), diabetes mellitus simple (DM simple), diabetes mellitus complicated (DM complicated), oxygen saturation (OsSats), mean arterial pressure, in mmHg (MAP), D-dimer, in mg/ml (Ddimer), platelets, in k per mm3 (Plts), international normalized ratio (INR), blood urea nitrogen, in mg/dL (BUN), alanine aminotransferase, in U/liter (AST), while blood cells, in per mm3 (WBC) and interleukin-6, in pg/ml (IL-6).
I would like to Thanks Scientific Reports for the study on Covid-19 patients.
This Dataset can help in predicting the Mortality Risk or Severe Covid-19 Patients in the Early Stages when they just get admitted into the hospital. By early prediction of Severe covid-19 patients it can help overburdened hospitals to arrange the resources like Oxygen cylinders and ICU beds accordingly which can save the life of patient.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Logistic regression analysis on the relationships of signs and symptoms with deaths for COVID-19a.
Facebook
TwitterHeart conditions were the most common causes of death in Mexico in 2023. During that period, more than ******* people died in the North American country as a result from said conditions. Diabetes mellitus ranked second, with over ******* deaths registered that year. Obesity in MexicoObesity and being overweight can worsen many risk factors for developing heart conditions, prediabetes, type 2 diabetes, and gestational diabetes, which in the case of a COVID-19 infection can lead to a severe course of the disease. In 2020, Mexico was reported as having one of the largest overweight and/or obese population in Latin America, with ** percent of people in the country having a body mass index higher than 25. In 2022, obesity was announced as being one of the most common illnesses experienced in Mexico, with over ******* cases estimated. In a decade from now, it is predicted that about *** million children in Mexico will suffer from obesity. If estimations are correct, this North American country will belong to the world’s top 10 countries with the most obese children in 2030. Physical activity in MexicoIt is not only a matter of food intake. A 2023 survey found, for instance, that only **** percent of Mexican population practiced sports and physical activities in their free time, a figure that has decreased in comparison to 2013. Less than ** percent of the physically active Mexicans practice sports for fun. However, the vast majority were motivated by health reasons.
Facebook
TwitterAccording to a survey conducted in Europe in 2020, ** percent of diabetes sufferers reported it was not at all difficult to obtain their medication prior to the COVID-19 pandemic, although this dropped to below ** percent during the pandemic. Furthermore, ** percent of people living with diabetes said it was difficult to access medication during the pandemic.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Overview
The COVID-19 Patient Recovery Dataset is a synthetic collection of anonymized records for around 70,000 COVID-19 patients. It aims to assist with classification tasks in machine learning and epidemiological research. The dataset includes detailed clinical and demographic information, such as symptoms, existing health issues, vaccination status, COVID-19 variants, treatment details, and outcomes related to recovery or mortality. This dataset is great for predicting patient recovery (recovered), mortality (death), disease severity (severity), or the need for intensive care (icu_admission) using algorithms like Logistic Regression, Random Forest, XGBoost, or Neural Networks. It also allows for exploratory data analysis (EDA), statistical modeling, and time-series studies to find patterns in COVID-19 outcomes.
The data is synthetic and reflects realistic trends found in public health data, based on sources like WHO reports. It ensures privacy and follows ethical guidelines. Dates are provided in Excel serial format, meaning 44447 corresponds to September 8, 2021, and can be converted to standard dates using Python’s datetime or Excel. With 70,000 records and 28 columns, this dataset serves as a valuable resource for data scientists, researchers, and students interested in health-related machine learning or pandemic trends.
Data Source and Collection
Source: Synthetic data based on public health patterns from sources like the World Health Organization (WHO). It includes placeholder URLs.
Collection Period: Simulated from early 2020 to mid-2022, covering the Alpha, Delta, and Omicron waves.
Number of Records: 70,000.
File Format: CSV, which works with Pandas, R, Excel, and more.
Data Quality Notes:
About 5% of the values are missing in fields like symptoms_2, symptoms_3, treatment_given_2, and date.
There are rare inconsistencies, such as between recovery/death flags and dates, which may need some preprocessing.
Unique, anonymized patient IDs.
| Column Name | Data Type |
|---|---|
| patient_id | String |
| country | String |
| region/state | String |
| date_reported | Integer |
| age | Integer |
| gender | String |
| comorbidities | String |
| symptoms_1 | String |
| symptoms_2 | String |
| symptoms_3 | String |
| severity | String |
| hospitalized | Integer |
| icu_admission | Integer |
| ventilator_support | Integer |
| vaccination_status | String |
| variant | String |
| treatment_given_1 | String |
| treatment_given_2 | String |
| days_to_recovery | Integer |
| recovered | Integer |
| death | Integer |
| date_of_recovery | Integer |
| date_of_death | Integer |
| tests_conducted | Integer |
| test_type | String |
| hospital_name | String |
| doctor_assigned | String |
| source_url | String |
Key Column Details
patient_id: Unique identifier (e.g., P000001).
country: Reporting country (e.g., India, USA, Brazil, Germany, China, Pakistan, South Africa, UK).
region/state: Sub-national region (e.g., Sindh, California, São Paulo, Beijing).
date_reported, date_of_recovery, date_of_death: Excel serial dates (convert using datetime(1899,12,30) + timedelta(days=value)).
age: Patient age (1–100 years).
gender: Male or Female.
comorbidities: Pre-existing conditions (e.g., Diabetes, Hypertension, Cancer, Heart Disease, Asthma, None).
symptoms_1, symptoms_2, symptoms_3: Reported symptoms (e.g., Cough, Fever, Fatigue, Loss of Smell, Sore Throat, or empty).
severity: Case severity (Mild, Moderate, Severe, Critical).
hospitalized, icu_admission, ventilator_support: Binary (1 = Yes, 0 = No).
vaccination_status: None, Partial, Full, or Booster.
variant: COVID-19 variant (Omicron, Delta, Alpha).
treatment_given_1, treatment_given_2: Treatments administered (e.g., Antibiotics, Remdesivir, Oxygen, Steroids, Paracetamol, or empty).
days_to_recovery: Days from report to recovery (5–30, or empty if not recovered).
recovered, death: Binary outcomes (1 = Yes, 0 = No; generally mutually exclusive).
tests_conducted: Number of tests (1–5).
test_type: PCR or Antigen.
hospital_name: Fictional hospital (e.g., Aga Khan, Mayo Clinic, NHS Trust).
doctor_assigned: Fictional doctor name (e.g., Dr. Smith, Dr. Müller).
source_url: Placeholder.
Summary Statistics
Total Patients: 70,000.
Age: Mean ~50 years, Min 1, Max 100, evenly distributed.
Gender: ~50% Male, ~50% Female.
Top Countries: USA (20%), India (18%), Brazil (15%), China (12%), Germany (10%).
Comorbidities: Diabetes (25%), Hypertension (20%), Cancer (15%), Heart Disease (15%), Asthma (10%), None (15%).
Severity: Mild (60%), Moderate (25%), Severe (10%), Critical (5%).
Recovery Rate: ~60% recovered (recovered=1), ~30% deceased (death=1), ~10% unresolved (both 0).
Vaccination: None (40%), Full (30%), Partial (15%), Booster (15%).
Variants: Omicron (50%), Delt...
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset provides COVID-19 mortality data with details on age groups, sex, and pre-existing conditions such as diabetes and hypertensive diseases. It includes the date of death, COVID-19 diagnosis, and comorbidities, helping to analyze the impact of COVID-19 on different demographics and health conditions. The dataset is valuable for epidemiological research, healthcare policy planning, and understanding the role of comorbidities in COVID-19-related deaths.