Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.
The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.
The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.
Facebook
TwitterNote: Note: Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. As of April 27, 2023 updates changed from daily to weekly. Summary The cumulative number of confirmed COVID-19 deaths among Maryland residents by gender: Female; Male; Unknown. Description The MD COVID-19 - Confirmed Deaths by Gender Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by gender. A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Probable deaths are available from the MD COVID-19 - Probable Deaths by Gender Distribution data layer. Terms of Use The Spatial Data, and the information therein, (collectively the "Data") is provided "as is" without warranty of any kind, either expressed, implied, or statutory. The user assumes the entire risk as to quality and performance of the Data. No guarantee of accuracy is granted, nor is any responsibility for reliance thereon assumed. In no event shall the State of Maryland be liable for direct, indirect, incidental, consequential or special damages of any kind. The State of Maryland does not accept liability for any damages or misrepresentation caused by inaccuracies in the Data or as a result to changes to the Data, nor is there responsibility assumed to maintain the Data in any manner or form. The Data can be freely distributed as long as the metadata entry is not modified or deleted. Any data derived from the Data must acknowledge the State of Maryland in the metadata.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please see FAQ for latest information on COVID-19 Data Hub data flows: https://covid-19.geohive.ie/pages/helpfaqs.Notice:See the Technical Data Issues section in the FAQ for information about issues in data: https://covid-19.geohive.ie/pages/helpfaqs.Deaths: From 16th May 2022 onwards, reporting of Notified Deaths will be weekly (each Wednesday) with deaths notified since the previous Wednesday reported. This is based on the date on which a death was notified on CIDR, not the date on which the death occurred. Data on deaths by date of death is available on the new HPSC Epidemiology of COVID-19 Data Hub https://epi-covid-19-hpscireland.hub.arcgis.com/.Notice:
Please be advised that on 29th April 2021, the 'Aged65up' and 'HospitalisedAged65up' fields were removed from this table. The three fields 'Aged65to74', 'Aged75to84', and 'Aged85up' replace the 'Aged65up' field.The three fields 'HospitalisedAged65to74', 'HospitalisedAged75to84' and 'HospitalisedAged85up' replace the 'HospitalisedAged65up' field.Please be advised that on the week beginning 1st March 2021, the values in the following fields in this table were set to zero: 'CommunityTransmission' , 'CloseContact', 'TravelAbroad' and ‘ClustersNotified’. ----------------------------------------------------------------------This feature service contains the up to date Covid-19 Daily Statistics as well as the Profile of Covid-19 Daily Statistics for Ireland, as reported by the Health Protection Surveillance Centre.The Covid-19 Daily Statistics are updated once a week, each Wednesday, which includes data for the full time series. Data on deaths is updated once a week, each Wednesday, which includes data for the full time series.The further breakdown of these counts (age, gender, transmission, etc.) is part of a Daily Statistics Profile of Covid-19, to help identify patterns and trends.The primary Date applies to the following fields:ConfirmedCovidCases, TotalConfirmedCovidCases, ConfirmedCovidDeaths, TotalCovidDeaths, ConfirmedCovidRecovered,SevenDayAverageCases.The StatisticProfileDate applies to the following fields:CovidCasesConfirmed, HospitalisedCovidCases, RequiringICUCovidCases, HealthcareWorkersCovidCases,Clusters Notified,HospitalisedAged5,HospitalisedAged5to14,HospitalisedAged15to24,HospitalisedAged25to34,HospitalisedAged35to44,HospitalisedAged45to54,HospitalisedAged55to64,HospitalisedAged65to74,HospitalisedAged75to84,HospitalisedAged85up,Male, Female, Unknown,Aged1to4, Aged5to14, Aged15to24, Aged25to34, Aged35to44, Aged45to54, Aged55to64, Aged65to74,Aged75to84,Aged85up,MedianAgeCommunityTransmission, CloseContact, TravelAbroad, Total Deaths by Date of Death,Deaths by Date of Death.
Facebook
TwitterRead the associated blogpost for a detailed description of how this dataset was prepared; plus extra code for producing animated maps.
The 2019 Novel Coronavirus (COVID-19) continues to spread in countries around the world. This dataset provides daily updated number of reported cases & deaths in Germany on the federal state (Bundesland) and county (Landkreis/Stadtkreis) level. In April 2021 I added a dataset on vaccination progress. In addition, I provide geospatial shape files and general state-level population demographics to aid the analysis.
The dataset consists of thre main csv files: covid_de.csv, demgraphics_de.csv, and covid_de_vaccines.csv. The geospatial shapes are included in the de_state.* files. See the column descriptions below for more detailed information.
covid_de.csv: COVID-19 cases and deaths which will be updated daily. The original data are being collected by Germany's Robert Koch Institute and can be download through the National Platform for Geographic Data (the latter site also hosts an interactive dashboard). I reshaped and translated the data (using R tidyverse tools) to make it better accessible. This blogpost explains how I prepared the data, and describes how to produces animated maps.
demographics_de.csv: General Demographic Data about Germany on the federal state level. Those have been downloaded from Germany's Federal Office for Statistics (Statistisches Bundesamt) through their Open Data platform GENESIS. The data reflect the (most recent available) estimates on 2018-12-31. You can find the corresponding table here.
covid_de_vaccines.csv: In April 2021 I added this file that contains the Covid-19 vaccination progress for Germany as a whole. It details daily doses, broken down cumulatively by manufacturer, as well as the cumulative number of people having received their first and full vaccination. The earliest data are from 2020-12-27.
de_state.*: Geospatial shape files for Germany's 16 federal states. Downloaded via Germany's Federal Agency for Cartography and Geodesy . Specifically, the shape file was obtained from this link.
COVID-19 dataset covid_de.csv:
state: Name of the German federal state. Germany has 16 federal states. I removed converted special characters from the original data.
county: The name of the German Landkreis (LK) or Stadtkreis (SK), which correspond roughly to US counties.
age_group: The COVID-19 data is being reported for 6 age groups: 0-4, 5-14, 15-34, 35-59, 60-79, and above 80 years old. As a shortcut the last category I'm using "80-99", but there might well be persons above 99 years old in this dataset. This column has a few NA entries.
gender: Reported as male (M) or female (F). This column has a few NA entries.
date: The calendar date of when a case or death were reported. There might be delays that will be corrected by retroactively assigning cases to earlier dates.
cases: COVID-19 cases that have been confirmed through laboratory work. This and the following 2 columns are counts per day, not cumulative counts.
deaths: COVID-19 related deaths.
recovered: Recovered cases.
Demographic dataset demographics_de.csv:
state, gender, age_group: same as above. The demographic data is available in higher age resolution, but I have binned it here to match the corresponding age groups in the covid_de.csv file.
population: Population counts for the respective categories. These numbers reflect the (most recent available) estimates on 2018-12-31.
Vaccination progress dataset covid_de_vaccines.csv:
date: calendar date of vaccination
doses, doses_first, doses_second: Daily count of administered doses: total, 1st shot, 2nd shot.
pfizer_cumul, moderna_cumul, astrazeneca_cumul: Daily cumulative number of administered vaccinations by manufacturer.
persons_first_cumul, persons_full_cumul: Daily cumulative number of people having received their 1st shot and full vaccination, respectively.
All the data have been extracted from open data sources which are being gratefully acknowledged:
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Provisional counts of the number of deaths and age-standardised mortality rates involving the coronavirus (COVID-19), by occupational groups, for deaths registered between 9 March and 28 December 2020 in England and Wales. Figures are provided for males and females.
Facebook
TwitterThis dataset is a per-state amalgamation of demographic, public health and other relevant predictors for COVID-19.
Used positive, death and totalTestResults from the API for, respectively, Infected, Deaths and Tested in this dataset.
Please read the documentation of the API for more context on those columns
Density is people per meter squared https://worldpopulationreview.com/states/
https://worldpopulationreview.com/states/gdp-by-state/
https://worldpopulationreview.com/states/per-capita-income-by-state/
https://en.wikipedia.org/wiki/List_of_U.S._states_by_Gini_coefficient
Rates from Feb 2020 and are percentage of labor force
https://www.bls.gov/web/laus/laumstrk.htm
Ratio is Male / Female
https://www.kff.org/other/state-indicator/distribution-by-gender/
https://worldpopulationreview.com/states/smoking-rates-by-state/
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/flu_pneumonia_mortality/flu_pneumonia.htm
Death rate per 100,000 people
https://www.cdc.gov/nchs/pressroom/sosmap/lung_disease_mortality/lung_disease.htm
https://www.kff.org/other/state-indicator/total-active-physicians/
https://www.kff.org/other/state-indicator/total-hospitals
Includes spending for all health care services and products by state of residence. Hospital spending is included and reflects the total net revenue. Costs such as insurance, administration, research, and construction expenses are not included.
https://www.kff.org/other/state-indicator/avg-annual-growth-per-capita/
Pollution: Average exposure of the general public to particulate matter of 2.5 microns or less (PM2.5) measured in micrograms per cubic meter (3-year estimate)
https://www.americashealthrankings.org/explore/annual/measure/air/state/ALL
For each state, number of medium and large airports https://en.wikipedia.org/wiki/List_of_the_busiest_airports_in_the_United_States
Note that FL was incorrect in the table, but is corrected in the Hottest States paragraph
https://worldpopulationreview.com/states/average-temperatures-by-state/
District of Columbia temperature computed as the average of Maryland and Virginia
Urbanization as a percentage of the population https://www.icip.iastate.edu/tables/population/urban-pct-states
https://www.kff.org/other/state-indicator/distribution-by-age/
Schools that haven't closed are marked NaN https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html
Note that some datasets above did not contain data for District of Columbia, this missing data was found via Google searches manually entered.
Facebook
TwitterThe United States have recently become the country with the most reported cases of 2019 Novel Coronavirus (COVID-19). This dataset contains daily updated number of reported cases & deaths in the US on the state and county level, as provided by the Johns Hopkins University. In addition, I provide matching demographic information for US counties.
The dataset consists of two main csv files: covid_us_county.csv and us_county.csv. See the column descriptions below for more detailed information. In addition, I've added US county shape files for geospatial plots: us_county.shp/dbf/prj/shx.
covid_us_county.csv: COVID-19 cases and deaths which will be updated daily. The data is provided by the Johns Hopkins University through their excellent github repo. I combined the separate "confirmed cases" and "deaths" files into a single table, removed a few (I think to be) redundant geo identifier columns, and reshaped the data into long format with a single date column. The earliest recorded cases are from 2020-01-22.
us_counties.csv: Demographic information on the US county level based on the (most recent) 2014-18 release of the Amercian Community Survey. Derived via the great tidycensus package.
COVID-19 dataset covid_us_county.csv:
fips: County code in numeric format (i.e. no leading zeros). A small number of cases have NA values here, but can still be used for state-wise aggregation. Currently, this only affect the states of Massachusetts and Missouri.
county: Name of the US county. This is NA for the (aggregated counts of the) territories of American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and Virgin Islands.
state: Name of US state or territory.
state_code: Two letter abbreviation of US state (e.g. "CA" for "California"). This feature has NA values for the territories listed above.
lat and long: coordinates of the county or territory.
date: Reporting date.
cases & deaths: Cumulative numbers for cases & deaths.
Demographic dataset us_counties.csv:
fips, county, state, state_code: same as above. The county names are slightly different, but mostly the difference is that this dataset has the word "County" added. I recommend to join on fips.
male & female: Population numbers for male and female.
population: Total population for the county. Provided as convenience feature; is always the sum of male + female.
female_percentage: Another convenience feature: female / population in percent.
median_age: Overall median age for the county.
Data provided for educational and academic research purposes by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE).
The github repo states that:
This GitHub repo and its contents herein, including all data, mapping, and analysis, copyright 2020 Johns Hopkins University, all rights reserved, is provided to the public strictly for educational and academic research purposes. The Website relies upon publicly available data from multiple sources, that do not always agree. The Johns Hopkins University hereby disclaims any and all representations and warranties with respect to the Website, including accuracy, fitness for use, and merchantability. Reliance on the Website for medical guidance or use of the Website in commerce is strictly prohibited.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Outcomes of male vs female adults with COVID-19.
Facebook
TwitterBackgroundWe aimed to determine the trend of TB-related deaths during the COVID-19 pandemic.MethodsTB-related mortality data of decedents aged ≥25 years from 2006 to 2021 were analyzed. Excess deaths were estimated by determining the difference between observed and projected mortality rates during the pandemic.ResultsA total of 18,628 TB-related deaths were documented from 2006 to 2021. TB-related age-standardized mortality rates (ASMRs) were 0.51 in 2020 and 0.52 in 2021, corresponding to an excess mortality of 10.22 and 9.19%, respectively. Female patients with TB demonstrated a higher relative increase in mortality (26.33 vs. 2.17% in 2020; 21.48 vs. 3.23% in 2021) when compared to male. Female aged 45–64 years old showed a surge in mortality, with an annual percent change (APC) of −2.2% pre-pandemic to 22.8% (95% CI: −1.7 to 68.7%) during the pandemic, corresponding to excess mortalities of 62.165 and 99.16% in 2020 and 2021, respectively; these excess mortality rates were higher than those observed in the overall female population ages 45–64 years in 2020 (17.53%) and 2021 (33.79%).ConclusionThe steady decline in TB-related mortality in the United States has been reversed by COVID-19. Female with TB were disproportionately affected by the pandemic.
Facebook
TwitterThis study explored the change in mortality rates of respiratory disease during the corona virus disease 2019 (COVID-19) pandemic. Death data of registered residents of Suzhou from 2014 to 2020 were collected and the weekly mortality rates due to respiratory disease and all deaths were analyzed. The differences in mortality rates during the pandemic and the same period in previous years were compared. Before the pandemic, the crude mortality rate (CMR) and standardized mortality rate (SMR) of Suzhou residents including respiratory disease, were not much different from those in previous years. During the emergency period, the CMR of Suzhou residents was 180.2/100,000 and the SMR was 85.5/100,000, decreasing by 9.1% and 14.6%, respectively; the CMR of respiratory disease was 16.4/100,000 and the SMR was 6.8/100,000, down 41.4% and 44.9%, respectively. Regardless of the mortality rates of all deaths or respiratory disease, the rates were higher in males than in females, although males had aslightly greater decrease in all deaths during the emergency period compared with females, and the opposite was true for respiratory disease. During the pandemic, the death rate of residents decreased, especially that due to respiratory disease.
Facebook
Twitterhttp://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Greetings everyone! I hope you find this dataset valuable for your COVID-19 models. It is aligned with SRK's Novel Corona Virus dataset. Feel free to upvote if you use it!
This dataset contains what I find as essential demographic information for every country specified in the submission COVID-19 competition file. Moreover, there is additional data which is critical in my point of view in order to predict the infection rate and mortality rate per country such as the number of COVID detection tests, detection date of 'patient zero' and initial restrictions dates. Please look at the columns description for the comprehensive explanation.
My
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe differential effect of comorbidities on COVID-19 severe outcomes by sex has not been fully evaluated.ObjectiveTo examine the association of major comorbidities and COVID-19 mortality in men and women separately.MethodsWe performed a retrospective cohort analysis using a large electronic health record (EHR) database in the U.S. We included adult patients with a clinical diagnosis of COVID-19 who also had necessary information on demographics and comorbidities from January 1, 2016 to October 31, 2021. We defined comorbidities by the Charlson Comorbidity Index (CCI) using ICD-10 codes at or before the COVID-19 diagnosis. We conducted logistic regressions to compare the risk of death associated with comorbidities stratifying by sex.ResultsA total of 121,342 patients were included in the final analysis. We found significant sex differences in the association between comorbidities and COVID-19 death. Specifically, moderate/severe liver disease, dementia, metastatic solid tumor, and heart failure and the increased number of comorbidities appeared to confer a greater magnitude of mortality risk in women compared to men.ConclusionsOur study suggests sex differences in the effect of comorbidities on COVID-19 mortality and highlights the importance of implementing sex-specific preventive or treatment approaches in patients with COVID-19.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
The datasets hold information about the cases and deaths from COVID-19 for multiple countries between January 22th 2020, to March 30, 2020. There is a separate excel sheet for every country. The following is the information that the dataset holds.
Separate CSV sheets are made for the country. The datasets would surely be updated on a certain basis to fit with the current COVID-19 values.
Special thanks to - https://www.kaggle.com/koryto/countryinfo for providing the much essential information for building the dataset.
Facebook
TwitterAttribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Overview
The COVID-19 Patient Recovery Dataset is a synthetic collection of anonymized records for around 70,000 COVID-19 patients. It aims to assist with classification tasks in machine learning and epidemiological research. The dataset includes detailed clinical and demographic information, such as symptoms, existing health issues, vaccination status, COVID-19 variants, treatment details, and outcomes related to recovery or mortality. This dataset is great for predicting patient recovery (recovered), mortality (death), disease severity (severity), or the need for intensive care (icu_admission) using algorithms like Logistic Regression, Random Forest, XGBoost, or Neural Networks. It also allows for exploratory data analysis (EDA), statistical modeling, and time-series studies to find patterns in COVID-19 outcomes.
The data is synthetic and reflects realistic trends found in public health data, based on sources like WHO reports. It ensures privacy and follows ethical guidelines. Dates are provided in Excel serial format, meaning 44447 corresponds to September 8, 2021, and can be converted to standard dates using Python’s datetime or Excel. With 70,000 records and 28 columns, this dataset serves as a valuable resource for data scientists, researchers, and students interested in health-related machine learning or pandemic trends.
Data Source and Collection
Source: Synthetic data based on public health patterns from sources like the World Health Organization (WHO). It includes placeholder URLs.
Collection Period: Simulated from early 2020 to mid-2022, covering the Alpha, Delta, and Omicron waves.
Number of Records: 70,000.
File Format: CSV, which works with Pandas, R, Excel, and more.
Data Quality Notes:
About 5% of the values are missing in fields like symptoms_2, symptoms_3, treatment_given_2, and date.
There are rare inconsistencies, such as between recovery/death flags and dates, which may need some preprocessing.
Unique, anonymized patient IDs.
| Column Name | Data Type |
|---|---|
| patient_id | String |
| country | String |
| region/state | String |
| date_reported | Integer |
| age | Integer |
| gender | String |
| comorbidities | String |
| symptoms_1 | String |
| symptoms_2 | String |
| symptoms_3 | String |
| severity | String |
| hospitalized | Integer |
| icu_admission | Integer |
| ventilator_support | Integer |
| vaccination_status | String |
| variant | String |
| treatment_given_1 | String |
| treatment_given_2 | String |
| days_to_recovery | Integer |
| recovered | Integer |
| death | Integer |
| date_of_recovery | Integer |
| date_of_death | Integer |
| tests_conducted | Integer |
| test_type | String |
| hospital_name | String |
| doctor_assigned | String |
| source_url | String |
Key Column Details
patient_id: Unique identifier (e.g., P000001).
country: Reporting country (e.g., India, USA, Brazil, Germany, China, Pakistan, South Africa, UK).
region/state: Sub-national region (e.g., Sindh, California, São Paulo, Beijing).
date_reported, date_of_recovery, date_of_death: Excel serial dates (convert using datetime(1899,12,30) + timedelta(days=value)).
age: Patient age (1–100 years).
gender: Male or Female.
comorbidities: Pre-existing conditions (e.g., Diabetes, Hypertension, Cancer, Heart Disease, Asthma, None).
symptoms_1, symptoms_2, symptoms_3: Reported symptoms (e.g., Cough, Fever, Fatigue, Loss of Smell, Sore Throat, or empty).
severity: Case severity (Mild, Moderate, Severe, Critical).
hospitalized, icu_admission, ventilator_support: Binary (1 = Yes, 0 = No).
vaccination_status: None, Partial, Full, or Booster.
variant: COVID-19 variant (Omicron, Delta, Alpha).
treatment_given_1, treatment_given_2: Treatments administered (e.g., Antibiotics, Remdesivir, Oxygen, Steroids, Paracetamol, or empty).
days_to_recovery: Days from report to recovery (5–30, or empty if not recovered).
recovered, death: Binary outcomes (1 = Yes, 0 = No; generally mutually exclusive).
tests_conducted: Number of tests (1–5).
test_type: PCR or Antigen.
hospital_name: Fictional hospital (e.g., Aga Khan, Mayo Clinic, NHS Trust).
doctor_assigned: Fictional doctor name (e.g., Dr. Smith, Dr. Müller).
source_url: Placeholder.
Summary Statistics
Total Patients: 70,000.
Age: Mean ~50 years, Min 1, Max 100, evenly distributed.
Gender: ~50% Male, ~50% Female.
Top Countries: USA (20%), India (18%), Brazil (15%), China (12%), Germany (10%).
Comorbidities: Diabetes (25%), Hypertension (20%), Cancer (15%), Heart Disease (15%), Asthma (10%), None (15%).
Severity: Mild (60%), Moderate (25%), Severe (10%), Critical (5%).
Recovery Rate: ~60% recovered (recovered=1), ~30% deceased (death=1), ~10% unresolved (both 0).
Vaccination: None (40%), Full (30%), Partial (15%), Booster (15%).
Variants: Omicron (50%), Delt...
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of demographics and characteristics of male vs female adults with COVID-19.
Facebook
TwitterThis repository contains datasets about the number of Italian Sars-CoV-2 confirmed cases and deaths disaggregated by age group and sex. The data is (automatically) extracted from pdf reports (like this) published by Istituto Superiore di Sanità (ISS) two times a week. A link to the most recent report can be found in this page under section "Documento esteso".
PDF reports are usually published on Tuesday and Friday and contains data updated to the 4 p.m. of the day day before their release.
I wrote a script that is runned periodically in order to automatically update this repository when a new report is published. The code is hosted in a separate repository.
For feedback and issues refers to the GitHub repository.
The data folder is structured as follows:
data
├── by-date
│ └── iccas_{date}.csv Dataset with cases/deaths updated to 4 p.m. of {date}
└── iccas_full.csv Dataset with data from all reports (by date)
The full dataset is obtained by concatenating all datasets in by-date and has an additional date column. If you use pandas, I suggest you to read this dataset using a multi-index on the first two columns:
python
import pandas as pd
df = pd.read_csv('iccas_full.csv', index_col=(0, 1)) # ('date', 'age_group')
NOTE: {date} is the date the data refers to, NOT the release date of the report it was extracted from: as written above, a report is usually released with a day of delay. For example, iccas_2020-03-19.csv contains data relative to 2020-03-19 which was extracted from the report published in 2020-03-20.
Each dataset in the by-date folder contains the same data you can find in "Table 1" of the corresponding ISS report. This table contains the number of confirmed cases, deaths and other derived information disaggregated by age group (0-9, 10-19, ..., 80-89, >=90) and sex.
WARNING: the sum of male and female cases is not equal to the total number of cases, since the sex of some cases is unknown. The same applies to deaths.
Below, {sex} can be male or female.
| Column | Description |
|---|---|
date | (Only in iccas_full.csv) Date the format YYYY-MM-DD; numbers are updated to 4 p.m of this date |
age_group | Values: "0-9", "10-19", ..., "80-89", ">=90" |
cases | Number of confirmed cases (both sexes + unknown-sex; active + closed) |
deaths | Number of deaths (both sexes + unknown-sex) |
{sex}_cases | Number of cases of sex {sex} |
{sex}_deaths | Number of cases of sex {sex} ended up in death |
cases_percentage | 100 * cases / cases_of_all_ages |
deaths_percentage | 100 * deaths / deaths_of_all_ages |
fatality_rate | 100 * deaths / cases |
{sex}_cases_percentage | 100 * {sex}_cases / (male_cases + female_cases) (cases of unknown sex excluded) |
{sex}_deaths_percentage | 100 * {sex}_deaths / (male_deaths + female_deaths) (cases of unknown sex excluded) |
{sex}_fatality_rate | 100 * {sex}_deaths / {sex}_cases |
All columns that can be computed from absolute counts of cases and deaths (bottom half of the table above) were all re-computed to increase precision.
Facebook
TwitterNotice:Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. Please refer to the Open Data resource at https://data.maryland.gov/Health-and-Human-Services/COVID-Master-Tracker/37gh-4yqf for continued weekly updates. SummaryThe cumulative number of probable COVID-19 deaths among Maryland residents by gender: Female; Male; Unknown.DescriptionThe MD COVID-19 - Probable Deaths by Gender Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by gender. A death is classified as probable if the person's death certificate notes COVID-19 to be a probable, suspect or presumed cause or condition. Probable deaths are not yet been confirmed by a laboratory test. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Confirmed deaths are available from the MD COVID-19 - Confirmed Deaths by Gender Distribution data layer.COVID-19 is a disease caused by a respiratory virus first identified in Wuhan, Hubei Province, China in December 2019. COVID-19 is a new virus that hasn't caused illness in humans before. Worldwide, COVID-19 has resulted in thousands of infections, causing illness and in some cases death. Cases have spread to countries throughout the world, with more cases reported daily. The Maryland Department of Health reports daily on COVID-19 cases by county.
Facebook
TwitterUnderstanding gender is essential to understanding the risk factors of poor health, early death and health inequities. The COVID-19 outbreak is no different. At this point in the pandemic, we are unable to provide a clear answer to the question of the extent to which sex and gender are influencing the health outcomes of people diagnosed with COVID-19. However, experience and evidence thus far tell us that both sex and gender are important drivers of risk and response to infection and disease.
http://globalhealth5050.org/covid19 https://data.humdata.org/dataset/covid-19-sex-disaggregated-data-tracker
In order to understand the role gender is playing in the COVID-19 outbreak, countries urgently need to begin both collecting and publicly reporting sex-disaggregated data. At a minimum, this should include the number of cases and deaths in men and women.
In collaboration with CNN, Global Health 50/50 began compiling publicly available sex-disaggregated data reported by national governments to date and is exploring how gender may be driving the higher proportion of reported deaths in men among confirmed cases so far.
http://globalhealth5050.org/covid19 https://data.humdata.org/dataset/covid-19-sex-disaggregated-data-tracker
Photo by Nick Fewings on Unsplash
Covid-19 Pandemic.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains COVID-19 Daily Statistics and the Profile of Covid-19 Daily Statistics for Ireland as reported by the Health Protection Surveillance Centre. This data includes confirmed cases (PCR) only and does not include positive antigen results uploaded to the HSE portal. Time series dataset from March 2020 to November 2023.Deaths: From 16th May 2022 to November 2023, reporting of Notified Deaths changed from daily to weekly. Data on deaths is based on the date on which a death was notified on CIDR, not the date on which the death occurred. Data on deaths by date of death is available on the HPSC Respiratory Virus Notification Hub https://respiratoryvirus.hpsc.ie/.Notice:Please be advised that on 29th April 2021, the 'Aged65up' and 'HospitalisedAged65up' fields were removed from this table.The three fields 'Aged65to74', 'Aged75to84', and 'Aged85up' replace the 'Aged65up' field.The three fields 'HospitalisedAged65to74', 'HospitalisedAged75to84' and 'HospitalisedAged85up' replace the 'HospitalisedAged65up' field.On the week beginning 1st March 2021, the values in the following fields in this table were set to zero: 'CommunityTransmission', 'CloseContact', 'TravelAbroad' and ‘ClustersNotified’. The primary Date applies to the following fields:ConfirmedCovidCases, TotalConfirmedCovidCases, ConfirmedCovidDeaths, TotalCovidDeaths, ConfirmedCovidRecovered,SevenDayAverageCases.The StatisticProfileDate applies to the following fields:CovidCasesConfirmed, HospitalisedCovidCases, RequiringICUCovidCases, HealthcareWorkersCovidCases,Clusters Notified,HospitalisedAged5,HospitalisedAged5to14,HospitalisedAged15to24,HospitalisedAged25to34,HospitalisedAged35to44,HospitalisedAged45to54,HospitalisedAged55to64,HospitalisedAged65to74,HospitalisedAged75to84,HospitalisedAged85up,Male, Female, Unknown,Aged1to4, Aged5to14, Aged15to24, Aged25to34, Aged35to44, Aged45to54, Aged55to64, Aged65to74,Aged75to84,Aged85up,MedianAgeCommunityTransmission, CloseContact, TravelAbroad, Total Deaths by Date of Death,Deaths by Date of Death.
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
https://videnskab.dk/files/styles/columns_12_12_desktop/public/article_media/shutterstock_1779839909.jpg?itok=kYzSroNA%C3%97tamp=1596709364" alt="">
Coronavirus disease 2019 (COVID‑19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It was first identified in December 2019 in Wuhan, Hubei, China, and has resulted in an ongoing pandemic. As of 12 August 2020, more than 20.2 million cases have been reported across 188 countries and territories, resulting in more than 741,000 deaths. More than 12.5 million people have recovered. Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness.
These numbers are sampled exclusively from Denmark between 11th of March 2020 and 9th of August 2020.
This contains 10 data files:
Wiki about COVID-19 in Denmark: https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Denmark Dashboard with information on COVID-19 in Denmark: https://experience.arcgis.com/experience/aa41b29149f24e20a4007a0c4e13db1d Currentcase count: https://www.worldometers.info/coronavirus/country/denmark/
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.
The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.
The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.