42 datasets found
  1. COVID-19 Dataset

    • kaggle.com
    zip
    Updated Nov 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meir Nizri (2022). COVID-19 Dataset [Dataset]. https://www.kaggle.com/datasets/meirnizri/covid19-dataset
    Explore at:
    zip(4890659 bytes)Available download formats
    Dataset updated
    Nov 13, 2022
    Authors
    Meir Nizri
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.

    The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.

    content

    The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.

    • sex: 1 for female and 2 for male.
    • age: of the patient.
    • classification: covid test findings. Values 1-3 mean that the patient was diagnosed with covid in different degrees. 4 or higher means that the patient is not a carrier of covid or that the test is inconclusive.
    • patient type: type of care the patient received in the unit. 1 for returned home and 2 for hospitalization.
    • pneumonia: whether the patient already have air sacs inflammation or not.
    • pregnancy: whether the patient is pregnant or not.
    • diabetes: whether the patient has diabetes or not.
    • copd: Indicates whether the patient has Chronic obstructive pulmonary disease or not.
    • asthma: whether the patient has asthma or not.
    • inmsupr: whether the patient is immunosuppressed or not.
    • hypertension: whether the patient has hypertension or not.
    • cardiovascular: whether the patient has heart or blood vessels related disease.
    • renal chronic: whether the patient has chronic renal disease or not.
    • other disease: whether the patient has other disease or not.
    • obesity: whether the patient is obese or not.
    • tobacco: whether the patient is a tobacco user.
    • usmr: Indicates whether the patient treated medical units of the first, second or third level.
    • medical unit: type of institution of the National Health System that provided the care.
    • intubed: whether the patient was connected to the ventilator.
    • icu: Indicates whether the patient had been admitted to an Intensive Care Unit.
    • date died: If the patient died indicate the date of death, and 9999-99-99 otherwise.
  2. d

    MD COVID-19 - Confirmed Deaths by Gender Distribution

    • catalog.data.gov
    • opendata.maryland.gov
    Updated Oct 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    opendata.maryland.gov (2025). MD COVID-19 - Confirmed Deaths by Gender Distribution [Dataset]. https://catalog.data.gov/dataset/md-covid-19-confirmed-deaths-by-gender-distribution
    Explore at:
    Dataset updated
    Oct 18, 2025
    Dataset provided by
    opendata.maryland.gov
    Description

    Note: Note: Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. As of April 27, 2023 updates changed from daily to weekly. Summary The cumulative number of confirmed COVID-19 deaths among Maryland residents by gender: Female; Male; Unknown. Description The MD COVID-19 - Confirmed Deaths by Gender Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by gender. A death is classified as confirmed if the person had a laboratory-confirmed positive COVID-19 test result. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Probable deaths are available from the MD COVID-19 - Probable Deaths by Gender Distribution data layer. Terms of Use The Spatial Data, and the information therein, (collectively the "Data") is provided "as is" without warranty of any kind, either expressed, implied, or statutory. The user assumes the entire risk as to quality and performance of the Data. No guarantee of accuracy is granted, nor is any responsibility for reliance thereon assumed. In no event shall the State of Maryland be liable for direct, indirect, incidental, consequential or special damages of any kind. The State of Maryland does not accept liability for any damages or misrepresentation caused by inaccuracies in the Data or as a result to changes to the Data, nor is there responsibility assumed to maintain the Data in any manner or form. The Data can be freely distributed as long as the metadata entry is not modified or deleted. Any data derived from the Data must acknowledge the State of Maryland in the metadata.

  3. g

    COVID-19 HPSC Detailed Statistics Profile

    • covid-19.geohive.ie
    • geohive.ie
    • +3more
    Updated Mar 31, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    content_geohive (2020). COVID-19 HPSC Detailed Statistics Profile [Dataset]. https://covid-19.geohive.ie/datasets/d8eb52d56273413b84b0187a4e9117be
    Explore at:
    Dataset updated
    Mar 31, 2020
    Dataset authored and provided by
    content_geohive
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    Please see FAQ for latest information on COVID-19 Data Hub data flows: https://covid-19.geohive.ie/pages/helpfaqs.Notice:See the Technical Data Issues section in the FAQ for information about issues in data: https://covid-19.geohive.ie/pages/helpfaqs.Deaths: From 16th May 2022 onwards, reporting of Notified Deaths will be weekly (each Wednesday) with deaths notified since the previous Wednesday reported. This is based on the date on which a death was notified on CIDR, not the date on which the death occurred. Data on deaths by date of death is available on the new HPSC Epidemiology of COVID-19 Data Hub https://epi-covid-19-hpscireland.hub.arcgis.com/.Notice:

    Please be advised that on 29th April 2021, the 'Aged65up' and 'HospitalisedAged65up' fields were removed from this table. The three fields 'Aged65to74', 'Aged75to84', and 'Aged85up' replace the 'Aged65up' field.The three fields 'HospitalisedAged65to74', 'HospitalisedAged75to84' and 'HospitalisedAged85up' replace the 'HospitalisedAged65up' field.Please be advised that on the week beginning 1st March 2021, the values in the following fields in this table were set to zero: 'CommunityTransmission' , 'CloseContact', 'TravelAbroad' and ‘ClustersNotified’. ----------------------------------------------------------------------This feature service contains the up to date Covid-19 Daily Statistics as well as the Profile of Covid-19 Daily Statistics for Ireland, as reported by the Health Protection Surveillance Centre.The Covid-19 Daily Statistics are updated once a week, each Wednesday, which includes data for the full time series. Data on deaths is updated once a week, each Wednesday, which includes data for the full time series.The further breakdown of these counts (age, gender, transmission, etc.) is part of a Daily Statistics Profile of Covid-19, to help identify patterns and trends.The primary Date applies to the following fields:ConfirmedCovidCases, TotalConfirmedCovidCases, ConfirmedCovidDeaths, TotalCovidDeaths, ConfirmedCovidRecovered,SevenDayAverageCases.The StatisticProfileDate applies to the following fields:CovidCasesConfirmed, HospitalisedCovidCases, RequiringICUCovidCases, HealthcareWorkersCovidCases,Clusters Notified,HospitalisedAged5,HospitalisedAged5to14,HospitalisedAged15to24,HospitalisedAged25to34,HospitalisedAged35to44,HospitalisedAged45to54,HospitalisedAged55to64,HospitalisedAged65to74,HospitalisedAged75to84,HospitalisedAged85up,Male, Female, Unknown,Aged1to4, Aged5to14, Aged15to24, Aged25to34, Aged35to44, Aged45to54, Aged55to64, Aged65to74,Aged75to84,Aged85up,MedianAgeCommunityTransmission, CloseContact, TravelAbroad, Total Deaths by Date of Death,Deaths by Date of Death.

  4. COVID-19 Tracking Germany

    • kaggle.com
    zip
    Updated Feb 7, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heads or Tails (2023). COVID-19 Tracking Germany [Dataset]. https://www.kaggle.com/datasets/headsortails/covid19-tracking-germany
    Explore at:
    zip(14492010 bytes)Available download formats
    Dataset updated
    Feb 7, 2023
    Authors
    Heads or Tails
    Area covered
    Germany
    Description

    Read the associated blogpost for a detailed description of how this dataset was prepared; plus extra code for producing animated maps.

    Context

    The 2019 Novel Coronavirus (COVID-19) continues to spread in countries around the world. This dataset provides daily updated number of reported cases & deaths in Germany on the federal state (Bundesland) and county (Landkreis/Stadtkreis) level. In April 2021 I added a dataset on vaccination progress. In addition, I provide geospatial shape files and general state-level population demographics to aid the analysis.

    Content

    The dataset consists of thre main csv files: covid_de.csv, demgraphics_de.csv, and covid_de_vaccines.csv. The geospatial shapes are included in the de_state.* files. See the column descriptions below for more detailed information.

    • covid_de.csv: COVID-19 cases and deaths which will be updated daily. The original data are being collected by Germany's Robert Koch Institute and can be download through the National Platform for Geographic Data (the latter site also hosts an interactive dashboard). I reshaped and translated the data (using R tidyverse tools) to make it better accessible. This blogpost explains how I prepared the data, and describes how to produces animated maps.

    • demographics_de.csv: General Demographic Data about Germany on the federal state level. Those have been downloaded from Germany's Federal Office for Statistics (Statistisches Bundesamt) through their Open Data platform GENESIS. The data reflect the (most recent available) estimates on 2018-12-31. You can find the corresponding table here.

    • covid_de_vaccines.csv: In April 2021 I added this file that contains the Covid-19 vaccination progress for Germany as a whole. It details daily doses, broken down cumulatively by manufacturer, as well as the cumulative number of people having received their first and full vaccination. The earliest data are from 2020-12-27.

    • de_state.*: Geospatial shape files for Germany's 16 federal states. Downloaded via Germany's Federal Agency for Cartography and Geodesy . Specifically, the shape file was obtained from this link.

    Column Description

    COVID-19 dataset covid_de.csv:

    • state: Name of the German federal state. Germany has 16 federal states. I removed converted special characters from the original data.

    • county: The name of the German Landkreis (LK) or Stadtkreis (SK), which correspond roughly to US counties.

    • age_group: The COVID-19 data is being reported for 6 age groups: 0-4, 5-14, 15-34, 35-59, 60-79, and above 80 years old. As a shortcut the last category I'm using "80-99", but there might well be persons above 99 years old in this dataset. This column has a few NA entries.

    • gender: Reported as male (M) or female (F). This column has a few NA entries.

    • date: The calendar date of when a case or death were reported. There might be delays that will be corrected by retroactively assigning cases to earlier dates.

    • cases: COVID-19 cases that have been confirmed through laboratory work. This and the following 2 columns are counts per day, not cumulative counts.

    • deaths: COVID-19 related deaths.

    • recovered: Recovered cases.

    Demographic dataset demographics_de.csv:

    • state, gender, age_group: same as above. The demographic data is available in higher age resolution, but I have binned it here to match the corresponding age groups in the covid_de.csv file.

    • population: Population counts for the respective categories. These numbers reflect the (most recent available) estimates on 2018-12-31.

    Vaccination progress dataset covid_de_vaccines.csv:

    • date: calendar date of vaccination

    • doses, doses_first, doses_second: Daily count of administered doses: total, 1st shot, 2nd shot.

    • pfizer_cumul, moderna_cumul, astrazeneca_cumul: Daily cumulative number of administered vaccinations by manufacturer.

    • persons_first_cumul, persons_full_cumul: Daily cumulative number of people having received their 1st shot and full vaccination, respectively.

    Acknowledgements

    All the data have been extracted from open data sources which are being gratefully acknowledged:

    • The [Robert ...
  5. Coronavirus (COVID-19) related deaths by occupation, England and Wales

    • ons.gov.uk
    • cy.ons.gov.uk
    xlsx
    Updated Jan 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Office for National Statistics (2021). Coronavirus (COVID-19) related deaths by occupation, England and Wales [Dataset]. https://www.ons.gov.uk/peoplepopulationandcommunity/healthandsocialcare/causesofdeath/datasets/coronaviruscovid19relateddeathsbyoccupationenglandandwales
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jan 25, 2021
    Dataset provided by
    Office for National Statisticshttp://www.ons.gov.uk/
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    Provisional counts of the number of deaths and age-standardised mortality rates involving the coronavirus (COVID-19), by occupational groups, for deaths registered between 9 March and 28 December 2020 in England and Wales. Figures are provided for males and females.

  6. COVID-19 State Data

    • kaggle.com
    zip
    Updated Nov 3, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Night Ranger (2020). COVID-19 State Data [Dataset]. https://www.kaggle.com/nightranger77/covid19-state-data
    Explore at:
    zip(4501 bytes)Available download formats
    Dataset updated
    Nov 3, 2020
    Authors
    Night Ranger
    Description

    This dataset is a per-state amalgamation of demographic, public health and other relevant predictors for COVID-19.

    Deaths, Infections and Tests by State

    The COVID Tracking Project: https://covidtracking.com/data/api

    Used positive, death and totalTestResults from the API for, respectively, Infected, Deaths and Tested in this dataset. Please read the documentation of the API for more context on those columns

    Predictor Data and Sources

    Population (2020)

    Density is people per meter squared https://worldpopulationreview.com/states/

    ICU Beds and Age 60+

    https://khn.org/news/as-coronavirus-spreads-widely-millions-of-older-americans-live-in-counties-with-no-icu-beds/

    GDP

    https://worldpopulationreview.com/states/gdp-by-state/

    Income per capita (2018)

    https://worldpopulationreview.com/states/per-capita-income-by-state/

    Gini

    https://en.wikipedia.org/wiki/List_of_U.S._states_by_Gini_coefficient

    Unemployment (2020)

    Rates from Feb 2020 and are percentage of labor force
    https://www.bls.gov/web/laus/laumstrk.htm

    Sex (2017)

    Ratio is Male / Female
    https://www.kff.org/other/state-indicator/distribution-by-gender/

    Smoking Percentage (2020)

    https://worldpopulationreview.com/states/smoking-rates-by-state/

    Influenza and Pneumonia Death Rate (2018)

    Death rate per 100,000 people
    https://www.cdc.gov/nchs/pressroom/sosmap/flu_pneumonia_mortality/flu_pneumonia.htm

    Chronic Lower Respiratory Disease Death Rate (2018)

    Death rate per 100,000 people
    https://www.cdc.gov/nchs/pressroom/sosmap/lung_disease_mortality/lung_disease.htm

    Active Physicians (2019)

    https://www.kff.org/other/state-indicator/total-active-physicians/

    Hospitals (2018)

    https://www.kff.org/other/state-indicator/total-hospitals

    Health spending per capita

    Includes spending for all health care services and products by state of residence. Hospital spending is included and reflects the total net revenue. Costs such as insurance, administration, research, and construction expenses are not included.
    https://www.kff.org/other/state-indicator/avg-annual-growth-per-capita/

    Pollution (2019)

    Pollution: Average exposure of the general public to particulate matter of 2.5 microns or less (PM2.5) measured in micrograms per cubic meter (3-year estimate)
    https://www.americashealthrankings.org/explore/annual/measure/air/state/ALL

    Medium and Large Airports

    For each state, number of medium and large airports https://en.wikipedia.org/wiki/List_of_the_busiest_airports_in_the_United_States

    Temperature (2019)

    Note that FL was incorrect in the table, but is corrected in the Hottest States paragraph
    https://worldpopulationreview.com/states/average-temperatures-by-state/
    District of Columbia temperature computed as the average of Maryland and Virginia

    Urbanization (2010)

    Urbanization as a percentage of the population https://www.icip.iastate.edu/tables/population/urban-pct-states

    Age Groups (2018)

    https://www.kff.org/other/state-indicator/distribution-by-age/

    School Closure Dates

    Schools that haven't closed are marked NaN https://www.edweek.org/ew/section/multimedia/map-coronavirus-and-school-closures.html

    Note that some datasets above did not contain data for District of Columbia, this missing data was found via Google searches manually entered.

  7. COVID-19 US County JHU Data & Demographics

    • kaggle.com
    zip
    Updated Mar 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Heads or Tails (2023). COVID-19 US County JHU Data & Demographics [Dataset]. https://www.kaggle.com/headsortails/covid19-us-county-jhu-data-demographics
    Explore at:
    zip(40873869 bytes)Available download formats
    Dataset updated
    Mar 1, 2023
    Authors
    Heads or Tails
    Area covered
    United States
    Description

    Context

    The United States have recently become the country with the most reported cases of 2019 Novel Coronavirus (COVID-19). This dataset contains daily updated number of reported cases & deaths in the US on the state and county level, as provided by the Johns Hopkins University. In addition, I provide matching demographic information for US counties.

    Content

    The dataset consists of two main csv files: covid_us_county.csv and us_county.csv. See the column descriptions below for more detailed information. In addition, I've added US county shape files for geospatial plots: us_county.shp/dbf/prj/shx.

    • covid_us_county.csv: COVID-19 cases and deaths which will be updated daily. The data is provided by the Johns Hopkins University through their excellent github repo. I combined the separate "confirmed cases" and "deaths" files into a single table, removed a few (I think to be) redundant geo identifier columns, and reshaped the data into long format with a single date column. The earliest recorded cases are from 2020-01-22.

    • us_counties.csv: Demographic information on the US county level based on the (most recent) 2014-18 release of the Amercian Community Survey. Derived via the great tidycensus package.

    Column Description

    COVID-19 dataset covid_us_county.csv:

    • fips: County code in numeric format (i.e. no leading zeros). A small number of cases have NA values here, but can still be used for state-wise aggregation. Currently, this only affect the states of Massachusetts and Missouri.

    • county: Name of the US county. This is NA for the (aggregated counts of the) territories of American Samoa, Guam, Northern Mariana Islands, Puerto Rico, and Virgin Islands.

    • state: Name of US state or territory.

    • state_code: Two letter abbreviation of US state (e.g. "CA" for "California"). This feature has NA values for the territories listed above.

    • lat and long: coordinates of the county or territory.

    • date: Reporting date.

    • cases & deaths: Cumulative numbers for cases & deaths.

    Demographic dataset us_counties.csv:

    • fips, county, state, state_code: same as above. The county names are slightly different, but mostly the difference is that this dataset has the word "County" added. I recommend to join on fips.

    • male & female: Population numbers for male and female.

    • population: Total population for the county. Provided as convenience feature; is always the sum of male + female.

    • female_percentage: Another convenience feature: female / population in percent.

    • median_age: Overall median age for the county.

    Acknowledgements

    Data provided for educational and academic research purposes by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE).

    Licence

    The github repo states that:

    This GitHub repo and its contents herein, including all data, mapping, and analysis, copyright 2020 Johns Hopkins University, all rights reserved, is provided to the public strictly for educational and academic research purposes. The Website relies upon publicly available data from multiple sources, that do not always agree. The Johns Hopkins University hereby disclaims any and all representations and warranties with respect to the Website, including accuracy, fitness for use, and merchantability. Reliance on the Website for medical guidance or use of the Website in commerce is strictly prohibited.
    

    Version history

    • In version 1, a small number of cases had values of `county == "Unassigned". Those have been superseded.
    • Version 5: added US county shape files
  8. Outcomes of male vs female adults with COVID-19.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ninh T. Nguyen; Justine Chinn; Morgan De Ferrante; Katharine A. Kirby; Samuel F. Hohmann; Alpesh Amin (2023). Outcomes of male vs female adults with COVID-19. [Dataset]. http://doi.org/10.1371/journal.pone.0254066.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ninh T. Nguyen; Justine Chinn; Morgan De Ferrante; Katharine A. Kirby; Samuel F. Hohmann; Alpesh Amin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Outcomes of male vs female adults with COVID-19.

  9. f

    Data_Sheet_1_Sex disparities of the effect of the COVID-19 pandemic on...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jun 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    He, Xinyuan; Qi, Mingyan; Ji, Fanpu; Li, Xiaofeng; Gao, Ning; Zeng, Qing-Lei; Lv, Fan; Bo, Yajing; Liu, Yishan; Qiu, Sikai; Deng, Huan (2024). Data_Sheet_1_Sex disparities of the effect of the COVID-19 pandemic on mortality among patients living with tuberculosis in the United States.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001429023
    Explore at:
    Dataset updated
    Jun 18, 2024
    Authors
    He, Xinyuan; Qi, Mingyan; Ji, Fanpu; Li, Xiaofeng; Gao, Ning; Zeng, Qing-Lei; Lv, Fan; Bo, Yajing; Liu, Yishan; Qiu, Sikai; Deng, Huan
    Area covered
    United States
    Description

    BackgroundWe aimed to determine the trend of TB-related deaths during the COVID-19 pandemic.MethodsTB-related mortality data of decedents aged ≥25 years from 2006 to 2021 were analyzed. Excess deaths were estimated by determining the difference between observed and projected mortality rates during the pandemic.ResultsA total of 18,628 TB-related deaths were documented from 2006 to 2021. TB-related age-standardized mortality rates (ASMRs) were 0.51 in 2020 and 0.52 in 2021, corresponding to an excess mortality of 10.22 and 9.19%, respectively. Female patients with TB demonstrated a higher relative increase in mortality (26.33 vs. 2.17% in 2020; 21.48 vs. 3.23% in 2021) when compared to male. Female aged 45–64 years old showed a surge in mortality, with an annual percent change (APC) of −2.2% pre-pandemic to 22.8% (95% CI: −1.7 to 68.7%) during the pandemic, corresponding to excess mortalities of 62.165 and 99.16% in 2020 and 2021, respectively; these excess mortality rates were higher than those observed in the overall female population ages 45–64 years in 2020 (17.53%) and 2021 (33.79%).ConclusionThe steady decline in TB-related mortality in the United States has been reversed by COVID-19. Female with TB were disproportionately affected by the pandemic.

  10. f

    Data from: Change in mortality rates of respiratory disease during the...

    • datasetcatalog.nlm.nih.gov
    • tandf.figshare.com
    Updated Mar 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lu, Yan; Huang, Chunyan; Wang, Linchi; Zhang, Jun; Xu, Jianrong; Wei, Xiaolin; Zhang, Zhengji; Hua, Yujie (2021). Change in mortality rates of respiratory disease during the COVID-19 pandemic [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000927808
    Explore at:
    Dataset updated
    Mar 31, 2021
    Authors
    Lu, Yan; Huang, Chunyan; Wang, Linchi; Zhang, Jun; Xu, Jianrong; Wei, Xiaolin; Zhang, Zhengji; Hua, Yujie
    Description

    This study explored the change in mortality rates of respiratory disease during the corona virus disease 2019 (COVID-19) pandemic. Death data of registered residents of Suzhou from 2014 to 2020 were collected and the weekly mortality rates due to respiratory disease and all deaths were analyzed. The differences in mortality rates during the pandemic and the same period in previous years were compared. Before the pandemic, the crude mortality rate (CMR) and standardized mortality rate (SMR) of Suzhou residents including respiratory disease, were not much different from those in previous years. During the emergency period, the CMR of Suzhou residents was 180.2/100,000 and the SMR was 85.5/100,000, decreasing by 9.1% and 14.6%, respectively; the CMR of respiratory disease was 16.4/100,000 and the SMR was 6.8/100,000, down 41.4% and 44.9%, respectively. Regardless of the mortality rates of all deaths or respiratory disease, the rates were higher in males than in females, although males had aslightly greater decrease in all deaths during the emergency period compared with females, and the opposite was true for respiratory disease. During the pandemic, the death rate of residents decreased, especially that due to respiratory disease.

  11. countryinfo

    • kaggle.com
    zip
    Updated Apr 14, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    My Koryto (2020). countryinfo [Dataset]. https://www.kaggle.com/koryto/countryinfo
    Explore at:
    zip(24384 bytes)Available download formats
    Dataset updated
    Apr 14, 2020
    Authors
    My Koryto
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Description

    Greetings everyone! I hope you find this dataset valuable for your COVID-19 models. It is aligned with SRK's Novel Corona Virus dataset. Feel free to upvote if you use it!

    This dataset contains what I find as essential demographic information for every country specified in the submission COVID-19 competition file. Moreover, there is additional data which is critical in my point of view in order to predict the infection rate and mortality rate per country such as the number of COVID detection tests, detection date of 'patient zero' and initial restrictions dates. Please look at the columns description for the comprehensive explanation.

    Major Insights:

    1. I've seen that there are some pretty clear distinctions between female and male mortality rate as men tend to develop more severe symptoms. Therefore, I added some variables which represent the sex ratio (amount of males per female) in each country, with separation by age groups & total. Moreover, I added lung disease data (death rate per 100k people) in each country with separation by sex as well.
    2. The average amount of children per woman has a quite high p-value when trying to analyze the trend of the confirmed cases. Especially when it comes in interaction with 'density' and school restrictions.

    Citations and Data Gathering

    1. https://www.worldometers.info/ - Population, Density, Median Age, Urban Population, Fertility Rate, Patient Zero Detection Date, Confirmed Cases, New Cases, Total Deaths, Total Recovered, Critical Cases.
    2. @benhamner 's link (see acknowledgements section below) - Restrictions Initial dates.
    3. https://worldpopulationreview.com/countries/smoking-rates-by-country/ - % of smokers by country.
    4. https://data.worldbank.org/indicator/SH.MED.BEDS.ZS - Hospital beds per 1000 citizens.
    5. https://en.wikipedia.org/wiki/List_of_countries_by_sex_ratio - Sex ratio by age.
    6. https://www.worldlifeexpectancy.com/cause-of-death/lung-disease/by-country/ - Lung diseases death rate.
    7. https://en.wikipedia.org/wiki/COVID-19_testing - COVID-19 Tests
    8. https://www.worldbank.org/ - GDP 2019, Health Expenses (Whatever was missing was filled with information from Wikipedia)
    9. https://en.climate-data.org/ - Temperature and Humidity raw data.

    Acknowledgements

    1. Restrictions are taken from here. Thanks to Ben Hamner for sharing this link!
    2. Special thanks to @diamondsnake for the idea of collecting the average temperature and humidity.

    Good luck trying to learn more about the virus, feel free to comment and collaborate in order to collect more relevant data!

    My

  12. f

    Table_1_Sex differences in comorbidities and COVID-19 mortality–Report from...

    • frontiersin.figshare.com
    docx
    Updated Jun 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yilin Yoshida; Jia Wang; Yuanhao Zu (2023). Table_1_Sex differences in comorbidities and COVID-19 mortality–Report from the real-world data.DOCX [Dataset]. http://doi.org/10.3389/fpubh.2022.881660.s003
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 4, 2023
    Dataset provided by
    Frontiers
    Authors
    Yilin Yoshida; Jia Wang; Yuanhao Zu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe differential effect of comorbidities on COVID-19 severe outcomes by sex has not been fully evaluated.ObjectiveTo examine the association of major comorbidities and COVID-19 mortality in men and women separately.MethodsWe performed a retrospective cohort analysis using a large electronic health record (EHR) database in the U.S. We included adult patients with a clinical diagnosis of COVID-19 who also had necessary information on demographics and comorbidities from January 1, 2016 to October 31, 2021. We defined comorbidities by the Charlson Comorbidity Index (CCI) using ICD-10 codes at or before the COVID-19 diagnosis. We conducted logistic regressions to compare the risk of death associated with comorbidities stratifying by sex.ResultsA total of 121,342 patients were included in the final analysis. We found significant sex differences in the association between comorbidities and COVID-19 death. Specifically, moderate/severe liver disease, dementia, metastatic solid tumor, and heart failure and the increased number of comorbidities appeared to confer a greater magnitude of mortality risk in women compared to men.ConclusionsOur study suggests sex differences in the effect of comorbidities on COVID-19 mortality and highlights the importance of implementing sex-specific preventive or treatment approaches in patients with COVID-19.

  13. CoVCSD - Covid-19 Countries Statistical Dataset

    • kaggle.com
    zip
    Updated Jun 10, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Kumar (2020). CoVCSD - Covid-19 Countries Statistical Dataset [Dataset]. https://www.kaggle.com/aestheteaman01/covcsd-covid19-countries-statistical-dataset
    Explore at:
    zip(8443990 bytes)Available download formats
    Dataset updated
    Jun 10, 2020
    Authors
    Aman Kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    The datasets hold information about the cases and deaths from COVID-19 for multiple countries between January 22th 2020, to March 30, 2020. There is a separate excel sheet for every country. The following is the information that the dataset holds.

    1. The date for which the observation was made for the country/state.
    2. Information regarding the state of the country where the case is reported.
    3. The country where the case is reported.
    4. Cumulative confirmed cases and cumulative deaths
    5. Daily cases reported and daily deaths
    6. Latitude and Longitude for the country
    7. Average temperature for that day.
    8. Minimum and Maximum temperature for that day.
    9. Wind speed reported for that day.
    10. Precipitation and Fog (1 denotes the presence)
    11. Population, Population density and median population for that country.
    12. The sex ratio for that country.
    13. %of Population above 65 years of age.
    14. Hospital Beds and Available Hospital beds/1000 people
    15. Confirmed COVID-19 cases/1000 people
    16. No. of males and females/1million people suffering from a lung / COPD Disease.
    17. Life Expectancy (Males and Females)
    18. Total COVID-19 Tests conducted for that country.
    19. Outbound | Inbound | Domestic travels for that country.

    Separate CSV sheets are made for the country. The datasets would surely be updated on a certain basis to fit with the current COVID-19 values.

    Special thanks to - https://www.kaggle.com/koryto/countryinfo for providing the much essential information for building the dataset.

  14. COVID-19 Recovery Dataset

    • kaggle.com
    zip
    Updated Oct 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Eshaal Malik (2025). COVID-19 Recovery Dataset [Dataset]. https://www.kaggle.com/datasets/eshaalnmalik/covid-19-recovery-dataset
    Explore at:
    zip(1761581 bytes)Available download formats
    Dataset updated
    Oct 4, 2025
    Authors
    Eshaal Malik
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    Overview

    The COVID-19 Patient Recovery Dataset is a synthetic collection of anonymized records for around 70,000 COVID-19 patients. It aims to assist with classification tasks in machine learning and epidemiological research. The dataset includes detailed clinical and demographic information, such as symptoms, existing health issues, vaccination status, COVID-19 variants, treatment details, and outcomes related to recovery or mortality. This dataset is great for predicting patient recovery (recovered), mortality (death), disease severity (severity), or the need for intensive care (icu_admission) using algorithms like Logistic Regression, Random Forest, XGBoost, or Neural Networks. It also allows for exploratory data analysis (EDA), statistical modeling, and time-series studies to find patterns in COVID-19 outcomes.
    The data is synthetic and reflects realistic trends found in public health data, based on sources like WHO reports. It ensures privacy and follows ethical guidelines. Dates are provided in Excel serial format, meaning 44447 corresponds to September 8, 2021, and can be converted to standard dates using Python’s datetime or Excel. With 70,000 records and 28 columns, this dataset serves as a valuable resource for data scientists, researchers, and students interested in health-related machine learning or pandemic trends.

    Data Source and Collection

    Source: Synthetic data based on public health patterns from sources like the World Health Organization (WHO). It includes placeholder URLs.
    Collection Period: Simulated from early 2020 to mid-2022, covering the Alpha, Delta, and Omicron waves.
    Number of Records: 70,000.
    File Format: CSV, which works with Pandas, R, Excel, and more.
    Data Quality Notes:

    About 5% of the values are missing in fields like symptoms_2, symptoms_3, treatment_given_2, and date.
    There are rare inconsistencies, such as between recovery/death flags and dates, which may need some preprocessing.
    Unique, anonymized patient IDs.

    Column NameData Type
    patient_idString
    countryString
    region/stateString
    date_reportedInteger
    ageInteger
    genderString
    comorbiditiesString
    symptoms_1String
    symptoms_2String
    symptoms_3String
    severityString
    hospitalizedInteger
    icu_admissionInteger
    ventilator_supportInteger
    vaccination_statusString
    variantString
    treatment_given_1String
    treatment_given_2String
    days_to_recoveryInteger
    recoveredInteger
    deathInteger
    date_of_recoveryInteger
    date_of_deathInteger
    tests_conductedInteger
    test_typeString
    hospital_nameString
    doctor_assignedString
    source_urlString

    Key Column Details

    patient_id: Unique identifier (e.g., P000001).
    country: Reporting country (e.g., India, USA, Brazil, Germany, China, Pakistan, South Africa, UK).
    region/state: Sub-national region (e.g., Sindh, California, São Paulo, Beijing).
    date_reported, date_of_recovery, date_of_death: Excel serial dates (convert using datetime(1899,12,30) + timedelta(days=value)).
    age: Patient age (1–100 years).
    gender: Male or Female.
    comorbidities: Pre-existing conditions (e.g., Diabetes, Hypertension, Cancer, Heart Disease, Asthma, None).
    symptoms_1, symptoms_2, symptoms_3: Reported symptoms (e.g., Cough, Fever, Fatigue, Loss of Smell, Sore Throat, or empty).
    severity: Case severity (Mild, Moderate, Severe, Critical).
    hospitalized, icu_admission, ventilator_support: Binary (1 = Yes, 0 = No).
    vaccination_status: None, Partial, Full, or Booster.
    variant: COVID-19 variant (Omicron, Delta, Alpha).
    treatment_given_1, treatment_given_2: Treatments administered (e.g., Antibiotics, Remdesivir, Oxygen, Steroids, Paracetamol, or empty).
    days_to_recovery: Days from report to recovery (5–30, or empty if not recovered).
    recovered, death: Binary outcomes (1 = Yes, 0 = No; generally mutually exclusive).
    tests_conducted: Number of tests (1–5).
    test_type: PCR or Antigen.
    hospital_name: Fictional hospital (e.g., Aga Khan, Mayo Clinic, NHS Trust).
    doctor_assigned: Fictional doctor name (e.g., Dr. Smith, Dr. Müller).
    source_url: Placeholder.

    Summary Statistics

    Total Patients: 70,000.
    Age: Mean ~50 years, Min 1, Max 100, evenly distributed.
    Gender: ~50% Male, ~50% Female.
    Top Countries: USA (20%), India (18%), Brazil (15%), China (12%), Germany (10%).
    Comorbidities: Diabetes (25%), Hypertension (20%), Cancer (15%), Heart Disease (15%), Asthma (10%), None (15%).
    Severity: Mild (60%), Moderate (25%), Severe (10%), Critical (5%).
    Recovery Rate: ~60% recovered (recovered=1), ~30% deceased (death=1), ~10% unresolved (both 0).
    Vaccination: None (40%), Full (30%), Partial (15%), Booster (15%).
    Variants: Omicron (50%), Delt...

  15. Summary of demographics and characteristics of male vs female adults with...

    • plos.figshare.com
    xls
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ninh T. Nguyen; Justine Chinn; Morgan De Ferrante; Katharine A. Kirby; Samuel F. Hohmann; Alpesh Amin (2023). Summary of demographics and characteristics of male vs female adults with COVID-19. [Dataset]. http://doi.org/10.1371/journal.pone.0254066.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Ninh T. Nguyen; Justine Chinn; Morgan De Ferrante; Katharine A. Kirby; Samuel F. Hohmann; Alpesh Amin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary of demographics and characteristics of male vs female adults with COVID-19.

  16. Italian Coronavirus Cases by Age group and Sex

    • kaggle.com
    zip
    Updated Nov 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    janluke (2025). Italian Coronavirus Cases by Age group and Sex [Dataset]. https://www.kaggle.com/giangip/iccas
    Explore at:
    zip(132873 bytes)Available download formats
    Dataset updated
    Nov 19, 2025
    Authors
    janluke
    Description

    Italy Coronavirus Cases by Age group and Sex (ICCAS)

    This repository contains datasets about the number of Italian Sars-CoV-2 confirmed cases and deaths disaggregated by age group and sex. The data is (automatically) extracted from pdf reports (like this) published by Istituto Superiore di Sanità (ISS) two times a week. A link to the most recent report can be found in this page under section "Documento esteso".

    PDF reports are usually published on Tuesday and Friday and contains data updated to the 4 p.m. of the day day before their release.

    I wrote a script that is runned periodically in order to automatically update this repository when a new report is published. The code is hosted in a separate repository.

    For feedback and issues refers to the GitHub repository.

    Data folder structure

    The data folder is structured as follows: data ├── by-date │ └── iccas_{date}.csv Dataset with cases/deaths updated to 4 p.m. of {date} └── iccas_full.csv Dataset with data from all reports (by date) The full dataset is obtained by concatenating all datasets in by-date and has an additional date column. If you use pandas, I suggest you to read this dataset using a multi-index on the first two columns: python import pandas as pd df = pd.read_csv('iccas_full.csv', index_col=(0, 1)) # ('date', 'age_group')

    NOTE: {date} is the date the data refers to, NOT the release date of the report it was extracted from: as written above, a report is usually released with a day of delay. For example, iccas_2020-03-19.csv contains data relative to 2020-03-19 which was extracted from the report published in 2020-03-20.

    Dataset details

    Each dataset in the by-date folder contains the same data you can find in "Table 1" of the corresponding ISS report. This table contains the number of confirmed cases, deaths and other derived information disaggregated by age group (0-9, 10-19, ..., 80-89, >=90) and sex.

    WARNING: the sum of male and female cases is not equal to the total number of cases, since the sex of some cases is unknown. The same applies to deaths.

    Below, {sex} can be male or female.

    ColumnDescription
    date(Only in iccas_full.csv) Date the format YYYY-MM-DD; numbers are updated to 4 p.m of this date
    age_groupValues: "0-9", "10-19", ..., "80-89", ">=90"
    casesNumber of confirmed cases (both sexes + unknown-sex; active + closed)
    deathsNumber of deaths (both sexes + unknown-sex)
    {sex}_casesNumber of cases of sex {sex}
    {sex}_deathsNumber of cases of sex {sex} ended up in death
    cases_percentage100 * cases / cases_of_all_ages
    deaths_percentage100 * deaths / deaths_of_all_ages
    fatality_rate100 * deaths / cases
    {sex}_cases_percentage100 * {sex}_cases / (male_cases + female_cases) (cases of unknown sex excluded)
    {sex}_deaths_percentage100 * {sex}_deaths / (male_deaths + female_deaths) (cases of unknown sex excluded)
    {sex}_fatality_rate100 * {sex}_deaths / {sex}_cases

    All columns that can be computed from absolute counts of cases and deaths (bottom half of the table above) were all re-computed to increase precision.

  17. a

    MDCOVID19 ProbableDeathsByGenderDistribution

    • data-maryland.opendata.arcgis.com
    • data.imap.maryland.gov
    • +1more
    Updated May 22, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ArcGIS Online for Maryland (2020). MDCOVID19 ProbableDeathsByGenderDistribution [Dataset]. https://data-maryland.opendata.arcgis.com/datasets/mdcovid19-probabledeathsbygenderdistribution/api
    Explore at:
    Dataset updated
    May 22, 2020
    Dataset authored and provided by
    ArcGIS Online for Maryland
    Description

    Notice:Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. Please refer to the Open Data resource at https://data.maryland.gov/Health-and-Human-Services/COVID-Master-Tracker/37gh-4yqf for continued weekly updates. SummaryThe cumulative number of probable COVID-19 deaths among Maryland residents by gender: Female; Male; Unknown.DescriptionThe MD COVID-19 - Probable Deaths by Gender Distribution data layer is a collection of the statewide confirmed and probable COVID-19 related deaths that have been reported each day by the Vital Statistics Administration by gender. A death is classified as probable if the person's death certificate notes COVID-19 to be a probable, suspect or presumed cause or condition. Probable deaths are not yet been confirmed by a laboratory test. Some data on deaths may be unavailable due to the time lag between the death, typically reported by a hospital or other facility, and the submission of the complete death certificate. Confirmed deaths are available from the MD COVID-19 - Confirmed Deaths by Gender Distribution data layer.COVID-19 is a disease caused by a respiratory virus first identified in Wuhan, Hubei Province, China in December 2019. COVID-19 is a new virus that hasn't caused illness in humans before. Worldwide, COVID-19 has resulted in thousands of infections, causing illness and in some cases death. Cases have spread to countries throughout the world, with more cases reported daily. The Maryland Department of Health reports daily on COVID-19 cases by county.

  18. COVID-19 Sex-Disaggregated Data

    • kaggle.com
    zip
    Updated Sep 22, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marília Prata (2020). COVID-19 Sex-Disaggregated Data [Dataset]. https://www.kaggle.com/mpwolke/cusersmarildownloadsdisaggregatedcsv
    Explore at:
    zip(10606 bytes)Available download formats
    Dataset updated
    Sep 22, 2020
    Authors
    Marília Prata
    Description

    Context

    Understanding gender is essential to understanding the risk factors of poor health, early death and health inequities. The COVID-19 outbreak is no different. At this point in the pandemic, we are unable to provide a clear answer to the question of the extent to which sex and gender are influencing the health outcomes of people diagnosed with COVID-19. However, experience and evidence thus far tell us that both sex and gender are important drivers of risk and response to infection and disease.

    http://globalhealth5050.org/covid19 https://data.humdata.org/dataset/covid-19-sex-disaggregated-data-tracker

    Content

    In order to understand the role gender is playing in the COVID-19 outbreak, countries urgently need to begin both collecting and publicly reporting sex-disaggregated data. At a minimum, this should include the number of cases and deaths in men and women.

    In collaboration with CNN, Global Health 50/50 began compiling publicly available sex-disaggregated data reported by national governments to date and is exploring how gender may be driving the higher proportion of reported deaths in men among confirmed cases so far.

    Acknowledgements

    http://globalhealth5050.org/covid19 https://data.humdata.org/dataset/covid-19-sex-disaggregated-data-tracker

    Photo by Nick Fewings on Unsplash

    Inspiration

    Covid-19 Pandemic.

  19. g

    COVID-19 HPSC Detailed Statistics Profile

    • ga.covid-19.geohive.ie
    Updated Mar 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    content_geohive (2020). COVID-19 HPSC Detailed Statistics Profile [Dataset]. https://ga.covid-19.geohive.ie/maps/d8eb52d56273413b84b0187a4e9117be_0/about
    Explore at:
    Dataset updated
    Mar 31, 2020
    Dataset authored and provided by
    content_geohive
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Area covered
    Description

    This dataset contains COVID-19 Daily Statistics and the Profile of Covid-19 Daily Statistics for Ireland as reported by the Health Protection Surveillance Centre. This data includes confirmed cases (PCR) only and does not include positive antigen results uploaded to the HSE portal. Time series dataset from March 2020 to November 2023.Deaths: From 16th May 2022 to November 2023, reporting of Notified Deaths changed from daily to weekly. Data on deaths is based on the date on which a death was notified on CIDR, not the date on which the death occurred. Data on deaths by date of death is available on the HPSC Respiratory Virus Notification Hub https://respiratoryvirus.hpsc.ie/.Notice:Please be advised that on 29th April 2021, the 'Aged65up' and 'HospitalisedAged65up' fields were removed from this table.The three fields 'Aged65to74', 'Aged75to84', and 'Aged85up' replace the 'Aged65up' field.The three fields 'HospitalisedAged65to74', 'HospitalisedAged75to84' and 'HospitalisedAged85up' replace the 'HospitalisedAged65up' field.On the week beginning 1st March 2021, the values in the following fields in this table were set to zero: 'CommunityTransmission', 'CloseContact', 'TravelAbroad' and ‘ClustersNotified’. The primary Date applies to the following fields:ConfirmedCovidCases, TotalConfirmedCovidCases, ConfirmedCovidDeaths, TotalCovidDeaths, ConfirmedCovidRecovered,SevenDayAverageCases.The StatisticProfileDate applies to the following fields:CovidCasesConfirmed, HospitalisedCovidCases, RequiringICUCovidCases, HealthcareWorkersCovidCases,Clusters Notified,HospitalisedAged5,HospitalisedAged5to14,HospitalisedAged15to24,HospitalisedAged25to34,HospitalisedAged35to44,HospitalisedAged45to54,HospitalisedAged55to64,HospitalisedAged65to74,HospitalisedAged75to84,HospitalisedAged85up,Male, Female, Unknown,Aged1to4, Aged5to14, Aged15to24, Aged25to34, Aged35to44, Aged45to54, Aged55to64, Aged65to74,Aged75to84,Aged85up,MedianAgeCommunityTransmission, CloseContact, TravelAbroad, Total Deaths by Date of Death,Deaths by Date of Death.

  20. COVID-19 In Denmark

    • kaggle.com
    zip
    Updated Aug 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christian Lillelund (2020). COVID-19 In Denmark [Dataset]. https://www.kaggle.com/christianlillelund/covid19-in-denmark
    Explore at:
    zip(11090 bytes)Available download formats
    Dataset updated
    Aug 12, 2020
    Authors
    Christian Lillelund
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    Denmark
    Description

    https://videnskab.dk/files/styles/columns_12_12_desktop/public/article_media/shutterstock_1779839909.jpg?itok=kYzSroNA%C3%97tamp=1596709364" alt="">

    Introduction

    Coronavirus disease 2019 (COVID‑19) is an infectious disease caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). It was first identified in December 2019 in Wuhan, Hubei, China, and has resulted in an ongoing pandemic. As of 12 August 2020, more than 20.2 million cases have been reported across 188 countries and territories, resulting in more than 741,000 deaths. More than 12.5 million people have recovered. Most people infected with the COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness.

    These numbers are sampled exclusively from Denmark between 11th of March 2020 and 9th of August 2020.

    Content

    This contains 10 data files:

    • Cases_by_age.csv: Current number of confirmed cases for each age group.
    • Cases_by_sex.csv: Current number of confirmed cases for men and women.
    • Deaths_over_time.csv: The death toll for each day.
    • Municipality_test_pos.csv: Number of tested and confirmed cases for each Danish municipality.
    • Newly_admitted_over_time.csv: Number of newly hospitalised people for each region per day.
    • Region_summary.csv: Number of tested and confirmed cases for each Danish region.
    • Rt_cases.csv: Reproduction rate each day. A key measure of how fast the virus is growing.
    • Rt_hospitalised.csv: Reproduction rate for hospitalised cases.
    • Test_pos_over_time.csv: Number of new positive cases over time and total tested.
    • Test_regions.csv: Number of tests done in each Danish region.

    Wiki about COVID-19 in Denmark: https://en.wikipedia.org/wiki/COVID-19_pandemic_in_Denmark Dashboard with information on COVID-19 in Denmark: https://experience.arcgis.com/experience/aa41b29149f24e20a4007a0c4e13db1d Currentcase count: https://www.worldometers.info/coronavirus/country/denmark/

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Meir Nizri (2022). COVID-19 Dataset [Dataset]. https://www.kaggle.com/datasets/meirnizri/covid19-dataset
Organization logo

COVID-19 Dataset

COVID-19 patient's symptoms, status, and medical history.

Explore at:
28 scholarly articles cite this dataset (View in Google Scholar)
zip(4890659 bytes)Available download formats
Dataset updated
Nov 13, 2022
Authors
Meir Nizri
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.

The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.

content

The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.

  • sex: 1 for female and 2 for male.
  • age: of the patient.
  • classification: covid test findings. Values 1-3 mean that the patient was diagnosed with covid in different degrees. 4 or higher means that the patient is not a carrier of covid or that the test is inconclusive.
  • patient type: type of care the patient received in the unit. 1 for returned home and 2 for hospitalization.
  • pneumonia: whether the patient already have air sacs inflammation or not.
  • pregnancy: whether the patient is pregnant or not.
  • diabetes: whether the patient has diabetes or not.
  • copd: Indicates whether the patient has Chronic obstructive pulmonary disease or not.
  • asthma: whether the patient has asthma or not.
  • inmsupr: whether the patient is immunosuppressed or not.
  • hypertension: whether the patient has hypertension or not.
  • cardiovascular: whether the patient has heart or blood vessels related disease.
  • renal chronic: whether the patient has chronic renal disease or not.
  • other disease: whether the patient has other disease or not.
  • obesity: whether the patient is obese or not.
  • tobacco: whether the patient is a tobacco user.
  • usmr: Indicates whether the patient treated medical units of the first, second or third level.
  • medical unit: type of institution of the National Health System that provided the care.
  • intubed: whether the patient was connected to the ventilator.
  • icu: Indicates whether the patient had been admitted to an Intensive Care Unit.
  • date died: If the patient died indicate the date of death, and 9999-99-99 otherwise.
Search
Clear search
Close search
Google apps
Main menu