73 datasets found
  1. Projected COVID-19 deaths in the U.S. from Dec. 1, 2020 to Mar. 31, 2021, by...

    • statista.com
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Statista, Projected COVID-19 deaths in the U.S. from Dec. 1, 2020 to Mar. 31, 2021, by scenario [Dataset]. https://www.statista.com/statistics/1176649/covid-projected-deaths-by-scenario-us/
    Explore at:
    Dataset authored and provided by
    Statistahttp://statista.com/
    Area covered
    United States
    Description

    Based on projections made on December 17, the number of deaths due to COVID-19 in the United States by the end of March 2021 could range from 505,894 to 713,674 depending on the scenario. The best case scenario being 95 percent mask usage universally and the worst case being continued easing of social distancing mandates. This statistic shows the projected number of deaths due to COVID-19 in the U.S. from December 1, 2020 to March 31, 2021 based on three different scenarios, as of December 17.

  2. T

    World Coronavirus COVID-19 Deaths

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). World Coronavirus COVID-19 Deaths [Dataset]. https://tradingeconomics.com/world/coronavirus-deaths
    Explore at:
    excel, csv, xml, jsonAvailable download formats
    Dataset updated
    Mar 9, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 4, 2020 - May 17, 2023
    Area covered
    World
    Description

    The World Health Organization reported 6932591 Coronavirus Deaths since the epidemic began. In addition, countries reported 766440796 Coronavirus Cases. This dataset provides - World Coronavirus Deaths- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  3. COVID-19 Dashboard

    • catalog.data.gov
    • healthdata.gov
    • +2more
    Updated Oct 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). COVID-19 Dashboard [Dataset]. https://catalog.data.gov/dataset/covid-19-dashboard
    Explore at:
    Dataset updated
    Oct 23, 2025
    Dataset provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    Description

    The dashboard is updated each Friday. Laboratory surveillance data: California laboratories report SARS-CoV-2 test results to CDPH through electronic laboratory reporting. Los Angeles County SARS-CoV-2 lab data has a 7-day reporting lag. Test positivity is calculated using SARS-CoV-2 lab tests that has a specimen collection date reported during a given week. Specimens for testing are collected from patients in healthcare settings and do not reflect all testing for COVID-19 in California. Test positivity for a given week is calculated by dividing the number of positive COVID-19 results by the total number of specimens tested for that virus. Weekly laboratory surveillance data are defined as Sunday through Saturday. Hospitalization data: Data on COVID-19 and influenza hospital admissions are from Centers for Disease Control and Prevention’s (CDC) National Healthcare Safety Network (NHSN) Hospitalization dataset. The requirement to report COVID-19-associated hospitalizations was effective November 1, 2024. CDPH pulls NHSN data from the CDC on the Wednesday prior to the publication of the report. Results may differ depending on which day data are pulled. Admission rates are calculated using population estimates from the P-3: Complete State and County Projections Dataset (https://dof.ca.gov/forecasting/demographics/projections/) provided by the State of California Department of Finance. Reported weekly admission rates for the entire season use the population estimates for the year the season started. For more information on NHSN data including the protocol and data collection information, see the CDC NHSN webpage (https://www.cdc.gov/nhsn/index.html). Weekly hospitalization data are defined as Sunday through Saturday. Death certificate data: CDPH receives weekly year-to-date dynamic data on deaths occurring in California from the CDPH Center for Health Statistics and Informatics. These data are limited to deaths occurring among California residents and are analyzed to identify COVID-19-coded deaths. These deaths are not necessarily laboratory-confirmed and are an underestimate of all COVID-19-associated deaths in California. Weekly death data are defined as Sunday through Saturday.

  4. Data_Sheet_1_Toward a Country-Based Prediction Model of COVID-19 Infections...

    • frontiersin.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated May 30, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tianshu Gu; Lishi Wang; Ning Xie; Xia Meng; Zhijun Li; Arnold Postlethwaite; Lotfi Aleya; Scott C. Howard; Weikuan Gu; Yongjun Wang (2023). Data_Sheet_1_Toward a Country-Based Prediction Model of COVID-19 Infections and Deaths Between Disease Apex and End: Evidence From Countries With Contained Numbers of COVID-19.PDF [Dataset]. http://doi.org/10.3389/fmed.2021.585115.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Tianshu Gu; Lishi Wang; Ning Xie; Xia Meng; Zhijun Li; Arnold Postlethwaite; Lotfi Aleya; Scott C. Howard; Weikuan Gu; Yongjun Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The complexity of COVID-19 and variations in control measures and containment efforts in different countries have caused difficulties in the prediction and modeling of the COVID-19 pandemic. We attempted to predict the scale of the latter half of the pandemic based on real data using the ratio between the early and latter halves from countries where the pandemic is largely over. We collected daily pandemic data from China, South Korea, and Switzerland and subtracted the ratio of pandemic days before and after the disease apex day of COVID-19. We obtained the ratio of pandemic data and created multiple regression models for the relationship between before and after the apex day. We then tested our models using data from the first wave of the disease from 14 countries in Europe and the US. We then tested the models using data from these countries from the entire pandemic up to March 30, 2021. Results indicate that the actual number of cases from these countries during the first wave mostly fall in the predicted ranges of liniar regression, excepting Spain and Russia. Similarly, the actual deaths in these countries mostly fall into the range of predicted data. Using the accumulated data up to the day of apex and total accumulated data up to March 30, 2021, the data of case numbers in these countries are falling into the range of predicted data, except for data from Brazil. The actual number of deaths in all the countries are at or below the predicted data. In conclusion, a linear regression model built with real data from countries or regions from early pandemics can predict pandemic scales of the countries where the pandemics occur late. Such a prediction with a high degree of accuracy provides valuable information for governments and the public.

  5. CDC COVID-19 Cases and Deaths Ensemble Forecast Archive

    • healthdata.gov
    • odgavaprod.ogopendata.com
    • +1more
    csv, xlsx, xml
    Updated Apr 27, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    data.cdc.gov (2023). CDC COVID-19 Cases and Deaths Ensemble Forecast Archive [Dataset]. https://healthdata.gov/CDC/CDC-COVID-19-Cases-and-Deaths-Ensemble-Forecast-Ar/hjhg-fag8
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Apr 27, 2023
    Dataset provided by
    data.cdc.gov
    Description

    This dataset contains forecasted weekly numbers of reported COVID-19 incident cases, incident deaths, and cumulative deaths in the United States, previously reported on COVID Data Tracker (https://covid.cdc.gov/covid-data-tracker/#datatracker-home). These forecasts were generated using mathematical models by CDC partners in the COVID-19 Forecast Hub (https://covid19forecasthub.org/doc/ensemble/). A CDC ensemble model was produced every week using the submitted models from that week at the national, and state/territory level.

    This dataset is intended to mirror the observed and forecasted data, previously available for download on the CDC’s COVID Data Tracker. Mortality forecasts for both new and cumulative reported COVID-19 deaths were produced at the state and territory level and national level. Forecasts of new reported COVID-19 cases were produced at the county, state/territory, and national level. Please note that this dataset is not complete for every model, date, location or combination thereof. Specifically, county level submissions for COVID-19 incident cases were accepted, but not required, and are missing or incomplete for many models and dates. State and territory-level forecasts are more complete, but not all models submitted forecasts for all locations, dates, and targets (new reported deaths, new reported cases, and cumulative reported deaths). Forecasts for COVID-19 incident cases were discontinued in February 2022. Forecasts for COVID-19 cumulative and incident deaths were discontinued in March 2023.

  6. D

    Data from: International COVID-19 mortality forecast visualization:...

    • datasetcatalog.nlm.nih.gov
    • datadryad.org
    Updated Dec 24, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bui, Alex; Akre, Samir; Liu, Patrick; Friedman, Joseph (2021). International COVID-19 mortality forecast visualization: covidcompare.io [Dataset]. http://doi.org/10.5068/D1V68X
    Explore at:
    Dataset updated
    Dec 24, 2021
    Authors
    Bui, Alex; Akre, Samir; Liu, Patrick; Friedman, Joseph
    Description

    COVID-19 mortality forecasting models provide critical information about the trajectory of the pandemic, which is used by policymakers and public health officials to guide decision-making. However, thousands of published COVID-19 mortality forecasts now exist, many with their own unique methods, assumptions, format, and visualization. As a result, it is difficult to compare models and understand under which circumstances a model performs best. Here, we describe the construction and usability of covidcompare.io, a web tool built to compare numerous forecasts and offer insight into how each has performed over the course of the pandemic. From its launch in December 2020 to June 2021, we have seen 4,600 unique visitors from 85 countries. A study conducted with public health professionals showed high usability overall as formally assessed using a Post-Study System Usability Questionnaire (PSSUQ). We find that covidcompare.io is an impactful tool for the comparison of international COVID-19 mortality forecasting models.

  7. M

    Data from: COVID-19 Forecasts: Deaths

    • catalog.midasnetwork.us
    Updated Mar 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Centers for Disease Control and Prevention (CDC) (2023). COVID-19 Forecasts: Deaths [Dataset]. https://catalog.midasnetwork.us/collection/147
    Explore at:
    Dataset updated
    Mar 9, 2023
    Dataset provided by
    MIDAS COORDINATION CENTER
    Authors
    Centers for Disease Control and Prevention (CDC)
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Area covered
    Country, State
    Variables measured
    Viruses, disease, COVID-19, modeling, pathogen, forecasting, Homo sapiens, host organism, mortality data, Population count, and 6 more
    Dataset funded by
    National Institute of General Medical Sciences
    Description

    The dataset contains observed and 4 weeks forecast new and total weekly COVID-19 deaths at national and state level until March 9, 2023. Forecasting teams predict numbers of deaths using different types of data (e.g., COVID-19 data, demographic data, mobility data), methods, and estimates of the impacts of interventions (e.g., social distancing, use of face coverings).

  8. Medical supplies required in US for COVID-19

    • kaggle.com
    zip
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Kumar (2020). Medical supplies required in US for COVID-19 [Dataset]. https://www.kaggle.com/aestheteaman01/medical-supplies-required-in-us-for-covid19
    Explore at:
    zip(667560 bytes)Available download formats
    Dataset updated
    May 20, 2020
    Authors
    Aman Kumar
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Area covered
    United States
    Description

    Context

    IHME has produced forecasts which show hospital bed use, need for intensive care beds, and ventilator use due to COVID-19 based on projected deaths for all 50 U.S. states. These projections are produced by models based on observed death rates from COVID-19 and include uncertainty intervals.

    They incorporate information about social distancing and other protective measures and are being updated daily with new data. These forecasts were developed in order to provide hospitals, policymakers, and the public with crucial information about how expected need aligns with existing resources so that cities and states can best prepare.

    All the column descriptors and details are attached in the PDF.

    Acknowledgements

    Institute for Health Metrics and Evaluation (IHME). United States COVID-19 Hospital Needs and Death Projections. Seattle, United States of America: Institute for Health Metrics and Evaluation (IHME), University of Washington, 2020

  9. COVID-19 Data

    • kaggle.com
    zip
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umut Toygar Göz (2025). COVID-19 Data [Dataset]. https://www.kaggle.com/datasets/umuttoygargoz/covid19-data
    Explore at:
    zip(8484157 bytes)Available download formats
    Dataset updated
    Nov 21, 2025
    Authors
    Umut Toygar Göz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset

    This dataset was created by Umut Toygar Göz

    Released under Attribution 4.0 International (CC BY 4.0)

    Contents

  10. Modelling projections for opioid-related deaths during the COVID-19 outbreak...

    • open.canada.ca
    • ouvert.canada.ca
    html
    Updated Oct 25, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Public Health Agency of Canada (2021). Modelling projections for opioid-related deaths during the COVID-19 outbreak [Dataset]. https://open.canada.ca/data/info/579520e1-9d19-428a-9b22-327d87bdd284
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Oct 25, 2021
    Dataset provided by
    Public Health Agency Of Canadahttp://www.phac-aspc.gc.ca/
    License

    Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
    License information was derived automatically

    Description

    The Public Health Agency of Canada (PHAC) released new modelling projections of the number of opioid-related deaths that may occur over the course of the coming months. The results of the model suggest that, under some scenarios, the number of opioid-related deaths may remain high or may even increase through to December 31, 2021.

  11. clinical lab parameters covid

    • kaggle.com
    zip
    Updated Nov 16, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Larmuseau (2020). clinical lab parameters covid [Dataset]. https://www.kaggle.com/plarmuseau/forecast-covid-death
    Explore at:
    zip(3047780 bytes)Available download formats
    Dataset updated
    Nov 16, 2020
    Authors
    Paul Larmuseau
    Description

    Context

    ***Is there a decision tree for covid19 possible with these datasets ***validation demonstrates limited clinical utility of the interpretable mortality prediction model for patients with COVID-19

    https://github.com/HAIRLAB/Pre_Surv_COVID_19/blob/master/response/EDA.ipynb The sudden increase of COVID-19 cases is putting a high pressure on health-care services worldwide. At the current stage, fast, accurate and early clinical assessment of the disease severity is vital. To support decision making and logistical planning in healthcare systems, this study leverages a database of blood samples from 485 infected patients in the region of Wuhan, China to identify crucial predictive biomarkers of disease mortality. For this purpose, machine learning tools selected three biomarkers that predict the mortality of individual patients with more than 90% accuracy: lactic dehydrogenase (LDH), lymphocyte and high-sensitivity C-reactive protein (hs-CRP). In particular, relatively high levels of LDH alone seem to play a crucial role in distinguishing the vast majority of cases that require immediate medical attention. This finding is consistent with current medical knowledge that high LDH levels are associated with tissue breakdown occurring in various diseases, including pulmonary disorders such as pneumonia. Overall, this paper suggests a simple and operable decision rule to quickly predict patients at the highest risk, allowing them to be prioritised and potentially reducing the mortality rate.

  12. Life table data for "Bounce backs amid continued losses: Life expectancy...

    • zenodo.org
    • data.niaid.nih.gov
    csv
    Updated Jul 19, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonas Schöley; Jonas Schöley; José Manuel Aburto; José Manuel Aburto; Ilya Kashnitsky; Ilya Kashnitsky; Maxi S. Kniffka; Maxi S. Kniffka; Luyin Zhang; Luyin Zhang; Hannaliis Jaadla; Hannaliis Jaadla; Jennifer B. Dowd; Jennifer B. Dowd; Ridhi Kashyap; Ridhi Kashyap (2022). Life table data for "Bounce backs amid continued losses: Life expectancy changes since COVID-19" [Dataset]. http://doi.org/10.5281/zenodo.6241025
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 19, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jonas Schöley; Jonas Schöley; José Manuel Aburto; José Manuel Aburto; Ilya Kashnitsky; Ilya Kashnitsky; Maxi S. Kniffka; Maxi S. Kniffka; Luyin Zhang; Luyin Zhang; Hannaliis Jaadla; Hannaliis Jaadla; Jennifer B. Dowd; Jennifer B. Dowd; Ridhi Kashyap; Ridhi Kashyap
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Life table data for "Bounce backs amid continued losses: Life expectancy changes since COVID-19"

    cc-by Jonas Schöley, José Manuel Aburto, Ilya Kashnitsky, Maxi S. Kniffka, Luyin Zhang, Hannaliis Jaadla, Jennifer B. Dowd, and Ridhi Kashyap. "Bounce backs amid continued losses: Life expectancy changes since COVID-19".

    These are CSV files of life tables over the years 2015 through 2021 across 29 countries analyzed in the paper "Bounce backs amid continued losses: Life expectancy changes since COVID-19".

    40-lifetables.csv

    Life table statistics 2015 through 2021 by sex and region with uncertainty quantiles based on Poisson replication of death counts.

    30-lt_input.csv

    Life table input data.

    • `id`: unique row identifier
    • `region_iso`: iso3166-2 region codes
    • `sex`: Male, Female, Total
    • `year`: iso year
    • `age_start`: start of age group
    • `age_width`: width of age group, Inf for age_start 100, otherwise 1
    • `nweeks_year`: number of weeks in that year, 52 or 53
    • `death_total`: number of deaths by any cause
    • `population_py`: person-years of exposure (adjusted for leap-weeks and missing weeks in input data on all cause deaths)
    • `death_total_nweeksmiss`: number of weeks in the raw input data with at least one missing death count for this region-sex-year stratum. missings are counted when the week is implicitly missing from the input data or if any NAs are encounted in this week or if age groups are implicitly missing for this week in the input data (e.g. 40-45, 50-55)
    • `death_total_minnageraw`: the minimum number of age-groups in the raw input data within this region-sex-year stratum
    • `death_total_maxnageraw`: the maximum number of age-groups in the raw input data within this region-sex-year stratum
    • `death_total_minopenageraw`: the minimum age at the start of the open age group in the raw input data within this region-sex-year stratum
    • `death_total_maxopenageraw`: the maximum age at the start of the open age group in the raw input data within this region-sex-year stratum
    • `death_total_source`: source of the all-cause death data
    • `population_midyear`: midyear population (July 1st)
    • `population_source`: source of the population count/exposure data
    • `death_covid`: number of deaths due to covid
    • `death_covid_date`: number of deaths due to covid as of
    • `death_covid_nageraw`: the number of age groups in the covid input data
    • `ex_wpp_estimate`: life expectancy estimates from the World Population prospects for a five year period, merged at the midpoint year
    • `ex_hmd_estimate`: life expectancy estimates from the Human Mortality Database
    • `nmx_hmd_estimate`: death rate estimates from the Human Mortality Database
    • `nmx_cntfc`: Lee-Carter death rate projections based on trend in the years 2015 through 2019

    Deaths

    • source:
    • STMF:
      • harmonized to single ages via pclm
      • pclm iterates over country, sex, year, and within-year age grouping pattern and converts irregular age groupings, which may vary by country, year and week into a regular age grouping of 0:110
      • smoothing parameters estimated via BIC grid search seperately for every pclm iteration
      • last age group set to [110,111)
      • ages 100:110+ are then summed into 100+ to be consistent with mid-year population information
      • deaths in unknown weeks are considered; deaths in unknown ages are not considered
    • ONS:
      • data already in single ages
      • ages 100:105+ are summed into 100+ to be consistent with mid-year population information
      • PCLM smoothing applied to for consistency reasons
    • CDC:
      • The CDC data comes in single ages 0:100 for the US. For 2020 we only have the STMF data in a much coarser age grouping, i.e. (0, 1, 5, 15, 25, 35, 45, 55, 65, 75, 85+). In order to calculate life-tables in a manner consistent with 2020, we summarise the pre 2020 US death counts into the 2020 age grouping and then apply the pclm ungrouping into single year ages, mirroring the approach to the 2020 data

    Population

    • source:
      • for years 2000 to 2019: World Population Prospects 2019 single year-age population estimates 1950-2019
      • for year 2020: World Population Prospects 2019 single year-age population projections 2020-2100
    • mid-year population
      • mid-year population translated into exposures:
        • if a region reports annual deaths using the Gregorian calendar definition of a year (365 or 366 days long) set exposures equal to mid year population estimates
        • if a region reports annual deaths using the iso-week-year definition of a year (364 or 371 days long), and if there is a leap-week in that year, set exposures equal to 371/364\*mid_year_population to account for the longer reporting period. in years without leap-weeks set exposures equal to mid year population estimates. further multiply by fraction of observed weeks on all weeks in a year.

    COVID deaths

    • source: COVerAGE-DB (https://osf.io/mpwjq/)
    • the data base reports cumulative numbers of COVID deaths over days of a year, we extract the most up to date yearly total

    External life expectancy estimates

  13. T

    San Marino Coronavirus COVID-19 Deaths

    • tradingeconomics.com
    • fi.tradingeconomics.com
    csv, excel, json, xml
    Updated May 18, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2023). San Marino Coronavirus COVID-19 Deaths [Dataset]. https://tradingeconomics.com/san-marino/coronavirus-deaths
    Explore at:
    xml, csv, excel, jsonAvailable download formats
    Dataset updated
    May 18, 2023
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 4, 2020 - May 17, 2023
    Area covered
    San Marino
    Description

    San Marino recorded 125 Coronavirus Deaths since the epidemic began, according to the World Health Organization (WHO). In addition, San Marino reported 24247 Coronavirus Cases. This dataset provides - San Marino Coronavirus Deaths- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  14. COVID-19 Dataset

    • kaggle.com
    zip
    Updated Nov 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Meir Nizri (2022). COVID-19 Dataset [Dataset]. https://www.kaggle.com/datasets/meirnizri/covid19-dataset
    Explore at:
    zip(4890659 bytes)Available download formats
    Dataset updated
    Nov 13, 2022
    Authors
    Meir Nizri
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Context

    Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.

    The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.

    content

    The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.

    • sex: 1 for female and 2 for male.
    • age: of the patient.
    • classification: covid test findings. Values 1-3 mean that the patient was diagnosed with covid in different degrees. 4 or higher means that the patient is not a carrier of covid or that the test is inconclusive.
    • patient type: type of care the patient received in the unit. 1 for returned home and 2 for hospitalization.
    • pneumonia: whether the patient already have air sacs inflammation or not.
    • pregnancy: whether the patient is pregnant or not.
    • diabetes: whether the patient has diabetes or not.
    • copd: Indicates whether the patient has Chronic obstructive pulmonary disease or not.
    • asthma: whether the patient has asthma or not.
    • inmsupr: whether the patient is immunosuppressed or not.
    • hypertension: whether the patient has hypertension or not.
    • cardiovascular: whether the patient has heart or blood vessels related disease.
    • renal chronic: whether the patient has chronic renal disease or not.
    • other disease: whether the patient has other disease or not.
    • obesity: whether the patient is obese or not.
    • tobacco: whether the patient is a tobacco user.
    • usmr: Indicates whether the patient treated medical units of the first, second or third level.
    • medical unit: type of institution of the National Health System that provided the care.
    • intubed: whether the patient was connected to the ventilator.
    • icu: Indicates whether the patient had been admitted to an Intensive Care Unit.
    • date died: If the patient died indicate the date of death, and 9999-99-99 otherwise.
  15. Data from: COVID-19 Deaths Dataset

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Apr 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb (2021). COVID-19 Deaths Dataset [Dataset]. http://doi.org/10.5281/zenodo.3994922
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 1, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Rakshitha Godahewa; Rakshitha Godahewa; Christoph Bergmeir; Christoph Bergmeir; Geoff Webb; Geoff Webb
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains 266 daily time series that represent the COVID-19 deaths in a set of countries and states from 22/01/2020 to 20/08/2020. It was extracted from the Johns Hopkins repository.

  16. Respiratory Virus Weekly Report

    • data.ca.gov
    • data.chhs.ca.gov
    • +2more
    csv, zip
    Updated Nov 28, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    California Department of Public Health (2025). Respiratory Virus Weekly Report [Dataset]. https://data.ca.gov/dataset/respiratory-virus-weekly-report
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Nov 28, 2025
    Dataset authored and provided by
    California Department of Public Healthhttps://www.cdph.ca.gov/
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data is from the California Department of Public Health (CDPH) Respiratory Virus Weekly Report.

    The report is updated each Friday.

    Laboratory surveillance data: California laboratories report SARS-CoV-2 test results to CDPH through electronic laboratory reporting. Los Angeles County SARS-CoV-2 lab data has a 7-day reporting lag. Test positivity is calculated using SARS-CoV-2 lab tests that has a specimen collection date reported during a given week.

    Laboratory surveillance for influenza, respiratory syncytial virus (RSV), and other respiratory viruses (parainfluenza types 1-4, human metapneumovirus, non-SARS-CoV-2 coronaviruses, adenovirus, enterovirus/rhinovirus) involves the use of data from clinical sentinel laboratories (hospital, academic or private) located throughout California. Specimens for testing are collected from patients in healthcare settings and do not reflect all testing for influenza, respiratory syncytial virus, and other respiratory viruses in California. These laboratories report the number of laboratory-confirmed influenza, respiratory syncytial virus, and other respiratory virus detections and isolations, and the total number of specimens tested by virus type on a weekly basis.

    Test positivity for a given week is calculated by dividing the number of positive COVID-19, influenza, RSV, or other respiratory virus results by the total number of specimens tested for that virus. Weekly laboratory surveillance data are defined as Sunday through Saturday.

    Hospitalization data: Data on COVID-19 and influenza hospital admissions are from Centers for Disease Control and Prevention’s (CDC) National Healthcare Safety Network (NHSN) Hospitalization dataset. The requirement to report COVID-19 and influenza-associated hospitalizations was effective November 1, 2024. CDPH pulls NHSN data from the CDC on the Wednesday prior to the publication of the report. Results may differ depending on which day data are pulled. Admission rates are calculated using population estimates from the P-3: Complete State and County Projections Dataset provided by the State of California Department of Finance (https://dof.ca.gov/forecasting/demographics/projections/). Reported weekly admission rates for the entire season use the population estimates for the year the season started. For more information on NHSN data including the protocol and data collection information, see the CDC NHSN webpage (https://www.cdc.gov/nhsn/index.html).

    CDPH collaborates with Northern California Kaiser Permanente (NCKP) to monitor trends in RSV admissions. The percentage of RSV admissions is calculated by dividing the number of RSV-related admissions by the total number of admissions during the same period. Admissions for pregnancy, labor and delivery, birth, and outpatient procedures are not included in total number of admissions. These admissions serve as a proxy for RSV activity and do not necessarily represent laboratory confirmed hospitalizations for RSV infections; NCKP members are not representative of all Californians.

    Weekly hospitalization data are defined as Sunday through Saturday.

    Death certificate data: CDPH receives weekly year-to-date dynamic data on deaths occurring in California from the CDPH Center for Health Statistics and Informatics. These data are limited to deaths occurring among California residents and are analyzed to identify influenza, respiratory syncytial virus, and COVID-19-coded deaths. These deaths are not necessarily laboratory-confirmed and are an underestimate of all influenza, respiratory syncytial virus, and COVID-19-associated deaths in California. Weekly death data are defined as Sunday through Saturday.

    Wastewater data: This dataset represents statewide weekly SARS-CoV-2 wastewater summary values. SARS-CoV-2 wastewater concentrations from all sites in California are combined into a single, statewide, unit-less summary value for each week, using a method for data transformation and aggregation developed by the CDC National Wastewater Surveillance System (NWSS). Please see the CDC NWSS data methods page for a description of how these summary values are calculated. Weekly wastewater data are defined as Sunday through Saturday.

  17. f

    Data_Sheet_1_The risk profile of patients with COVID-19 as predictors of...

    • datasetcatalog.nlm.nih.gov
    Updated Jul 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sturkenboom, Miriam; Bouhaddani, Said el; Royo, Albert Cid; Rahimi, Ezat; Ahmadizar, Fariba; Sigari, Naseh; Shahisavandi, Mina; Azizi, Mohammad (2022). Data_Sheet_1_The risk profile of patients with COVID-19 as predictors of lung lesions severity and mortality—Development and validation of a prediction model.PDF [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000358035
    Explore at:
    Dataset updated
    Jul 26, 2022
    Authors
    Sturkenboom, Miriam; Bouhaddani, Said el; Royo, Albert Cid; Rahimi, Ezat; Ahmadizar, Fariba; Sigari, Naseh; Shahisavandi, Mina; Azizi, Mohammad
    Description

    ObjectiveWe developed and validated a prediction model based on individuals' risk profiles to predict the severity of lung involvement and death in patients hospitalized with coronavirus disease 2019 (COVID-19) infection.MethodsIn this retrospective study, we studied hospitalized COVID-19 patients with data on chest CT scans performed during hospital stay (February 2020-April 2021) in a training dataset (TD) (n = 2,251) and an external validation dataset (eVD) (n = 993). We used the most relevant demographical, clinical, and laboratory variables (n = 25) as potential predictors of COVID-19-related outcomes. The primary and secondary endpoints were the severity of lung involvement quantified as mild (≤25%), moderate (26–50%), severe (>50%), and in-hospital death, respectively. We applied random forest (RF) classifier, a machine learning technique, and multivariable logistic regression analysis to study our objectives.ResultsIn the TD and the eVD, respectively, the mean [standard deviation (SD)] age was 57.9 (18.0) and 52.4 (17.6) years; patients with severe lung involvement [n (%):185 (8.2) and 116 (11.7)] were significantly older [mean (SD) age: 64.2 (16.9), and 56.2 (18.9)] than the other two groups (mild and moderate). The mortality rate was higher in patients with severe (64.9 and 38.8%) compared to moderate (5.5 and 12.4%) and mild (2.3 and 7.1%) lung involvement. The RF analysis showed age, C reactive protein (CRP) levels, and duration of hospitalizations as the three most important predictors of lung involvement severity at the time of the first CT examination. Multivariable logistic regression analysis showed a significant strong association between the extent of the severity of lung involvement (continuous variable) and death; adjusted odds ratio (OR): 9.3; 95% CI: 7.1–12.1 in the TD and 2.6 (1.8–3.5) in the eVD.ConclusionIn hospitalized patients with COVID-19, the severity of lung involvement is a strong predictor of death. Age, CRP levels, and duration of hospitalizations are the most important predictors of severe lung involvement. A simple prediction model based on available clinical and imaging data provides a validated tool that predicts the severity of lung involvement and death probability among hospitalized patients with COVID-19.

  18. Z

    COVID-19 SPI-M-O medium term projections, created October 2020 to end of...

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated Jul 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SPI-M-O modelling groups; Dstl (2022). COVID-19 SPI-M-O medium term projections, created October 2020 to end of January 2021 [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_6778104
    Explore at:
    Dataset updated
    Jul 4, 2022
    Authors
    SPI-M-O modelling groups; Dstl
    License

    Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
    License information was derived automatically

    Description

    SPI-M-O consensus and individual model medium term projections created between October 2020 and the end of January 2021 for daily number of COVID-19 patients admitted to hospital, and deaths within 28 days of positive test by date of death, within England and English regions.

  19. f

    Table_2_Difference in mortality rates in hospitalized COVID-19 patients...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Sep 20, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rojas-Chaves, Sebastián; Echeverri-McCandless, Ann; Solano-Vargas, Mariela; Silesky-Jiménez, Juan Ignacio; Sanabría-Castro, Alfredo; Mora, Javier; Weigert, Andreas; Sibaja-Campos, Mario; Landaverde-Recinos, Denis; Suarez-Sánchez, María José; Madrigal-Sánchez, Juan José; Rojas-Salas, María Paula; Figueroa-Protti, Lucia; Villafuerte-Mena, Danae; Chaverri-Fernández, José Miguel; Calvo-Flores, Leonardo; Soto-Rodríguez, Andrés; Castro-Castro, Ana Cristina; Boza-Calvo, Carolina; Molina-Mora, Jose Arturo (2022). Table_2_Difference in mortality rates in hospitalized COVID-19 patients identified by cytokine profile clustering using a machine learning approach: An outcome prediction alternative.XLSX [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000230585
    Explore at:
    Dataset updated
    Sep 20, 2022
    Authors
    Rojas-Chaves, Sebastián; Echeverri-McCandless, Ann; Solano-Vargas, Mariela; Silesky-Jiménez, Juan Ignacio; Sanabría-Castro, Alfredo; Mora, Javier; Weigert, Andreas; Sibaja-Campos, Mario; Landaverde-Recinos, Denis; Suarez-Sánchez, María José; Madrigal-Sánchez, Juan José; Rojas-Salas, María Paula; Figueroa-Protti, Lucia; Villafuerte-Mena, Danae; Chaverri-Fernández, José Miguel; Calvo-Flores, Leonardo; Soto-Rodríguez, Andrés; Castro-Castro, Ana Cristina; Boza-Calvo, Carolina; Molina-Mora, Jose Arturo
    Description

    COVID-19 is a disease caused by the novel Coronavirus SARS-CoV-2 causing an acute respiratory disease that can eventually lead to severe acute respiratory syndrome (SARS). An exacerbated inflammatory response is characteristic of SARS-CoV-2 infection, which leads to a cytokine release syndrome also known as cytokine storm associated with the severity of the disease. Considering the importance of this event in the immunopathology of COVID-19, this study analyses cytokine levels of hospitalized patients to identify cytokine profiles associated with severity and mortality. Using a machine learning approach, 3 clusters of COVID-19 hospitalized patients were created based on their cytokine profile. Significant differences in the mortality rate were found among the clusters, associated to different CXCL10/IL-38 ratio. The balance of a CXCL10 induced inflammation with an appropriate immune regulation mediated by the anti-inflammatory cytokine IL-38 appears to generate the adequate immune context to overrule SARS-CoV-2 infection without creating a harmful inflammatory reaction. This study supports the concept that analyzing a single cytokine is insufficient to determine the outcome of a complex disease such as COVID-19, and different strategies incorporating bioinformatic analyses considering a broader immune profile represent a more robust alternative to predict the outcome of hospitalized patients with SARS-CoV-2 infection.

  20. COVID-19-Daily-Data2

    • kaggle.com
    zip
    Updated Apr 13, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Craig Phillips (2020). COVID-19-Daily-Data2 [Dataset]. https://www.kaggle.com/craigphillips/covid19dailydata2
    Explore at:
    zip(113701 bytes)Available download formats
    Dataset updated
    Apr 13, 2020
    Authors
    Craig Phillips
    Description

    Context

    Our primary objective is to commit our data and ideas into code so that we can share these ideas with true Data Scientists to be used to better understand this pandemic. Our current model uses the most current data available to create a predictive these models by country/region to estimate the maximum of Confirmed Cases by country/region and create reasonable a timeline to go with it.

    Content

    Most of us are familiar with the data. China (mainly Hubei), was at the epicenter of this pandemic starting around January 22, 2020, and from there on to Europe and then around the world. Since the far east is more mature in this situation, we are already seeing certain areas flatten out in their cases of COVID; namely Hubei, China and South Korea. Other than that most countries are still in the growth stage of their development. However, from Hubei and South Korea we were able to fit regression curves to these data. Of noticeable importance was a version of the Sigmoid curve-fit equation as shown below. Yes, there are other equations that had better fits (r2); however, the Sigmoid equation has meaningful fit parameters that stand for something to us the users.

    Acknowledgements

    We have studied and openly used code from covid-19-digging-a-bit-deeper and COVID Global Forecast: SIR model + ML regressions as go-by's in the preparation of this notebook. These were both great notebooks that allowed this non-programmer to at least share some ideas in the spirit of collaboration.

    Inspiration

    These COVID data have certain characteristics by country/region as pointed out by Tomas Pueyo in the Medium article, "Coronavirus: The Hammer and the Dance". Tomas did an excellent job of describing these artifacts in the Hubei data in relationship to what he called the Hammer and the Dance and this gave us insight into interpreting the data from South Korea and hopefully the rest of the world soon .

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Statista, Projected COVID-19 deaths in the U.S. from Dec. 1, 2020 to Mar. 31, 2021, by scenario [Dataset]. https://www.statista.com/statistics/1176649/covid-projected-deaths-by-scenario-us/
Organization logo

Projected COVID-19 deaths in the U.S. from Dec. 1, 2020 to Mar. 31, 2021, by scenario

Explore at:
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
United States
Description

Based on projections made on December 17, the number of deaths due to COVID-19 in the United States by the end of March 2021 could range from 505,894 to 713,674 depending on the scenario. The best case scenario being 95 percent mask usage universally and the worst case being continued easing of social distancing mandates. This statistic shows the projected number of deaths due to COVID-19 in the U.S. from December 1, 2020 to March 31, 2021 based on three different scenarios, as of December 17.

Search
Clear search
Close search
Google apps
Main menu