Official statistics are produced impartially and free from political influence.
The MIDAS (Models of Infectious Disease Agent Study) Online Portal for COVID-19 Modeling Research is a collection of publicly-available COVID-19 resources to support dashboard monitoring, data processing, modeling, and visualization efforts. Collections listed in the portal include case counts and case line lists with documented metadata, peer-reviewed and non-peer-reviewed parameter estimates, and software created by MIDAS community members. Datasets and parameter estimates are maintained and stored in the MIDAS Github repository; software is hosted by their respective creators on Github or a personal webpage.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes three tables with the model-based projections and estimates as shown on CalCAT in 2025 (http://calcat.cdph.ca.gov) for California state, regions, and counties.
(1) COVID-19 Nowcasts includes the R-effective estimates for COVID-19 from the different models available for the past 80 days from the archive date and the median ensemble thereof.
(2) CalCAT Forecasts includes hospital census and admissions forecasts for COVID-19 and Influenza, and the corresponding ensemble metrics for a 4 week horizon from the archive date.
(3) Variant Proportion Nowcasts contains the Integrated Genomic Epidemiology Dataset (IGED)-based and Terra-based estimates of COVID-19 variants circulating over the past 3 months as well as model-based predictions for the proportions of the variants of concern for dates leading up to the archive date. Prediction intervals are included when available.
This dataset provides CalCAT users with programmatic access to the downloadable datasets on CalCAT.
This dataset also includes a zipped file with the historical archives of the COVID-19 Nowcasts, CalCAT Forecasts and Variant Proportion Nowcasts through 2023.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
My PhD thesis with title "Investigating Changes in COVID-19 Epidemiological Parameters from Different Perspectives" focus on using line list data (anonymized), patient hospitalization data (anonymized) and viral load data (anonymized) to improve the estimatin of different key epidemiological parameters during the COVID-19 pandemic in Hong Kong.This dataset contains supporting data for reproducibility, it has 6 subfolders correspond to 6 chapters of the thesis (chapters 2, 4, 5, 6, 7, 8) where contain figures and data analyses, each sub folder contains data and R code for reproducing the figures and other analytical results, with README file accompanied with each sub folder.In chapter 2, I provided an overview of the COVID-19 pandemic in Hong Kong and worldwide, and thus used datasets contain case incidence data and a R code to generate incidence figure. I also conducted a systematic review of the latent period estimation, and I provided the endnote library with spreadsheet of the endnote output that contain my paper screening process, which are included in subfolder dataset chapter 2.In chapter 4, I did a detailed statistical analyses of the changing serial interval of COVID-19 in Hong Kong, and thus sub folder dataset chapter 4 contained anonymized transmission pair line list data for estimating the serial interval, I provided R codes and essential subset of the data output for reproducibility of my results. The related published work is on American Journal of Epidemiology, in README chapter4.txt I have put the DOI of this paper.In chapter 5, I developed an inferential framework to infer the generation interval on temporal time scale, sub folder dataset chapter 5 contained public available line list data from mainland China, and R codes and essential subset of the data output for reproducibility of my results. The related published work is on Nature Communications, and the data and code are also available on github, I have out the DOI and github link in README chapter5.txt.In chapter 6, I investigated the superspreading potential and setting-specific generation interval in Hong Kong, subfolder dataset chapter 6 contained simplified and anonymized transmission cluster size information, and related R code to reproduce the result, and also the R code for modelling buildig and estimation summary of the generation interval estimates.In chapter 7, I estimated the latent period of COVID-19 based on different settings in Hong Kong, sub folder dataset chapter 7 contained processed and anonymized viral load record and transmission pair information of COVID-19 cases in Hong Kong, and related R code to reproduce the result, together with two spreadsheets for estimation summary. The entire R programming process contain a lot of R scripts, which I put two sub folders (R and Stan) under sub folder dataset chapter 7, and also put the original Github link for R programming of the method in README chapter 7.txtIn chapter 8, I analyzed the length of stay in hospital of COVID-19 patients in Hong Kong and the potential association with vaccination status. In sub folder dataset chapter 8 I put a simplified and anonymized dataset of patient's hospitalization record regarding their vaccination status and length of stay in hospital for the analysis. I also put R code and essential subset of the data output to reproduce the result.
Open Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Global model data has been generated for COVID-19 (Coronavirus Disease 2019) simulations. The model used was the United Kingdom Earth System Model 1.0 (UKESM1.0), in an atmosphere-only nudged configuration, with Met Office Unified Model version 11.5. The data is on a global N96 grid (192 x 144 points), and covers the years 2012, 2013, and 2014. These data were used to study the effect of COVID-19 lockdowns (simulated scenarios) on atmospheric composition and radiative forcing.
The dataset includes data used in the paper submitted to Geophysical Research Letters (GRL) August 2020 with title 'Minimal climate impacts from short-lived climate forcers following emission reductions related to the COVID-19 pandemic'. See Details/Docs tab for a link to this. For this purpose, there are four experimental integrations (a1, a2, a3, a4), and a control (con) for each year. The files are labelled using variable codes such as m01s34i001 to determine the model variable field contained. A full description of what these are can be found in the included docs/file variable_codes.txt.
The data are in NetCDF format, and were generated from the following suites: u-bt034, u-bt090, u-bt091, u-bt092, u-bt637, u-bt341, u-bt342, u-bt343, u-bt344, u-bt926, u-bt375, u-bt376, u-bt377, u-bt378, u-bt927.
This is a NERC funded project.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Estimate, standard error (SE) and 95% confidence interval of peak statistics using the COVID-19 daily case reporting data from Italy (2020-02-20 to 2020-07-11).
Attribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
Dataset general description:
• This dataset reports 4195 recurrent neural network models, their settings, and their generated prediction csv files, graphs, and metadata files, for predicting COVID-19's daily infections in Brazil by training on limited raw data (30 time-steps and 40 time-steps alternatives). The used code is developed by the author and located in the following online data repository link: http://dx.doi.org/10.17632/yp4d95pk7n.1
Dataset content:
• Models, Graphs, and csv predictions files: 1. Deterministic mode (DM): includes 1194 generated models files (30 time-steps), and their generated 2835 graphs and 2835 predictions files. Similarly, this mode includes 1976 generated model files (40 time-steps), and their generated 7301 graphs and 7301 predictions files. 2. Non-deterministic mode (NDM): includes 20 generated model files (30 time-steps), and their generated 53 graphs and 53 predictions files. 3. Technical validation mode (TVM): includes 1001 generated model files (30 time-steps), and their generated 3619 graphs and 3619 predictions files for 358 models, which are a sample of 1001 models. Also, 1 model in control group for India. 4. 1 graph and 1 prediction files for each of DM and NDM, reporting evaluation till 2020-07-11.
• Settings and metadata for the above 3 categories: 1. Used settings in json files for reproducibility. 2. Metadata about training and prediction setup and accuracy in csv files.
Raw data source that was used to train the models:
• The used raw data for training the models is from: COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University): https://github.com/CSSEGISandData/COVID-19
• The models were trained on these versions of the raw data: 1. Link till 2020-06-29 (accessed 2020-07-08): https://github.com/CSSEGISandData/COVID-19/raw/78d91b2dbc2a26eb2b2101fa499c6798aa22fca8/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv 2. Link till 2020-06-13 (accessed 2020-07-08): https://github.com/CSSEGISandData/COVID-19/raw/02ea750a263f6d8b8945fdd3253b35d3fd9b1bee/csse_covid_19_data/csse_covid_19_time_series/time_series_covid19_confirmed_global.csv
License: This prediction Dataset is licensed under CC BY NC 3.0.
Notice and disclaimer: 1- This prediction Dataset is for scientific and research purposes only. 2- The generation of this Dataset complies with the terms of use of the publicly available raw data from the COVID-19 Data Repository by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University: https://github.com/CSSEGISandData/COVID-19 and therefore, the author of the prediction Dataset disclaims any and all responsibility and warranties regarding the contents of used raw data, including but not limited to: the correctness, completeness, and any issues linked to third-party rights.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw, original data and fits data set for "A Simple Model to Predict Future SARS-CoV-2 Infections on a National Level" by Blanco et al. in EXCEL and GraphPad Prism file formats.
This dataset contains forecasted weekly numbers of reported COVID-19 incident cases, incident deaths, and cumulative deaths in the United States, previously reported on COVID Data Tracker (https://covid.cdc.gov/covid-data-tracker/#datatracker-home). These forecasts were generated using mathematical models by CDC partners in the COVID-19 Forecast Hub (https://covid19forecasthub.org/doc/ensemble/). A CDC ensemble model was produced every week using the submitted models from that week at the national, and state/territory level.
This dataset is intended to mirror the observed and forecasted data, previously available for download on the CDC’s COVID Data Tracker. Mortality forecasts for both new and cumulative reported COVID-19 deaths were produced at the state and territory level and national level. Forecasts of new reported COVID-19 cases were produced at the county, state/territory, and national level. Please note that this dataset is not complete for every model, date, location or combination thereof. Specifically, county level submissions for COVID-19 incident cases were accepted, but not required, and are missing or incomplete for many models and dates. State and territory-level forecasts are more complete, but not all models submitted forecasts for all locations, dates, and targets (new reported deaths, new reported cases, and cumulative reported deaths). Forecasts for COVID-19 incident cases were discontinued in February 2022. Forecasts for COVID-19 cumulative and incident deaths were discontinued in March 2023.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In examples 1 to 3, we have demonstrated how to use Excel to calculate variables Sn, En, In, Rn, yn in l-i SEIR (Susceptible-Exposed-Infectious-Recovered) model, to determine the time-dependent kn, and to find the number of actual total infections in the absence of vaccination and breakthrough infections. In the l-i SEIR model, l is the time length of latent period, i is the time length of infectious period, and yn is the number of daily-confirmed cases of infections. In this section (Example 4), we will extend l-i SEIR model to l-i SEIR-vaccination model for examining the effect of vaccination on COVID-19 transmission. Two files (one Word file and one Excel files) are attached. In the Word file, the author described how to build the l-i SEIR-vaccination model and how to calculate the number of daily confirmed cases of COVID-19 infections, yn, in Excel. The calculated yn and the reported yn have been compared to each other and displayed graphically in the Excel file
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
I combined several data sources to gain an integrated dataset involving country-level COVID-19 confirmed, recovered and fatalities cases which can be used to build some epidemic models such as SIR, SIR with mortality. Adding information regarding population which can be used for calculating incidence rate and prevalence rate. One of my applications based on this dataset is published at https://dylansp.shinyapps.io/COVID19_Visualization_Analysis_Tool/.
My approach is to retrieve cumulative confirmed cases, fatalities and recovered cases since 2020-01-22 onwards from the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE) COVID-19 dataset, merged with country code as well as population of each country. For the purpose of building epidemic models, I calculated information regarding daily new confirmed cases, recovered cases, and fatalities, together with remaining confirmed cases which equal to cumulative confirmed cases - cumulative recovered cases - cumulative fatalities. I haven't yet to find creditable data sources regarding probable cases of various countries yet. I'll add them once I found them.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Studies examining factors responsible for COVID-19 incidence are mostly focused at the national or sub-national level. A global-level characterization of contributing factors and temporal trajectories of disease incidence is lacking. Here we conducted a global-scale analysis of COVID-19 infections to identify key factors associated with early disease incidence. Additionally, we compared longitudinal trends of COVID-19 incidence at a per-country level and classified countries based on COVID-19 incidence trajectories and effects of lockdown responses. Univariate analysis identified eleven variables as independently associated with COVID-19 infections at a global level (p<1e-05). Multivariable analysis identified a 4-variable model as optimal for explaining global variations in COVID-19 (p<0.01). COVID-19 case trajectories for most countries were best captured by a log-logistic model, as determined by AIC estimates. Six predominant country clusters were identified when characterizing the effects of lockdown intervals on variations in COVID-19 new cases per country. Globally, economic and meteorological factors are important determinants of early COVID-19 incidence. Analysis of longitudinal trends and lockdown effects on COVID-19 highlights important nuances in country-specific responses to infections. These results provide valuable insights into disease incidence at a per-country level, possibly allowing for more informed decision making by individual governments in future disease outbreaks. Methods Data for COVID-19 confirmed cases was obtained from https://ourworldindata.org/coronavirus-source-data, which is updated daily and based on data on confirmed cases and deaths from Johns Hopkins University. Data on additional demographic, meteorological, health or economic variables were downloaded from a variety of sources online. For each variable, values from the most recent year for which data on the greatest number of countries were available were utilized (varied between 2016-2019). Variables were categorized as Demographic, Meterological, Health or Economic domains. Please see the README document ("README_data_COVID19_112322.txt") and the accompanying published article: Ghosh, S., Roy, S.S. Global-scale modeling of early factors and country-specific trajectories of COVID-19 incidence: a cross-sectional study of the first 6 months of the pandemic. BMC Public Health 22, 1919 (2022). https://doi.org/10.1186/s12889-022-14336-w
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset was collected from the official website of the Nigeria Centre for Disease Control (NCDC) provides the daily incidence of COVID-19 from February 23, 2020, to April 10, 2021, were organised in a spreadsheet to build a daily time-series database. The dataset also contains population per state in Nigeria, COVID-19 testing laboratories, etc.
https://www.immport.org/agreementhttps://www.immport.org/agreement
COVID-19 has challenged health systems to learn how to learn. This paper describes the context, methods and challenges for learning to improve COVID-19 care at one academic health center. Challenges to learning include: (1) choosing a right clinical target; (2) designing methods for accurate predictions by borrowing strength from prior patients' experiences; (3) communicating the methodology to clinicians so they understand and trust it; (4) communicating the predictions to the patient at the moment of clinical decision; and (5) continuously evaluating and revising the methods so they adapt to changing patients and clinical demands. To illustrate these challenges, this paper contrasts two statistical modeling approaches - prospective longitudinal models in common use and retrospective analogues complementary in the COVID-19 context - for predicting future biomarker trajectories and major clinical events. The methods are applied to and validated on a cohort of 1,678 patients who were hospitalized with COVID-19 during the early months of the pandemic. We emphasize graphical tools to promote physician learning and inform clinical decision making.
Classical epidemiological models assume mass action. However, this assumption is violated when interactions are not random. With the recent COVID-19 pandemic, and resulting shelter in place social distancing directives, mass action models must be modified to account for limited social interactions. In this paper we apply a pairwise network model with moment closure to study the early transmission of COVID-19 in New York and San Francisco and to investigate the factors determining the severity and duration of outbreak in these two cities. In particular, we consider the role of population density, transmission rates and social distancing on the disease dynamics and outcomes. Sensitivity analysis shows that there is a strongly negative correlation between the clustering coefficient in the pairwise model and the basic reproduction number and the effective reproduction number. The shelter in place policy makes the clustering coefficient increase thereby reducing the basic reproduction number and the effective reproduction number. By switching population densities in New York and San Francisco we demonstrate how the outbreak would progress if New York had the same density as San Francisco and vice-versa. The results underscore the crucial role that population density has in the epidemic outcomes. We also show that under the assumption of no further changes in policy or transmission dynamics not lifting the shelter in place policy would have little effect on final outbreak size in New York, but would reduce the final size in San Francisco by 97%.
This dataset was created by inversion
It contains the following files:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Corresponding to the second COVID-19 data, the goodness of fit measures of the fitted models.
https://www.immport.org/agreementhttps://www.immport.org/agreement
Background: The impact of variable infection risk by race and ethnicity on the dynamics of SARS-CoV-2 spread is largely unknown. Methods: Here, we fit structured compartmental models to seroprevalence data from New York State and analyze how herd immunity thresholds (HITs), final sizes, and epidemic risk change across groups. Results: A simple model where interactions occur proportionally to contact rates reduced the HIT, but more realistic models of preferential mixing within groups increased the threshold toward the value observed in homogeneous populations. Across all models, the burden of infection fell disproportionately on minority populations: in a model fit to Long Island serosurvey and census data, 81% of Hispanics or Latinos were infected when the HIT was reached compared to 34% of non-Hispanic whites. Conclusions: Our findings, which are meant to be illustrative and not best estimates, demonstrate how racial and ethnic disparities can impact epidemic trajectories and result in unequal distributions of SARS-CoV-2 infection. Funding: K.C.M. was supported by National Science Foundation GRFP grant DGE1745303. Y.H.G. and M.L. were funded by the Morris-Singer Foundation. M.L. was supported by SeroNet cooperative agreement U01 CA261277
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
The COVID-19 Scenario Hub contains a standardized set of data on scenario projections from teams making projections of cumulative and incident deaths and incident hospitalizations due to COVID-19 in the United States. The Scenario Hub harmonizes scenario projections in the United States to generate long-term COVID-19 projections combining insights from different models and in order to make them available to decision-makers, public health experts, and the general public.
Open Government Licence - Canada 2.0https://open.canada.ca/en/open-government-licence-canada
License information was derived automatically
Mathematical modelling of COVID-19 pandemic which uses mathematical equations to estimate how many cases of a disease may occur in the coming weeks or months.
Official statistics are produced impartially and free from political influence.