Facebook
Twitterhttps://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Facebook
Twitterhttps://ottawa.ca/en/city-hall/get-know-your-city/open-data#open-data-licence-version-2-0https://ottawa.ca/en/city-hall/get-know-your-city/open-data#open-data-licence-version-2-0
This file contains data regarding a 7-day average of the estimated instantaneous reproduction number, R(t), of COVID-19 in Ottawa. The reproduction number, R, is the average number of secondary cases of disease caused by a single infected individual over his or her infectious period. R(t) values greater than 1 indicate the virus is spreading faster and each case infects more than one contact, and less than 1 indicates the spread is slowing and the epidemic is coming under control.
R(t) was calculated using the EpiEstim package, developed by Cori et al. (2013; DOI: 10.1093/aje/kwt133), in the R software environment for statistical computing and graphics. Accurate episode date was used as the time anchor and cases were assigned as having a local or travel-related source of infection.
Accuracy: Points of consideration for interpretation of the data: Data are entered into and extracted by Ottawa Public Health from la Solution de gestion des cas et des contacts pour la santé publique (Solution GCC). The CCM is a dynamic disease reporting system that allows for ongoing updates; data represent a snapshot at the time of extraction and may differ from previous or subsequent reports.As the cases are investigated and more information is available, the dates are updated.A person’s exposure may have occurred up to 14 days prior to onset of symptoms. Symptomatic cases occurring in approximately the last 14 days are likely under-reported due to the time for individuals to seek medical assessment, availability of testing, and receipt of test results.Confirmed cases are those with a confirmed COVID-19 laboratory result as per the Ministry of Health Public health management of cases and contacts of COVID-19 in Ontario. March 25, 2020 version 6.0.Counts will be subject to varying degrees of underreporting due to a variety of factors, such as disease awareness and medical care seeking behaviours, which may depend on severity of illness, clinical practice, changes in laboratory testing, and reporting behaviours.Surveillance testing for COVID-19 began in long term care facilities on April 25, 2020. Attributes: Data fields: Date – the earliest of symptom onset, test or reported date for cases (YYYY-MM-DD H:MM).Lower Bound - 95% Confidence Interval - lower bound of the 95% confidence interval for the 7-day average of the R(t) estimate. Upper Bound - 95% Confidence Interval - upper bound of the 95% confidence interval for the 7-day average of the R(t) estimate.Estimate of R(t) (7 Day Average) - 7-day average of the estimated instantaneous reproduction number, R(t), of COVID-19 in Ottawa. Nowcasting Adjusted Cases by Episode Date – number of Ottawa residents with confirmed COVID-19 by episode date. Counts for the most recent 14 days represent a nowcasting adjusted estimate developed by R. Imgrund in 2020. The model uses linear regression to estimate the number of future cases expected to have an accurate episode date within that 14-day window. Update Frequency: As of March 2022, the dataset is no longer updated. Historical data only. Contact: OPH Epidemiology Team
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in India. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterRead the associated blogpost for a detailed description of how this dataset was prepared; plus extra code for producing animated maps.
The 2019 Novel Coronavirus (COVID-19) continues to spread in countries around the world. This dataset provides daily updated number of reported cases & deaths in Germany on the federal state (Bundesland) and county (Landkreis/Stadtkreis) level. In April 2021 I added a dataset on vaccination progress. In addition, I provide geospatial shape files and general state-level population demographics to aid the analysis.
The dataset consists of thre main csv files: covid_de.csv, demgraphics_de.csv, and covid_de_vaccines.csv. The geospatial shapes are included in the de_state.* files. See the column descriptions below for more detailed information.
covid_de.csv: COVID-19 cases and deaths which will be updated daily. The original data are being collected by Germany's Robert Koch Institute and can be download through the National Platform for Geographic Data (the latter site also hosts an interactive dashboard). I reshaped and translated the data (using R tidyverse tools) to make it better accessible. This blogpost explains how I prepared the data, and describes how to produces animated maps.
demographics_de.csv: General Demographic Data about Germany on the federal state level. Those have been downloaded from Germany's Federal Office for Statistics (Statistisches Bundesamt) through their Open Data platform GENESIS. The data reflect the (most recent available) estimates on 2018-12-31. You can find the corresponding table here.
covid_de_vaccines.csv: In April 2021 I added this file that contains the Covid-19 vaccination progress for Germany as a whole. It details daily doses, broken down cumulatively by manufacturer, as well as the cumulative number of people having received their first and full vaccination. The earliest data are from 2020-12-27.
de_state.*: Geospatial shape files for Germany's 16 federal states. Downloaded via Germany's Federal Agency for Cartography and Geodesy . Specifically, the shape file was obtained from this link.
COVID-19 dataset covid_de.csv:
state: Name of the German federal state. Germany has 16 federal states. I removed converted special characters from the original data.
county: The name of the German Landkreis (LK) or Stadtkreis (SK), which correspond roughly to US counties.
age_group: The COVID-19 data is being reported for 6 age groups: 0-4, 5-14, 15-34, 35-59, 60-79, and above 80 years old. As a shortcut the last category I'm using "80-99", but there might well be persons above 99 years old in this dataset. This column has a few NA entries.
gender: Reported as male (M) or female (F). This column has a few NA entries.
date: The calendar date of when a case or death were reported. There might be delays that will be corrected by retroactively assigning cases to earlier dates.
cases: COVID-19 cases that have been confirmed through laboratory work. This and the following 2 columns are counts per day, not cumulative counts.
deaths: COVID-19 related deaths.
recovered: Recovered cases.
Demographic dataset demographics_de.csv:
state, gender, age_group: same as above. The demographic data is available in higher age resolution, but I have binned it here to match the corresponding age groups in the covid_de.csv file.
population: Population counts for the respective categories. These numbers reflect the (most recent available) estimates on 2018-12-31.
Vaccination progress dataset covid_de_vaccines.csv:
date: calendar date of vaccination
doses, doses_first, doses_second: Daily count of administered doses: total, 1st shot, 2nd shot.
pfizer_cumul, moderna_cumul, astrazeneca_cumul: Daily cumulative number of administered vaccinations by manufacturer.
persons_first_cumul, persons_full_cumul: Daily cumulative number of people having received their 1st shot and full vaccination, respectively.
All the data have been extracted from open data sources which are being gratefully acknowledged:
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in the United States of America. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in the United Kingdom. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterAs global communities responded to COVID-19, we heard from public health officials that the same type of aggregated, anonymized insights we use in products such as Google Maps would be helpful as they made critical decisions to combat COVID-19. These Community Mobility Reports aimed to provide insights into what changed in response to policies aimed at combating COVID-19. The reports charted movement trends over time by geography, across different categories of places such as retail and recreation, groceries and pharmacies, parks, transit stations, workplaces, and residential.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in Belgium. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in Italy. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterBy Valtteri Kurkela [source]
The dataset is constantly updated and synced hourly to ensure up-to-date information. With over several columns available for analysis and exploration purposes, users can extract valuable insights from this extensive dataset.
Some of the key metrics covered in the dataset include:
Vaccinations: The dataset covers total vaccinations administered worldwide as well as breakdowns of people vaccinated per hundred people and fully vaccinated individuals per hundred people.
Testing & Positivity: Information on total tests conducted along with new tests conducted per thousand people is provided. Additionally, details on positive rate (percentage of positive Covid-19 tests out of all conducted) are included.
Hospital & ICU: Data on ICU patients and hospital patients are available along with corresponding figures normalized per million people. Weekly admissions to intensive care units and hospitals are also provided.
Confirmed Cases: The number of confirmed Covid-19 cases globally is captured in both absolute numbers as well as normalized values representing cases per million people.
5.Confirmed Deaths: Total confirmed deaths due to Covid-19 worldwide are provided with figures adjusted for population size (total deaths per million).
6.Reproduction Rate: The estimated reproduction rate (R) indicates the contagiousness of the virus within a particular country or region.
7.Policy Responses: Besides healthcare-related metrics, this comprehensive dataset includes policy responses implemented by countries or regions such as lockdown measures or travel restrictions.
8.Other Variables of InterestThe data encompasses various socioeconomic factors that may influence Covid-19 outcomes including population density,membership in a continent,gross domestic product(GDP)per capita;
For demographic factors: -Age Structure : percentage populations aged 65 and older,aged (70)older,median age -Gender-specific factors: Percentage of female smokers -Lifestyle-related factors: Diabetes prevalence rate and extreme poverty rate
- Excess Mortality: The dataset further provides insights into excess mortality rates, indicating the percentage increase in deaths above the expected number based on historical data.
The dataset consists of numerous columns providing specific information for analysis, such as ISO code for countries/regions, location names,and units of measurement for different parameters.
Overall,this dataset serves as a valuable resource for researchers, analysts, and policymakers seeking to explore various aspects related to Covid-19
Introduction:
Understanding the Basic Structure:
- The dataset consists of various columns containing different data related to vaccinations, testing, hospitalization, cases, deaths, policy responses, and other key variables.
- Each row represents data for a specific country or region at a certain point in time.
Selecting Desired Columns:
- Identify the specific columns that are relevant to your analysis or research needs.
- Some important columns include population, total cases, total deaths, new cases per million people, and vaccination-related metrics.
Filtering Data:
- Use filters based on specific conditions such as date ranges or continents to focus on relevant subsets of data.
- This can help you analyze trends over time or compare data between different regions.
Analyzing Vaccination Metrics:
- Explore variables like total_vaccinations, people_vaccinated, and people_fully_vaccinated to assess vaccination coverage in different countries.
- Calculate metrics such as people_vaccinated_per_hundred or total_boosters_per_hundred for standardized comparisons across populations.
Investigating Testing Information:
- Examine columns such as total_tests, new_tests, and tests_per_case to understand testing efforts in various countries.
- Calculate rates like tests_per_case to assess testing efficiency or identify changes in testing strategies over time.
Exploring Hospitalization and ICU Data:
- Analyze variables like hosp_patients, icu_patients, and hospital_beds_per_thousand to understand healthcare systems' strain.
- Calculate rates like icu_patients_per_million or hosp_patients_per_million for cross-country comparisons.
Assessing Covid-19 Cases and Deaths:
- Analyze variables like total_cases, new_ca...
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
The dataset complements the following study: This study looks at the content on Reddit’s COVID-19 community, r/Coronavirus, to capture and understand the main themes and discussions around the global pandemic, and their evolution, over the first year. It is based on 356,690 submissions and 9,413,331 comments corresponding to the period of 20th January 2020 and 31st January 2021. On each of these datasets we carried out analysis based on lexical sentiment and topics generated from unsupervised topic modelling. The study found that negative sentiments show higher ratio in submissions while negative sentiments were of the same ratio as positive ones in the comments. Terms associated more positively or negatively were identified. Upon assessment of the upvotes and downvotes, this study also uncovered contentious topics, particularly “fake” or misleading news. Through topic modelling, 9 distinct topics were identified from submissions while 20 were identified from comments. Overall, this study provides a clear overview on the dominating topics and popular sentiments pertaining the pandemic during the first year. Our methodology provides an invaluable tool for governments and health authorities to obtain a deeper understanding of the dominant public concerns.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
2020
Facebook
TwitterDeaths involving coronavirus disease 2019 (COVID-19) by month of death, region, age, place of death, and race and Hispanic origin.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Total cases, deaths, and Farr’s ratios associated with COVID-19 pandemic, per country until April 10, 2020.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table 1 - Estimating the basic reproduction number for COVID-19 in Western Europe
Facebook
TwitterAs of March 10, 2023, the death rate from COVID-19 in the state of New York was 397 per 100,000 people. New York is one of the states with the highest number of COVID-19 cases.
Facebook
TwitterOpen Government Licence 3.0http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
License information was derived automatically
Antibody data, by UK country and age, from the Coronavirus (COVID-19) Infection Survey.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Identifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting both nationally and subnationally in the Russian Federation. These results are impacted by changes in testing effort, increases and decreases in testing effort will increase and decrease reproduction number estimates respectively.
Facebook
TwitterIdentifying changes in the reproduction number, rate of spread, and doubling time during the course of the COVID-19 outbreak whilst accounting for potential biases due to delays in case reporting at the local authority level in the United Kingdom.
Facebook
Twitterhttps://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.