100+ datasets found

g
Coronavirus (Covid-19) Data in the United States
github.com
openicpsr.org
+2more
csv
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data
Explore at:
csvAvailable download formats
Dataset provided by
New York Times
License
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
Description
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
e
COVID-19 Trends in Each Country
coronavirus-resources.esri.com
hub.arcgis.com
+2more
Updated Mar 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Urban Observatory by Esri (2020). COVID-19 Trends in Each Country [Dataset]. https://coronavirus-resources.esri.com/maps/a16bb8b137ba4d8bbe645301b80e5740
Explore at:
Dataset updated
Mar 28, 2020
Dataset authored and provided by
Urban Observatory by Esri
Area covered
Earth
Description
On March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. For updated cases, deaths, and vaccine data please visit: World Health Organization (WHO)For more information, visit the Johns Hopkins Coronavirus Resource Center.COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.DOI: https://doi.org/10.6084/m9.figshare.125529863/7/2022 - Adjusted the rate of active cases calculation in the U.S. to reflect the rates of serious and severe cases due nearly completely dominant Omicron variant.6/24/2020 - Expanded Case Rates discussion to include fix on 6/23 for calculating active cases.6/22/2020 - Added Executive Summary and Subsequent Outbreaks sectionsRevisions on 6/10/2020 based on updated CDC reporting. This affects the estimate of active cases by revising the average duration of cases with hospital stays downward from 30 days to 25 days. The result shifted 76 U.S. counties out of Epidemic to Spreading trend and no change for national level trends.Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Correction on 6/1/2020Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Revisions added on 4/30/2020 are highlighted.Revisions added on 4/23/2020 are highlighted.Executive SummaryCOVID-19 Trends is a methodology for characterizing the current trend for places during the COVID-19 global pandemic. Each day we assign one of five trends: Emergent, Spreading, Epidemic, Controlled, or End Stage to geographic areas to geographic areas based on the number of new cases, the number of active cases, the total population, and an algorithm (described below) that contextualize the most recent fourteen days with the overall COVID-19 case history. Currently we analyze the countries of the world and the U.S. Counties. The purpose is to give policymakers, citizens, and analysts a fact-based data driven sense for the direction each place is currently going. When a place has the initial cases, they are assigned Emergent, and if that place controls the rate of new cases, they can move directly to Controlled, and even to End Stage in a short time. However, if the reporting or measures to curtail spread are not adequate and significant numbers of new cases continue, they are assigned to Spreading, and in cases where the spread is clearly uncontrolled, Epidemic trend.We analyze the data reported by Johns Hopkins University to produce the trends, and we report the rates of cases, spikes of new cases, the number of days since the last reported case, and number of deaths. We also make adjustments to the assignments based on population so rural areas are not assigned trends based solely on case rates, which can be quite high relative to local populations.Two key factors are not consistently known or available and should be taken into consideration with the assigned trend. First is the amount of resources, e.g., hospital beds, physicians, etc.that are currently available in each area. Second is the number of recoveries, which are often not tested or reported. On the latter, we provide a probable number of active cases based on CDC guidance for the typical duration of mild to severe cases.Reasons for undertaking this work in March of 2020:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-25 days + 5% from past 26-49 days - total deaths. On 3/17/2022, the U.S. calculation was adjusted to: Active Cases = 100% of new cases in past 14 days + 6% from past 15-25 days + 3% from past 26-49 days - total deaths. Sources: https://www.cdc.gov/mmwr/volumes/71/wr/mm7104e4.htm https://covid.cdc.gov/covid-data-tracker/#variant-proportions If a new variant arrives and appears to cause higher rates of serious cases, we will roll back this adjustment. We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source
c
The COVID Tracking Project
covidtracking.com
google sheets
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The COVID Tracking Project [Dataset]. https://covidtracking.com/
Explore at:
google sheetsAvailable download formats
Description
The COVID Tracking Project collects information from 50 US states, the District of Columbia, and 5 other US territories to provide the most comprehensive testing data we can collect for the novel coronavirus, SARS-CoV-2. We attempt to include positive and negative results, pending tests, and total people tested for each state or district currently reporting that data.
Testing is a crucial part of any public health response, and sharing test data is essential to understanding this outbreak. The CDC is currently not publishing complete testing data, so we’re doing our best to collect it from each state and provide it to the public. The information is patchy and inconsistent, so we’re being transparent about what we find and how we handle it—the spreadsheet includes our live comments about changing data and how we’re working with incomplete information.
From here, you can also learn about our methodology, see who makes this, and find out what information states provide and how we handle it.
Number of U.S. COVID-19 cases from Jan. 20, 2020 - Nov. 11, 2022, by week
statista.com
Updated Nov 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Number of U.S. COVID-19 cases from Jan. 20, 2020 - Nov. 11, 2022, by week [Dataset]. https://www.statista.com/statistics/1102816/coronavirus-covid19-cases-number-us-americans-by-day/
Explore at:
Dataset updated
Nov 17, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 20, 2020 - Nov 11, 2022
Area covered
United States
Description
Around 282 thousand new cases of COVID-19 were reported in the United States during the week ending November 11, 2022. Between January 20, 2020 and November 11, 2022 there had been around 96.8 million confirmed cases of COVID-19 with over one million deaths in the U.S. as reported by the World Health Organization.

How did the coronavirus outbreak start? Pneumonia cases with an unknown cause were first reported in the Hubei province of China at the end of December 2019. Patients described symptoms including a fever and difficulty breathing, and early reports suggested no evidence of human-to-human transmission. We now know that a novel coronavirus named SARS-CoV-2 is causing the disease COVID-19. The virus has been characterized as a pandemic and continues to spread from person to person – there have been around 642 million cases worldwide as of November 17, 2022.

The importance of isolation and quarantine In an effort to contain the early spread of the virus, China tightened travel restrictions and enforced isolation measures in the hardest-hit areas. The World Health Organization endorsed this strategy, and countries around the world implemented similar quarantine measures. Staying at home can limit the spread of the virus, and this applies to individuals who are only showing mild symptoms or none at all. Asymptomatic carriers of the virus – those that are experiencing no symptoms – may transmit the virus to people who are at a higher risk of getting very sick.
g
Coronavirus COVID-19 Global Cases by the Center for Systems Science and...
github.com
systems.jhu.edu
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE), Coronavirus COVID-19 Global Cases by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [Dataset]. https://github.com/CSSEGISandData/COVID-19
Explore at:
Dataset provided by
Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE)
Area covered
Global
Description
2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Confirmed Cases by Country/Region/Sovereignty
Confirmed Cases by Province/State/Dependency
Deaths
Recovered
Downloadable data:
https://github.com/CSSEGISandData/COVID-19
Additional Information about the Visual Dashboard:
https://systems.jhu.edu/research/public-health/ncov
Increase in medical app downloads during peak of COVID-19 crisis by country...
statista.com
Updated Oct 22, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2020). Increase in medical app downloads during peak of COVID-19 crisis by country 2020 [Dataset]. https://www.statista.com/statistics/1181413/medical-app-downloads-growth-during-covid-pandemic-by-country/
Explore at:
Dataset updated
Oct 22, 2020
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Jan 2020 - Jul 2020
Area covered
Worldwide
Description
This statistic illustrates the growth in the number of medical apps downloaded in January 2020 compared to the 'peak' month for the COVID-19 crisis in each respective country. South Korea had the highest growth, with a 135 percent increase in such downloads comparing its peak month of the pandemic with January.
d
Data from: COVID-19 prevalence and predictors in United States adults during...
search.dataone.org
data.niaid.nih.gov
+1more
Updated Apr 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Robert Morlock (2025). COVID-19 prevalence and predictors in United States adults during peak stay-at-home orders [Dataset]. http://doi.org/10.5061/dryad.2547d7wpq
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.2547d7wpq
Dataset updated
Apr 30, 2025
Dataset provided by
Dryad Digital Repository
Authors
Robert Morlock
Time period covered
Jan 1, 2021
Area covered
United States
Description
This was a cross-sectional nationwide survey of adults in the US conducted between April 24 andÂ May 13, 2020. The survey targeted a representative sample of approximately 5,000 respondents. The rate of COVID-19 cases and testing, most frequently reported symptoms, symptom severity, treatment received, impact of COVID-19 on mental and physical health, and factors predictive of testing positive were assessed.
m
COVID-19 reporting
mass.gov
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Executive Office of Health and Human Services (2023). COVID-19 reporting [Dataset]. https://www.mass.gov/info-details/covid-19-reporting
Explore at:
Dataset updated
Dec 4, 2023
Dataset provided by
Executive Office of Health and Human Services
Department of Public Health
Area covered
Massachusetts
Description
The COVID-19 dashboard includes data on city/town COVID-19 activity, confirmed and probable cases of COVID-19, confirmed and probable deaths related to COVID-19, and the demographic characteristics of cases and deaths.
f
Infections averted in the general population with 5-day testing and one-time...
figshare.com
xls
Updated Jun 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lauren E. Cipriano; Wael M. R. Haddara; Gregory S. Zaric; Eva A. Enns (2023). Infections averted in the general population with 5-day testing and one-time testing of students compared to a policy of no routine asymptomatic testing (symptom-based surveillance and contact tracing only). [Dataset]. http://doi.org/10.1371/journal.pone.0255782.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0255782.t002
Dataset updated
Jun 4, 2023
Dataset provided by
PLOS ONE
Authors
Lauren E. Cipriano; Wael M. R. Haddara; Gregory S. Zaric; Eva A. Enns
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Infections averted in the general population with 5-day testing and one-time testing of students compared to a policy of no routine asymptomatic testing (symptom-based surveillance and contact tracing only).
Average number of daily new infections in the simulation for the first 226...
plos.figshare.com
xlsx
Updated Jun 4, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pinar Keskinocak; Buse Eylul Oruc; Arden Baxter; John Asplund; Nicoleta Serban (2023). Average number of daily new infections in the simulation for the first 226 days. [Dataset]. http://doi.org/10.1371/journal.pone.0239798.s018
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0239798.s018
Dataset updated
Jun 4, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Pinar Keskinocak; Buse Eylul Oruc; Arden Baxter; John Asplund; Nicoleta Serban
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data values are averages from the 30 runs. The number of children, adults, elderly and total number of people infected on a given day in the simulation is recorded. Since the simulation is based on the population of Georgia, and each entity in the simulation represents ten people, all data values recorded are based on one-tenth of the population of Georgia, that is, there are a total of roughly one million people in the simulation. (XLSX)
a
COVID-19 Trends in Each Country-Copy
open-data-pittsylvania.hub.arcgis.com
Updated Jun 4, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
United Nations Population Fund (2020). COVID-19 Trends in Each Country-Copy [Dataset]. https://open-data-pittsylvania.hub.arcgis.com/datasets/UNFPAPDP::covid-19-trends-in-each-country-copy
Explore at:
Dataset updated
Jun 4, 2020
Dataset authored and provided by
United Nations Population Fund
Area covered
Description
COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.Revisions added on 4/23/2020 are highlighted.Revisions added on 4/30/2020 are highlighted.Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Correction on 6/1/2020Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Reasons for undertaking this work:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-30 days + 5% from past 31-56 days - total deaths.We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source used as basis:Stephen A. Lauer, MS, PhD *; Kyra H. Grantz, BA *; Qifang Bi, MHS; Forrest K. Jones, MPH; Qulu Zheng, MHS; Hannah R. Meredith, PhD; Andrew S. Azman, PhD; Nicholas G. Reich, PhD; Justin Lessler, PhD. 2020. The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application. Annals of Internal Medicine DOI: 10.7326/M20-0504.New Cases per Day (NCD) = Measures the daily spread of COVID-19. This is the basis for all rates. Back-casting revisions: In the Johns Hopkins’ data, the structure is to provide the cumulative number of cases per day, which presumes an ever-increasing sequence of numbers, e.g., 0,0,1,1,2,5,7,7,7, etc. However, revisions do occur and would look like, 0,0,1,1,2,5,7,7,6. To accommodate this, we revised the lists to eliminate decreases, which make this list look like, 0,0,1,1,2,5,6,6,6.Reporting Interval: In the early weeks, Johns Hopkins' data provided reporting every day regardless of change. In late April, this changed allowing for days to be skipped if no new data was available. The day was still included, but the value of total cases was set to Null. The processing therefore was updated to include tracking of the spacing between intervals with valid values.100 News Cases in a day as a spike threshold: Empirically, this is based on COVID-19’s rate of spread, or r0 of ~2.5, which indicates each case will infect between two and three other people. There is a point at which each administrative area’s capacity will not have the resources to trace and account for all contacts of each patient. Thus, this is an indicator of uncontrolled or epidemic trend. Spiking activity in combination with the rate of new cases is the basis for determining whether an area has a spreading or epidemic trend (see below). Source used as basis:World Health Organization (WHO). 16-24 Feb 2020. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). Obtained online.Mean of Recent Tail of NCD = Empirical, and a COVID-19-specific basis for establishing a recent trend. The recent mean of NCD is taken from the most recent fourteen days. A minimum of 21 days of cases is required for analysis but cannot be considered reliable. Thus, a preference of 42 days of cases ensures much higher reliability. This analysis is not explanatory and thus, merely represents a likely trend. The tail is analyzed for the following:Most recent 2 days: In terms of likelihood, this does not mean much, but can indicate a reason for hope and a basis to share positive change that is not yet a trend. There are two worthwhile indicators:Last 2 days count of new cases is less than any in either the past five or 14 days. Past 2 days has only one or fewer new cases – this is an extremely positive outcome if the rate of testing has continued at the same rate as the previous 5 days or 14 days. Most recent 5 days: In terms of likelihood, this is more meaningful, as it does represent at short-term trend. There are five worthwhile indicators:Past five days is greater than past 2 days and past 14 days indicates the potential of the past 2 days being an aberration. Past five days is greater than past 14 days and less than past 2 days indicates slight positive trend, but likely still within peak trend time frame.Past five days is less than the past 14 days. This means a downward trend. This would be an
Daily new COVID-19 confirmed cases Australia Mar-Sep 2020
statista.com
Updated Sep 15, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Daily new COVID-19 confirmed cases Australia Mar-Sep 2020 [Dataset]. https://www.statista.com/statistics/1113327/australia-covid-19-new-confirmed-cases/
Explore at:
Dataset updated
Sep 15, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Time period covered
Mar 1, 2020 - Sep 30, 2020
Area covered
Australia
Description
On September 30, 2020, there were 17 new reported confirmed cases of COVID-19 in Australia. Australia's daily new confirmed coronavirus cases peaked on July 30 with 746 new cases on that day. This was considered to be the second wave of coronavirus infections in Australia, with the first wave peaking at the end of March at 460 cases before dropping to less than 20 cases per day throughout May and most of June.

 A second wave

Australia’s second wave of coronavirus found its epicenter in Melbourne, after over a month of recording low numbers of national daily cases. Despite being primarily focused within a single state, clusters of coronavirus cases in Victoria soon pushed the daily number of recorded cases over that of the first wave, with well over double the number of deaths. As a result, the Victorian Government once again increased lockdown measures to limit movement and social interaction. At the same time the other states and territories closed or restricted movement across borders, with some of the strictest border closures taking place in Western Australian.

 Is Australia entering into a recession?

After narrowly avoiding a recession during the global financial crisis, by September 2020 Australia had recorded two consecutive quarters of economic decline, hailing the country’s first recession since 1991. This did not necessarily come as a surprise for many Australians who had already witnessed a rising unemployment rate throughout the second quarter of 2020 alongside ongoing restrictions on retail and hospitality trading. However, thanks to welfare initiatives like JobKeeper and a government stimulus payment supplementing many household incomes, the economic situation could have been much worse at this point.
Number of coronavirus (COVID-19) cases in New York as of Dec. 16, 2022, by...
statista.com
Updated Dec 26, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Statista (2022). Number of coronavirus (COVID-19) cases in New York as of Dec. 16, 2022, by county [Dataset]. https://www.statista.com/statistics/1109360/coronavirus-covid19-cases-number-new-york-by-county/
Explore at:
Dataset updated
Dec 26, 2022
Dataset authored and provided by
Statistahttp://statista.com/
Area covered
New York
Description
As of December 16, 2022, there had been almost 6.37 million COVID-19 cases in New York State, with 2.97 million cases found in New York City. New York has been one of the U.S. states most impacted by the pandemic, recording the highest number of deaths in the country.

A closer look at the outbreak in New York Towards the middle of December 2022, the number of deaths due to the coronavirus in New York State had reached almost 60 thousand, and almost half of those deaths were in New York City. However, the number of new daily deaths in New York City peaked early in the pandemic and although there have been times when the number of new daily deaths surged, they have not gotten close to reaching the levels seen at the beginning of the pandemic. New York City is made up of five counties, which are more commonly known by their borough names – Staten Island is the borough with the highest rate of COVID-19 cases.
m
COVID-19 NE Dataset
data.mendeley.com
narcis.nl
Updated Aug 18, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Joel Mintz (2020). COVID-19 NE Dataset [Dataset]. http://doi.org/10.17632/42wzh29xrp.2
Explore at:
Unique identifier
https://doi.org/10.17632/42wzh29xrp.2
Dataset updated
Aug 18, 2020
Authors
Joel Mintz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
COVID-19 Dataset for Correlation Between Early Government Interventions in the Northeastern United States and Peak COVID-19 Disease Burden by Joel Mintz. File Type: Excel Contents: Tab 1 ("Raw")=Raw Data as Downloaded directly from COVID Tracking Project, sorted by date Tab 2-14 ("State Name') = Data Sorted by State Tab 2-14 Headers: Column 1: Population per state, as recorded by latest American Community Survey, maximum (peak) COVID-19 outcome, with date on which outcome occurred. Column 2: Date on which numbers were recorded* Column 3: State Name* Column 4: Number of reported positive COVID-19 tests* Column 5: Number of reported negative COVID-19 tests* Column 6: Pending COVID-19 tests* Column 7: Currently Hospitalized* Column 8: Cumulatively Hospitalized* Column 9: Currently in ICU* Column 10: Cumulatively in ICU* Column 11: Currently on Ventilator Support* Column 12: Cumulatively on Ventilator Support* Column 13: Total Recovered* Column 14: Cumulative Mortality* *Provided in Original Raw Data Column 15: Total Tests Administered (Column 4+Column 5) Column 16: Placeholder Column 17: % of total population tested Column 18: New Cases Per day Column 19: Change in new cases per day Column 20: Positive cases per day per capita in number per/ hundreds of thousands: (Column 18/total population*100000) Column 21: Change in Positive cases per day per capita in number per/ hundreds of thousands: (Column 19/total population*100000) Column 22: Hospitalizations per day per capita in number per/ hundreds of thousands Column 23: Change in Hospitalizations per day per capita in number per/ hundreds of thousands Column 24: Deaths per day per capita in number per/ hundreds of thousands Column 25: Change in Deaths per day per capita in number per/ hundreds of thousands Column 26-31: Columns 20-25 with an applied 5 day moving average filter Column 32: Adjusted hospitalization: (Subtract number of hospitalizations from the initial number of hospitalzations where reporting bean) Column 33: Adjusted hospitalizations per day per capita Column 34: Adjusted hospitalizations per day per capita, with applied 5 day moving average filter
COVID-19 Reported Patient Impact and Hospital Capacity by Facility
healthdata.gov
data.ct.gov
+5more
Updated May 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
U.S. Department of Health & Human Services (2024). COVID-19 Reported Patient Impact and Hospital Capacity by Facility [Dataset]. https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/anag-cw7u
Explore at:
tsv, application/rssxml, csv, xml, application/rdfxml, application/geo+json, kmz, kmlAvailable download formats
Dataset updated
May 3, 2024
Dataset provided by
United States Department of Health and Human Serviceshttp://www.hhs.gov/
Authors
U.S. Department of Health & Human Services
License
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Description
After May 3, 2024, this dataset and webpage will no longer be updated because hospitals are no longer required to report data on COVID-19 hospital admissions, and hospital capacity and occupancy data, to HHS through CDC’s National Healthcare Safety Network. Data voluntarily reported to NHSN after May 1, 2024, will be available starting May 10, 2024, at COVID Data Tracker Hospitalizations.

The following dataset provides facility-level data for hospital utilization aggregated on a weekly basis (Sunday to Saturday). These are derived from reports with facility-level granularity across two main sources: (1) HHS TeleTracking, and (2) reporting provided directly to HHS Protect by state/territorial health departments on behalf of their healthcare facilities.

The hospital population includes all hospitals registered with Centers for Medicare & Medicaid Services (CMS) as of June 1, 2020. It includes non-CMS hospitals that have reported since July 15, 2020. It does not include psychiatric, rehabilitation, Indian Health Service (IHS) facilities, U.S. Department of Veterans Affairs (VA) facilities, Defense Health Agency (DHA) facilities, and religious non-medical facilities.

For a given entry, the term “collection_week” signifies the start of the period that is aggregated. For example, a “collection_week” of 2020-11-15 means the average/sum/coverage of the elements captured from that given facility starting and including Sunday, November 15, 2020, and ending and including reports for Saturday, November 21, 2020.

Reported elements include an append of either “_coverage”, “_sum”, or “_avg”.

A “_coverage” append denotes how many times the facility reported that element during that collection week.

A “_sum” append denotes the sum of the reports provided for that facility for that element during that collection week.

A “_avg” append is the average of the reports provided for that facility for that element during that collection week.

The file will be updated weekly. No statistical analysis is applied to impute non-response. For averages, calculations are based on the number of values collected for a given hospital in that collection week. Suppression is applied to the file for sums and averages less than four (4). In these cases, the field will be replaced with “-999,999”.

A story page was created to display both corrected and raw datasets and can be accessed at this link: https://healthdata.gov/stories/s/nhgk-5gpv

This data is preliminary and subject to change as more data become available. Data is available starting on July 31, 2020.

Sometimes, reports for a given facility will be provided to both HHS TeleTracking and HHS Protect. When this occurs, to ensure that there are not duplicate reports, deduplication is applied according to prioritization rules within HHS Protect.

For influenza fields listed in the file, the current HHS guidance marks these fields as optional. As a result, coverage of these elements are varied.

For recent updates to the dataset, scroll to the bottom of the dataset description.

On May 3, 2021, the following fields have been added to this data set.
hhs_ids
previous_day_admission_adult_covid_confirmed_7_day_coverage
previous_day_admission_pediatric_covid_confirmed_7_day_coverage
previous_day_admission_adult_covid_suspected_7_day_coverage
previous_day_admission_pediatric_covid_suspected_7_day_coverage
previous_week_personnel_covid_vaccinated_doses_administered_7_day_sum
total_personnel_covid_vaccinated_doses_none_7_day_sum
total_personnel_covid_vaccinated_doses_one_7_day_sum
total_personnel_covid_vaccinated_doses_all_7_day_sum
previous_week_patients_covid_vaccinated_doses_one_7_day_sum
previous_week_patients_covid_vaccinated_doses_all_7_day_sum

On May 8, 2021, this data set has been converted to a corrected data set. The corrections applied to this data set are to smooth out data anomalies caused by keyed in data errors. To help determine which records have had corrections made to it. An additional Boolean field called is_corrected has been added.

On May 13, 2021 Changed vaccination fields from sum to max or min fields. This reflects the maximum or minimum number reported for that metric in a given week.

On June 7, 2021 Changed vaccination fields from max or min fields to Wednesday reported only. This reflects that the number reported for that metric is only reported on Wednesdays in a given week.

On September 20, 2021, the following has been updated: The use of analytic dataset as a source.

On January 19, 2022, the following fields have been added to this dataset:

inpatient_beds_used_covid_7_day_avg
inpatient_beds_used_covid_7_day_sum
inpatient_beds_used_covid_7_day_coverage

On April 28, 2022, the following pediatric fields have been added to this dataset:

all_pediatric_inpatient_bed_occupied_7_day_avg
all_pediatric_inpatient_bed_occupied_7_day_coverage
all_pediatric_inpatient_bed_occupied_7_day_sum
all_pediatric_inpatient_beds_7_day_avg
all_pediatric_inpatient_beds_7_day_coverage
all_pediatric_inpatient_beds_7_day_sum
previous_day_admission_pediatric_covid_confirmed_0_4_7_day_sum
previous_day_admission_pediatric_covid_confirmed_12_17_7_day_sum
previous_day_admission_pediatric_covid_confirmed_5_11_7_day_sum
previous_day_admission_pediatric_covid_confirmed_unknown_7_day_sum
staffed_icu_pediatric_patients_confirmed_covid_7_day_avg
staffed_icu_pediatric_patients_confirmed_covid_7_day_coverage
staffed_icu_pediatric_patients_confirmed_covid_7_day_sum
staffed_pediatric_icu_bed_occupancy_7_day_avg
staffed_pediatric_icu_bed_occupancy_7_day_coverage
staffed_pediatric_icu_bed_occupancy_7_day_sum
total_staffed_pediatric_icu_beds_7_day_avg
total_staffed_pediatric_icu_beds_7_day_coverage
total_staffed_pediatric_icu_beds_7_day_sum

On October 24, 2022, the data includes more analytical calculations in efforts to provide a cleaner dataset. For a raw version of this dataset, please follow this link: https://healthdata.gov/Hospital/COVID-19-Reported-Patient-Impact-and-Hospital-Capa/uqq2-txqb

Due to changes in reporting requirements, after June 19, 2023, a collection week is defined as starting on a Sunday and ending on the next Saturday.
C
Covid-19 Protein Report
datainsightsmarket.com
doc, pdf, ppt
Updated May 28, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Data Insights Market (2025). Covid-19 Protein Report [Dataset]. https://www.datainsightsmarket.com/reports/covid-19-protein-1818961
Explore at:
doc, pdf, pptAvailable download formats
Dataset updated
May 28, 2025
Dataset authored and provided by
Data Insights Market
License
https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The COVID-19 protein market experienced significant growth from 2019 to 2024, driven by the urgent need for diagnostic tools, vaccine development, and therapeutic research during the pandemic. While precise market figures aren't provided, the rapid advancements in understanding the virus and the intense global research efforts suggest a substantial market size, potentially exceeding $1 billion in 2024. The market's Compound Annual Growth Rate (CAGR) likely ranged from 20% to 30% during this period, reflecting the high demand for various COVID-19 proteins for research and development. Key drivers included the global pandemic itself, the accelerated development of vaccines and therapeutics, and ongoing research into the virus's long-term effects and variants. Market trends show a shift towards more sophisticated protein characterization techniques and the development of novel diagnostic assays. Constraints on the market included the initial scarcity of resources and manufacturing capabilities during the early stages of the pandemic, as well as regulatory hurdles for new diagnostic tests and therapies. The market is segmented by protein type (Spike, Nucleocapsid, etc.), application (research, diagnostics, therapeutics), and end-user (pharmaceutical companies, research institutions, diagnostic laboratories). Major players include Abcam, The Native Antigen Company, Bio-Rad Laboratories, Thermo Fisher Scientific, and others, competing based on protein quality, price, and technological advancements. Post-pandemic, the COVID-19 protein market is expected to experience a period of adjustment. While the immediate, explosive growth will likely moderate, sustained demand for research-grade proteins will persist. Ongoing studies related to long COVID, variant analysis, and the potential for future outbreaks will fuel continued growth, albeit at a lower CAGR than during the peak pandemic years. The market will see increasing consolidation among players as smaller companies are acquired by larger ones. The focus will shift towards a more stable, albeit smaller, market, with opportunities in developing more accurate, sensitive, and cost-effective diagnostics and therapeutics, further fueling innovation and expansion within the niche markets of long-COVID research and future pandemic preparedness.
C
Covid-19 reproductiegetal
ckan.mobidatalab.eu
data.rivm.nl
+3more
json
Updated Aug 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NationaalGeoregisterNL (2023). Covid-19 reproductiegetal [Dataset]. https://ckan.mobidatalab.eu/dataset/covid-19-reproductiegetal
Explore at:
jsonAvailable download formats
Dataset updated
Aug 5, 2023
Dataset provided by
NationaalGeoregisterNL
Description
For English, see below The number of COVID-19 related hospitalizations has been low for quite some time and COVID-19 is no longer a notifiable disease as of July 1, 2023. Therefore, the data will no longer be updated from July 11, 2023. The reproduction number R gives the average number of people infected by one person with COVID-19. To estimate this reproduction number, we use the number of reported COVID-19 hospital admissions per day in the Netherlands. This number of hospital admissions is tracked by the NICE Foundation (National Intensive Care Evaluation). Because a COVID-19 admission is passed on with some delay in the reporting system, we correct the number of admissions for this delay [1]. The first day of illness is known for a large proportion of the reported cases. This information is used to estimate the first day of illness for hospital admissions. By displaying the number of COVID-19 admissions per date of the first day of illness, it is immediately possible to see whether the number of infections is increasing, peaking or decreasing. For the calculation of the reproduction number, it is also necessary to know the length of time between the first day of illness of a COVID-19 case and the first day of illness of his or her infector. This duration is an average of 4 days for SARS-CoV-2 variants in 2020 and 2021, and an average of 3.5 days for more recent variants, calculated on the basis of COVID-19 reports to the GGD. With this information, the value of the reproduction number is calculated as described in Wallinga & Lipsitch 2007 [2]. Until June 12, 2020, the reproduction number was calculated on the basis of COVID-19 hospital admissions, and until March 15, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGDs. [1] van de Kassteele J, Eilers PHC, Wallinga J. Nowcasting the Number of New Symptomatic Cases During Infectious Disease Outbreaks Using Constrained P-spline Smoothing. Epidemiology. 2019;30(5):737-745. doi:10.1097/EDE.0000000000001050. [2] Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2007;274(1609):599-604. doi:10.1098/rspb.2006.3754. Description of the variables: Version: Version number of the dataset. When the content of the dataset is structurally changed (so not the daily update or a correction at record level), the version number will be adjusted (+1) and also the corresponding metadata in RIVMdata (https://data.rivm.nl) . Version 2 update (February 8, 2022): - In the calculation of the reproduction number, the date of the positive test result is now used instead of the GGD notification date. Version 3 update (February 17, 2022): - The calculation of the reproduction number now takes into account different generation times for different variants. For the variants up to and including Delta, the average generation time is 4 days, from Omikron it is 3.5 days. The reproduction number published here is a weighted average of the reproduction numbers per variant. Version 4 update (September 1, 2022): - From September 1, 2022, this dataset is split into two parts. The first part contains the dates from the start of the pandemic to October 3, 2021 (week 39) and contains "tm" in the file name. This data will no longer be updated. The second part contains the data from October 4, 2021 (week 40) and is updated every Tuesday and Friday. - Until August 31, the published reproduction number was calculated with the data of the day before publication. From September 1, the published reproduction number is calculated with the data of the day of publication. Version 5 update (March 31, 2023): - From March 15, 2023, the reproduction number is calculated based on COVID-19 hospital admissions according to the NICE hospital registration. From June 13, 2020 to March 14, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGD. However, the number of reports is strongly determined by the test policy, and is less suitable as a basis for calculating the reproduction number due to the adjusted test policy as of March 10, 2023 and the closure of the GGD test lanes as of March 17, 2023. Until 12 June 2020, the reproduction number was also calculated on the basis of hospital admissions, but then as reported to the GGD. Date: Date for which the reproduction number was estimated Rt_low: Lower bound 95% confidence interval Rt_avg: Estimated reproduction number Rt_up: Upper bound 95% confidence interval population: patient population with value “hosp” for hospitalized patients or “testpos” for test positive patients For recent R estimates, the reliability is not great, because the reliability depends on the time between infection and becoming ill and the time between becoming ill and reporting. Therefore, the variable Rt_avg is absent in the last two weeks. -------------------------------------------------- --------------------------------------------- Covid-19 reproduction number The number of COVID-19 related hospitalizations has been low for quite some time and COVID-19 is no longer a notifiable disease as of July 1, 2023. Therefore, the data will no longer be updated from July 11, 2023. The reproduction number R gives the average number of people infected by one person with COVID-19. To estimate this reproduction number, we use the number of reported COVID-19 hospital admissions per day in the Netherlands. This number of hospital admissions is tracked by the NICE Foundation (National Intensive Care Evaluation). Because a COVID-19 admission is reported with some delay in the reporting system, we correct the number of admissions for this delay [1]. The first day of illness is known for a large proportion of the reported cases. This information is used to estimate the first day of illness for hospital admissions. By displaying the number of COVID-19 admissions per date of the first day of illness, it is immediately possible to see whether the number of infections is increasing, peaking or decreasing. To calculate the reproduction number, it is also necessary to know the length of time between the first day of illness of a COVID-19 case and the first day of illness of his or her infector. This duration is an average of 4 days for SARS-CoV-2 variants in 2020 and 2021, and an average of 3.5 days for more recent variants, calculated on the basis of COVID-19 reports to the PHS. With this information, the value of the reproduction number is calculated as described in Wallinga & Lipsitch 2007 [2]. Until June 12, 2020, the reproduction number was calculated on the basis of COVID-19 hospital admissions, and until March 15, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the GGDs. [1] van de Kassteele J, Eilers PHC, Wallinga J. Nowcasting the Number of New Symptomatic Cases During Infectious Disease Outbreaks Using Constrained P-spline Smoothing. Epidemiology. 2019;30(5):737-745. doi:10.1097/EDE.0000000000001050. [2] Wallinga J, Lipsitch M. How generation intervals shape the relationship between growth rates and reproductive numbers. Proc Biol Sci. 2007;274(1609):599-604. doi:10.1098/rspb.2006.3754. Description of the variables: Version: Version number of the dataset. When the content of the dataset is structurally changed (so not the daily update or a correction at record level), the version number will be adjusted (+1) and also the corresponding metadata in RIVMdata (https://data.rivm.nl). Version 2 update (February 8, 2022): - In the calculation of the reproduction number, the date of the positive test result is now used instead of the PHS notification date. Version 3 update (February 17, 2022): - The calculation of the reproduction number now takes into account different generation times for different variants. For the variants up to and including Delta, the average generation time is 4 days, from Omikron it is 3.5 days. The reproduction number published here is a weighted average of the reproduction numbers per variant. Version 4 update (September 1, 2022): - As of September 1, 2022, this dataset is split into two parts. The first part contains the dates from the start of the pandemic till October 3, 2021 (week 39) and contains "tm" in the file name. This data will no longer be updated. The second part contains the data from October 4, 2021 (week 40) and is updated every Tuesday and Friday. - Until August 31, the published reproduction number was calculated with the data of the day before publication. From September 1, the published reproduction number is calculated with the data of the day of publication. Version 5 update (March 31, 2023): - As of March 15, 2023, the reproduction number is calculated based on COVID-19 hospital admissions according to the NICE hospital registry. From June 13, 2020 to March 14, 2023, the reproduction number was calculated on the basis of COVID-19 reports to the PHS. However, the number of reports is strongly determined by the test policy, and is less suitable as a basis for calculating the reproduction number due to the adjusted test policy as of March 10, 2023 and the closure of the PHS test lanes as of March 17, 2023. Until 12 June 2020, the reproduction number was also calculated on the basis of hospital admissions, but then as reported to the PHS. Date: Date for which the reproduction number was estimated Rt_low: Lower limit 95% confidence interval Rt_avg: Estimated reproduction number Rt_up: Upper bound 95% confidence interval population: patient population with value “hosp” for hospitalized patients or “testpos” for test positive patients For recent R estimates, the reliability is not great, because the reliability depends on the time between infection and becoming ill and the time between becoming ill and reporting. Therefore, the variable Rt_avg is absent in the last two weeks.
c
INF-COVID: Longitudinal data - Switzerland French-speaking - T0-T1-T2-T3
datacatalogue.cessda.eu
swissubase.ch
Updated Apr 22, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ortoleva Bucher; Delmas; Oulevey Bachmann (2025). INF-COVID: Longitudinal data - Switzerland French-speaking - T0-T1-T2-T3 [Dataset]. http://doi.org/10.48573/syf6-5715
Explore at:
Unique identifier
https://doi.org/10.48573/syf6-5715
Dataset updated
Apr 22, 2025
Dataset provided by
Claudia
Annie
Philippe
Authors
Ortoleva Bucher; Delmas; Oulevey Bachmann
Area covered
France, Switzerland
Description
The COVID-19 pandemic was making a huge impact on Europe’s healthcare systems in the spring of 2020, and most predictive models concurred that pandemic waves were in the offing. Most studies adopted a pathogenic approach to the subject; few used a salutogenic approach. These showed, however, that nurses can retain their health despite a pandemic by mobilising generalised resistance resources. Our study aims to understand how nurses working in hospitals protected their health and workplace well-being during the COVID-19 pandemic by investigating the moderating effects of the health resources they mobilised against the stressors inherent to the situation. Data was gathered longitudinally in the following countries: Switzerland (French-speaking and German-speaking parts), France, Portugal and Canada. In addition, a cross-sectionnal sample of nurses from Belgium was also investigated. The questionnaires included the PSS, WHOQOL, NSS, BRIEF-COPE, PTGI, CD-RISC, MSPSS, COPSOQ, SISI and demographic information. See Ortololeva et al. 2021 (in the bibliographical reference section) for the published protocol of this project
f
Base case parameters and sources.
figshare.com
xls
Updated Jun 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lauren E. Cipriano; Wael M. R. Haddara; Gregory S. Zaric; Eva A. Enns (2023). Base case parameters and sources. [Dataset]. http://doi.org/10.1371/journal.pone.0255782.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0255782.t001
Dataset updated
Jun 5, 2023
Dataset provided by
PLOS ONE
Authors
Lauren E. Cipriano; Wael M. R. Haddara; Gregory S. Zaric; Eva A. Enns
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Base case parameters and sources.
Z
INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafiz Sadman (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Nafiz Sadman
Kishor Datta Gupta
Nishat Anjum
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States, Bangladesh
Description
Introduction

There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

2 Data-set Introduction

2.1 Data Collection

We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

The headline must have one or more words directly or indirectly related to COVID-19.

The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

Avoid taking duplicate reports.

Maintain a time frame for the above mentioned newspapers.

To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

2.2 Data Pre-processing and Statistics

Some pre-processing steps performed on the newspaper report dataset are as follows:

Remove hyperlinks.

Remove non-English alphanumeric characters.

Remove stop words.

Lemmatize text.

While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

The primary data statistics of the two dataset are shown in Table 1 and 2.

Table 1: Covid-News-USA-NNK data statistics

No of words per headline

7 to 20

No of words per body content

150 to 2100

Table 2: Covid-News-BD-NNK data statistics No of words per headline

10 to 20

No of words per body content

100 to 1500

2.3 Dataset Repository

We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

3 Literature Review

Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

4 Our experiments and Result analysis

We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

In February, both the news paper have talked about China and source of the outbreak.

StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

Washington Post discussed global issues more than StarTribune.

StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,

Facebook

Twitter

Click to copy link

Link copied

Cite

New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://github.com/nytimes/covid-19-data

Coronavirus (Covid-19) Data in the United States

Explore at:

csvAvailable download formats

Dataset provided by

New York Times

License

https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE

Description

The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.

Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.

We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.

The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.

Clear search

Close search

Google apps

Main menu

Coronavirus (Covid-19) Data in the United States

COVID-19 Trends in Each Country

The COVID Tracking Project

Number of U.S. COVID-19 cases from Jan. 20, 2020 - Nov. 11, 2022, by week

Coronavirus COVID-19 Global Cases by the Center for Systems Science and...

Increase in medical app downloads during peak of COVID-19 crisis by country...

Data from: COVID-19 prevalence and predictors in United States adults during...

COVID-19 reporting

Infections averted in the general population with 5-day testing and one-time...

Average number of daily new infections in the simulation for the first 226...

COVID-19 Trends in Each Country-Copy

Daily new COVID-19 confirmed cases Australia Mar-Sep 2020

Number of coronavirus (COVID-19) cases in New York as of Dec. 16, 2022, by...

COVID-19 NE Dataset

COVID-19 Reported Patient Impact and Hospital Capacity by Facility

Covid-19 Protein Report

Covid-19 reproductiegetal

INF-COVID: Longitudinal data - Switzerland French-speaking - T0-T1-T2-T3

Base case parameters and sources.

INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

Coronavirus (Covid-19) Data in the United States