https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Reporting of new Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. This dataset will receive a final update on June 1, 2023, to reconcile historical data through May 10, 2023, and will remain publicly available.
Aggregate Data Collection Process Since the start of the COVID-19 pandemic, data have been gathered through a robust process with the following steps:
Methodology Changes Several differences exist between the current, weekly-updated dataset and the archived version:
Confirmed and Probable Counts In this dataset, counts by jurisdiction are not displayed by confirmed or probable status. Instead, confirmed and probable cases and deaths are included in the Total Cases and Total Deaths columns, when available. Not all jurisdictions report probable cases and deaths to CDC.* Confirmed and probable case definition criteria are described here:
Council of State and Territorial Epidemiologists (ymaws.com).
Deaths CDC reports death data on other sections of the website: CDC COVID Data Tracker: Home, CDC COVID Data Tracker: Cases, Deaths, and Testing, and NCHS Provisional Death Counts. Information presented on the COVID Data Tracker pages is based on the same source (total case counts) as the present dataset; however, NCHS Death Counts are based on death certificates that use information reported by physicians, medical examiners, or coroners in the cause-of-death section of each certificate. Data from each of these pages are considered provisional (not complete and pending verification) and are therefore subject to change. Counts from previous weeks are continually revised as more records are received and processed.
Number of Jurisdictions Reporting There are currently 60 public health jurisdictions reporting cases of COVID-19. This includes the 50 states, the District of Columbia, New York City, the U.S. territories of American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, Puerto Rico, and the U.S Virgin Islands as well as three independent countries in compacts of free association with the United States, Federated States of Micronesia, Republic of the Marshall Islands, and Republic of Palau. New York State’s reported case and death counts do not include New York City’s counts as they separately report nationally notifiable conditions to CDC.
CDC COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths, available by state and by county. These and other data on COVID-19 are available from multiple public locations, such as:
https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html
https://www.cdc.gov/covid-data-tracker/index.html
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html
https://www.cdc.gov/coronavirus/2019-ncov/php/open-america/surveillance-data-analytics.html
Additional COVID-19 public use datasets, include line-level (patient-level) data, are available at: https://data.cdc.gov/browse?tags=covid-19.
Archived Data Notes:
November 3, 2022: Due to a reporting cadence issue, case rates for Missouri counties are calculated based on 11 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 3, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Due to a reporting cadence change, case rates for Alabama counties are calculated based on 13 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 10, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Per the request of the jurisdiction, cases and deaths among non-residents have been removed from all Hawaii county totals throughout the entire time series. Cumulative case and death counts reported by CDC will no longer match Hawaii’s COVID-19 Dashboard, which still includes non-resident cases and deaths.
November 17, 2022: Two new columns, weekly historic cases and weekly historic deaths, were added to this dataset on November 17, 2022. These columns reflect case and death counts that were reported that week but were historical in nature and not reflective of the current burden within the jurisdiction. These historical cases and deaths are not included in the new weekly case and new weekly death columns; however, they are reflected in the cumulative totals provided for each jurisdiction. These data are used to account for artificial increases in case and death totals due to batched reporting of historical data.
December 1, 2022: Due to cadence changes over the Thanksgiving holiday, case rates for all Ohio counties are reported as 0 in the data released on December 1, 2022.
January 5, 2023: Due to North Carolina’s holiday reporting cadence, aggregate case and death data will contain 14 days’ worth of data instead of the customary 7 days. As a result, case and death metrics will appear higher than expected in the January 5, 2023, weekly release.
January 12, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0. As a result, case and death metrics will appear lower than expected in the January 12, 2023, weekly release.
January 19, 2023: Due to a reporting cadence issue, Mississippi’s aggregate case and death data will be calculated based on 14 days’ worth of data instead of the customary 7 days in the January 19, 2023, weekly release.
January 26, 2023: Due to a reporting backlog of historic COVID-19 cases, case rates for two Michigan counties (Livingston and Washtenaw) were higher than expected in the January 19, 2023 weekly release.
January 26, 2023: Due to a backlog of historic COVID-19 cases being reported this week, aggregate case and death counts in Charlotte County and Sarasota County, Florida, will appear higher than expected in the January 26, 2023 weekly release.
January 26, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0 in the weekly release posted on January 26, 2023.
February 2, 2023: As of the data collection deadline, CDC observed an abnormally large increase in aggregate COVID-19 cases and deaths reported for Washington State. In response, totals for new cases and new deaths released on February 2, 2023, have been displayed as zero at the state level until the issue is addressed with state officials. CDC is working with state officials to address the issue.
February 2, 2023: Due to a decrease reported in cumulative case counts by Wyoming, case rates will be reported as 0 in the February 2, 2023, weekly release. CDC is working with state officials to verify the data submitted.
February 16, 2023: Due to data processing delays, Utah’s aggregate case and death data will be reported as 0 in the weekly release posted on February 16, 2023. As a result, case and death metrics will appear lower than expected and should be interpreted with caution.
February 16, 2023: Due to a reporting cadence change, Maine’s
https://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Notice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths
column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Coronavirus infection is currently the most important health topic. It surely tested and continues to test to the fullest extent the healthcare systems around the world. Although big progress is made in handling this pandemic, a tremendous number of questions are needed to be answered. I hereby present to you the local Bulgarian COVID-19 dataset with some context. It could be used as a comparator because it stands out compared to other countries and deserves analysis.
Context for Bulgarian population: Population - 6 948 445 Median age - 44.7 years Aged >65 - 20.801 % Aged >70 - 13.272%
Summary of the results: - first pandemic wave was weak, probably because of the early state of emergency (5 days after the first confirmed case). Whether this was a good decision or it was too early and just postpone the inevitable is debatable. -healthcare system collapses (probably due to delayed measures) in the second and third waves which resulted in Bulgaria gaining the top ranks for mortality and morbidity tables worldwide and in the EU. - low percentage of vaccinated people results in a prolonged epidemic and delaying the lifting of the preventive measures.
Some of the important moments that should be considered when interpreting the data: 08.03.2020 - Bulgaria confirmed its first two cases. The government issued a nationwide ban on closed-door public events (first lockdown); 13.03.2020- after 16 reported cases in one day, Bulgaria declared a state of emergency for one month until 13.04.2020. Schools, shopping centres, cinemas, restaurants, and other places of business were closed. All sports events were suspended. Only supermarkets, food markets, pharmacies, banks, and gas stations remain open. 03.04.2020 - The National Assembly approved the government's proposal to extend the state of emergency by one month until 13.05.2020; 14.05.2020 - the national emergency was lifted, and in its place was declared a state of an emergency epidemic situation. Schools and daycares remain closed, as well as shopping centers and indoor restaurants; 18.05.2020 - Shopping malls and fitness centers opened; 01.06.2020 - Restaurants and gaming halls opened; 10.07.2020 - discos and bars are closed, the sports events are without an audience; 29.10.2020 - High school and college students are transitioning to online learning; 27.11.2020 - the whole education is online, restaurants, nightclubs, bars, and discos are closed (second lockdown 27.11 - 21.12); 05.12.2020 - the 14-day mortality rate is the highest in the world; 16.01.2021 - some of the students went back to school; 01.03.2021 - restaurants and casinos opened; 22.03.2021 - restaurants, shopping malls, fitness centers, and schools are closed (third lockdown for 10 days - 22.03 - 31.03); 19.04.2021 - children daycare facilities, fitness centers, and nightclubs are opened;
This dataset consists of 447 rows with 29 columns and covers the period 08.03.2020 - 28.05.2021. In the beginning, there are some missing values until the proper statistical report was established.
A publication proposal is sent to anyone who wishes to collaborate. Based on the results and the value of the findings and the relevance of the topic it is expected to publish: - in a local journal (guaranteed); - in a SCOPUS journal (highly probable); - in an IF journal (if the results are really insightful).
The topics could be, but not limited to: - descriptive analysis of the pandemic outbreak in the country; - prediction of the pandemic or the vaccination rate; - discussion about the numbers compared to other countries/world; - discussion about the government decisions; - estimating cut-off values for step-down or step-up of the restrictions.
If you find an error, have a question, or wish to make a suggestion, I encourage you to reach me.
2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Downloadable data:
https://github.com/CSSEGISandData/COVID-19
Additional Information about the Visual Dashboard:
https://systems.jhu.edu/research/public-health/ncov
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ABSTRACT: Objective: Estimating the potential number of COVID-19 deaths in Brazil for the coming months. Methods: The study included all confirmed cases of COVID-19 deaths, from the first confirmed death on March 17th to May 15th, 2020. These data were collected from an official Brazilian website of the Ministry of Health. The Boltzmann function was applied to a data simulation for each set of data regarding all states of the country. Results: The model data were well-fitted, with R2 values close to 0.999. Up to May 15th, 14,817 COVID-19 deaths have been confirmed in the country. Amazonas has the highest rate of accumulated cases per 1,000,000 inhabitants (321.14), followed by Ceará (161.63). Rio de Janeiro, Roraima, Amazonas, Pará, and Pernambuco are estimated to experience a substantial increase in the rate of cumulative cases until July 15th. Mato Grosso do Sul, Paraná, Minas Gerais, Rio Grande do Sul, and Santa Catarina will show lower rates per 1,000,000 inhabitants. Conclusion: We estimate a substantial increase in the rate of cumulative cases in Brazil over the next months. The Boltzmann function proved to be a simple tool for epidemiological forecasting that can assist in the planning of measures to contain COVID-19.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Analysis of ‘COVID-19 Healthy Diet Dataset’ provided by Analyst-2 (analyst-2.ai), based on source dataset retrieved from https://www.kaggle.com/mariaren/covid19-healthy-diet-dataset on 28 January 2022.
--- Dataset description provided by original source is as follows ---
“Health requires healthy food."
Roger Williams (1603 – 1683)
In the past couple months, we’ve witnessed doctors, nurses, paramedics and thousands of medical workers putting their lives on the frontline to save patients who are infected. And as the battle with COVID-19 continues, we should all ask ourselves – What should we do to help out? What can we do to protect our loved ones, those who sacrifice for us, and ourselves from this pandemic?
These questions all relate back to the CORD-19 Open Research Dataset Challenge Task Question: “What do we know about non-pharmaceutical interventions?”
And my simple answer is : We need to protect our families and our own healths by adapting to a healthy diet.
The USDA Center for Nutrition Policy and Promotion recommends a very simple daily diet intake guideline: 30% grains, 40% vegetables, 10% fruits, and 20% protein, but are we really eating in the healthy eating style recommended by these food divisions and balances?
In this dataset, I have combined data of different types of food, world population obesity and undernourished rate, and global COVID-19 cases count from around the world in order to learn more about how a healthy eating style could help combat the Corona Virus. And from the dataset, we can gather information regarding diet patterns from countries with lower COVID infection rate, and adjust our own diet accordingly.
In each of the 4 datasets below, I have calculated fat quantity, energy intake (kcal), food supply quantity (kg), and protein for different categories of food (all calculated as percentage of total intake amount). I've also added on the obesity and undernourished rate (also in percentage) for comparison. The end of the datasets also included the most up to date confirmed/deaths/recovered/active cases (also in percentage of current population for each country).
Data for different food group supply quantities, nutrition values, obesity, and undernourished percentages are obtained from Food and Agriculture Organization of the United Nations FAO website To see the specific types of food included in each category from the FAO data, take a look at the last dataset Supply_Food_Data_Description.csv
.
Data for population count for each country comes from Population Reference Bureau PRB website
Data for COVID-19 confirmed, deaths, recovered and active cases are obtained from Johns Hopkins Center for Systems Science and Engineering CSSE website
The USDA Center for Nutrition Policy and Promotion diet intake guideline information can be found in ChooseMyPlate.gov
Note: I will update and push new versions of the datasets weekly. (Current version include COVID data from the week of 02/06/2021) Click here to see my data cleaning/preprocessing code in R
If you like this dataset, please don't forget to give me an upvote! 👍
--- Original source retains full ownership of the source dataset ---
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundThe Omicron variant of SARS-CoV-2 is more highly infectious and transmissible than prior variants of concern. It was unclear which factors might have contributed to the alteration of COVID-19 cases and deaths during the Delta and Omicron variant periods. This study aimed to compare the COVID-19 average weekly infection fatality rate (AWIFR), investigate factors associated with COVID-19 AWIFR, and explore the factors linked to the increase in COVID-19 AWIFR between two periods of Delta and Omicron variants.Materials and methodsAn ecological study has been conducted among 110 countries over the first 12 weeks during two periods of Delta and Omicron variant dominance using open publicly available datasets. Our analysis included 102 countries in the Delta period and 107 countries in the Omicron period. Linear mixed-effects models and linear regression models were used to explore factors associated with the variation of AWIFR over Delta and Omicron periods.FindingsDuring the Delta period, the lower AWIFR was witnessed in countries with better government effectiveness index [β = −0.762, 95% CI (−1.238)–(−0.287)] and higher proportion of the people fully vaccinated [β = −0.385, 95% CI (−0.629)–(−0.141)]. In contrast, a higher burden of cardiovascular diseases was positively associated with AWIFR (β = 0.517, 95% CI 0.102–0.932). Over the Omicron period, while years lived with disability (YLD) caused by metabolism disorders (β = 0.843, 95% CI 0.486–1.2), the proportion of the population aged older than 65 years (β = 0.737, 95% CI 0.237–1.238) was positively associated with poorer AWIFR, and the high proportion of the population vaccinated with a booster dose [β = −0.321, 95% CI (−0.624)–(−0.018)] was linked with the better outcome. Over two periods of Delta and Omicron, the increase in government effectiveness index was associated with a decrease in AWIFR [β = −0.438, 95% CI (−0.750)–(−0.126)]; whereas, higher death rates caused by diabetes and kidney (β = 0.472, 95% CI 0.089–0.855) and percentage of population aged older than 65 years (β = 0.407, 95% CI 0.013–0.802) were associated with a significant increase in AWIFR.ConclusionThe COVID-19 infection fatality rates were strongly linked with the coverage of vaccination rate, effectiveness of government, and health burden related to chronic diseases. Therefore, proper policies for the improvement of vaccination coverage and support of vulnerable groups could substantially mitigate the burden of COVID-19.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.
However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.
2 Data-set Introduction
2.1 Data Collection
We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:
The headline must have one or more words directly or indirectly related to COVID-19.
The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.
The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.
Avoid taking duplicate reports.
Maintain a time frame for the above mentioned newspapers.
To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.
2.2 Data Pre-processing and Statistics
Some pre-processing steps performed on the newspaper report dataset are as follows:
Remove hyperlinks.
Remove non-English alphanumeric characters.
Remove stop words.
Lemmatize text.
While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.
The primary data statistics of the two dataset are shown in Table 1 and 2.
Table 1: Covid-News-USA-NNK data statistics
No of words per headline
7 to 20
No of words per body content
150 to 2100
Table 2: Covid-News-BD-NNK data statistics No of words per headline
10 to 20
No of words per body content
100 to 1500
2.3 Dataset Repository
We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.
3 Literature Review
Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.
Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].
Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.
Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.
4 Our experiments and Result analysis
We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:
In February, both the news paper have talked about China and source of the outbreak.
StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.
Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.
Washington Post discussed global issues more than StarTribune.
StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.
While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.
We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases
where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
An active discussion about the mortality data in Moscow has erupted in the days. The Moscow Times newspaper drew attention to a significant increase in official mortality rates in April 2020: "Moscow recorded 20% more fatalities in April 2020 compared to its average April mortality total over the past decade, according to newly published preliminary data from Moscow’s civil registry office. The data comes as Russia sees the fastest growth in coronavirus infections in Europe, while its mortality rate remains much lower than in many countries. Moscow, the epicenter of Russia’s coronavirus outbreak, has continued to see daily spikes in new cases despite being under lockdown since March 30. According to the official data, 11,846 people died in Russia’s capital in April of this year, roughly a 20% increase from the 10-year average for April deaths, which is 9,866. The numbers suggest that the city’s statistics of coronavirus deaths may be higher in reality than official numbers indicate. Russia boasts a relatively low coronavirus mortality rate of 0.9%, which experts believe is linked to the way coronavirus-related deaths are counted."
After this publication have been realesed The Moscow Department of Health has denied the statement of the inaccuracy of counting.:
First, Moscow is a region that openly publishes mortality data on its websites. Moscow on an initiative basis published data for April before the federal structures did it. Secondly, the comparison of mortality rates in the monthly dynamics is incorrect and is not a clear evidence of any trends. In April 2020, indeed, according to the Civil Registry Office in Moscow, 11,846 death certificates were issued. So, the increase compared to April 2019 amounted to 1841 people, and compared to the same month of 2018 - 985 people, i.e. 2 times less. Thirdly, the diagnosis of coronavirus-infected deaths in Moscow is established after a mandatory autopsy is performed in strict accordance with the Provisional Guidelines of the Russian Ministry of Health.Of the total number of deaths in April 2020, 639 are people whose cause of death is coronavirus infection and its complications, most often pneumonia.It should be emphasized that the pathological autopsy of the dead with suspected CoV-19 in Russia and Moscow is carried out in 100% of cases, unlike most other countries.It is impossible to name the cause of death of COVID-19 in other cases. For example, over 60% of deaths occurred from obvious alternative causes, such as vascular accidents (myocardial infarction and stroke), stage 4 malignant diseases (essentially palliative patients), leukemia, systemic diseases with the development of organ failure (e.g. amyloidosis and terminal renal insufficiency) and other non-curable deadly diseases. Fourth, any seasonal increase in the incidence of SARS, not to mention the pandemic caused by the spread of the new coronavirus, is always accompanied by an increase in mortality. This is due to the appearance of the dead directly from an infectious disease, but to an even greater extent from other diseases, the exacerbation of which and the decompensation of the condition of patients suffering from these diseases also leads to death. In these cases, the infectious onset is a catalyst for the rapid progression of chronic diseases and the manifestation of new diseases. Fifthly, a similar situation with statistics is observed in other countries - mortality from COVID-19 is lower than the overall increase in mortality. According to the official sites of cities:In New York, mortality from coronavirus in April amounted to 11,861 people. At the same time, the total increase in mortality compared to the same period in 2019 is 15709.In London, in April, 3,589 people died with a diagnosis of coronavirus, while the total increase was 5531 Sixth, even if all the additional mortality for April in Moscow is attributed to coronavirus, the mortality from COVID will be slightly more than 3%, which is lower than the official mortality in New York and London (10% and 23%, respectively). Moreover, if you make such a recount in these cities, the mortality rate in them will be 13% and 32%, respectively. Seventh, Moscow is open for discussion and is ready to share experience with both Russian and foreign experts.
I think community members would be interested in studying the data on mortality in the Russian capital themselves and conducting a competent statistical check.
This may be of particular interest in connection with that he [US announced a grant of $ 250 thousand to "expose the disinformation of health care" in Russia](https://www....
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectivesAnalyzing and comparing COVID-19 infection and case-fatality rates across different regions can help improve our response to future pandemics.MethodsWe used public data from the WHO to calculate and compare the COVID-19 infection and case-fatality rates in different continents and income levels from 2019 to 2023.ResultsThe Global prevalence of COVID-19 increased from 0.011 to 0.098, while case fatality rates declined from 0.024 to 0.009. Europe reported the highest cumulative infection rate (0.326), with Africa showing the lowest (0.011). Conversely, Africa experienced the highest cumulative case fatality rates (0.020), with Oceania the lowest (0.002). Infection rates in Asia showed a steady increase in contrast to other continents which observed initial rises followed by decreases. A correlation between economic status and infection rates was identified; high-income countries had the highest cumulative infection rate (0.353) and lowest case fatality rate (0.006). Low-income countries showed low cumulative infection rates (0.006) but the highest case fatality rate (0.016). Initially, high and upper-middle-income countries experienced elevated initial infection and case fatality rates, which subsequently underwent significant reductions.ConclusionsCOVID-19 rates varied significantly by continent and income level. Europe and the Americas faced surges in infections and low case fatality rates. In contrast, Africa experienced low infection rates and higher case fatality rates, with lower- and middle-income nations exceeding case fatality rates in high-income countries over time.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Here we investigated whether the dengue fever pandemic of 2019-2020 may have influenced COVID-19 incidence and spread around the world. In Brazil, the geographic distribution of dengue fever was highly complementary to that of COVID-19. This was accompanied by an inverse correlation between COVID-19 and dengue fever incidence that could not be explained by socioeconomic factors. This inverse correlation was observed for 5,016 Brazilian municipalities reporting COVID-19 cases, 558 micro- and 137 meso-regions, 27 states and 5 regions. Brazilian states with high population levels of dengue IgM in 2020 exhibited: (i) lower COVID-19 case and death incidence, (ii) slower infection growth rates, and (iii) took longer to accumulate COVID-19 cases. No such inverse correlations were observed for the chikungunya virus, which is also transmitted by the Aedes aegypti mosquito. The same inverse correlation between COVID-19 and dengue fever incidence was observed for 145 locations (66 countries and the 64 states of Mexico and Colombia) in Latin America, the Caribbean, and Asia. Countries with high dengue incidence took longer to accumulate COVID-19 cases than those without dengue. Although the dataset considered has quality and availability limitations, these findings raise the possibility of an immunological cross-reaction between dengue virus serotypes and SARS-CoV-2, which could have led to partial immunological protection for COVID-19 in dengue infected communities. However, further studies are necessary to better test this hypothesis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Background: By February 2021, the overall impact of coronavirus disease 2019 (COVID-19) in South and Southeast Asia was relatively mild. Surprisingly, in early April 2021, the second wave significantly impacted the population and garnered widespread international attention.Methods: This study focused on the nine countries with the highest cumulative deaths from the disease as of August 17, 2021. We look at COVID-19 transmission dynamics in South and Southeast Asia using the reported death data, which fits a mathematical model with a time-varying transmission rate.Results: We estimated the transmission rate, infection fatality rate (IFR), infection attack rate (IAR), and the effects of vaccination in the nine countries in South and Southeast Asia. Our study suggested that the IAR is still low in most countries, and increased vaccination is required to prevent future waves.Conclusion: Implementing non-pharmacological interventions (NPIs) could have helped South and Southeast Asia keep COVID-19 under control in 2020, as demonstrated in our estimated low-transmission rate. We believe that the emergence of the new Delta variant, social unrest, and migrant workers could have triggered the second wave of COVID-19.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Coronavirus disease 2019 (COVID-19) has developed into a global pandemic, affecting every nation and territory in the world. Machine learning-based approaches are useful when trying to understand the complexity behind the spread of the disease and how to contain its spread effectively. The unsupervised learning method could be useful to evaluate the shortcomings of health facilities in areas of increased infection as well as what strategies are necessary to prevent disease spread within or outside of the country. To contribute toward the well-being of society, this paper focusses on the implementation of machine learning techniques for identifying common prevailing public health care facilities and concerns related to COVID-19 as well as attitudes to infection prevention strategies held by people from different countries concerning the current pandemic situation. Regression tree, random forest, cluster analysis and principal component machine learning techniques are used to analyze the global COVID-19 data of 133 countries obtained from the Worldometer website as of April 17, 2020. The analysis revealed that there are four major clusters among the countries. Eight countries having the highest cumulative infected cases and deaths, forming the first cluster. Seven countries, United States, Spain, Italy, France, Germany, United Kingdom, and Iran, play a vital role in explaining the 60% variation of the total variations by us of the first component characterized by all variables except for the rate variables. The remaining countries explain only 20% of the variation of the total variation by use of the second component characterized by only rate variables. Most strikingly, the analysis found that the variable number of tests by the country did not play a vital role in the prediction of the cumulative number of confirmed cases.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) has caused the Coronavirus Disease 2019 (COVID-19) worldwide pandemic in 2020. In response, most countries in the world implemented lockdowns, restricting their population's movements, work, education, gatherings, and general activities in attempt to “flatten the curve” of COVID-19 cases. The public health goal of lockdowns was to save the population from COVID-19 cases and deaths, and to prevent overwhelming health care systems with COVID-19 patients. In this narrative review I explain why I changed my mind about supporting lockdowns. The initial modeling predictions induced fear and crowd-effects (i.e., groupthink). Over time, important information emerged relevant to the modeling, including the lower infection fatality rate (median 0.23%), clarification of high-risk groups (specifically, those 70 years of age and older), lower herd immunity thresholds (likely 20–40% population immunity), and the difficult exit strategies. In addition, information emerged on significant collateral damage due to the response to the pandemic, adversely affecting many millions of people with poverty, food insecurity, loneliness, unemployment, school closures, and interrupted healthcare. Raw numbers of COVID-19 cases and deaths were difficult to interpret, and may be tempered by information placing the number of COVID-19 deaths in proper context and perspective relative to background rates. Considering this information, a cost-benefit analysis of the response to COVID-19 finds that lockdowns are far more harmful to public health (at least 5–10 times so in terms of wellbeing years) than COVID-19 can be. Controversies and objections about the main points made are considered and addressed. Progress in the response to COVID-19 depends on considering the trade-offs discussed here that determine the wellbeing of populations. I close with some suggestions for moving forward, including focused protection of those truly at high risk, opening of schools, and building back better with a economy.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Morbidity and mortality attributable to COVID-19 is devastating global health systems and economies. Bacillus Calmette Guérin (BCG) vaccination has been in use for many decades to prevent severe forms of tuberculosis in children. Studies have also shown a combination of improved long-term innate or trained immunity (through epigenetic reprogramming of myeloid cells) and adaptive responses after BCG vaccination, which leads to non-specific protective effects in adults. Observational studies have shown that countries with routine BCG vaccination programs have significantly less reported cases and deaths of COVID-19, but such studies are prone to significant bias and need confirmation. To date, in the absence of direct evidence, WHO does not recommend BCG for the prevention of COVID-19. This project aims to investigate in a timely manner whether and why BCG-revaccination can reduce infection rate and/or disease severity in health care workers during the SARS-CoV-2 outbreak in South Africa. These objectives will be achieved with a blinded, randomised controlled trial of BCG revaccination versus placebo in exposed front-line staff in hospitals in Cape Town. Observations will include the rate of infection with COVID-19 as well as the occurrence of mild, moderate or severe ambulatory respiratory tract infections, hospitalisation, need for oxygen, mechanical ventilation or death. HIV-positive individuals will be excluded. Safety of the vaccines will be monitored. A secondary endpoint is the occurrence of latent or active tuberculosis. Initial sample size and follow-up duration is at least 500 workers and 52 weeks. Statistical analysis will be model-based and ongoing in real time with frequent interim analyses and optional increases of both sample size or observation time, based on the unforeseeable trajectory of the South African COVID-19 epidemic, available funds and recommendations of an independent data and safety monitoring board. The study will be supported by a novel 3D lung organoid model of SARS-CoV-2 infection system that can mimic the cascade of immunological events after SARS-CoV-2 infection to determine and analyse the contribution of cellular components to the impact of BCG revaccination in this study. Given the immediate threat of the SARS-CoV-2 epidemic the trial has been designed as a pragmatic study with highly feasible endpoints that can be continuously measured. This allows for the most rapid identification of a beneficial outcome that would lead to immediate dissemination of the results, vaccination of the control group and outreach to the health authorities to consider BCG vaccination for all qualifying health care workers. Methods This dataset was collected in a clinical randomised control trial under the TASK008-BCG CORONA protocol. The trial was conducted in South Africa. This trial was registered with ClinicalTrials.gov, NCT04379336.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionThere have been large geographical differences in the infection and death rates of COVID-19. Foods and beverages containing high amounts of phytochemicals with bioactive properties were suggested to prevent contracting and to facilitate recovery from COVID-19. The goal of our study was to determine the correlation of the type of foods/beverages people consumed and the risk reduction of contracting COVID-19 and the recovery from COVID-19.MethodsWe developed an online survey that asked the participants whether they contracted COVID-19, their symptoms, time to recover, and their frequency of eating various types of foods/beverages. The survey was developed in 10 different languages.ResultsThe participants who did not contract COVID-19 consumed vegetables, herbs/spices, and fermented foods/beverages significantly more than the participants who contracted COVID-19. Among the six countries (India/Iran/Italy/Japan/Russia/Spain) with over 100 participants and high correspondence between the location of the participants and the language of the survey, in India and Japan the people who contracted COVID-19 showed significantly shorter recovery time, and greater daily intake of vegetables, herbs/spices, and fermented foods/beverages was associated with faster recovery.ConclusionsOur results suggest that phytochemical compounds included in the vegetables may have contributed in not only preventing contraction of COVID-19, but also accelerating their recovery.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectiveThe goal of this study was to dynamically model next-wave scenarios to observe the impact of different lockdown measures on the infection rates (IR) and mortality for two different prototype countries, mimicking the 1st year of the COVID-19 pandemic in Europe.MethodsA dynamic simulation SIRD model was designed to assess the effectiveness of policy measures on four next-wave scenarios, each preceded by two different lockdowns. The four scenarios were (1) no-measures, (2) uniform measures, (3) differential measures based on isolating > 60 years of age group, and (4) differential measures with additional contact reduction measures for the 20–60 years of age group. The dynamic simulation model was prepared for two prototype European countries, Northwestern (NW) and Southern (S) country. Both prototype countries were characterized based on age composition and contact matrix.ResultsThe results show that the outcomes of the next-wave scenarios depend on number of infections of previous lockdowns. All scenarios reduce the incremental deaths compared with a no-measures scenario. Differential measures show lower number of deaths despite an increase of infections. Additionally, prototype S shows overall more deaths compared with prototype NW due to a higher share of older citizens.ConclusionThis study shows that differential measures are a worthwhile option for controlling the COVID-19 epidemic. This may also be the case in situations where relevant parts of the population have taken up vaccination. Additionally, the effectiveness of interventions strongly depends on the number of previously infected individuals. The results of this study may be useful when planning and forecasting the impact of non-pharmacological interventions and vaccination campaigns.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Early detection and isolation of COVID-19 patients are essential for successful implementation of mitigation strategies and eventually curbing the disease spread. With a limited number of daily COVID-19 tests performed in every country, simulating the COVID-19 spread along with the potential effect of each mitigation strategy currently remains one of the most effective ways in managing the healthcare system and guiding policy-makers. We introduce COVIDHunter, a flexible and accurate COVID-19 outbreak simulation model that evaluates the current mitigation measures that are applied to a region, predicts COVID-19 statistics (the daily number of cases, hospitalizations, and deaths), and provides suggestions on what strength the upcoming mitigation measure should be. The key idea of COVIDHunter is to quantify the spread of COVID-19 in a geographical region by simulating the average number of new infections caused by an infected person considering the effect of external factors, such as environmental conditions (e.g., climate, temperature, humidity), different variants of concern, vaccination rate, and mitigation measures. Using Switzerland as a case study, COVIDHunter estimates that we are experiencing a deadly new wave that will peak on 26 January 2022, which is very similar in numbers to the wave we had in February 2020. The policy-makers have only one choice that is to increase the strength of the currently applied mitigation measures for 30 days. Unlike existing models, the COVIDHunter model accurately monitors and predicts the daily number of cases, hospitalizations, and deaths due to COVID-19. Our model is flexible to configure and simple to modify for modeling different scenarios under different environmental conditions and mitigation measures. We release the source code of the COVIDHunter implementation at https://github.com/CMU-SAFARI/COVIDHunter and show how to flexibly configure our model for any scenario and easily extend it for different measures and conditions than we account for.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Autoimmune bullous diseases (AIBDs) are a heterogeneous group of life-threatening disorders associated with subepidermal or intraepidermal blistering. Skin barrier alterations and prolonged immunosuppressive treatments increase the risk of infections in patients with AIBDs, who are considered fragile. COVID-19 pandemic had a heavy impact on these patients. Although advances have been made in terms of prevention and treatment of COVID-19, this topic remains significant as the pandemic and its waves could last several years and, so far, a relevant proportion of the population worldwide is not vaccinated. This review is a 2022 update that summarizes and discusses the pandemic’s burden on AIBD patients mainly considering relevant studies in terms of: (i) sample dimension; (ii) quality of control populations; (iii) possible standardization by age, gender and country. The findings show that: (i) the risk of COVID-19 infection and its severe course were comparable in AIBD patients and in the general population, except for rituximab-treated patients that presented a higher risk of infection and severe disease; (ii) the mortality rate in COVID-19-infected bullous pemphigoid patients was higher than in the general population, (iii) 121 cases of AIBD onset and 185 cases of relapse or exacerbation occurred after COVID-19 vaccination and a causal relationship has not been demonstrated so far. Altogether, acquired knowledge on COVID-19 pandemic could also be important in possible, albeit undesirable, future pandemic scenarios.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
https://www.usa.gov/government-workshttps://www.usa.gov/government-works
Reporting of new Aggregate Case and Death Count data was discontinued May 11, 2023, with the expiration of the COVID-19 public health emergency declaration. This dataset will receive a final update on June 1, 2023, to reconcile historical data through May 10, 2023, and will remain publicly available.
Aggregate Data Collection Process Since the start of the COVID-19 pandemic, data have been gathered through a robust process with the following steps:
Methodology Changes Several differences exist between the current, weekly-updated dataset and the archived version:
Confirmed and Probable Counts In this dataset, counts by jurisdiction are not displayed by confirmed or probable status. Instead, confirmed and probable cases and deaths are included in the Total Cases and Total Deaths columns, when available. Not all jurisdictions report probable cases and deaths to CDC.* Confirmed and probable case definition criteria are described here:
Council of State and Territorial Epidemiologists (ymaws.com).
Deaths CDC reports death data on other sections of the website: CDC COVID Data Tracker: Home, CDC COVID Data Tracker: Cases, Deaths, and Testing, and NCHS Provisional Death Counts. Information presented on the COVID Data Tracker pages is based on the same source (total case counts) as the present dataset; however, NCHS Death Counts are based on death certificates that use information reported by physicians, medical examiners, or coroners in the cause-of-death section of each certificate. Data from each of these pages are considered provisional (not complete and pending verification) and are therefore subject to change. Counts from previous weeks are continually revised as more records are received and processed.
Number of Jurisdictions Reporting There are currently 60 public health jurisdictions reporting cases of COVID-19. This includes the 50 states, the District of Columbia, New York City, the U.S. territories of American Samoa, Guam, the Commonwealth of the Northern Mariana Islands, Puerto Rico, and the U.S Virgin Islands as well as three independent countries in compacts of free association with the United States, Federated States of Micronesia, Republic of the Marshall Islands, and Republic of Palau. New York State’s reported case and death counts do not include New York City’s counts as they separately report nationally notifiable conditions to CDC.
CDC COVID-19 data are available to the public as summary or aggregate count files, including total counts of cases and deaths, available by state and by county. These and other data on COVID-19 are available from multiple public locations, such as:
https://www.cdc.gov/coronavirus/2019-ncov/cases-updates/cases-in-us.html
https://www.cdc.gov/covid-data-tracker/index.html
https://www.cdc.gov/coronavirus/2019-ncov/covid-data/covidview/index.html
https://www.cdc.gov/coronavirus/2019-ncov/php/open-america/surveillance-data-analytics.html
Additional COVID-19 public use datasets, include line-level (patient-level) data, are available at: https://data.cdc.gov/browse?tags=covid-19.
Archived Data Notes:
November 3, 2022: Due to a reporting cadence issue, case rates for Missouri counties are calculated based on 11 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 3, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Due to a reporting cadence change, case rates for Alabama counties are calculated based on 13 days’ worth of case count data in the Weekly United States COVID-19 Cases and Deaths by State data released on November 10, 2022, instead of the customary 7 days’ worth of data.
November 10, 2022: Per the request of the jurisdiction, cases and deaths among non-residents have been removed from all Hawaii county totals throughout the entire time series. Cumulative case and death counts reported by CDC will no longer match Hawaii’s COVID-19 Dashboard, which still includes non-resident cases and deaths.
November 17, 2022: Two new columns, weekly historic cases and weekly historic deaths, were added to this dataset on November 17, 2022. These columns reflect case and death counts that were reported that week but were historical in nature and not reflective of the current burden within the jurisdiction. These historical cases and deaths are not included in the new weekly case and new weekly death columns; however, they are reflected in the cumulative totals provided for each jurisdiction. These data are used to account for artificial increases in case and death totals due to batched reporting of historical data.
December 1, 2022: Due to cadence changes over the Thanksgiving holiday, case rates for all Ohio counties are reported as 0 in the data released on December 1, 2022.
January 5, 2023: Due to North Carolina’s holiday reporting cadence, aggregate case and death data will contain 14 days’ worth of data instead of the customary 7 days. As a result, case and death metrics will appear higher than expected in the January 5, 2023, weekly release.
January 12, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0. As a result, case and death metrics will appear lower than expected in the January 12, 2023, weekly release.
January 19, 2023: Due to a reporting cadence issue, Mississippi’s aggregate case and death data will be calculated based on 14 days’ worth of data instead of the customary 7 days in the January 19, 2023, weekly release.
January 26, 2023: Due to a reporting backlog of historic COVID-19 cases, case rates for two Michigan counties (Livingston and Washtenaw) were higher than expected in the January 19, 2023 weekly release.
January 26, 2023: Due to a backlog of historic COVID-19 cases being reported this week, aggregate case and death counts in Charlotte County and Sarasota County, Florida, will appear higher than expected in the January 26, 2023 weekly release.
January 26, 2023: Due to data processing delays, Mississippi’s aggregate case and death data will be reported as 0 in the weekly release posted on January 26, 2023.
February 2, 2023: As of the data collection deadline, CDC observed an abnormally large increase in aggregate COVID-19 cases and deaths reported for Washington State. In response, totals for new cases and new deaths released on February 2, 2023, have been displayed as zero at the state level until the issue is addressed with state officials. CDC is working with state officials to address the issue.
February 2, 2023: Due to a decrease reported in cumulative case counts by Wyoming, case rates will be reported as 0 in the February 2, 2023, weekly release. CDC is working with state officials to verify the data submitted.
February 16, 2023: Due to data processing delays, Utah’s aggregate case and death data will be reported as 0 in the weekly release posted on February 16, 2023. As a result, case and death metrics will appear lower than expected and should be interpreted with caution.
February 16, 2023: Due to a reporting cadence change, Maine’s