Facebook
TwitterThe COVID-19 dashboard includes data on city/town COVID-19 activity, confirmed and probable cases of COVID-19, confirmed and probable deaths related to COVID-19, and the demographic characteristics of cases and deaths.
Facebook
TwitterNotice of data discontinuation: Since the start of the pandemic, AP has reported case and death counts from data provided by Johns Hopkins University. Johns Hopkins University has announced that they will stop their daily data collection efforts after March 10. As Johns Hopkins stops providing data, the AP will also stop collecting daily numbers for COVID cases and deaths. The HHS and CDC now collect and visualize key metrics for the pandemic. AP advises using those resources when reporting on the pandemic going forward.
April 9, 2020
April 20, 2020
April 29, 2020
September 1st, 2020
February 12, 2021
new_deaths column.February 16, 2021
The AP is using data collected by the Johns Hopkins University Center for Systems Science and Engineering as our source for outbreak caseloads and death counts for the United States and globally.
The Hopkins data is available at the county level in the United States. The AP has paired this data with population figures and county rural/urban designations, and has calculated caseload and death rates per 100,000 people. Be aware that caseloads may reflect the availability of tests -- and the ability to turn around test results quickly -- rather than actual disease spread or true infection rates.
This data is from the Hopkins dashboard that is updated regularly throughout the day. Like all organizations dealing with data, Hopkins is constantly refining and cleaning up their feed, so there may be brief moments where data does not appear correctly. At this link, you’ll find the Hopkins daily data reports, and a clean version of their feed.
The AP is updating this dataset hourly at 45 minutes past the hour.
To learn more about AP's data journalism capabilities for publishers, corporations and financial institutions, go here or email kromano@ap.org.
Use AP's queries to filter the data or to join to other datasets we've made available to help cover the coronavirus pandemic
Filter cases by state here
Rank states by their status as current hotspots. Calculates the 7-day rolling average of new cases per capita in each state: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=481e82a4-1b2f-41c2-9ea1-d91aa4b3b1ac
Find recent hotspots within your state by running a query to calculate the 7-day rolling average of new cases by capita in each county: https://data.world/associatedpress/johns-hopkins-coronavirus-case-tracker/workspace/query?queryid=b566f1db-3231-40fe-8099-311909b7b687&showTemplatePreview=true
Join county-level case data to an earlier dataset released by AP on local hospital capacity here. To find out more about the hospital capacity dataset, see the full details.
Pull the 100 counties with the highest per-capita confirmed cases here
Rank all the counties by the highest per-capita rate of new cases in the past 7 days here. Be aware that because this ranks per-capita caseloads, very small counties may rise to the very top, so take into account raw caseload figures as well.
The AP has designed an interactive map to track COVID-19 cases reported by Johns Hopkins.
@(https://datawrapper.dwcdn.net/nRyaf/15/)
<iframe title="USA counties (2018) choropleth map Mapping COVID-19 cases by county" aria-describedby="" id="datawrapper-chart-nRyaf" src="https://datawrapper.dwcdn.net/nRyaf/10/" scrolling="no" frameborder="0" style="width: 0; min-width: 100% !important;" height="400"></iframe><script type="text/javascript">(function() {'use strict';window.addEventListener('message', function(event) {if (typeof event.data['datawrapper-height'] !== 'undefined') {for (var chartId in event.data['datawrapper-height']) {var iframe = document.getElementById('datawrapper-chart-' + chartId) || document.querySelector("iframe[src*='" + chartId + "']");if (!iframe) {continue;}iframe.style.height = event.data['datawrapper-height'][chartId] + 'px';}}});})();</script>
Johns Hopkins timeseries data - Johns Hopkins pulls data regularly to update their dashboard. Once a day, around 8pm EDT, Johns Hopkins adds the counts for all areas they cover to the timeseries file. These counts are snapshots of the latest cumulative counts provided by the source on that day. This can lead to inconsistencies if a source updates their historical data for accuracy, either increasing or decreasing the latest cumulative count. - Johns Hopkins periodically edits their historical timeseries data for accuracy. They provide a file documenting all errors in their timeseries files that they have identified and fixed here
This data should be credited to Johns Hopkins University COVID-19 tracking project
Facebook
Twitterhttps://github.com/nytimes/covid-19-data/blob/master/LICENSEhttps://github.com/nytimes/covid-19-data/blob/master/LICENSE
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since the first reported coronavirus case in Washington State on Jan. 21, 2020, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Facebook
TwitterAs of March 10, 2023, the state with the highest number of COVID-19 cases was California. Almost 104 million cases have been reported across the United States, with the states of California, Texas, and Florida reporting the highest numbers.
From an epidemic to a pandemic The World Health Organization declared the COVID-19 outbreak a pandemic on March 11, 2020. The term pandemic refers to multiple outbreaks of an infectious illness threatening multiple parts of the world at the same time. When the transmission is this widespread, it can no longer be traced back to the country where it originated. The number of COVID-19 cases worldwide has now reached over 669 million.
The symptoms and those who are most at risk Most people who contract the virus will suffer only mild symptoms, such as a cough, a cold, or a high temperature. However, in more severe cases, the infection can cause breathing difficulties and even pneumonia. Those at higher risk include older persons and people with pre-existing medical conditions, including diabetes, heart disease, and lung disease. People aged 85 years and older have accounted for around 27 percent of all COVID-19 deaths in the United States, although this age group makes up just two percent of the U.S. population
Facebook
TwitterNote: This dataset is no longer being updated as of June 2, 2025.
This dataset contains numbers of COVID-19 outbreaks and associated cases, categorized by setting, reported to CDPH since January 1, 2021.
AB 685 (Chapter 84, Statutes of 2020) and the Cal/OSHA COVID-19 Emergency Temporary Standards (Title 8, Subchapter 7, Sections 3205-3205.4) required non-healthcare employers in California to report workplace COVID-19 outbreaks to their local health department (LHD) between January 1, 2021 – December 31, 2022. Beginning January 1, 2023, non-healthcare employer reporting of COVID-19 outbreaks to local health departments is voluntary, unless a local order is in place. More recent data collected without mandated reporting may therefore be less representative of all outbreaks that have occurred, compared to earlier data collected during mandated reporting. Licensed health facilities continue to be mandated to report outbreaks to LHDs.
LHDs report confirmed outbreaks to the California Department of Public Health (CDPH) via the California Reportable Disease Information Exchange (CalREDIE), the California Connected (CalCONNECT) system, or other established processes. Data are compiled and categorized by setting by CDPH. Settings are categorized by U.S. Census industry codes. Total outbreaks and cases are included for individual industries as well as for broader industrial sectors.
The first dataset includes numbers of outbreaks in each setting by month of onset, for outbreaks reported to CDPH since January 1, 2021. This dataset includes some outbreaks with onset prior to January 1 that were reported to CDPH after January 1; these outbreaks are denoted with month of onset “Before Jan 2021.” The second dataset includes cumulative numbers of COVID-19 outbreaks with onset after January 1, 2021, categorized by setting. Due to reporting delays, the reported numbers may not reflect all outbreaks that have occurred as of the reporting date; additional outbreaks may have occurred that have not yet been reported to CDPH.
While many of these settings are workplaces, cases may have occurred among workers, other community members who visited the setting, or both. Accordingly, these data do not distinguish between outbreaks involving only workers, outbreaks involving only residents or patrons, or outbreaks involving both.
Several additional data limitations should be kept in mind:
Outbreaks are classified as “Insufficient information” for outbreaks where not enough information was available for CDPH to assign an industry code.
Some sectors, particularly congregate residential settings, may have increased testing and therefore increased likelihood of outbreak recognition and reporting. As a result, in congregate residential settings, the number of outbreak-associated cases may be more accurate.
However, in most settings, outbreak and case counts are likely underestimates. For most cases, it is not possible to identify the source of exposure, as many cases have multiple possible exposures.
Because some settings have been at times been closed or open with capacity restrictions, numbers of outbreak reports in those settings do not reflect COVID-19 transmission risk.
The number of outbreaks in different settings will depend on the number of different workplaces in each setting. More outbreaks would be expected in settings with many workplaces compared to settings with few workplaces.
Facebook
TwitterDaily count of NYC residents who tested positive for SARS-CoV-2, who were hospitalized with COVID-19, and deaths among COVID-19 patients. Note that this dataset currently pulls from https://raw.githubusercontent.com/nychealth/coronavirus-data/master/trends/data-by-day.csv on a daily basis.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The United States of America
Facebook
TwitterThe first two cases of the new coronavirus (COVID-19) in Italy were recorded between the end of January and the beginning of February 2020. Since then, the number of cases in Italy increased steadily, reaching over 26.9 million as of January 8, 2025. The region mostly hit by the virus in the country was Lombardy, counting almost 4.4 million cases. On January 11, 2022, 220,532 new cases were registered, which represented the biggest daily increase in cases in Italy since the start of the pandemic. The virus originated in Wuhan, a Chinese city populated by millions and located in the province of Hubei. More statistics and facts about the virus in Italy are available here.For a global overview, visit Statista's webpage exclusively dedicated to coronavirus, its development, and its impact.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.
So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.
Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.
Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.
2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC
This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.
The data is available from 22 Jan, 2020.
Here’s a polished version suitable for a professional Kaggle dataset description:
This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv, with supporting files for earlier records and individual-level line list data.
This is the primary dataset and contains aggregated COVID-19 statistics by location and date.
This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv.
This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.
Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.
✅ Use covid_19_data.csv for up-to-date aggregated global trends.
✅ Use the line list datasets for detailed, individual-level case analysis.
If you are interested in knowing country level data, please refer to the following Kaggle datasets:
India - https://www.kaggle.com/sudalairajkumar/covid19-in-india
South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset
Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy
Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil
USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa
Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland
Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases
Johns Hopkins University for making the data available for educational and academic research purposes
MoBS lab - https://www.mobs-lab.org/2019ncov.html
World Health Organization (WHO): https://www.who.int/
DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.
BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/
National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml
China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm
Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html
Macau Government: https://www.ssm.gov.mo/portal/
Taiwan CDC: https://sites.google....
Facebook
TwitterThe COVID Tracking Project collects information from 50 US states, the District of Columbia, and 5 other US territories to provide the most comprehensive testing data we can collect for the novel coronavirus, SARS-CoV-2. We attempt to include positive and negative results, pending tests, and total people tested for each state or district currently reporting that data.
Testing is a crucial part of any public health response, and sharing test data is essential to understanding this outbreak. The CDC is currently not publishing complete testing data, so we’re doing our best to collect it from each state and provide it to the public. The information is patchy and inconsistent, so we’re being transparent about what we find and how we handle it—the spreadsheet includes our live comments about changing data and how we’re working with incomplete information.
From here, you can also learn about our methodology, see who makes this, and find out what information states provide and how we handle it.
Facebook
TwitterU.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Note: Starting October 10th, 2025 this dataset is deprecated and is no longer being updated. As of April 27, 2023 updates changed from daily to weekly.
Summary The cumulative number of positive COVID-19 cases among Maryland residents within a single Maryland jurisdiction.
Description The MD COVID-19 - Cases by County data layer is a collection of positive COVID-19 test results that have been reported each day by the local health department via the ESSENCE system.
Terms of Use The Spatial Data, and the information therein, (collectively the "Data") is provided "as is" without warranty of any kind, either expressed, implied, or statutory. The user assumes the entire risk as to quality and performance of the Data. No guarantee of accuracy is granted, nor is any responsibility for reliance thereon assumed. In no event shall the State of Maryland be liable for direct, indirect, incidental, consequential or special damages of any kind. The State of Maryland does not accept liability for any damages or misrepresentation caused by inaccuracies in the Data or as a result to changes to the Data, nor is there responsibility assumed to maintain the Data in any manner or form. The Data can be freely distributed as long as the metadata entry is not modified or deleted. Any data derived from the Data must acknowledge the State of Maryland in the metadata.
Facebook
TwitterBrazil is the Latin American country affected the most by the COVID-19 pandemic. As of May 2025, the country had reported around 38 million cases. It was followed by Argentina, with approximately ten million confirmed cases of COVID-19. In total, the region had registered more than 83 million diagnosed patients, as well as a growing number of fatal COVID-19 cases. The research marathon Normally, the development of vaccines takes years of research and testing until options are available to the general public. However, with an alarming and threatening situation as that of the COVID-19 pandemic, scientists quickly got on board in a vaccine marathon to develop a safe and effective way to prevent and control the spread of the virus in record time. Over two years after the first cases were reported, the world had around 1,521 drugs and vaccines targeting the COVID-19 disease. As of June 2022, a total of 39 candidates were already launched and countries all over the world had started negotiations and acquisition of the vaccine, along with immunization campaigns. COVID vaccination rates in Latin America As immunization against the spread of the disease continues to progress, regional disparities in vaccination coverage persist. While Brazil, Argentina, and Mexico were among the Latin American nations with the most COVID-19 cases, those that administered the highest number of COVID-19 doses per 100 population are Cuba, Chile, and Peru. Leading the vaccination coverage in the region is the Caribbean nation, with more than 406 COVID-19 vaccines administered per every 100 inhabitants as of January 5, 2024.For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Facts and Figures page.
Facebook
TwitterAs of March 10, 2023, the state with the highest rate of COVID-19 cases was Rhode Island followed by Alaska. Around 103.9 million cases have been reported across the United States, with the states of California, Texas, and Florida reporting the highest numbers of infections.
From an epidemic to a pandemic The World Health Organization declared the COVID-19 outbreak as a pandemic on March 11, 2020. The term pandemic refers to multiple outbreaks of an infectious illness threatening multiple parts of the world at the same time; when the transmission is this widespread, it can no longer be traced back to the country where it originated. The number of COVID-19 cases worldwide is roughly 683 million, and it has affected almost every country in the world.
The symptoms and those who are most at risk Most people who contract the virus will suffer only mild symptoms, such as a cough, a cold, or a high temperature. However, in more severe cases, the infection can cause breathing difficulties and even pneumonia. Those at higher risk include older persons and people with pre-existing medical conditions, including diabetes, heart disease, and lung disease. Those aged 85 years and older have accounted for around 27 percent of all COVID deaths in the United States, although this age group makes up just two percent of the total population
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Context
The dataset contains the list of COVID Fake News/Claims which is shared all over the internet.
Content
Inspiration
In many research portals, there was this common question in which the combined fake news dataset is available or not. This led to the publication of this dataset.
Facebook
TwitterAfter over two years of public reporting, the State Profile Report will no longer be produced and distributed after February 2023. The final release was on February 23, 2023. We want to thank everyone who contributed to the design, production, and review of this report and we hope that it provided insight into the data trends throughout the COVID-19 pandemic. Data about COVID-19 will continue to be updated at CDC’s COVID Data Tracker. The State Profile Report (SPR) is generated by the Data Strategy and Execution Workgroup in the Joint Coordination Cell, in collaboration with the White House. It is managed by an interagency team with representatives from multiple agencies and offices (including the United States Department of Health and Human Services (HHS), the Centers for Disease Control and Prevention, the HHS Assistant Secretary for Preparedness and Response, and the Indian Health Service). The SPR provides easily interpretable information on key indicators for each state, down to the county level. It is a weekly snapshot in time that: Focuses on recent outcomes in the last seven days and changes relative to the month prior Provides additional contextual information at the county level for each state, and includes national level information Supports rapid visual interpretation of results with color thresholds
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains world news related to Covid-19 and vaccine and also with the news article's available metadata.
Facebook
Twitterhttps://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
This dataset compiles daily snapshots of publicly reported data on 2019 Novel Coronavirus (COVID-19) testing in Ontario.
Data includes:
**Effective November 14, 2024 this page will no longer be updated. Information about COVID-19 and other respiratory viruses is available on Public Health Ontario’s interactive respiratory virus tool: https://www.publichealthontario.ca/en/Data-and-Analysis/Infectious-Disease/Respiratory-Virus-Tool **
Data for the period of October 24, 2023 to March 24, 2024 excludes hospitals in the West region who were experiencing data availability issues.
Daily adult, pediatric, and neonatal patient ICU census data were impacted by technical issues between September 9 and October 20, 2023. As a result, when public reporting resumes on November 16, 2023, historical ICU data for this time period will be excluded.
As of August 3, 2023, the data in this file has been updated to reflect that there are now six Ontario Health (OH) regions.
This dataset is subject to change. Please review the daily epidemiologic summaries for information on variables, methodology, and technical considerations.
Facebook
Twitterhttps://www.ontario.ca/page/open-government-licence-ontariohttps://www.ontario.ca/page/open-government-licence-ontario
This dataset compiles daily snapshots of publicly reported data on 2019 Novel Coronavirus (COVID-19) testing in Ontario.
Data includes:
This dataset is subject to change. Please review the daily epidemiologic summaries for information on variables, methodology, and technical considerations.
This data is no longer available on this page. Information about COVID-19, and other respiratory viruses, is available through Public Health Ontario’s “Ontario Respiratory Virus Tool".
On November 30, 2023 the count of COVID-19 deaths was updated to include missing historical deaths from January 15, 2020 to March 31, 2023. This impacts data captured in the column ‘Outcome1’.
Due to changes in data availability, the following variables will be removed from this file, effective Thursday April 13, 2023: ‘Case_AcquisitionInfo’, ‘Outbreak_Related’. Also due to changes in data availability, the variable ‘Outcome1’ will be equal to ‘Fatal’ (deaths due to COVID-19) or blank (all other cases)
The methodology used to count COVID-19 deaths has changed to exclude deaths not caused by COVID. This impacts data captured in the column ‘‘Outcome1’ starting with data posted to the catalogue on March 11, 2022.
CCM is a dynamic disease reporting system which allows ongoing update to data previously entered. As a result, data extracted from CCM represents a snapshot at the time of extraction and may differ from previous or subsequent results. Public Health Units continually clean up COVID-19 data, correcting for missing or overcounted cases and deaths. These corrections can result in data spikes and current totals being different from previously reported cases and deaths. Observed trends over time should be interpreted with caution for the most recent period due to reporting and/or data entry lags.
Facebook
TwitterOn March 10, 2023, the Johns Hopkins Coronavirus Resource Center ceased its collecting and reporting of global COVID-19 data. For updated cases, deaths, and vaccine data please visit: World Health Organization (WHO)For more information, visit the Johns Hopkins Coronavirus Resource Center.COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.DOI: https://doi.org/10.6084/m9.figshare.125529863/7/2022 - Adjusted the rate of active cases calculation in the U.S. to reflect the rates of serious and severe cases due nearly completely dominant Omicron variant.6/24/2020 - Expanded Case Rates discussion to include fix on 6/23 for calculating active cases.6/22/2020 - Added Executive Summary and Subsequent Outbreaks sectionsRevisions on 6/10/2020 based on updated CDC reporting. This affects the estimate of active cases by revising the average duration of cases with hospital stays downward from 30 days to 25 days. The result shifted 76 U.S. counties out of Epidemic to Spreading trend and no change for national level trends.Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Correction on 6/1/2020Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Revisions added on 4/30/2020 are highlighted.Revisions added on 4/23/2020 are highlighted.Executive SummaryCOVID-19 Trends is a methodology for characterizing the current trend for places during the COVID-19 global pandemic. Each day we assign one of five trends: Emergent, Spreading, Epidemic, Controlled, or End Stage to geographic areas to geographic areas based on the number of new cases, the number of active cases, the total population, and an algorithm (described below) that contextualize the most recent fourteen days with the overall COVID-19 case history. Currently we analyze the countries of the world and the U.S. Counties. The purpose is to give policymakers, citizens, and analysts a fact-based data driven sense for the direction each place is currently going. When a place has the initial cases, they are assigned Emergent, and if that place controls the rate of new cases, they can move directly to Controlled, and even to End Stage in a short time. However, if the reporting or measures to curtail spread are not adequate and significant numbers of new cases continue, they are assigned to Spreading, and in cases where the spread is clearly uncontrolled, Epidemic trend.We analyze the data reported by Johns Hopkins University to produce the trends, and we report the rates of cases, spikes of new cases, the number of days since the last reported case, and number of deaths. We also make adjustments to the assignments based on population so rural areas are not assigned trends based solely on case rates, which can be quite high relative to local populations.Two key factors are not consistently known or available and should be taken into consideration with the assigned trend. First is the amount of resources, e.g., hospital beds, physicians, etc.that are currently available in each area. Second is the number of recoveries, which are often not tested or reported. On the latter, we provide a probable number of active cases based on CDC guidance for the typical duration of mild to severe cases.Reasons for undertaking this work in March of 2020:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-25 days + 5% from past 26-49 days - total deaths. On 3/17/2022, the U.S. calculation was adjusted to: Active Cases = 100% of new cases in past 14 days + 6% from past 15-25 days + 3% from past 26-49 days - total deaths. Sources: https://www.cdc.gov/mmwr/volumes/71/wr/mm7104e4.htm https://covid.cdc.gov/covid-data-tracker/#variant-proportions If a new variant arrives and appears to cause higher rates of serious cases, we will roll back this adjustment. We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source
Facebook
Twitterhttps://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
By [source]
The FakeCovid dataset is an unparalleled compilation of 7623 fact-checked news articles related to COVID-19. Obtained from 92 fact-checking websites located in 105 countries, this comprehensive collection covers a wide range of sources and languages, including locations across Africa, Europe, Asia, The Americas and Oceania. With data gathered from references on Poynter and Snopes, this unique dataset is an invaluable resource for researching the accuracy of global news related to the pandemic. It offers an invaluable insight into the international nature of COVID information with its column headers covering country's involved; categories such as coronavirus health updates or political interference during coronavirus; URLs for referenced articles; verifiers employed by websites; article classes that can range from true to false or even mixed evaluations; publication dates ; article sources injected with credibility verification as well as article text and language standardization. This one-of-a kind dataset serves as an essential tool in understanding both global information flow around the world concerning COVID 19 while simultaneously offering transparency into whose interests guide it
For more datasets, click here.
- 🚨 Your notebook can be here! 🚨!
The FakeCovid dataset is a multilingual cross-domain collection of 7623 fact-checked news articles related to COVID-19. It is collected from 92 fact-checking websites and covers a wide range of sources and countries, including locations in Africa, Asia, Europe, The Americas, and Oceania. This dataset can be used for research related to understanding the truth and accuracy of news sources related to COVID-19 in different countries and languages.
To use this dataset effectively, you will need basic knowledge of data science principles such as data manipulation with pandas or Python libraries such as NumPy or ScikitLearn. The data is in CSV (comma separated values) format that can be read by most spreadsheet applications or text editor like Notepad++. Here are some steps on how to get started: - Access the FakeCovid Fact Checked News Dataset from Kaggle: https://www.kaggle.com/c/fakecovidfactcheckednewsdataset/data - Download the provided CSV file containing all fact checked news articles and place it into your desired folder location - Load the CSV file into your preferred software application like Jupyter Notebook or RStudio 4)Explore your dataset using built-in functions within data science libraries such as Pandas & matplotlib – find meaningful information through statistical analysis &//or create visualizations 5)Modify parameters within the csv file if required & save 6)Share your creative projects through Gitter chatroom #fakecovidauthors 7 )Publish any interesting discoveries you find within open source repositories like GitHub 8 )Engage with our Hangouts group #FakeCoviDFactCheckersClub 9 )Show off fun graphics via Twitter hashtag #FakeCovidiauthors 10 )Reach out if you have further questions via email contactfakecovidadatateam 11 )Stay connected by joining our mailing list#FakeCoviDAuthorsGroup
We hope this guide helps you better understand how to use our FakeCoviD Fact Checked News Dataset for generating meaningful insights relating to COVID-19 news articles worldwide!
- Developing an automated algorithm to detect fake news related to COVID-19 by leveraging the fact-checking flags and other results included in this dataset for machine learning and natural language processing tasks.
- Training a sentiment analysis model on the data to categorize articles according to their sentiments which can be used for further investigations into why certain news topics or countries have certain outcomes, motivations, or behaviors due to their content relatedness or author biasness(if any).
- Using unsupervised clustering techniques, this dataset could be used as a tool for identifying any discrepancies between news circulated in different populations in different countries (langauge and regions) so that publicists can focus more on providing factual information rather than spreading false rumors or misinformation about the pandemic
If you use this dataset in your research, please credit the original authors. Data Source
**License: [CC0 1.0 Universal (CC0 1.0) - Public Do...
Facebook
TwitterThe COVID-19 dashboard includes data on city/town COVID-19 activity, confirmed and probable cases of COVID-19, confirmed and probable deaths related to COVID-19, and the demographic characteristics of cases and deaths.