Based on a comparison of coronavirus deaths in 210 countries relative to their population, Peru had the most losses to COVID-19 up until July 13, 2022. As of the same date, the virus had infected over 557.8 million people worldwide, and the number of deaths had totaled more than 6.3 million. Note, however, that COVID-19 test rates can vary per country. Additionally, big differences show up between countries when combining the number of deaths against confirmed COVID-19 cases. The source seemingly does not differentiate between "the Wuhan strain" (2019-nCOV) of COVID-19, "the Kent mutation" (B.1.1.7) that appeared in the UK in late 2020, the 2021 Delta variant (B.1.617.2) from India or the Omicron variant (B.1.1.529) from South Africa.
The difficulties of death figures
This table aims to provide a complete picture on the topic, but it very much relies on data that has become more difficult to compare. As the coronavirus pandemic developed across the world, countries already used different methods to count fatalities, and they sometimes changed them during the course of the pandemic. On April 16, for example, the Chinese city of Wuhan added a 50 percent increase in their death figures to account for community deaths. These deaths occurred outside of hospitals and went unaccounted for so far. The state of New York did something similar two days before, revising their figures with 3,700 new deaths as they started to include “assumed” coronavirus victims. The United Kingdom started counting deaths in care homes and private households on April 29, adjusting their number with about 5,000 new deaths (which were corrected lowered again by the same amount on August 18). This makes an already difficult comparison even more difficult. Belgium, for example, counts suspected coronavirus deaths in their figures, whereas other countries have not done that (yet). This means two things. First, it could have a big impact on both current as well as future figures. On April 16 already, UK health experts stated that if their numbers were corrected for community deaths like in Wuhan, the UK number would change from 205 to “above 300”. This is exactly what happened two weeks later. Second, it is difficult to pinpoint exactly which countries already have “revised” numbers (like Belgium, Wuhan or New York) and which ones do not. One work-around could be to look at (freely accessible) timelines that track the reported daily increase of deaths in certain countries. Several of these are available on our platform, such as for Belgium, Italy and Sweden. A sudden large increase might be an indicator that the domestic sources changed their methodology.
Where are these numbers coming from?
The numbers shown here were collected by Johns Hopkins University, a source that manually checks the data with domestic health authorities. For the majority of countries, this is from national authorities. In some cases, like China, the United States, Canada or Australia, city reports or other various state authorities were consulted. In this statistic, these separately reported numbers were put together. For more information or other freely accessible content, please visit our dedicated Facts and Figures page.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
As of May 2, 2023, the outbreak of the coronavirus disease (COVID-19) had spread to almost every country in the world, and more than 6.86 million people had died after contracting the respiratory virus. Over 1.16 million of these deaths occurred in the United States.
Waves of infections Almost every country and territory worldwide have been affected by the COVID-19 disease. At the end of 2021 the virus was once again circulating at very high rates, even in countries with relatively high vaccination rates such as the United States and Germany. As rates of new infections increased, some countries in Europe, like Germany and Austria, tightened restrictions once again, specifically targeting those who were not yet vaccinated. However, by spring 2022, rates of new infections had decreased in many countries and restrictions were once again lifted.
What are the symptoms of the virus? It can take up to 14 days for symptoms of the illness to start being noticed. The most commonly reported symptoms are a fever and a dry cough, leading to shortness of breath. The early symptoms are similar to other common viruses such as the common cold and flu. These illnesses spread more during cold months, but there is no conclusive evidence to suggest that temperature impacts the spread of the SARS-CoV-2 virus. Medical advice should be sought if you are experiencing any of these symptoms.
As of May 2, 2023, there were roughly 687 million global cases of COVID-19. Around 660 million people had recovered from the disease, while there had been almost 6.87 million deaths. The United States, India, and Brazil have been among the countries hardest hit by the pandemic.
The various types of human coronavirus The SARS-CoV-2 virus is the seventh known coronavirus to infect humans. Its emergence makes it the third in recent years to cause widespread infectious disease following the viruses responsible for SARS and MERS. A continual problem is that viruses naturally mutate as they attempt to survive. Notable new variants of SARS-CoV-2 were first identified in the UK, South Africa, and Brazil. Variants are of particular interest because they are associated with increased transmission.
Vaccination campaigns Common human coronaviruses typically cause mild symptoms such as a cough or a cold, but the novel coronavirus SARS-CoV-2 has led to more severe respiratory illnesses and deaths worldwide. Several COVID-19 vaccines have now been approved and are being used around the world.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS CASES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.
So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.
Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.
Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.
2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC
This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.
The data is available from 22 Jan, 2020.
Here’s a polished version suitable for a professional Kaggle dataset description:
This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv
, with supporting files for earlier records and individual-level line list data.
This is the primary dataset and contains aggregated COVID-19 statistics by location and date.
This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv
.
This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.
Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.
✅ Use covid_19_data.csv
for up-to-date aggregated global trends.
✅ Use the line list datasets for detailed, individual-level case analysis.
If you are interested in knowing country level data, please refer to the following Kaggle datasets:
India - https://www.kaggle.com/sudalairajkumar/covid19-in-india
South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset
Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy
Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil
USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa
Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland
Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases
Johns Hopkins University for making the data available for educational and academic research purposes
MoBS lab - https://www.mobs-lab.org/2019ncov.html
World Health Organization (WHO): https://www.who.int/
DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.
BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/
National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml
China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm
Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html
Macau Government: https://www.ssm.gov.mo/portal/
Taiwan CDC: https://sites.google....
As of November 24, 2024 there were over 274 million confirmed cases of coronavirus (COVID-19) across the whole of Europe since the first confirmed cases in France in January 2020. France has been the worst affected country in Europe with 39,028,437 confirmed cases, followed by Germany with 38,437,756 cases. Italy and the UK have approximately 26.8 million and 25 million cases respectively. For further information about the coronavirus pandemic, please visit our dedicated Facts and Figures page.
As of January 13, 2023, Bulgaria had the highest rate of COVID-19 deaths among its population in Europe at 548.6 deaths per 100,000 population. Hungary had recorded 496.4 deaths from COVID-19 per 100,000. Furthermore, Russia had the highest number of confirmed COVID-19 deaths in Europe, at over 394 thousand.
Number of cases in Europe During the same period, across the whole of Europe, there have been over 270 million confirmed cases of COVID-19. France has been Europe's worst affected country with around 38.3 million cases, this translates to an incidence rate of approximately 58,945 cases per 100,000 population. Germany and Italy had approximately 37.6 million and 25.3 million cases respectively.
Current situation In March 2023, the rate of cases in Austria over the last seven days was 224 per 100,000 which was the highest in Europe. Luxembourg and Slovenia both followed with seven day rates of infections at 122 and 108 respectively.
COVID-19 Trends MethodologyOur goal is to analyze and present daily updates in the form of recent trends within countries, states, or counties during the COVID-19 global pandemic. The data we are analyzing is taken directly from the Johns Hopkins University Coronavirus COVID-19 Global Cases Dashboard, though we expect to be one day behind the dashboard’s live feeds to allow for quality assurance of the data.Revisions added on 4/23/2020 are highlighted.Revisions added on 4/30/2020 are highlighted.Discussion of our assertion of an abundance of caution in assigning trends in rural counties added 5/7/2020. Correction on 6/1/2020Methodology update on 6/2/2020: This sets the length of the tail of new cases to 6 to a maximum of 14 days, rather than 21 days as determined by the last 1/3 of cases. This was done to align trends and criteria for them with U.S. CDC guidance. The impact is areas transition into Controlled trend sooner for not bearing the burden of new case 15-21 days earlier.Reasons for undertaking this work:The popular online maps and dashboards show counts of confirmed cases, deaths, and recoveries by country or administrative sub-region. Comparing the counts of one country to another can only provide a basis for comparison during the initial stages of the outbreak when counts were low and the number of local outbreaks in each country was low. By late March 2020, countries with small populations were being left out of the mainstream news because it was not easy to recognize they had high per capita rates of cases (Switzerland, Luxembourg, Iceland, etc.). Additionally, comparing countries that have had confirmed COVID-19 cases for high numbers of days to countries where the outbreak occurred recently is also a poor basis for comparison.The graphs of confirmed cases and daily increases in cases were fit into a standard size rectangle, though the Y-axis for one country had a maximum value of 50, and for another country 100,000, which potentially misled people interpreting the slope of the curve. Such misleading circumstances affected comparing large population countries to small population counties or countries with low numbers of cases to China which had a large count of cases in the early part of the outbreak. These challenges for interpreting and comparing these graphs represent work each reader must do based on their experience and ability. Thus, we felt it would be a service to attempt to automate the thought process experts would use when visually analyzing these graphs, particularly the most recent tail of the graph, and provide readers with an a resulting synthesis to characterize the state of the pandemic in that country, state, or county.The lack of reliable data for confirmed recoveries and therefore active cases. Merely subtracting deaths from total cases to arrive at this figure progressively loses accuracy after two weeks. The reason is 81% of cases recover after experiencing mild symptoms in 10 to 14 days. Severe cases are 14% and last 15-30 days (based on average days with symptoms of 11 when admitted to hospital plus 12 days median stay, and plus of one week to include a full range of severely affected people who recover). Critical cases are 5% and last 31-56 days. Sources:U.S. CDC. April 3, 2020 Interim Clinical Guidance for Management of Patients with Confirmed Coronavirus Disease (COVID-19). Accessed online. Initial older guidance was also obtained online. Additionally, many people who recover may not be tested, and many who are, may not be tracked due to privacy laws. Thus, the formula used to compute an estimate of active cases is: Active Cases = 100% of new cases in past 14 days + 19% from past 15-30 days + 5% from past 31-56 days - total deaths.We’ve never been inside a pandemic with the ability to learn of new cases as they are confirmed anywhere in the world. After reviewing epidemiological and pandemic scientific literature, three needs arose. We need to specify which portions of the pandemic lifecycle this map cover. The World Health Organization (WHO) specifies six phases. The source data for this map begins just after the beginning of Phase 5: human to human spread and encompasses Phase 6: pandemic phase. Phase six is only characterized in terms of pre- and post-peak. However, these two phases are after-the-fact analyses and cannot ascertained during the event. Instead, we describe (below) a series of five trends for Phase 6 of the COVID-19 pandemic.Choosing terms to describe the five trends was informed by the scientific literature, particularly the use of epidemic, which signifies uncontrolled spread. The five trends are: Emergent, Spreading, Epidemic, Controlled, and End Stage. Not every locale will experience all five, but all will experience at least three: emergent, controlled, and end stage.This layer presents the current trends for the COVID-19 pandemic by country (or appropriate level). There are five trends:Emergent: Early stages of outbreak. Spreading: Early stages and depending on an administrative area’s capacity, this may represent a manageable rate of spread. Epidemic: Uncontrolled spread. Controlled: Very low levels of new casesEnd Stage: No New cases These trends can be applied at several levels of administration: Local: Ex., City, District or County – a.k.a. Admin level 2State: Ex., State or Province – a.k.a. Admin level 1National: Country – a.k.a. Admin level 0Recommend that at least 100,000 persons be represented by a unit; granted this may not be possible, and then the case rate per 100,000 will become more important.Key Concepts and Basis for Methodology: 10 Total Cases minimum threshold: Empirically, there must be enough cases to constitute an outbreak. Ideally, this would be 5.0 per 100,000, but not every area has a population of 100,000 or more. Ten, or fewer, cases are also relatively less difficult to track and trace to sources. 21 Days of Cases minimum threshold: Empirically based on COVID-19 and would need to be adjusted for any other event. 21 days is also the minimum threshold for analyzing the “tail” of the new cases curve, providing seven cases as the basis for a likely trend (note that 21 days in the tail is preferred). This is the minimum needed to encompass the onset and duration of a normal case (5-7 days plus 10-14 days). Specifically, a median of 5.1 days incubation time, and 11.2 days for 97.5% of cases to incubate. This is also driven by pressure to understand trends and could easily be adjusted to 28 days. Source used as basis:Stephen A. Lauer, MS, PhD *; Kyra H. Grantz, BA *; Qifang Bi, MHS; Forrest K. Jones, MPH; Qulu Zheng, MHS; Hannah R. Meredith, PhD; Andrew S. Azman, PhD; Nicholas G. Reich, PhD; Justin Lessler, PhD. 2020. The Incubation Period of Coronavirus Disease 2019 (COVID-19) From Publicly Reported Confirmed Cases: Estimation and Application. Annals of Internal Medicine DOI: 10.7326/M20-0504.New Cases per Day (NCD) = Measures the daily spread of COVID-19. This is the basis for all rates. Back-casting revisions: In the Johns Hopkins’ data, the structure is to provide the cumulative number of cases per day, which presumes an ever-increasing sequence of numbers, e.g., 0,0,1,1,2,5,7,7,7, etc. However, revisions do occur and would look like, 0,0,1,1,2,5,7,7,6. To accommodate this, we revised the lists to eliminate decreases, which make this list look like, 0,0,1,1,2,5,6,6,6.Reporting Interval: In the early weeks, Johns Hopkins' data provided reporting every day regardless of change. In late April, this changed allowing for days to be skipped if no new data was available. The day was still included, but the value of total cases was set to Null. The processing therefore was updated to include tracking of the spacing between intervals with valid values.100 News Cases in a day as a spike threshold: Empirically, this is based on COVID-19’s rate of spread, or r0 of ~2.5, which indicates each case will infect between two and three other people. There is a point at which each administrative area’s capacity will not have the resources to trace and account for all contacts of each patient. Thus, this is an indicator of uncontrolled or epidemic trend. Spiking activity in combination with the rate of new cases is the basis for determining whether an area has a spreading or epidemic trend (see below). Source used as basis:World Health Organization (WHO). 16-24 Feb 2020. Report of the WHO-China Joint Mission on Coronavirus Disease 2019 (COVID-19). Obtained online.Mean of Recent Tail of NCD = Empirical, and a COVID-19-specific basis for establishing a recent trend. The recent mean of NCD is taken from the most recent fourteen days. A minimum of 21 days of cases is required for analysis but cannot be considered reliable. Thus, a preference of 42 days of cases ensures much higher reliability. This analysis is not explanatory and thus, merely represents a likely trend. The tail is analyzed for the following:Most recent 2 days: In terms of likelihood, this does not mean much, but can indicate a reason for hope and a basis to share positive change that is not yet a trend. There are two worthwhile indicators:Last 2 days count of new cases is less than any in either the past five or 14 days. Past 2 days has only one or fewer new cases – this is an extremely positive outcome if the rate of testing has continued at the same rate as the previous 5 days or 14 days. Most recent 5 days: In terms of likelihood, this is more meaningful, as it does represent at short-term trend. There are five worthwhile indicators:Past five days is greater than past 2 days and past 14 days indicates the potential of the past 2 days being an aberration. Past five days is greater than past 14 days and less than past 2 days indicates slight positive trend, but likely still within peak trend time frame.Past five days is less than the past 14 days. This means a downward trend. This would be an
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains a list of all african countries affected by covid 19, this statistics was last updated on 09 April 2020, it contains number of confirmed cases, recovered, deaths and name of countries and their regions
with the help of the worldometer and the google searche engine i was able to collect a list of all african countries affected
Worldometer
Your data will be in front of the world's largest data science community. What questions do you want to see answered?
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Our complete COVID-19 dataset is a collection of the COVID-19 data maintained by Our World in Data. We will update it daily throughout the duration of the COVID-19 pandemic (more information on our updating process and schedule here). It includes the following data:
Metrics | Source | Updated | Countries |
---|---|---|---|
Vaccinations | Official data collated by the Our World in Data team | Daily | 218 |
Tests & positivity | Official data collated by the Our World in Data team | Weekly | 151 |
Hospital & ICU | Official data collated by the Our World in Data team | Daily | 47 |
Confirmed cases | JHU CSSE COVID-19 Data | Daily | 216 |
Confirmed deaths | JHU CSSE COVID-19 Data | Daily | 216 |
Reproduction rate | Arroyo-Marioli F, Bullano F, Kucinskas S, Rondón-Moreno C | Daily | 189 |
Policy responses | Oxford COVID-19 Government Response Tracker | Daily | 186 |
Other variables of interest | International organizations (UN, World Bank, OECD, IHME…) | Fixed | 241 |
A specific section of this repository is also dedicated to vaccinations, with a lighter dataset containing only vaccination data.
**Our complete COVID-19 dataset is available in CSV, XLSX, and JSON formats, and inc...
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS RECOVERED reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides values for CORONAVIRUS VACCINATION TOTAL reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
The outbreak of the novel coronavirus in Wuhan, China, saw infection cases spread throughout the Asia-Pacific region. By April 13, 2024, India had faced over 45 million coronavirus cases. South Korea followed behind India as having had the second highest number of coronavirus cases in the Asia-Pacific region, with about 34.6 million cases. At the same time, Japan had almost 34 million cases. At the beginning of the outbreak, people in South Korea had been optimistic and predicted that the number of cases would start to stabilize. What is SARS CoV 2?Novel coronavirus, officially known as SARS CoV 2, is a disease which causes respiratory problems which can lead to difficulty breathing and pneumonia. The illness is similar to that of SARS which spread throughout China in 2003. After the outbreak of the coronavirus, various businesses and shops closed to prevent further spread of the disease. Impacts from flight cancellations and travel plans were felt across the Asia-Pacific region. Many people expressed feelings of anxiety as to how the virus would progress. Impact throughout Asia-PacificThe Coronavirus and its variants have affected the Asia-Pacific region in various ways. Out of all Asia-Pacific countries, India was highly affected by the pandemic and experienced more than 50 thousand deaths. However, the country also saw the highest number of recoveries within the APAC region, followed by South Korea and Japan.
2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Downloadable data:
https://github.com/CSSEGISandData/COVID-19
Additional Information about the Visual Dashboard:
https://systems.jhu.edu/research/public-health/ncov
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This data release includes two Wikipedia datasets related to the readership of the project as it relates to the early COVID-19 pandemic period. The first dataset is COVID-19 article page views by country, the second dataset is one hop navigation where one of the two pages are COVID-19 related. The data covers roughly the first six months of the pandemic, more specifically from January 1st 2020 to June 30th 2020. For more background on the pandemic in those months, see English Wikipedia's Timeline of the COVID-19 pandemic.Wikipedia articles are considered COVID-19 related according the methodology described here, the list of COVID-19 articles used for the released datasets is available in covid_articles.tsv. For simplicity and transparency, the same list of articles from 20 April 2020 was used for the entire dataset though in practice new COVID-19-relevant articles were constantly being created as the pandemic evolved.Privacy considerationsWhile this data is considered valuable for the insight that it can provide about information-seeking behaviors around the pandemic in its early months across diverse geographies, care must be taken to not inadvertently reveal information about the behavior of individual Wikipedia readers. We put in place a number of filters to release as much data as we can while minimizing the risk to readers.The Wikimedia foundation started to release most viewed articles by country from Jan 2021. At the beginning of the COVID-19 an exemption was made to store reader data about the pandemic with additional privacy protections:- exclude the page views from users engaged in an edit session- exclude reader data from specific countries (with a few exceptions)- the aggregated statistics are based on 50% of reader sessions that involve a pageview to a COVID-19-related article (see covid_pages.tsv). As a control, a 1% random sample of reader sessions that have no pageviews to COVID-19-related articles was kept. In aggregate, we make sure this 1% non-COVID-19 sample and 50% COVID-19 sample represents less than 10% of pageviews for a country for that day. The randomization and filters occurs on a daily cadence with all timestamps in UTC.- exclude power users - i.e. userhashes with greater than 500 pageviews in a day. This doubles as another form of likely bot removal, protects very heavy users of the project, and also in theory would help reduce the chance of a single user heavily skewing the data.- exclude readership from users of the iOS and Android Wikipedia apps. In effect, the view counts in this dataset represent comparable trends rather than the total amount of traffic from a given country. For more background on readership data per country data, and the COVID-19 privacy protections in particular, see this phabricator.To further minimize privacy risks, a k-anonymity threshold of 100 was applied to the aggregated counts. For example, a page needs to be viewed at least 100 times in a given country and week in order to be included in the dataset. In addition, the view counts are floored to a multiple of 100.DatasetsThe datasets published in this release are derived from a reader session dataset generated by the code in this notebook with the filtering described above. The raw reader session data itself will not be publicly available due to privacy considerations. The datasets described below are similar to the pageviews and clickstream data that the Wikimedia foundation publishes already, with the addition of the country specific counts.COVID-19 pageviewsThe file covid_pageviews.tsv contains:- pageview counts for COVID-19 related pages, aggregated by week and country- k-anonymity threshold of 100- example: In the 13th week of 2020 (23 March - 29 March 2020), the page 'Pandémie_de_Covid-19_en_Italie' on French Wikipedia was visited 11700 times from readers in Belgium- as a control bucket, we include pageview counts to all pages aggregated by week and country. Due to privacy considerations during the collection of the data, the control bucket was sampled at ~1% of all view traffic. The view counts for the control
title are thus proportional to the total number of pageviews to all pages.The file is ~8 MB and contains ~134000 data points across the 27 weeks, 108 countries, and 168 projects.Covid reader session bigramsThe file covid_session_bigrams.tsv contains:- number of occurrences of visits to pages A -> B, where either A or B is a COVID-19 related article. Note that the bigrams are tuples (from, to) of articles viewed in succession, the underlying mechanism can be clicking on a link in an article, but it may also have been a new search or reading both articles based on links from third source articles. In contrast, the clickstream data is based on referral information only- aggregated by month and country- k-anonymity threshold of 100- example: In March of 2020, there were a 1000 occurences of readers accessing the page es.wikipedia/SARS-CoV-2 followed by es.wikipedia/Orthocoronavirinae from ChileThe file is ~10 MB and contains ~90000 bigrams across the 6 months, 96 countries, and 56 projects.ContactPlease reach out to research-feedback@wikimedia.org for any questions.
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Data on cumulative coronavirus cases and deaths can be found in two files for states and counties.
Each row of data reports cumulative counts based on our best reporting up to the moment we publish an update. We do our best to revise earlier entries in the data when we receive new information.
Both files contain FIPS codes, a standard geographic identifier, to make it easier for an analyst to combine this data with other data sets like a map file or population data.
State-level data can be found in the us-states.csv file.
date,state,fips,cases,deaths
2020-01-21,Washington,53,1,0
...
County-level data can be found in the us-counties.csv file.
date,county,state,fips,cases,deaths
2020-01-21,Snohomish,Washington,53061,1,0
...
In some cases, the geographies where cases are reported do not map to standard county boundaries. See the list of geographic exceptions for more detail on these.
This dataset contains COVID-19 data for the United States of America made available by The New York Times on github at https://github.com/nytimes/covid-19-data
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
List of countries where the survey participants were located, with the corresponding number of participants per country.
Based on a comparison of coronavirus deaths in 210 countries relative to their population, Peru had the most losses to COVID-19 up until July 13, 2022. As of the same date, the virus had infected over 557.8 million people worldwide, and the number of deaths had totaled more than 6.3 million. Note, however, that COVID-19 test rates can vary per country. Additionally, big differences show up between countries when combining the number of deaths against confirmed COVID-19 cases. The source seemingly does not differentiate between "the Wuhan strain" (2019-nCOV) of COVID-19, "the Kent mutation" (B.1.1.7) that appeared in the UK in late 2020, the 2021 Delta variant (B.1.617.2) from India or the Omicron variant (B.1.1.529) from South Africa.
The difficulties of death figures
This table aims to provide a complete picture on the topic, but it very much relies on data that has become more difficult to compare. As the coronavirus pandemic developed across the world, countries already used different methods to count fatalities, and they sometimes changed them during the course of the pandemic. On April 16, for example, the Chinese city of Wuhan added a 50 percent increase in their death figures to account for community deaths. These deaths occurred outside of hospitals and went unaccounted for so far. The state of New York did something similar two days before, revising their figures with 3,700 new deaths as they started to include “assumed” coronavirus victims. The United Kingdom started counting deaths in care homes and private households on April 29, adjusting their number with about 5,000 new deaths (which were corrected lowered again by the same amount on August 18). This makes an already difficult comparison even more difficult. Belgium, for example, counts suspected coronavirus deaths in their figures, whereas other countries have not done that (yet). This means two things. First, it could have a big impact on both current as well as future figures. On April 16 already, UK health experts stated that if their numbers were corrected for community deaths like in Wuhan, the UK number would change from 205 to “above 300”. This is exactly what happened two weeks later. Second, it is difficult to pinpoint exactly which countries already have “revised” numbers (like Belgium, Wuhan or New York) and which ones do not. One work-around could be to look at (freely accessible) timelines that track the reported daily increase of deaths in certain countries. Several of these are available on our platform, such as for Belgium, Italy and Sweden. A sudden large increase might be an indicator that the domestic sources changed their methodology.
Where are these numbers coming from?
The numbers shown here were collected by Johns Hopkins University, a source that manually checks the data with domestic health authorities. For the majority of countries, this is from national authorities. In some cases, like China, the United States, Canada or Australia, city reports or other various state authorities were consulted. In this statistic, these separately reported numbers were put together. For more information or other freely accessible content, please visit our dedicated Facts and Figures page.