In May 2020, up to six percent of all online news and posts related to the coronavirus (COVID-19) and released in Italy were false or not accurate. The percentage was calculated on the average volume of posts and articles published by the Italian media outlets, including posts on social media. The peak in the release of fake news was registered in the early stage of the pandemic at the end of January 2020, with 7.3 percent of the coronavirus-related information.
For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Fact and Figures page.
A survey from April 2020 showed that during the coronavirus (COVID-19) pandemic, 48 percent of Italian online users looked for local news, while 41 percent of them was more interested in the situation across the country. The current situation in foreign countries was searched by 29 percent of respondents.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Coronaviruses are a large family of viruses which may cause illness in animals or humans. In humans, several coronaviruses are known to cause respiratory infections ranging from the common cold to more severe diseases such as Middle East Respiratory Syndrome (MERS) and Severe Acute Respiratory Syndrome (SARS). The most recently discovered coronavirus causes coronavirus disease COVID-19 - WHO
People can catch COVID-19 from others who have the virus. This has been spreading rapidly around the world and Italy is one of the most affected country.
On March 8, 2020 - Italy’s prime minister announced a sweeping coronavirus quarantine early Sunday, restricting the movements of about a quarter of the country’s population in a bid to limit contagions at the epicenter of Europe’s outbreak. - TIME
This dataset is from https://github.com/pcm-dpc/COVID-19
collected by Sito del Dipartimento della Protezione Civile - Emergenza Coronavirus: la risposta nazionale
This dataset has two files
covid19_italy_province.csv
- Province level data of COVID-19 casescovid_italy_region.csv
- Region level data of COVID-19 casesData is collected by Sito del Dipartimento della Protezione Civile - Emergenza Coronavirus: la risposta nazionale and is uploaded into this github repo.
Dashboard on the data can be seen here. Picture courtesy is from the dashboard.
Insights on * Spread to various regions over time * Try to predict the spread of COVID-19 ahead of time to take preventive measures
Italian people perceived the TV newscast as the most reliable source of information regarding the coronavirus (COVID-19), giving a score of 7.3 points. Online news sites and printed media followed in the ranking with 6.8 and 6.6, respectively. For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Facts and Figures page.
After the outbreak of the coronavirus (COVID-19) pandemic in Italy as of February 2020, the number of people trying to be the most up to date with the latest news regarding the the emergency the country was facing increased dramatically. Such attitude by the Italian population could been seen in the growth of news websites audience share. Between the 2nd and 8th March 2020, La7 registered the most significant increase in comparison with the previous weeks (255 percent), followed by ANSA (119.1 percent). The website of the all-news channel Rai News ranked third with a growth of 116.7 percent.For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Fact and Figures page.
A survey from April 2020 showed that Italian people considered TV newscast the most reliable news source regarding the coronavirus (COVID-19). The Government followed in the ranking with 48 percent of individuals seeing it as a reliable news source. News shared by friends and family were perceived as more reliable (20 percent) than radio (17 percent).
After entering Italy, the coronavirus (COVID-19) spread fast. The strict lockdown implemented by the government during the Spring 2020 helped to slow down the outbreak. However, in the following months the country had to face four new harsh waves of contagion. As of January 1, 2025, 198,638 deaths caused by COVID-19 were reported by the authorities, of which approximately 48.7 thousand in the region of Lombardy, 20.1 thousand in the region of Emilia-Romagna, and roughly 17.6 thousand in Veneto, the regions mostly hit. The total number of cases reported in the country reached over 26.9 million. The north of the country was mostly hit, and the region with the highest number of cases was Lombardy, which registered almost 4.4 million of them. The north-eastern region of Veneto counted about 2.9 million cases. Italy's death toll was one of the most tragic in the world. In the last months, however, the country saw the end to this terrible situation: as of November 2023, 85 percent of the total Italian population was fully vaccinated. For a global overview, visit Statista's webpage exclusively dedicated to coronavirus, its development, and its impact.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.
So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.
Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.
Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.
2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC
This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.
The data is available from 22 Jan, 2020.
Here’s a polished version suitable for a professional Kaggle dataset description:
This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv
, with supporting files for earlier records and individual-level line list data.
This is the primary dataset and contains aggregated COVID-19 statistics by location and date.
This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv
.
This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.
Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.
✅ Use covid_19_data.csv
for up-to-date aggregated global trends.
✅ Use the line list datasets for detailed, individual-level case analysis.
If you are interested in knowing country level data, please refer to the following Kaggle datasets:
India - https://www.kaggle.com/sudalairajkumar/covid19-in-india
South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset
Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy
Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil
USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa
Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland
Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases
Johns Hopkins University for making the data available for educational and academic research purposes
MoBS lab - https://www.mobs-lab.org/2019ncov.html
World Health Organization (WHO): https://www.who.int/
DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.
BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/
National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml
China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm
Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html
Macau Government: https://www.ssm.gov.mo/portal/
Taiwan CDC: https://sites.google....
As of January 1, 2025, the number of active coronavirus (COVID-19) infections in Italy was approximately 218,000. Among these, 42 infected individuals were being treated in intensive care units. Another 1,332 individuals infected with the coronavirus were hospitalized with symptoms, while approximately 217,000 thousand were in isolation at home. The total number of coronavirus cases in Italy reached over 26.9 million (including active cases, individuals who recovered, and individuals who died) as of the same date. The region mostly hit by the spread of the virus was Lombardy, which counted almost 4.4 million cases.For a global overview, visit Statista's webpage exclusively dedicated to coronavirus, its development, and its impact.
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
The COVID-19 pandemic generated (and keeps generating) a huge corpus of news articles, easily retrievable in Factiva with very targeted queries.
This dataset, generated with an ad-hoc parser and NLP pipeline, analyzes the frequency of lemmas and named entities in news articles (in German, French, Italian and English ) regarding Switzerland and COVID-19.
The analysis of large bodies of grey literature via text mining and computational linguistics is an increasingly frequent approach to understand the large-scale trends of specific topics. We used Factiva, a news monitoring and search engine developed and owned by Dow Jones, to gather and download all the news articles published between January 2020 and May 2021 on Covid-19 and Switzerland.
Due to Factiva's copyright policy, it is not possible to share the original dataset with the exports of the articles' text; however, we can share the results of our work on the corpus. All the information relevant to reproduce the results is provided.
Factiva allows a very granular definition of the queries, and moreover has access to full text articles published by the major media outlet of the world. The query has been defined as follows (syntax in bold, explanation in italics):
((coronavirus or Wuhan virus or corvid19 or corvid 19 or covid19 or covid 19 or ncov or novel coronavirus or sars) and (atleast3 coronavirus or atleast3 wuhan or atleast3 corvid* or atleast3 covid* or atleast3 ncov or atleast3 novel or atleast3 corona*))
Keywords for covid19; must appear at least 3 times in the text
and ns=(gsars or gout)
Subject is “novel coronaviruses” or “outbreaks and epidemics” and “general news”
and la=X
Language is X (DE, FR, IT, EN)
and rst=tmnb
Restrict to TMNB (major news and business publications)
and wc>300
At least 300 words
and date from 20191001 to 20212005
Date interval
and re=SWITZ
Region is Switzerland
It is important to specify some details that characterize the query. The query is not limited to articles published by Swiss media, but to articles regarding Switzerland. The reason is simple: a Swiss user googling for “Schweiz Coronavirus” or for “Coronavirus Ticino” can easily find and read articles published by foreign media outlets (namely, German or Italian) on that topic. If the objective is capturing and describing the information trends to which people are exposed, this approach makes much more sense than limiting the analysis to articles published by Swiss media. Factiva’s field “NS” is a descriptor for the content of the article. “gsars” is defined in Factiva’s documentation as “All news on Severe Acute Respiratory Syndrome”, and “gout” as “The widespread occurrence of an infectious disease affecting many people or animals in a given population at the same time”; however, the way these descriptors are assigned to articles is not specified in the documentation.
Finally, the query has been restricted to major news and business publications of at least 300 words. Duplicate check is performed by Factiva. Given the incredibly large amount of articles published on COVID-19, this (absolutely arbitrary) restriction allows retrieving a corpus that is both meaningful and manageable.
metadata.xlsx contains information about the articles retrieved (strategy, amount)
This work is part of the PubliCo research project.
BackgroundThe COVID-19 pandemic propelled immunology into global news and social media, resulting in the potential for misinterpreting and misusing complex scientific concepts.ObjectiveTo study the extent to which immunology is discussed in news articles and YouTube videos in English and Italian, and if related scientific concepts are used to support specific political or ideological narratives in the context of COVID-19.MethodsIn English and Italian we searched the period 11/09/2019 to 11/09/2022 on YouTube, using the software Mozdeh, for videos mentioning COVID-19 and one of nine immunological concepts: antibody-dependent enhancement, anergy, cytokine storm, herd immunity, hygiene hypothesis, immunity debt, original antigenic sin, oxidative stress and viral interference. We repeated this using MediaCloud for news articles.Four samples of 200 articles/videos were obtained from the randomised data gathered and analysed for mentions of concepts, stance on vaccines, masks, lockdown, social distancing, and political signifiers.ResultsVaccine-negative information was higher in videos than news (8-fold in English, 6-fold in Italian) and higher in Italian than English (4-fold in news, 3-fold in videos). We also observed the existence of information bubbles, where a negative stance towards one intervention was associated with a negative stance to other linked ideas. Some immunological concepts (immunity debt, viral interference, anergy and original antigenic sin) were associated with anti-vaccine or anti-NPI (non-pharmacological intervention) views. Videos in English mentioned politics more frequently than those in Italian and, in all media and languages, politics was more frequently mentioned in anti-guidelines and anti-vaccine media by a factor of 3 in video and of 3–5 in news.ConclusionThere is evidence that some immunological concepts are used to provide credibility to specific narratives and ideological views. The existence of information bubbles supports the concept of the “rabbit hole” effect, where interest in unconventional views/media leads to ever more extreme algorithmic recommendations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The current study provides data about the immediate risk perceptions and psychological effects of the COVID-19 pandemic among Italian participants. A sample of 980 volunteers answered a web-based survey which aimed to investigate the many facets of risk perceptions connected to COVID-19 (health, work, institutional-economy, interpersonal and psychological), and risk-related variables such as perceived knowledge, news seeking, perceived control, perceived efficacy of containment measures, and affective states. Socio-demographic characteristics were also collected. Results showed that although levels of general concern are relatively high among Italians, risk perceptions are highest with regards to the institutional-economy and work, and lowest concerning health. COVID-19 has been also estimated to be the least likely cause of death. Cognitive and affective risk-related variables contributed to explain the several risk perception domains differently. COVID-19 perceived knowledge did not affect any risk perception while the perceived control decreased health risk likelihood. The other risk-related variables amplified risk perceptions: News seeking increased work and institutional-economy risk; perceived efficacy of containment measures increased almost all perceived risks; negative affective states of fear, anger and sadness increased health risk; anxiety increased health, interpersonal and psychological risks, and uncertainty increased work, institutional-economy, interpersonal and psychological risk perceptions. Finally, positive affective states increased health risk perception. Socio-psychological implications are discussed.
I continue to work on improving this Dataset and will upload as soon as I have an improved version of it. I don't own this dataset, I have merely tried to enrich the data that is gathered from multiple sources by John Hopkins CSSE.
COVID-19 is perhaps the biggest historical event of our lifetime with the kind of destruction and disruption it has already caused to the people around the world. I wanted to build a dashboard summarizing the events from beginning to date and that's the reason I worked on combining all the daily reports into one file.
This file consists of incidents reported from across the world Jan 22 onwards. Incidents are categorized into Confirmed, Deaths and Recovered. Country/Region and/or Province/State information is available. Geo-coordinates are available but these are missing for countries like China
This data belongs to John Hopkins CSSE which they gathered from multiple sources. Below is from JHU Github account, please read before using the dataset.
This is the data repository for the 2019 Novel Coronavirus Visual Dashboard operated by the Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE). Also, Supported by ESRI Living Atlas Team and the Johns Hopkins University Applied Physics Lab (JHU APL).
Visual Dashboard (desktop): https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Visual Dashboard (mobile): http://www.arcgis.com/apps/opsdashboard/index.html#/85320e2ea5424dfaaa75ae62e5c06e61
Lancet Article: An interactive web-based dashboard to track COVID-19 in real time
Provided by Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE): https://systems.jhu.edu/
Data Sources:
World Health Organization (WHO): https://www.who.int/ DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia. BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/ National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html Macau Government: https://www.ssm.gov.mo/portal/ Taiwan CDC: https://sites.google.com/cdc.gov.tw/2019ncov/taiwan?authuser=0 US CDC: https://www.cdc.gov/coronavirus/2019-ncov/index.html Government of Canada: https://www.canada.ca/en/public-health/services/diseases/coronavirus.html Australia Government Department of Health: https://www.health.gov.au/news/coronavirus-update-at-a-glance European Centre for Disease Prevention and Control (ECDC): https://www.ecdc.europa.eu/en/geographical-distribution-2019-ncov-cases Ministry of Health Singapore (MOH): https://www.moh.gov.sg/covid-19 Italy Ministry of Health: http://www.salute.gov.it/nuovocoronavirus 1Point3Arces: https://coronavirus.1point3acres.com/en WorldoMeters: https://www.worldometers.info/coronavirus/
Additional Information about the Visual Dashboard: https://systems.jhu.edu/research/public-health/ncov/
Contact Us:
Email: jhusystems@gmail.com
Terms of Use:
This GitHub repo and its contents herein, including all data, mapping, and analysis, copyright 2020 Johns Hopkins University, all rights reserved, is provided to the public strictly for educational and academic research purposes. The Website relies upon publicly available data from multiple sources, that do not always agree. The Johns Hopkins University hereby disclaims any and all representations and warranties with respect to the Website, including accuracy, fitness for use, and merchantability. Reliance on the Website for medical guidance or use of the Website in commerce is strictly prohibited.
COVID-19 is perhaps the biggest historical event of our lifetime with the kind of destruction and disruption it has already caused to the people around the world. I wanted to build a dashboard summarizing the events from beginning to date and that's the reason I worked on combining all the daily reports into one file.
The first two cases of the new coronavirus (COVID-19) in Italy were recorded between the end of January and the beginning of February 2020. Since then, the number of cases in Italy increased steadily, reaching over 26.9 million as of January 8, 2025. The region mostly hit by the virus in the country was Lombardy, counting almost 4.4 million cases. On January 11, 2022, 220,532 new cases were registered, which represented the biggest daily increase in cases in Italy since the start of the pandemic. The virus originated in Wuhan, a Chinese city populated by millions and located in the province of Hubei. More statistics and facts about the virus in Italy are available here.For a global overview, visit Statista's webpage exclusively dedicated to coronavirus, its development, and its impact.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Online platforms play a relevant role in the creation and diffusion of false or misleading news. Concerningly, the COVID-19 pandemic is shaping a communication network which reflects the emergence of collective attention towards a topic that rapidly gained universal interest. Here, we characterize the dynamics of this network on Twitter, analysing how unreliable content distributes among its users. We find that a minority of accounts is responsible for the majority of the misinformation circulating online, and identify two categories of users: a few active ones, playing the role of ‘creators’, and a majority playing the role of ‘consumers’. The relative proportion of these groups (approx. 14% creators—86% consumers) appears stable over time: consumers are mostly exposed to the opinions of a vocal minority of creators (which are the origin of 82% of fake content in our data), that could be mistakenly understood as representative of the majority of users. The corresponding pressure from a perceived majority is identified as a potential driver of the ongoing COVID-19 infodemic. Methods The datasets that we used in this work come from the COVID-19 Infodemics Observatory (https://covid19obs.fbk.eu/#/). Tweets associated with the COVID-19 pandemics (coronavirus, ncov, #Wuhan, covid19, COVID-19, SARSCoV2, COVID) have been automatically collected using the Twitter Filter API. It contains 7.7 million retweets in the case of USA, 300 thousand in the case of Italy and 900 thousand in the case of the UK. The time of the collection goes from the 22nd of January to the 22nd of May for the USA, while for Italy and the UK it goes from the 22nd of January to the 2nd of December. For each tweet we specified the ID code as well as the time at which it was created. In this dataset one can also find the tables necessary to reproduce exactly the figures in the paper.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In times of uncertainty, people often seek out information to help alleviate fear, possibly leaving them vulnerable to false information. During the COVID-19 pandemic, we attended to a viral spread of incorrect and misleading information that compromised collective actions and public health measures to contain the spread of the disease. We investigated the influence of fear of COVID-19 on social and cognitive factors including believing in fake news, bullshit receptivity, overclaiming, and problem-solving—within two of the populations that have been severely hit by COVID-19: Italy and the United States of America. To gain a better understanding of the role of misinformation during the early height of the COVID-19 pandemic, we also investigated whether problem-solving ability and socio-cognitive polarization were associated with believing in fake news. Results showed that fear of COVID-19 is related to seeking out information about the virus and avoiding infection in the Italian and American samples, as well as a willingness to share real news (COVID and non-COVID-related) headlines in the American sample. However, fear positively correlated with bullshit receptivity, suggesting that the pandemic might have contributed to creating a situation where people were pushed toward pseudo-profound existential beliefs. Furthermore, problem-solving ability was associated with correctly discerning real or fake news, whereas socio-cognitive polarization was the strongest predictor of believing in fake news in both samples. From these results, we concluded that a construct reflecting cognitive rigidity, neglecting alternative information, and black-and-white thinking negatively predicts the ability to discern fake from real news. Such a construct extends also to reasoning processes based on thinking outside the box and considering alternative information such as problem-solving.
News audiences in Norway were the most likely to pay for online news according to a global study on paid digital news content consumption, with 42 percent having paid for news online in the last year. Ranked second was Sweden, followed by Switzerland, Australia, and Austria. With the changing media landscape leading to more and more consumers turning to digital sources to access the news, publishers are adding paywalls on their sites. However, not all consumers are equally inclined to pay for digital news content. Italy and UK news audiences for example were substantially less likely to pay for online news than U.S. consumers. Why pay for online news? The reasons for paying for news are diverse and dependent on various factors. The digitalization of news allows stories to be shared and disseminated on a global scale, but not all sources are reliable or credible. For consumers, it is often difficult to identify trustworthy news sources, and as such which sources they would happily pay for. Consumers may also be reluctant to pay for news because of the sheer amount of free content online. Whilst the availability of free content made news more accessible, at the same time this impacts journalists and publishers. In Finland for example, this has led to a correlated decrease in sales of printed content. As traditional print publications move online, there is also a growing reliance on advertising to generate revenue. Users are encouraged to pay for access to restricted material as publishers limit content to members only. Consumer’s willingness to pay was seen to be dependent on content, with Americans happier to pay for news than features or e-magazines. Impact of the coronavirus With the coronavirus pandemic forcing millions across the globe to stay at home, having access to digital news has never been more crucial, accordingly an increase of subscribers paying for premium news content could be expected. However the health crisis has also led to economic hardship for many, which may instead lead to people cutting out luxuries such as paid news subscriptions. In the UK for example, 2020 saw a decrease in people paying for news content compared to the previous year. With the pandemic dominating news reports, 2020 also saw audiences experience news fatigue, and after a year of news coverage saturated with coronavirus updates, consumers may feel the need to switch off entirely.
A survey from April 2020 showed that 79 percent of Italian people believed Facebook to be responsible for spreading false or not accurate information regarding the coronavirus (COVID-19) and its impact. Data revealed that television was considered less reliable than Twitter or Instagram.
A survey from April 2020 showed that about eight out of ten Italian people believed Facebook to be responsible for spreading false or not accurate information regarding the coronavirus (COVID-19) and its impact. More in detail, 78 percent of male respondents had this opinion, while the percentage amounted to 80 percent among women. However, when it came to information about the pandemic, male respondents seemed to distrust all other news sources more than the female respondents did.
A survey from April 2020 showed that 34 percent of Italian individuals living in the Islands believed newspapers to be responsible for spreading fake or non accurate information about the coronavirus (COVID-19) and its impact. The percentage was lower in the North-West of the country (25 percent).
In May 2020, up to six percent of all online news and posts related to the coronavirus (COVID-19) and released in Italy were false or not accurate. The percentage was calculated on the average volume of posts and articles published by the Italian media outlets, including posts on social media. The peak in the release of fake news was registered in the early stage of the pandemic at the end of January 2020, with 7.3 percent of the coronavirus-related information.
For further information about the coronavirus (COVID-19) pandemic, please visit our dedicated Fact and Figures page.