100+ datasets found

COVID-19 Global Case and Death Data
kaggle.com
zip
Updated Dec 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). COVID-19 Global Case and Death Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/covid-19-global-case-and-death-data
Explore at:
zip(81724234 bytes)Available download formats
Dataset updated
Dec 4, 2023
Authors
The Devastator
Description
COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

By Coronavirus (COVID-19) Data Hub [source]

About this dataset

The COVID-19 Global Time Series Case and Death Data is a comprehensive collection of global COVID-19 case and death information recorded over time. This dataset includes data from various sources such as JHU CSSE COVID-19 Data and The New York Times.

The dataset consists of several columns providing detailed information on different aspects of the COVID-19 situation. The COUNTRY_SHORT_NAME column represents the short name of the country where the data is recorded, while the Data_Source column indicates the source from which the data was obtained.

Other important columns include Cases, which denotes the number of COVID-19 cases reported, and Difference, which indicates the difference in case numbers compared to the previous day. Additionally, there are columns such as CONTINENT_NAME, DATA_SOURCE_NAME, COUNTRY_ALPHA_3_CODE, COUNTRY_ALPHA_2_CODE that provide additional details about countries and continents.

Furthermore, this dataset also includes information on deaths related to COVID-19. The column PEOPLE_DEATH_NEW_COUNT shows the number of new deaths reported on a specific date.

To provide more context to the data, certain columns offer demographic details about locations. For instance, Population_Count provides population counts for different areas. Moreover,**FIPS** code is available for provincial/state regions for identification purposes.

It is important to note that this dataset covers both confirmed cases (Case_Type: confirmed) as well as probable cases (Case_Type: probable). These classifications help differentiate between various types of COVID-19 infections.

Overall, this dataset offers a comprehensive picture of global COVID-19 situations by providing accurate and up-to-date information on cases, deaths, demographic details like population count or FIPS code), source references (such as JHU CSSE or NY Times), geographical information (country names coded with ALPHA codes) , etcetera making it useful for researchers studying patterns and trends associated with this pandemic

How to use the dataset

Understanding the Dataset Structure:

The dataset is available in two files: COVID-19 Activity.csv and COVID-19 Cases.csv.

Both files contain different columns that provide information about the COVID-19 cases and deaths.

Some important columns to look out for are: a. PEOPLE_POSITIVE_CASES_COUNT: The total number of confirmed positive COVID-19 cases. b. COUNTY_NAME: The name of the county where the data is recorded. c. PROVINCE_STATE_NAME: The name of the province or state where the data is recorded. d. REPORT_DATE: The date when the data was reported. e. CONTINENT_NAME: The name of the continent where the data is recorded. f. DATA_SOURCE_NAME: The name of the data source. g. PEOPLE_DEATH_NEW_COUNT: The number of new deaths reported on a specific date. h.COUNTRY_ALPHA_3_CODE :The three-letter alpha code represents country f.Lat,Long :latitude and longitude coordinates represent location i.Country_Region or COUNTRY_SHORT_NAME:The country or region where cases were reported.

Choosing Relevant Columns: It's important to determine which columns are relevant to your analysis or research question before proceeding with further analysis.

Exploring Data Patterns: Use various statistical techniques like summarizing statistics, creating visualizations (e.g., bar charts, line graphs), etc., to explore patterns in different variables over time or across regions/countries.

Filtering Data: You can filter your dataset based on specific criteria using column(s) such as COUNTRY_SHORT_NAME, CONTINENT_NAME, or PROVINCE_STATE_NAME to focus on specific countries, continents, or regions of interest.

Combining Data: You can combine data from different sources (e.g., COVID-19 cases and deaths) to perform advanced analysis or create insightful visualizations.

Analyzing Trends: Use the dataset to analyze and identify trends in COVID-19 cases and deaths over time. You can examine factors such as population count, testing count, hospitalization count, etc., to gain deeper insights into the impact of the virus.

Comparing Countries/Regions: Compare COVID-19

Research Ideas

Trend Analysis: This dataset can be used to analyze and track the trends of COVID-19 cases and deaths over time. It provides comprehensive global data, allowing researchers and po...
COVID-19 Coronavirus data - weekly
kaggle.com
zip
Updated Mar 17, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Habib Gültekin (2022). COVID-19 Coronavirus data - weekly [Dataset]. https://www.kaggle.com/hgultekin/covid19-coronavirus-data-weekly
Explore at:
zip(811658 bytes)Available download formats
Dataset updated
Mar 17, 2022
Authors
Habib Gültekin
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Content

The dataset contains a weekly situation update on COVID-19, the epidemiological curve and the global geographical distribution (EU/EEA and the UK, worldwide).

Since the beginning of the coronavirus pandemic, ECDC’s Epidemic Intelligence team has collected the number of COVID-19 cases and deaths, based on reports from health authorities worldwide. This comprehensive and systematic process was carried out on a daily basis until 14/12/2020. See the discontinued daily dataset: COVID-19 Coronavirus data - daily. ECDC’s decision to discontinue daily data collection is based on the fact that the daily number of cases reported or published by countries is frequently subject to retrospective corrections, delays in reporting and/or clustered reporting of data for several days. Therefore, the daily number of cases may not reflect the true number of cases at EU/EEA level at a given day of reporting. Consequently, day to day variations in the number of cases does not constitute a valid basis for policy decisions.

ECDC continues to monitor the situation. Every week between Monday and Wednesday, a team of epidemiologists screen up to 500 relevant sources to collect the latest figures for publication on Thursday. The data screening is followed by ECDC’s standard epidemic intelligence process for which every single data entry is validated and documented in an ECDC database. An extract of this database, complete with up-to-date figures and data visualisations, is then shared on the ECDC website, ensuring a maximum level of transparency.

ECDC receives regular updates from EU/EEA countries through the Early Warning and Response System (EWRS), The European Surveillance System (TESSy), the World Health Organization (WHO) and email exchanges with other international stakeholders. This information is complemented by screening up to 500 sources every day to collect COVID-19 figures from 196 countries. This includes websites of ministries of health (43% of the total number of sources), websites of public health institutes (9%), websites from other national authorities (ministries of social services and welfare, governments, prime minister cabinets, cabinets of ministries, websites on health statistics and official response teams) (6%), WHO websites and WHO situation reports (2%), and official dashboards and interactive maps from national and international institutions (10%). In addition, ECDC screens social media accounts maintained by national authorities on for example Twitter, Facebook, YouTube or Telegram accounts run by ministries of health (28%) and other official sources (e.g. official media outlets) (2%). Several media and social media sources are screened to gather additional information which can be validated with the official sources previously mentioned. Only cases and deaths reported by the national and regional competent authorities from the countries and territories listed are aggregated in our database.

Disclaimer: National updates are published at different times and in different time zones. This, and the time ECDC needs to process these data, might lead to discrepancies between the national numbers and the numbers published by ECDC. Users are advised to use all data with caution and awareness of their limitations. Data are subject to retrospective corrections; corrected datasets are released as soon as processing of updated national data has been completed.

Source

https://data.europa.eu/euodp/en/data/dataset/covid-19-coronavirus-data-weekly-from-17-december-2020
Global Covid-19 Data
kaggle.com
zip
Updated Dec 3, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Devastator (2023). Global Covid-19 Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/global-covid-19-data
Explore at:
zip(15394324 bytes)Available download formats
Dataset updated
Dec 3, 2023
Authors
The Devastator
Description
Global Covid-19 Data

Global Covid-19 data on cases, deaths, vaccinations, and more

By Valtteri Kurkela [source]

About this dataset

The dataset is constantly updated and synced hourly to ensure up-to-date information. With over several columns available for analysis and exploration purposes, users can extract valuable insights from this extensive dataset.

Some of the key metrics covered in the dataset include:

Vaccinations: The dataset covers total vaccinations administered worldwide as well as breakdowns of people vaccinated per hundred people and fully vaccinated individuals per hundred people.

Testing & Positivity: Information on total tests conducted along with new tests conducted per thousand people is provided. Additionally, details on positive rate (percentage of positive Covid-19 tests out of all conducted) are included.

Hospital & ICU: Data on ICU patients and hospital patients are available along with corresponding figures normalized per million people. Weekly admissions to intensive care units and hospitals are also provided.

Confirmed Cases: The number of confirmed Covid-19 cases globally is captured in both absolute numbers as well as normalized values representing cases per million people.

5.Confirmed Deaths: Total confirmed deaths due to Covid-19 worldwide are provided with figures adjusted for population size (total deaths per million).

6.Reproduction Rate: The estimated reproduction rate (R) indicates the contagiousness of the virus within a particular country or region.

7.Policy Responses: Besides healthcare-related metrics, this comprehensive dataset includes policy responses implemented by countries or regions such as lockdown measures or travel restrictions.

8.Other Variables of InterestThe data encompasses various socioeconomic factors that may influence Covid-19 outcomes including population density,membership in a continent,gross domestic product(GDP)per capita;

For demographic factors: -Age Structure : percentage populations aged 65 and older,aged (70)older,median age -Gender-specific factors: Percentage of female smokers -Lifestyle-related factors: Diabetes prevalence rate and extreme poverty rate

Excess Mortality: The dataset further provides insights into excess mortality rates, indicating the percentage increase in deaths above the expected number based on historical data.

The dataset consists of numerous columns providing specific information for analysis, such as ISO code for countries/regions, location names,and units of measurement for different parameters.

Overall,this dataset serves as a valuable resource for researchers, analysts, and policymakers seeking to explore various aspects related to Covid-19

How to use the dataset

Introduction:

Understanding the Basic Structure:

The dataset consists of various columns containing different data related to vaccinations, testing, hospitalization, cases, deaths, policy responses, and other key variables.

Each row represents data for a specific country or region at a certain point in time.

Selecting Desired Columns:

Identify the specific columns that are relevant to your analysis or research needs.

Some important columns include population, total cases, total deaths, new cases per million people, and vaccination-related metrics.

Filtering Data:

Use filters based on specific conditions such as date ranges or continents to focus on relevant subsets of data.

This can help you analyze trends over time or compare data between different regions.

Analyzing Vaccination Metrics:

Explore variables like total_vaccinations, people_vaccinated, and people_fully_vaccinated to assess vaccination coverage in different countries.

Calculate metrics such as people_vaccinated_per_hundred or total_boosters_per_hundred for standardized comparisons across populations.

Investigating Testing Information:

Examine columns such as total_tests, new_tests, and tests_per_case to understand testing efforts in various countries.

Calculate rates like tests_per_case to assess testing efficiency or identify changes in testing strategies over time.

Exploring Hospitalization and ICU Data:

Analyze variables like hosp_patients, icu_patients, and hospital_beds_per_thousand to understand healthcare systems' strain.

Calculate rates like icu_patients_per_million or hosp_patients_per_million for cross-country comparisons.

Assessing Covid-19 Cases and Deaths:

Analyze variables like total_cases, new_ca...
T
CORONAVIRUS DEATHS by Country Dataset
tradingeconomics.com
csv, excel, json, xml
Updated Mar 4, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
TRADING ECONOMICS (2020). CORONAVIRUS DEATHS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-deaths
Explore at:
csv, excel, xml, jsonAvailable download formats
Dataset updated
Mar 4, 2020
Dataset authored and provided by
TRADING ECONOMICS
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
2025
Area covered
World
Description
This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.
COVID-19 Country Data
kaggle.com
zip
Updated May 3, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Patrick (2020). COVID-19 Country Data [Dataset]. https://www.kaggle.com/datasets/bitsnpieces/covid19-country-data/code
Explore at:
zip(190821 bytes)Available download formats
Dataset updated
May 3, 2020
Authors
Patrick
License
http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/
Description
Motivation

Why did I create this dataset? This is my first time creating a notebook in Kaggle and I am interested in learning more about COVID-19 and how different countries are affected by it and why. It might be useful to compare different metrics between different countries. And I also wanted to participate in a challenge, and I've decided to join the COVID-19 datasets challenge. While looking through the projects, I noticed https://www.kaggle.com/koryto/countryinfo and it inspired me to start this project.

Method

My approach is to scour the Internet and Kaggle looking for country data that can potentially have an impact on how the COVID-19 pandemic spreads. In the end, I ended up with the following for each country:

Monthly temperature and precipitation from Worldbank

Latitude and longitude

Population, density, gender and age

Airport traffic from Worldbank

COVID-19 date of first case and number of cases and deaths as of March 26, 2020

2009 H1N1 flu pandemic cases and deaths obtained from Wikipedia

Property affordability index and Health care index from Numbeo

Number of hospital beds and ICU beds from Wikipedia

Flu and pneumonia death rate from Worldlifeexpectancy.com (Age Adjusted Death Rate Estimates: 2017)

School closures due to COVID-19

Number of COVID-19 tests done

Number of COVID-19 genetic strains

US Social Distancing Policies from COVID19StatePolicy’s SocialDistancing repository on GitHub

DHL Global Connectedness Index 2018 (People Breadth scores)

Datasets have been merged by country name whenever possible. I needed to rename some countries by hand, e.g. US to United Sates, etc. but it's possible that I might have missed some. See the output file covid19_merged.csv for the merged result.

See covid19_data - data_sources.csv for data source details.

Notebook: https://www.kaggle.com/bitsnpieces/covid19-data

Caveats

Since I did not personally collect each datapoint, and because each datasource is different with different objectives, collected at different times, measured in different ways, any inferences from this dataset will need further investigation.

Other interesting sources of information

IMF Policy Tracker

nCov strain analysis across the globe!

Google Mobility Report

Oxford government response tracker

Impact on aviation

Impact on restaurants

Harvard

Tableau datahub

Unemployment outlook from ILO

Acknowledgements

I want to acknowledge the authors of the datasets that made their data publicly available which has made this project possible. Banner image is by Brian.

I hope that the community finds this dataset useful. Feel free to recommend other datasets that you think will be useful / relevant! Thanks for looking.
g
Coronavirus COVID-19 Global Cases by the Center for Systems Science and...
github.com
systems.jhu.edu
+1more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE), Coronavirus COVID-19 Global Cases by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [Dataset]. https://github.com/CSSEGISandData/COVID-19
Explore at:
Dataset provided by
Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE)
Area covered
Global
Description
2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6
Confirmed Cases by Country/Region/Sovereignty
Confirmed Cases by Province/State/Dependency
Deaths
Recovered
Downloadable data:
https://github.com/CSSEGISandData/COVID-19
Additional Information about the Visual Dashboard:
https://systems.jhu.edu/research/public-health/ncov
o
COVID-19 development in Vietnam and 10 neighboring countries in Asia -...
data.opendevelopmentmekong.net
Updated Jul 30, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2020). COVID-19 development in Vietnam and 10 neighboring countries in Asia - Dataset OD Mekong Datahub [Dataset]. https://data.opendevelopmentmekong.net/dataset/covid-19-increasing-in-vietnam-and-14-neighboring-countries-in-asia
Explore at:
Dataset updated
Jul 30, 2020
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Area covered
Mekong River, Asia, Vietnam
Description
The data set provides numbers of COVID-19 infections in Vietnam and some neighboring countries in Asia. In the data set in Vietnam, there is a classification of the total number of new cases, new cases and new infections in the community. The number of infections in the community is the number of unknown infections. The case data is published on a daily basis and is aggregated to the time of current statistics.
Data from: Worldwide differences in COVID-19-related mortality
scielo.figshare.com
jpeg
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pedro Curi Hallal (2023). Worldwide differences in COVID-19-related mortality [Dataset]. http://doi.org/10.6084/m9.figshare.14284478.v1
Explore at:
jpegAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14284478.v1
Dataset updated
Jun 1, 2023
Dataset provided by
SciELOhttp://www.scielo.org/
Authors
Pedro Curi Hallal
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract Mortality statistics due to COVID-19 worldwide are compared, by adjusting for the size of the population and the stage of the pandemic. Data from the European Centre for Disease Control and Prevention, and Our World in Data websites were used. Analyses are based on number of deaths per one million inhabitants. In order to account for the stage of the pandemic, the baseline date was defined as the day in which the 10th death was reported. The analyses included 78 countries and territories which reported 10 or more deaths by April 9. On day 10, India had 0.06 deaths per million, Belgium had 30.46 and San Marino 618.78. On day 20, India had 0.27 deaths per million, China had 0.71 and Spain 139.62. On day 30, four Asian countries had the lowest mortality figures, whereas eight European countries had the highest ones. In Italy and Spain, mortality on day 40 was greater than 250 per million, whereas in China and South Korea, mortality was below 4 per million. Mortality on day 10 was moderately correlated with life expectancy, but not with population density. Asian countries presented much lower mortality figures as compared to European ones. Life expectancy was found to be correlated with mortality.
m
Data from: COVID-19 Datasets for predicting the number of new cases of...
data.mendeley.com
narcis.nl
Updated Jul 28, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pınar Tüfekci (2020). COVID-19 Datasets for predicting the number of new cases of COVID-19 ahead of 1 day, 3 days, and 10 days [Dataset]. http://doi.org/10.17632/499vtcykvw.1
Explore at:
Unique identifier
https://doi.org/10.17632/499vtcykvw.1
Dataset updated
Jul 28, 2020
Authors
Pınar Tüfekci
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Four datasets are presented here. The original dataset is a collection of the COVID-19 data maintained by Our World in Data. It includes data on confirmed cases, and deaths, as well as other variables of potential interest for ten countries such as Australia, Brazil, Canada, China, Denmark, France, Israel, Italy, the United Kingdom, and the United States. The original dataset includes the data from the date of 31st December in 2019 to 31st May in 2020 with a total of 1.530 instances and 19 features. This dataset is collected from a variety of sources (the European Centre for Disease Prevention and Control, United Nations, World Bank, Global Burden of Disease, Blavatnik School of Government, etc.). After the original dataset is pre-processed by cleaning and removing some data including unnecessary and blank. Then, all strings are converted numeric values, and some new features such as continent, hemisphere, year, month, and day are added by extracting the original features. After that, the processed original dataset is organized for prediction of the number of new cases of COVID-19 for 1 day, 3 days, and 10 days ago and three datasets (Dataset-1, 2, 3) are created for that.
Excess mortality by month
ec.europa.eu
Updated Sep 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eurostat (2025). Excess mortality by month [Dataset]. http://doi.org/10.2908/DEMO_MEXRT
Explore at:
tsv, application/vnd.sdmx.data+csv;version=2.0.0, application/vnd.sdmx.data+csv;version=1.0.0, application/vnd.sdmx.genericdata+xml;version=2.1, application/vnd.sdmx.data+xml;version=3.0.0, jsonAvailable download formats
Unique identifier
https://doi.org/10.2908/DEMO_MEXRT
Dataset updated
Sep 16, 2025
Dataset authored and provided by
Eurostathttps://ec.europa.eu/eurostat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jan 2020 - Jun 2025
Area covered
Latvia, Finland, Romania, Poland, France, Hungary, Norway, Germany, Malta, Lithuania
Description
The monthly excess mortality indicator is based on the exceptional data collection on weekly deaths that Eurostat and the National Statistical Institutes set up, in April 2020, in order to support the policy and research efforts related to the COVID-19 pandemic. With that data collection, Eurostat's target was to provide quickly statistics assessing the changing situation of the total number of deaths on a weekly basis, from early 2020 onwards.

The National Statistical Institutes transmit available data on total weekly deaths, classified by sex, 5-year age groups and NUTS3 regions (NUTS2021) over the last 20 years, on a voluntary basis. The resulting online tables, and complementary metadata, are available in the folder Weekly deaths - special data collection (demomwk).

Starting in 2025, the weekly deaths data collected on a quarterly basis. The database updated on the 16th of June 2025 (1st quarter), on the 16 th of September 2025 (2nd quarter), and next update will be in mid-December 2025 (3rd quarter), and mid-February 2026 (4th quarter).

In December 2020, Eurostat released the European Recovery Statistical Dashboard containing also indicators tracking economic and social developments, including health. In this context, “excess mortality” offers elements for monitoring and further analysing direct and indirect effects of the COVID-19 pandemic.

The monthly excess mortality indicator draws attention to the magnitude of the crisis by providing a comprehensive comparison of additional deaths amongst the European countries and allowing for further analysis of its causes. The number of deaths from all causes is compared with the expected number of deaths during a certain period in the past (baseline period, 2016-2019).

The reasons that excess mortality may vary according to different phenomena are that the indicator is comparing the total number of deaths from all causes with the expected number of deaths during a certain period in the past (baseline). While a substantial increase largely coincides with a COVID-19 outbreak in each country, the indicator does not make a distinction between causes of death. Similarly, it does not take into account changes over time and differences between countries in terms of the size and age/sex structure of the population Statistics on excess deaths provide information about the burden of mortality potentially related to the COVID-19 pandemic, thereby covering not only deaths that are directly attributed to the virus but also those indirectly related to or even due to another reason. For example, In July 2022, several countries recorded unusually high numbers of excess deaths compared to the same month of 2020 and 2021, a situation probably connected not only to COVID-19 but also to the heatwaves that affected parts of Europe during the reference period.

In addition to confirmed deaths, excess mortality captures COVID-19 deaths that were not correctly diagnosed and reported, as well as deaths from other causes that may be attributed to the overall crisis. It also accounts for the partial absence of deaths from other causes like accidents that did not occur due, for example, to the limitations in commuting or travel during the lockdown periods.
s
CoVid Plots and Analysis
orda.shef.ac.uk
datasetcatalog.nlm.nih.gov
+2more
txt
Updated Feb 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Colin Angus (2023). CoVid Plots and Analysis [Dataset]. http://doi.org/10.15131/shef.data.12328226.v60
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.15131/shef.data.12328226.v60
Dataset updated
Feb 26, 2023
Dataset provided by
The University of Sheffield
Authors
Colin Angus
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
COVID-19Plots and analysis relating to the coronavirus pandemic. Includes five sets of plots and associated R code to generate them.1) HeatmapsUpdated every few days - heatmaps of COVID-19 case and death trajectories for Local Authorities (or equivalent) in England, Wales, Scotland, Ireland and Germany.2) All cause mortalityUpdated on Tuesday (for England & Wales), Wednesday (for Scotland) and Friday (for Northern Ireland) - analysis and plots of weekly all-cause deaths in 2020 compared to previous years by country, age, sex and region. Also a set of international comparisons using data from mortality.org3) ExposuresNo longer updated - mapping of potential COVID-19 mortality exposure at local levels (LSOAs) in England based on the age-sex structure of the population and levels of poor health.There is also a Shiny app which creates slightly lower resolution versions of the same plots online, which you can find here: https://victimofmaths.shinyapps.io/covidmapper/, on GitHub https://github.com/VictimOfMaths/COVIDmapper and uploaded to this record4) Index of Multiple Deprivation No longer updated - preliminary analysis of the inequality impacts of COVID-19 based on Local Authority level cases and levels of deprivation. 5) Socioeconomic inequalities. No longer updated (unless ONS release more data) - Analysis of published ONS figures of COVID-19 and other cause mortality in 2020 compared to previous years by deprivation decile.Latest versions of plots and associated analysis can be found on Twitter: https://twitter.com/victimofmathsThis work is described in more detail on the UK Data Service Impact and Innovation Lab blog: https://blog.ukdataservice.ac.uk/visualising-high-risk-areas-for-covid-19-mortality/Adapted from data from the Office for National Statistics licensed under the Open Government Licence v.1.0.http://www.nationalarchives.gov.uk/doc/open-government-licence/version/3/
Z
Data on the daily number of new reported COVID-19 cases and deaths by EU/EEA...
data.niaid.nih.gov
Updated Jan 11, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rocca, Marica Teresa (2024). Data on the daily number of new reported COVID-19 cases and deaths by EU/EEA countries [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10491751
Explore at:
Dataset updated
Jan 11, 2024
Dataset provided by
Università degli Studi di Pavia
Authors
Rocca, Marica Teresa
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
European Union
Description
The dataset contains the number of new cases and deaths reported per day and per Country in the EU/EEA.

It is based on data originally downloaded by the site https://www.ecdc.europa.eu/en/covid-19.

Raw data from ECDC, harmonization and homogenization of data from UNIPV - Laboratory of Geomatics
f
Data_Sheet_4_Toward a Country-Based Prediction Model of COVID-19 Infections...
datasetcatalog.nlm.nih.gov
frontiersin.figshare.com
Updated Jun 10, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Howard, Scott C.; Aleya, Lotfi; Wang, Lishi; Gu, Weikuan; Meng, Xia; Xie, Ning; Wang, Yongjun; Gu, Tianshu; Li, Zhijun; Postlethwaite, Arnold (2021). Data_Sheet_4_Toward a Country-Based Prediction Model of COVID-19 Infections and Deaths Between Disease Apex and End: Evidence From Countries With Contained Numbers of COVID-19.PDF [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000850339
Explore at:
Dataset updated
Jun 10, 2021
Authors
Howard, Scott C.; Aleya, Lotfi; Wang, Lishi; Gu, Weikuan; Meng, Xia; Xie, Ning; Wang, Yongjun; Gu, Tianshu; Li, Zhijun; Postlethwaite, Arnold
Description
The complexity of COVID-19 and variations in control measures and containment efforts in different countries have caused difficulties in the prediction and modeling of the COVID-19 pandemic. We attempted to predict the scale of the latter half of the pandemic based on real data using the ratio between the early and latter halves from countries where the pandemic is largely over. We collected daily pandemic data from China, South Korea, and Switzerland and subtracted the ratio of pandemic days before and after the disease apex day of COVID-19. We obtained the ratio of pandemic data and created multiple regression models for the relationship between before and after the apex day. We then tested our models using data from the first wave of the disease from 14 countries in Europe and the US. We then tested the models using data from these countries from the entire pandemic up to March 30, 2021. Results indicate that the actual number of cases from these countries during the first wave mostly fall in the predicted ranges of liniar regression, excepting Spain and Russia. Similarly, the actual deaths in these countries mostly fall into the range of predicted data. Using the accumulated data up to the day of apex and total accumulated data up to March 30, 2021, the data of case numbers in these countries are falling into the range of predicted data, except for data from Brazil. The actual number of deaths in all the countries are at or below the predicted data. In conclusion, a linear regression model built with real data from countries or regions from early pandemics can predict pandemic scales of the countries where the pandemics occur late. Such a prediction with a high degree of accuracy provides valuable information for governments and the public.
Z
INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET
data.niaid.nih.gov
Updated Jul 19, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nafiz Sadman; Nishat Anjum; Kishor Datta Gupta (2024). INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4047647
Explore at:
Dataset updated
Jul 19, 2024
Dataset provided by
Silicon Orchard Lab, Bangladesh
Independent University, Bangladesh
University of Memphis, USA
Authors
Nafiz Sadman; Nishat Anjum; Kishor Datta Gupta
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
United States, Bangladesh
Description
Introduction

There are several works based on Natural Language Processing on newspaper reports. Mining opinions from headlines [ 1 ] using Standford NLP and SVM by Rameshbhaiet. Al.compared several algorithms on a small and large dataset. Rubinet. al., in their paper [ 2 ], created a mechanism to differentiate fake news from real ones by building a set of characteristics of news according to their types. The purpose was to contribute to the low resource data available for training machine learning algorithms. Doumitet. al.in [ 3 ] have implemented LDA, a topic modeling approach to study bias present in online news media.

However, there are not many NLP research invested in studying COVID-19. Most applications include classification of chest X-rays and CT-scans to detect presence of pneumonia in lungs [ 4 ], a consequence of the virus. Other research areas include studying the genome sequence of the virus[ 5 ][ 6 ][ 7 ] and replicating its structure to fight and find a vaccine. This research is crucial in battling the pandemic. The few NLP based research publications are sentiment classification of online tweets by Samuel et el [ 8 ] to understand fear persisting in people due to the virus. Similar work has been done using the LSTM network to classify sentiments from online discussion forums by Jelodaret. al.[ 9 ]. NKK dataset is the first study on a comparatively larger dataset of a newspaper report on COVID-19, which contributed to the virus’s awareness to the best of our knowledge.

2 Data-set Introduction

2.1 Data Collection

We accumulated 1000 online newspaper report from United States of America (USA) on COVID-19. The newspaper includes The Washington Post (USA) and StarTribune (USA). We have named it as “Covid-News-USA-NNK”. We also accumulated 50 online newspaper report from Bangladesh on the issue and named it “Covid-News-BD-NNK”. The newspaper includes The Daily Star (BD) and Prothom Alo (BD). All these newspapers are from the top provider and top read in the respective countries. The collection was done manually by 10 human data-collectors of age group 23- with university degrees. This approach was suitable compared to automation to ensure the news were highly relevant to the subject. The newspaper online sites had dynamic content with advertisements in no particular order. Therefore there were high chances of online scrappers to collect inaccurate news reports. One of the challenges while collecting the data is the requirement of subscription. Each newspaper required $1 per subscriptions. Some criteria in collecting the news reports provided as guideline to the human data-collectors were as follows:

The headline must have one or more words directly or indirectly related to COVID-19.

The content of each news must have 5 or more keywords directly or indirectly related to COVID-19.

The genre of the news can be anything as long as it is relevant to the topic. Political, social, economical genres are to be more prioritized.

Avoid taking duplicate reports.

Maintain a time frame for the above mentioned newspapers.

To collect these data we used a google form for USA and BD. We have two human editor to go through each entry to check any spam or troll entry.

2.2 Data Pre-processing and Statistics

Some pre-processing steps performed on the newspaper report dataset are as follows:

Remove hyperlinks.

Remove non-English alphanumeric characters.

Remove stop words.

Lemmatize text.

While more pre-processing could have been applied, we tried to keep the data as much unchanged as possible since changing sentence structures could result us in valuable information loss. While this was done with help of a script, we also assigned same human collectors to cross check for any presence of the above mentioned criteria.

The primary data statistics of the two dataset are shown in Table 1 and 2.

Table 1: Covid-News-USA-NNK data statistics

No of words per headline

7 to 20

No of words per body content

150 to 2100

Table 2: Covid-News-BD-NNK data statistics No of words per headline

10 to 20

No of words per body content

100 to 1500

2.3 Dataset Repository

We used GitHub as our primary data repository in account name NKK^1. Here, we created two repositories USA-NKK^2 and BD-NNK^3. The dataset is available in both CSV and JSON format. We are regularly updating the CSV files and regenerating JSON using a py script. We provided a python script file for essential operation. We welcome all outside collaboration to enrich the dataset.

3 Literature Review

Natural Language Processing (NLP) deals with text (also known as categorical) data in computer science, utilizing numerous diverse methods like one-hot encoding, word embedding, etc., that transform text to machine language, which can be fed to multiple machine learning and deep learning algorithms.

Some well-known applications of NLP includes fraud detection on online media sites[ 10 ], using authorship attribution in fallback authentication systems[ 11 ], intelligent conversational agents or chatbots[ 12 ] and machine translations used by Google Translate[ 13 ]. While these are all downstream tasks, several exciting developments have been made in the algorithm solely for Natural Language Processing tasks. The two most trending ones are BERT[ 14 ], which uses bidirectional encoder-decoder architecture to create the transformer model, that can do near-perfect classification tasks and next-word predictions for next generations, and GPT-3 models released by OpenAI[ 15 ] that can generate texts almost human-like. However, these are all pre-trained models since they carry huge computation cost. Information Extraction is a generalized concept of retrieving information from a dataset. Information extraction from an image could be retrieving vital feature spaces or targeted portions of an image; information extraction from speech could be retrieving information about names, places, etc[ 16 ]. Information extraction in texts could be identifying named entities and locations or essential data. Topic modeling is a sub-task of NLP and also a process of information extraction. It clusters words and phrases of the same context together into groups. Topic modeling is an unsupervised learning method that gives us a brief idea about a set of text. One commonly used topic modeling is Latent Dirichlet Allocation or LDA[17].

Keyword extraction is a process of information extraction and sub-task of NLP to extract essential words and phrases from a text. TextRank [ 18 ] is an efficient keyword extraction technique that uses graphs to calculate the weight of each word and pick the words with more weight to it.

Word clouds are a great visualization technique to understand the overall ’talk of the topic’. The clustered words give us a quick understanding of the content.

4 Our experiments and Result analysis

We used the wordcloud library^4 to create the word clouds. Figure 1 and 3 presents the word cloud of Covid-News-USA- NNK dataset by month from February to May. From the figures 1,2,3, we can point few information:

In February, both the news paper have talked about China and source of the outbreak.

StarTribune emphasized on Minnesota as the most concerned state. In April, it seemed to have been concerned more.

Both the newspaper talked about the virus impacting the economy, i.e, bank, elections, administrations, markets.

Washington Post discussed global issues more than StarTribune.

StarTribune in February mentioned the first precautionary measurement: wearing masks, and the uncontrollable spread of the virus throughout the nation.

While both the newspaper mentioned the outbreak in China in February, the weight of the spread in the United States are more highlighted through out March till May, displaying the critical impact caused by the virus.

We used a script to extract all numbers related to certain keywords like ’Deaths’, ’Infected’, ’Died’ , ’Infections’, ’Quarantined’, Lock-down’, ’Diagnosed’ etc from the news reports and created a number of cases for both the newspaper. Figure 4 shows the statistics of this series. From this extraction technique, we can observe that April was the peak month for the covid cases as it gradually rose from February. Both the newspaper clearly shows us that the rise in covid cases from February to March was slower than the rise from March to April. This is an important indicator of possible recklessness in preparations to battle the virus. However, the steep fall from April to May also shows the positive response against the attack. We used Vader Sentiment Analysis to extract sentiment of the headlines and the body. On average, the sentiments were from -0.5 to -0.9. Vader Sentiment scale ranges from -1(highly negative to 1(highly positive). There were some cases

where the sentiment scores of the headline and body contradicted each other,i.e., the sentiment of the headline was negative but the sentiment of the body was slightly positive. Overall, sentiment analysis can assist us sort the most concerning (most negative) news from the positive ones, from which we can learn more about the indicators related to COVID-19 and the serious impact caused by it. Moreover, sentiment analysis can also provide us information about how a state or country is reacting to the pandemic. We used PageRank algorithm to extract keywords from headlines as well as the body content. PageRank efficiently highlights important relevant keywords in the text. Some frequently occurring important keywords extracted from both the datasets are: ’China’, Government’, ’Masks’, ’Economy’, ’Crisis’, ’Theft’ , ’Stock market’ , ’Jobs’ , ’Election’, ’Missteps’, ’Health’, ’Response’. Keywords extraction acts as a filter allowing quick searches for indicators in case of locating situations of the economy,
COVID-19 Stats and Mobility Trends
kaggle.com
zip
Updated Mar 28, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Diogo Alex (2021). COVID-19 Stats and Mobility Trends [Dataset]. https://www.kaggle.com/datasets/diogoalex/covid19-stats-and-trends
Explore at:
zip(998511 bytes)Available download formats
Dataset updated
Mar 28, 2021
Authors
Diogo Alex
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
COVID-19 Stats & Trends

Context

This dataset seeks to provide insights into what has changed due to policies aimed at combating COVID-19 and evaluate the changes in community activities and its relation to reduced confirmed cases of COVID-19. The reports chart movement trends, compared to an expected baseline, over time (from 2020/02/15 to 2020/02/05) by geography (across 133 countries), as well as some other stats about the country that might help explain the evolution of the disease.

Content

Grocery & Pharmacy: Mobility trends for places like grocery markets, food warehouses, farmers' markets, specialty food shops, drug stores, and pharmacies.

Parks: Mobility trends for places like national parks, public beaches, marinas, dog parks, plazas, and public gardens.

Residential: Mobility trends for places of residence.

Retail & Recreation: Mobility trends for places like restaurants, cafes, shopping centers, theme parks, museums, libraries, and movie theaters.

Transit stations: Mobility trends for places like public transport hubs such as subway, bus, and train stations.

Workplaces: Mobility trends for places of work.

Total Cases: Total number of people infected with the SARS-CoV-2.

Fatalities: Total number of deaths caused by CoV-19.

Government Response Stringency Index: Additive score of nine indicators of government response to CoV-19: School closures, workplace closures, cancellation of public events, public information campaigns, stay at home policies, restrictions on internal movement, international travel controls, testing policy, and contact tracing.

COVID-19 Testing: Total number of tests performed.

Total Vaccinations: Total number of shots given.

Total People Vaccinated: Total number of people given a shot.

Total People Fully Vaccinated: Total number of people fully vaccinated (might require two shots of some vaccines).

Population: Total number of inhabitants.

Population Density per km2: Number of human inhabitants per square kilometer.

Health System Index: Overall performance of the health system.

Human Development Index (HDI): Summary index based on life expectancy at birth, expected years of schooling for children and mean years of schooling for adults, and GNI per capita.

GDP (PPP) per capita: Gross Domestic Product (GDP) per capita based on Purchasing Power Parity (PPP), taking into account the relative cost of local goods, services and inflation rates of the country, rather than using international market exchange rates, which may distort the real differences in per capita income.

Elderly Population (percentage): Percentage of the population above the age of 65 years old.

References & Acknowledgements

Bing COVID-19 data. Available at: https://github.com/microsoft/Bing-COVID-19-Data COVID-19 Community Mobility Report. Available at: https://www.google.com/covid19/mobility/ COVID-19: Government Response Stringency Index. Available at: https://ourworldindata.org/grapher/covid-stringency-index Coronavirus (COVID-19) Testing. Available at: https://github.com/owid/covid-19-data/blob/master/public/data/testing/covid-testing-all-observations.csv Coronavirus (COVID-19) Vaccination. Available at: https://raw.githubusercontent.com/owid/covid-19-data/master/public/data/vaccinations/vaccinations.csv List of countries and dependencies by population. Available at: https://www.kaggle.com/tanuprabhu/population-by-country-2020 List of countries and dependencies by population density. Available at: https://www.kaggle.com/tanuprabhu/population-by-country-2020 List of countries by Human Development Index. Available at: http://hdr.undp.org/en/data Measuring Overall Health System Performance. Available at: https://www.who.int/healthinfo/paper30.pdf?ua=1 List of countries by GDP (PPP) per capita. Available at: https://data.worldbank.org/indicator/NY.GDP.PCAP.PP.CD List of countries by age structure (65+). Available at: https://data.worldbank.org/indicator/SP.POP.65UP.TO.ZS

Authors

Diogo Silva, up201706892@fe.up.pt
Data_Sheet_1_Systematic Assessment of COVID-19 Pandemic in Bangladesh:...
frontiersin.figshare.com
docx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Priom Saha; Jahida Gulshan (2023). Data_Sheet_1_Systematic Assessment of COVID-19 Pandemic in Bangladesh: Effectiveness of Preparedness in the First Wave.docx [Dataset]. http://doi.org/10.3389/fpubh.2021.628931.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fpubh.2021.628931.s001
Dataset updated
May 30, 2023
Dataset provided by
Frontiers Mediahttp://www.frontiersin.org/
Authors
Priom Saha; Jahida Gulshan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Area covered
Bangladesh
Description
Background: To develop an effective countermeasure and determine our susceptibilities to the outbreak of COVID-19 is challenging for a densely populated developing country like Bangladesh and a systematic review of the disease on a continuous basis is necessary.Methods: Publicly available and globally acclaimed datasets (4 March 2020–30 September 2020) from IEDCR, Bangladesh, JHU, and ECDC database are used for this study. Visual exploratory data analysis is used and we fitted a polynomial model for the number of deaths. A comparison of Bangladesh scenario over different time points as well as with global perspectives is made.Results: In Bangladesh, the number of active cases had decreased, after reaching a peak, with a constant pattern of death rate at from July to the end of September, 2020. Seventy-one percent of the cases and 77% of the deceased were males. People aged between 21 and 40 years were most vulnerable to the coronavirus and most of the fatalities (51.49%) were in the 60+ population. A strong positive correlation (0.93) between the number of tests and confirmed cases and a constant incidence rate (around 21%) from June 1 to August 31, 2020 was observed. The case fatality ratio was between 1 and 2. The number of cases and the number of deaths in Bangladesh were much lower compared to other countries.Conclusions: This study will help to understand the patterns of spread and transition in Bangladesh, possible measures, effectiveness of the preparedness, implementation gaps, and their consequences to gather vital information and prevent future pandemics.
n
Coronavirus (Covid-19) Data in the United States
nytimes.com
openicpsr.org
+4more
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
New York Times, Coronavirus (Covid-19) Data in the United States [Dataset]. https://www.nytimes.com/interactive/2020/us/coronavirus-us-cases.html
Explore at:
Dataset provided by
New York Times
Description
The New York Times is releasing a series of data files with cumulative counts of coronavirus cases in the United States, at the state and county level, over time. We are compiling this time series data from state and local governments and health departments in an attempt to provide a complete record of the ongoing outbreak.
Since late January, The Times has tracked cases of coronavirus in real time as they were identified after testing. Because of the widespread shortage of testing, however, the data is necessarily limited in the picture it presents of the outbreak.
We have used this data to power our maps and reporting tracking the outbreak, and it is now being made available to the public in response to requests from researchers, scientists and government officials who would like access to the data to better understand the outbreak.
The data begins with the first reported coronavirus case in Washington State on Jan. 21, 2020. We will publish regular updates to the data in this repository.
Descriptive statistics.
plos.figshare.com
bin
Updated Oct 23, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Bergmann; Melanie Wagner (2023). Descriptive statistics. [Dataset]. http://doi.org/10.1371/journal.pone.0287158.t001
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0287158.t001
Dataset updated
Oct 23, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Michael Bergmann; Melanie Wagner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The COVID-19 pandemic began impacting Europe in early 2020, posing significant challenges for individuals requiring care. This group is particularly susceptible to severe COVID-19 infections and depends on regular health care services. In this article, we examine the situation of European care recipients aged 50 years and older 18 months after the pandemic outbreak and compare it to the initial phase of the pandemic. In the descriptive section, we illustrate the development of (unmet) care needs and access to health care throughout the pandemic. Additionally, we explore regional variations in health care receipt across Europe. In the analytical section, we shed light on the mid- and long-term health consequences of COVID-19-related restrictions on accessing health care services by making comparisons between care recipients and individuals without care needs. We conducted an analysis using data from the representative Corona Surveys of the Survey of Health, Ageing and Retirement in Europe (SHARE). Our study examines changes in approximately 3,400 care-dependent older Europeans (aged 50+) interviewed in 2020 and 2021, comparing them with more than 45,000 respondents not receiving care. The dataset provides a cross-national perspective on care recipients across 27 European countries and Israel. Our findings reveal that in 2021, compared to the previous year, difficulties in obtaining personal care from someone outside the household were significantly reduced in Western and Southern European countries. Access to health care services improved over the course of the pandemic, particularly with respect to medical treatments and appointments that had been canceled by health care institutions. However, even 18 months after the COVID-19 outbreak, a considerable number of treatments had been postponed either by respondents themselves or by health care institutions. These delayed medical treatments had adverse effects on the physical and mental health of both care receivers and individuals who did not rely on care.
V
Dataset from International Registry of Healthcare Workers Exposed to...
data-staging.niaid.nih.gov
data.niaid.nih.gov
Updated Feb 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Certara; Roman Casciano, Masters; Craig Rayner, PharmD (2025). Dataset from International Registry of Healthcare Workers Exposed to COVID-19 Patients [Dataset]. http://doi.org/10.25934/PR00007500
Explore at:
Unique identifier
https://doi.org/10.25934/PR00007500
Dataset updated
Feb 22, 2025
Dataset provided by
Certara
Authors
Certara; Roman Casciano, Masters; Craig Rayner, PharmD
Area covered
Kenya, Pakistan, Uganda, Senegal, Zambia, Nigeria, South Africa
Variables measured
Hospitalization, SARS-CoV-2 Virus, All Cause Mortality, Administration Of Prophylactic Treatment
Description
The International Registry of Healthcare Workers Exposed to COVID-19 Patients (UNITY Global), is an international registry of approximately 10,000 healthcare workers in low- and middle-income countries experiencing increasing numbers of COVID-19 cases and commensurate increased exposure to the SARS-CoV-2 virus among their healthcare worker populations.
P
[Archived] COVID-19 cases in Pacific Island Countries and Territories
pacificdata.org
pacific-data.sprep.org
csv
Updated May 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
SPC (2025). [Archived] COVID-19 cases in Pacific Island Countries and Territories [Dataset]. https://pacificdata.org/data/dataset/archived-covid-19-cases-in-pacific-island-countries-and-territories-df-covid
Explore at:
csvAvailable download formats
Dataset updated
May 22, 2025
Dataset provided by
SPC
Time period covered
Jan 1, 2020 - May 31, 2024
Description
Disclaimer: As of January 2025, SPC will no longer provide updated information on COVID-19 cases and deaths. The information presented on this page is for reference only. For current epidemic and emerging disease alerts in the Pacific region, please visit: https://www.spc.int/epidemics/

Statistics from SPC's Public Health Division (PHD) on the number of cases of COVID-19 and the number of deaths attributed to COVID-19 in Pacific Island Countries and Territories.

Find more Pacific data on PDH.stat.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Devastator (2023). COVID-19 Global Case and Death Data [Dataset]. https://www.kaggle.com/datasets/thedevastator/covid-19-global-case-and-death-data

COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

Explore at:

zip(81724234 bytes)Available download formats

Dataset updated

Dec 4, 2023

Authors

The Devastator

Description

COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

By Coronavirus (COVID-19) Data Hub [source]

About this dataset

The COVID-19 Global Time Series Case and Death Data is a comprehensive collection of global COVID-19 case and death information recorded over time. This dataset includes data from various sources such as JHU CSSE COVID-19 Data and The New York Times.

The dataset consists of several columns providing detailed information on different aspects of the COVID-19 situation. The COUNTRY_SHORT_NAME column represents the short name of the country where the data is recorded, while the Data_Source column indicates the source from which the data was obtained.

Other important columns include Cases, which denotes the number of COVID-19 cases reported, and Difference, which indicates the difference in case numbers compared to the previous day. Additionally, there are columns such as CONTINENT_NAME, DATA_SOURCE_NAME, COUNTRY_ALPHA_3_CODE, COUNTRY_ALPHA_2_CODE that provide additional details about countries and continents.

Furthermore, this dataset also includes information on deaths related to COVID-19. The column PEOPLE_DEATH_NEW_COUNT shows the number of new deaths reported on a specific date.

To provide more context to the data, certain columns offer demographic details about locations. For instance, Population_Count provides population counts for different areas. Moreover,**FIPS** code is available for provincial/state regions for identification purposes.

It is important to note that this dataset covers both confirmed cases (Case_Type: confirmed) as well as probable cases (Case_Type: probable). These classifications help differentiate between various types of COVID-19 infections.

Overall, this dataset offers a comprehensive picture of global COVID-19 situations by providing accurate and up-to-date information on cases, deaths, demographic details like population count or FIPS code), source references (such as JHU CSSE or NY Times), geographical information (country names coded with ALPHA codes) , etcetera making it useful for researchers studying patterns and trends associated with this pandemic

How to use the dataset

Understanding the Dataset Structure:

The dataset is available in two files: COVID-19 Activity.csv and COVID-19 Cases.csv.

Both files contain different columns that provide information about the COVID-19 cases and deaths.

Some important columns to look out for are: a. PEOPLE_POSITIVE_CASES_COUNT: The total number of confirmed positive COVID-19 cases. b. COUNTY_NAME: The name of the county where the data is recorded. c. PROVINCE_STATE_NAME: The name of the province or state where the data is recorded. d. REPORT_DATE: The date when the data was reported. e. CONTINENT_NAME: The name of the continent where the data is recorded. f. DATA_SOURCE_NAME: The name of the data source. g. PEOPLE_DEATH_NEW_COUNT: The number of new deaths reported on a specific date. h.COUNTRY_ALPHA_3_CODE :The three-letter alpha code represents country f.Lat,Long :latitude and longitude coordinates represent location i.Country_Region or COUNTRY_SHORT_NAME:The country or region where cases were reported.

Choosing Relevant Columns: It's important to determine which columns are relevant to your analysis or research question before proceeding with further analysis.

Exploring Data Patterns: Use various statistical techniques like summarizing statistics, creating visualizations (e.g., bar charts, line graphs), etc., to explore patterns in different variables over time or across regions/countries.

Filtering Data: You can filter your dataset based on specific criteria using column(s) such as COUNTRY_SHORT_NAME, CONTINENT_NAME, or PROVINCE_STATE_NAME to focus on specific countries, continents, or regions of interest.

Combining Data: You can combine data from different sources (e.g., COVID-19 cases and deaths) to perform advanced analysis or create insightful visualizations.

Analyzing Trends: Use the dataset to analyze and identify trends in COVID-19 cases and deaths over time. You can examine factors such as population count, testing count, hospitalization count, etc., to gain deeper insights into the impact of the virus.

Comparing Countries/Regions: Compare COVID-19

Research Ideas

Trend Analysis: This dataset can be used to analyze and track the trends of COVID-19 cases and deaths over time. It provides comprehensive global data, allowing researchers and po...

Clear search

Close search

Google apps

Main menu

COVID-19 Global Case and Death Data

COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

About this dataset

How to use the dataset

Research Ideas

COVID-19 Coronavirus data - weekly

Content

Source

Global Covid-19 Data

Global Covid-19 Data

Global Covid-19 data on cases, deaths, vaccinations, and more

About this dataset

How to use the dataset

CORONAVIRUS DEATHS by Country Dataset

COVID-19 Country Data

Motivation

Method

Caveats

Other interesting sources of information

Acknowledgements

Coronavirus COVID-19 Global Cases by the Center for Systems Science and...

COVID-19 development in Vietnam and 10 neighboring countries in Asia -...

Data from: Worldwide differences in COVID-19-related mortality

Data from: COVID-19 Datasets for predicting the number of new cases of...

Excess mortality by month

CoVid Plots and Analysis

Data on the daily number of new reported COVID-19 cases and deaths by EU/EEA...

Data_Sheet_4_Toward a Country-Based Prediction Model of COVID-19 Infections...

INTRODUCTION OF COVID-NEWS-US-NNK AND COVID-NEWS-BD-NNK DATASET

COVID-19 Stats and Mobility Trends

COVID-19 Stats & Trends

Context

Content

References & Acknowledgements

Authors

Data_Sheet_1_Systematic Assessment of COVID-19 Pandemic in Bangladesh:...

Coronavirus (Covid-19) Data in the United States

Descriptive statistics.

Dataset from International Registry of Healthcare Workers Exposed to...

[Archived] COVID-19 cases in Pacific Island Countries and Territories

COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

COVID-19 Global Case and Death Data

Global COVID-19 Cases and Deaths Over Time

About this dataset

How to use the dataset

Research Ideas