100+ datasets found
  1. Novel Covid-19 Dataset

    • kaggle.com
    Updated Sep 18, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GHOST5612 (2025). Novel Covid-19 Dataset [Dataset]. https://www.kaggle.com/datasets/ghost5612/novel-covid-19-dataset
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Sep 18, 2025
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    GHOST5612
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Context:

    From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.

    So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.

    Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.

    Edited:

    Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.

    Content

    2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC

    This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.

    The data is available from 22 Jan, 2020.

    Here’s a polished version suitable for a professional Kaggle dataset description:

    Dataset Description

    This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv, with supporting files for earlier records and individual-level line list data.

    Files and Columns

    1. covid_19_data.csv (Main File)

    This is the primary dataset and contains aggregated COVID-19 statistics by location and date.

    • Sno – Serial number of the record
    • ObservationDate – Date of the observation (MM/DD/YYYY)
    • Province/State – Province or state of the observation (may be missing for some entries)
    • Country/Region – Country of the observation
    • Last Update – Timestamp (UTC) when the record was last updated (not standardized, requires cleaning before use)
    • Confirmed – Cumulative number of confirmed cases on that date
    • Deaths – Cumulative number of deaths on that date
    • Recovered – Cumulative number of recoveries on that date

    2. 2019_ncov_data.csv (Legacy File)

    This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv.

    3. COVID_open_line_list_data.csv

    This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.

    4. COVID19_line_list_data.csv

    Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.

    ✅ Use covid_19_data.csv for up-to-date aggregated global trends.

    ✅ Use the line list datasets for detailed, individual-level case analysis.

    Country level datasets:

    If you are interested in knowing country level data, please refer to the following Kaggle datasets:

    India - https://www.kaggle.com/sudalairajkumar/covid19-in-india

    South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset

    Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy

    Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil

    USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa

    Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland

    Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases

    Acknowledgements :

    Johns Hopkins University for making the data available for educational and academic research purposes

    MoBS lab - https://www.mobs-lab.org/2019ncov.html

    World Health Organization (WHO): https://www.who.int/

    DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.

    BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/

    National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml

    China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm

    Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html

    Macau Government: https://www.ssm.gov.mo/portal/

    Taiwan CDC: https://sites.google....

  2. All Country's COVID-19 Dataset.

    • kaggle.com
    zip
    Updated Oct 23, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aman Sharma (2024). All Country's COVID-19 Dataset. [Dataset]. https://www.kaggle.com/datasets/aman2626786/all-countrys-covid-19-dataset
    Explore at:
    zip(12187 bytes)Available download formats
    Dataset updated
    Oct 23, 2024
    Authors
    Aman Sharma
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset provides comprehensive statistics on COVID-19 for countries around the world. It includes data on the number of active cases, critical cases, total deaths, and total tests conducted. The dataset is updated frequently to ensure the most current information is available.

    Key Features:

    Global Coverage: Data for countries across all continents, including Asia, Africa, Europe, North America, South America, and Oceania. Detailed Statistics: Includes metrics such as active cases, critical cases, total deaths, and total tests. Population Data: Provides population figures for each country to contextualize the COVID-19 statistics. Frequent Updates: The dataset is updated regularly to reflect the latest information.

  3. T

    CORONAVIRUS CASES by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS CASES by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-cases
    Explore at:
    excel, json, csv, xmlAvailable download formats
    Dataset updated
    Mar 4, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for CORONAVIRUS CASES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  4. COVID-19 First Case Date By Country (Coronavirus)

    • kaggle.com
    zip
    Updated May 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Glynn (2020). COVID-19 First Case Date By Country (Coronavirus) [Dataset]. https://www.kaggle.com/datasets/josephglynn/covid19-first-case-date-by-country-coronavirus/code
    Explore at:
    zip(3258 bytes)Available download formats
    Dataset updated
    May 20, 2020
    Authors
    Joseph Glynn
    Description

    Context

    This data was collected as part of a university research paper where COVID-19 cases were analysed using a cross-sectional regression model as at 17th May 2020. In order to better understand COVID-19 cases growth at a country level I decided to create a dataset containing key dates in the progression of the virus globally.

    Content

    210 rows, 6 columns.

    This dataset contains data relating to COVID-19 cases for 210 countries globally. Data was collected using the most recent and reliable information as at 17th May 2020. The majority of data was collected from Worldometer. https://www.worldometers.info/coronavirus/#countries

    This dataset contains dates for the 1st coronavirus case, 100th coronavirus case, and (50th coronavirus case per 1 million people) for 210 countries. Data is also provided for the number of days between the 1st case and the 100th as well as the 1st case and the 50th per 1 million people.

    Data prior to 15th February 2020, was not easily accessible at the country level from Worldometer. Therefore any dates prior to 15th February 2020 were not sourced from Worldometer but reputable government and local media sources.

    Blanks (null values) indicate that the country in question has not reached either 50 coronavirus cases per 1 million people or 100 coronavirus cases. These were left blank.

    Acknowledgements

    I would like to acknowledge Worldometer for providing the vast majority of the data in this file. Worldometer is a website that provides real time statistics on topics such as coronavirus cases. Its sources include government official reports as well as trusted local media sources all of which are referenced on their website.

    Inspiration

    Hopefully this data can be used to better understand the growth of COVID-19 cases globally.

  5. g

    Coronavirus COVID-19 Global Cases by the Center for Systems Science and...

    • github.com
    • systems.jhu.edu
    • +1more
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE), Coronavirus COVID-19 Global Cases by the Center for Systems Science and Engineering (CSSE) at Johns Hopkins University (JHU) [Dataset]. https://github.com/CSSEGISandData/COVID-19
    Explore at:
    Dataset provided by
    Johns Hopkins University Center for Systems Science and Engineering (JHU CSSE)
    Area covered
    Global
    Description

    2019 Novel Coronavirus COVID-19 (2019-nCoV) Visual Dashboard and Map:
    https://www.arcgis.com/apps/opsdashboard/index.html#/bda7594740fd40299423467b48e9ecf6

    • Confirmed Cases by Country/Region/Sovereignty
    • Confirmed Cases by Province/State/Dependency
    • Deaths
    • Recovered

    Downloadable data:
    https://github.com/CSSEGISandData/COVID-19

    Additional Information about the Visual Dashboard:
    https://systems.jhu.edu/research/public-health/ncov

  6. COVID-19 Cases by Country

    • console.cloud.google.com
    Updated Jul 23, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    https://console.cloud.google.com/marketplace/browse?filter=partner:European%20Centre%20for%20Disease%20Prevention%20and%20Control (2020). COVID-19 Cases by Country [Dataset]. https://console.cloud.google.com/marketplace/product/european-cdc/covid-19-global-cases
    Explore at:
    Dataset updated
    Jul 23, 2020
    Dataset provided by
    Googlehttp://google.com/
    Description

    This dataset is maintained by the European Centre for Disease Prevention and Control (ECDC) and reports on the geographic distribution of COVID-19 cases worldwide. This data includes COVID-19 reported cases and deaths broken out by country. This data can be visualized via ECDC’s Situation Dashboard . More information on ECDC’s response to COVID-19 is available here . This public dataset is hosted in Google BigQuery and is included in BigQuery's 1TB/mo of free tier processing. This means that each user receives 1TB of free BigQuery processing every month, which can be used to run queries on this public dataset. Watch this short video to learn how to get started quickly using BigQuery to access public datasets. What is BigQuery . This dataset is hosted in both the EU and US regions of BigQuery. See the links below for the appropriate dataset copy: US region EU region This dataset has significant public interest in light of the COVID-19 crisis. All bytes processed in queries against this dataset will be zeroed out, making this part of the query free. Data joined with the dataset will be billed at the normal rate to prevent abuse. After September 15, queries over these datasets will revert to the normal billing rate. Users of ECDC public-use data files must comply with data use restrictions to ensure that the information will be used solely for statistical analysis or reporting purposes.

  7. m

    Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First...

    • data.mendeley.com
    Updated Jul 20, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hasmot Ali (2020). Data for: COVID-19 Dataset: Worldwide Spread Log Including Countries First Case And First Death [Dataset]. http://doi.org/10.17632/vw427wzzkk.5
    Explore at:
    Dataset updated
    Jul 20, 2020
    Authors
    Hasmot Ali
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Contain informative data related to COVID-19 pandemic. Specially, figure out about the First Case and First Death information for every single country. The datasets mainly focus on two major fields first one is First Case which consists of information of Date of First Case(s), Number of confirm Case(s) at First Day, Age of the patient(s) of First Case, Last Visited Country and the other one First Death information consist of Date of First Death and Age of the Patient who died first for every Country mentioning corresponding Continent. The datasets also contain the Binary Matrix of spread chain among different country and region.

    *This is not a country. This is a ship. The name of the Cruise Ship was not given from the government.
    "N+": the age is not specified but greater than N
    “No Trace”: some data was not found
    “Unspecified”: not available from the authority
    “N/A”: for “Last Visited Country(s) of Confirmed Case(s)” column, “N/A” indicates that the confirmed case(s) of those countries do not have any travel history in recent past; in “Age of First Death(s)” column “N/A” indicates that those countries do not have may death case till May 16, 2020.

  8. WHO COVID-19 cases - dataset

    • kaggle.com
    zip
    Updated Nov 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Iron Wolf (2024). WHO COVID-19 cases - dataset [Dataset]. https://www.kaggle.com/datasets/ironwolf437/who-covid-19-cases-dataset/code
    Explore at:
    zip(597020 bytes)Available download formats
    Dataset updated
    Nov 19, 2024
    Authors
    Iron Wolf
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Overview:

    • This dataset includes the number of coronavirus cases and deaths, and the time of recording each case by country.
    • I have added a new column, continent, to provide more information about the number of infections in each continent.

    Objective:

    The objective of this dataset is to: 1. Perform a data analysis to find out the number of cases and deaths in each country and continent, and compare the times when the number of cases or deaths was available. 2. Create a machine learning model (if possible) to train the model to predict deaths based on the cases and deaths.

    • The dataset is allowed to be modified during the analysis.

    Details of the columns:

    | | Columns | Description | Type | |----------------------|---------------------------------------------------------------------------------|---------------| |1| Date_reported | Date of recording the number of cases, whether infected or dead. | Categorical | |2| Country_code | A standardized code (ISO-3166) for the country (e.g., US for the United States).| Categorical | |3| Country | The full name of the country where the data is collected. | Categorical | |4| Continent | The continent on which the country is located (e.g., Asia, Europe). | Categorical | |5| WHO_region | The specific WHO region that the country belongs to (e.g., AFRO, EMRO, EURO). | Categorical | |6| New_cases | The number of newly reported COVID-19 cases on the reporting date. | Numerical (float) | |7| Cumulative_cases | The total number of COVID-19 cases reported in the country to date. | Numerical (int) | |8| New_deaths | The number of newly reported deaths due to COVID-19 on the reporting date. | Numerical (float) | |9| Cumulative_deaths | The total number of deaths due to COVID-19 reported in the country to date. | Numerical (int) |

  9. T

    World Coronavirus COVID-19 Cases

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 9, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). World Coronavirus COVID-19 Cases [Dataset]. https://tradingeconomics.com/world/coronavirus-cases
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Mar 9, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jan 4, 2020 - May 17, 2023
    Area covered
    World
    Description

    The World Health Organization reported 766440796 Coronavirus Cases since the epidemic began. In addition, countries reported 6932591 Coronavirus Deaths. This dataset provides - World Coronavirus Cases- actual values, historical data, forecast, chart, statistics, economic calendar and news.

  10. P

    [Archived] COVID-19 cases in Pacific Island Countries and Territories

    • pacificdata.org
    • pacific-data.sprep.org
    csv
    Updated May 22, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    SPC (2025). [Archived] COVID-19 cases in Pacific Island Countries and Territories [Dataset]. https://pacificdata.org/data/dataset/archived-covid-19-cases-in-pacific-island-countries-and-territories-df-covid
    Explore at:
    csvAvailable download formats
    Dataset updated
    May 22, 2025
    Dataset provided by
    SPC
    Time period covered
    Jan 1, 2020 - May 31, 2024
    Description

    Disclaimer: As of January 2025, SPC will no longer provide updated information on COVID-19 cases and deaths. The information presented on this page is for reference only. For current epidemic and emerging disease alerts in the Pacific region, please visit: https://www.spc.int/epidemics/

    Statistics from SPC's Public Health Division (PHD) on the number of cases of COVID-19 and the number of deaths attributed to COVID-19 in Pacific Island Countries and Territories.

    Find more Pacific data on PDH.stat.

  11. e

    COVID-19 Coronavirus data - weekly (from 17 December 2020)

    • data.europa.eu
    csv, excel xlsx, html +3
    Updated Dec 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    European Centre for Disease Prevention and Control (2020). COVID-19 Coronavirus data - weekly (from 17 December 2020) [Dataset]. https://data.europa.eu/data/datasets/covid-19-coronavirus-data-weekly-from-17-december-2020?locale=en
    Explore at:
    html, csv, json, unknown, xml, excel xlsxAvailable download formats
    Dataset updated
    Dec 17, 2020
    Dataset authored and provided by
    European Centre for Disease Prevention and Control
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The dataset contains a weekly situation update on COVID-19, the epidemiological curve and the global geographical distribution (EU/EEA and the UK, worldwide).

    Since the beginning of the coronavirus pandemic, ECDC’s Epidemic Intelligence team has collected the number of COVID-19 cases and deaths, based on reports from health authorities worldwide. This comprehensive and systematic process was carried out on a daily basis until 14/12/2020. See the discontinued daily dataset: COVID-19 Coronavirus data - daily. ECDC’s decision to discontinue daily data collection is based on the fact that the daily number of cases reported or published by countries is frequently subject to retrospective corrections, delays in reporting and/or clustered reporting of data for several days. Therefore, the daily number of cases may not reflect the true number of cases at EU/EEA level at a given day of reporting. Consequently, day to day variations in the number of cases does not constitute a valid basis for policy decisions.

    ECDC continues to monitor the situation. Every week between Monday and Wednesday, a team of epidemiologists screen up to 500 relevant sources to collect the latest figures for publication on Thursday. The data screening is followed by ECDC’s standard epidemic intelligence process for which every single data entry is validated and documented in an ECDC database. An extract of this database, complete with up-to-date figures and data visualisations, is then shared on the ECDC website, ensuring a maximum level of transparency.

    ECDC receives regular updates from EU/EEA countries through the Early Warning and Response System (EWRS), The European Surveillance System (TESSy), the World Health Organization (WHO) and email exchanges with other international stakeholders. This information is complemented by screening up to 500 sources every day to collect COVID-19 figures from 196 countries. This includes websites of ministries of health (43% of the total number of sources), websites of public health institutes (9%), websites from other national authorities (ministries of social services and welfare, governments, prime minister cabinets, cabinets of ministries, websites on health statistics and official response teams) (6%), WHO websites and WHO situation reports (2%), and official dashboards and interactive maps from national and international institutions (10%). In addition, ECDC screens social media accounts maintained by national authorities on for example Twitter, Facebook, YouTube or Telegram accounts run by ministries of health (28%) and other official sources (e.g. official media outlets) (2%). Several media and social media sources are screened to gather additional information which can be validated with the official sources previously mentioned. Only cases and deaths reported by the national and regional competent authorities from the countries and territories listed are aggregated in our database.

    Disclaimer: National updates are published at different times and in different time zones. This, and the time ECDC needs to process these data, might lead to discrepancies between the national numbers and the numbers published by ECDC. Users are advised to use all data with caution and awareness of their limitations. Data are subject to retrospective corrections; corrected datasets are released as soon as processing of updated national data has been completed.

    If you reuse or enrich this dataset, please share it with us.

  12. T

    CORONAVIRUS DEATHS by Country Dataset

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS DEATHS by Country Dataset [Dataset]. https://tradingeconomics.com/country-list/coronavirus-deaths
    Explore at:
    csv, excel, xml, jsonAvailable download formats
    Dataset updated
    Mar 4, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    World
    Description

    This dataset provides values for CORONAVIRUS DEATHS reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  13. D

    Covid-19 Country Level Social Science Dataset

    • dataverse.no
    • dataverse.azure.uit.no
    application/dbf +10
    Updated Oct 20, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Øystein Solvang; Øystein Solvang; Kari Elida Eriksen; Jonas Stein; Camilla Brattland; Kari Elida Eriksen; Jonas Stein; Camilla Brattland (2020). Covid-19 Country Level Social Science Dataset [Dataset]. http://doi.org/10.18710/VMUP44
    Explore at:
    type/x-r-syntax(11257), csv(36577), application/prj(146), type/x-r-syntax(12007), application/shx(2140), application/dbf(323441), bin(5), txt(9844), application/sbn(2796), application/prj(145), application/shp(8800376), type/x-r-syntax(4038), application/sbx(349), pdf(189956), bin(6), csv(41050), pdf(138533), application/dbf(10298), application/sbx(348), pdf(339251)Available download formats
    Dataset updated
    Oct 20, 2020
    Dataset provided by
    DataverseNO
    Authors
    Øystein Solvang; Øystein Solvang; Kari Elida Eriksen; Jonas Stein; Camilla Brattland; Kari Elida Eriksen; Jonas Stein; Camilla Brattland
    License

    https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/VMUP44https://dataverse.no/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.18710/VMUP44

    Time period covered
    Jan 1, 2020 - Jul 15, 2020
    Area covered
    Covers 199 countries
    Description

    The dataset is a cross-sectional dataset covering social and public health data pertaining to the Covid-19 outbreak in 199 countries. The dataset was compiled from public register and other openly available sources. Data on Covid-19 cases and related fatalities is current as of medio July 2020. Data on other variables is mainly from the last three years, depending on data availability. Standardized unique unit identifiers (ISO-3166-1 Alpha-3) are included, enabling merging with other data. The dataset was assembled concurrently with a similar one on the Norwegian municipal level, as part of the project «Ressurs for studentaktiv læring i undervisning i statistisk og romlig analyse for samfunnsfag», at the Department of Social Science and The Norwegian College of Fishery Science, UiT. Dette er et tverrsnittsdatasett med samfunns- og folkehelsedata relatert til den pågående Covid-19-pandemien. Datasettet dekker 199 land. Det er satt sammen med data fra offentlige registre og andre åpent tilgjengelige kilder. Data om Covid-19-tilfeller og -dødsfall er à jour per medio juli 2020. Data på andre variabler er hovedsaklig fra de tre siste årene, avhengig av hva som var tilgjengelig på innsamlingstidspunktet. Standardiserte unike ID-variabler (ISO-3166-1 Alpha-3) er inkludert for å muliggjøre fusjonering med annen data. Datasettet ble satt sammen parallellt med et tilsvarende på kommunenivå (Norge), som en del av prosjektet «Ressurs for studentaktiv læring i undervisning i statistisk og romlig analyse for samfunnsfag» ved Institutt for samfunnsvitenskap og Norges fiskerihøgskole, UiT.

  14. A

    Spatiotemporal data for 2019-Novel Coronavirus Covid-19 Cases and deaths

    • data.amerigeoss.org
    csv, pdf, txt
    Updated Jan 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    UN Humanitarian Data Exchange (2022). Spatiotemporal data for 2019-Novel Coronavirus Covid-19 Cases and deaths [Dataset]. https://data.amerigeoss.org/it/dataset/2019-novel-coronavirus-cases
    Explore at:
    txt(23645), csv(4916), pdf(15032), txt(7422), csv(795112664)Available download formats
    Dataset updated
    Jan 4, 2022
    Dataset provided by
    UN Humanitarian Data Exchange
    Description

    Data Overview

    This repository contains spatiotemporal data from many official sources for 2019-Novel Coronavirus beginning 2019 in Hubei, China ("nCoV_2019")

    You may not use this data for commercial purposes. If there is a need for commercial use of the data, please contact Metabiota at info@metabiota.com to obtain a commercial use license.

    The incidence data are in a CSV file format. One row in an incidence file contains a piece of epidemiological data extracted from the specified source.

    The file contains data from multiple sources at multiple spatial resolutions in cumulative and non-cumulative formats by confirmation status. To select a single time series of case or death data, filter the incidence dataset by source, spatial resolution, location, confirmation status, and cumulative flag.

    Data are collected, structured, and validated by Metabiota’s digital surveillance experts. The data structuring process is designed to produce the most reliable estimates of reported cases and deaths over space and time. The data are cleaned and provided in a uniform format such that information can be compared across multiple sources. Data are collected at the time of publication in the highest geographic and temporal resolutions available in the original report.

    This repository is intended to provide a single access point for data from a wide range of data sources. Data will be updated periodically with the latest epidemiological data. Metabiota maintains a database of epidemiological information for over two thousand high-priority infectious disease events. Please contact us (info@metabiota.com) if you are interested in licensing the complete dataset.

    Cumulative vs. Non-Cumulative Incidence

    Reporting sources provide either cumulative incidence, non-cumulative incidence, or both. If the source only provides a non-cumulative incidence value, the cumulative values are inferred using prior reports from the same source. Use the CUMULATIVE FLAG variable to subset the data to cumulative (TRUE) or non-cumulative (FALSE) values.

    Case Confirmation Status

    The incidence datasets include the confirmation status of cases and deaths when this information is provided by the reporting source. Subset the data by the CONFIRMATION_STATUS variable to either TOTAL, CONFIRMED, SUSPECTED, or PROBABLE to obtain the data of your choice.

    Total incidence values include confirmed, suspected, and probable incidence values. If a source only provides suspected, probable, or confirmed incidence, the total incidence is inferred to be the sum of the provided values. If the report does not specify confirmation status, the value is included in the "total" confirmation status value.

    The data provided under the "Metabiota Composite Source" often does not include suspected incidence due to inconsistencies in reporting cases and deaths with this confirmation status.

    Outcome - Cases vs. Deaths

    The incidence datasets include cases and deaths. Subset the data to either CASE or DEATH using the OUTCOME variable. It should be noted that deaths are included in case counts.

    Spatial Resolution

    Data are provided at multiple spatial resolutions. Data should be subset to a single spatial resolution of interest using the SPATIAL_RESOLUTION variable.

    Information is included at the finest spatial resolution provided to the original epidemic report. We also aggregate incidence to coarser geographic resolutions. For example, if a source only provides data at the province-level, then province-level data are included in the dataset as well as country-level totals. Users should avoid summing all cases or deaths in a given country for a given date without specifying the SPATIAL_RESOLUTION value. For example, subset the data to SPATIAL_RESOLUTION equal to “AL0” in order to view only the aggregated country level data.

    There are differences in administrative division naming practices by country. Administrative levels in this dataset are defined using the Google Geolocation API (https://developers.google.com/maps/documentation/geolocation/). For example, the data for the 2019-nCoV from one source provides information for the city of Beijing, which Google Geolocations indicates is a “locality.” Beijing is also the name of the municipality where the city Beijing is located. Thus, the 2019-nCoV dataset includes rows of data for both the city Beijing, as well as the municipality of the same name. If additional cities in the Beijing municipality reported data, those data would be aggregated with the city Beijing data to form the municipality Beijing data.

    Sources

    Data sources in this repository were selected to provide comprehensive spatiotemporal data for each outbreak. Data from a specific source can be selected using the SOURCE variable.

    In addition to the original reporting sources, Metabiota compiles multiple sources to generate the most comprehensive view of an outbreak. This compilation is stored in the database under the source name “Metabiota Composite Source.” The purpose of generating this new view of the outbreak is to provide the most accurate and precise spatiotemporal data for the outbreak. At this time, Metabiota does not incorporate unofficial - including media - sources into the “Metabiota Composite Source” dataset.

    Quality Assurance

    Data are collected by a team of digital surveillance experts and undergo many quality assurance tests. After data are collected, they are independently verified by at least one additional analyst. The data also pass an automated validation program to ensure data consistency and integrity.

    NonCommercial Use License

    • Creative Commons License Attribution-NonCommercial-ShareAlike 3.0 Unported (CC BY-NC-SA 3.0)

    • This is a human-readable summary of the Legal Code.

    • You are free:

      to Share — to copy, distribute and transmit the work to Remix — to adapt the work

    • Under the following conditions:

      Attribution — You must attribute the work in the manner specified by the author or licensor (but not in any way that suggests that they endorse you or your use of the work).

      Noncommercial — You may not use this work for commercial purposes.

      Share Alike — If you alter, transform, or build upon this work, you may distribute the resulting work only under the same or similar license to this one.

    • With the understanding that:

      Waiver — Any of the above conditions can be waived if you get permission from the copyright holder.

      Public Domain — Where the work or any of its elements is in the public domain under applicable law, that status is in no way affected by the license.

      Other Rights — In no way are any of the following rights affected by the license: Your fair dealing or fair use rights, or other applicable copyright exceptions and limitations; The author's moral rights; Rights other persons may have either in the work itself or in how the work is used, such as publicity or privacy rights. Notice — For any reuse or distribution, you must make clear to others the license terms of this work. The best way to do this is with a link to this web page.

    For details and the full license text, see http://creativecommons.org/licenses/by-nc-sa/3.0/

    Liability

    Metabiota shall in no event be liable for any decision taken by the user based on the data made available. Under no circumstances, shall Metabiota be liable for any damages (whatsoever) arising out of the use or inability to use the database. The entire risk arising out of the use of the database remains with the user.

  15. m

    Data for: Clustering analysis of countries using the COVID-19 cases dataset

    • data.mendeley.com
    Updated Apr 26, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Efthimios Zervas (2021). Data for: Clustering analysis of countries using the COVID-19 cases dataset [Dataset]. http://doi.org/10.17632/ffrk66tnvf.1
    Explore at:
    Dataset updated
    Apr 26, 2021
    Authors
    Efthimios Zervas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Cases of COVID-19 Figure1: per country (per day after the first case and per date), Figure2: clustering of Figure1 data, Figure 3: per country/population (per day after the first case and per date), Figure 4: clustering of Figure3 data, Figure5: per country/population/land area (per day after the first case and per date), Figure 6: clustering of Figure 5 data

  16. T

    CORONAVIRUS CASES by Country in EUROPE

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 8, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS CASES by Country in EUROPE [Dataset]. https://tradingeconomics.com/country-list/coronavirus-cases?continent=europe
    Explore at:
    excel, xml, csv, jsonAvailable download formats
    Dataset updated
    Mar 8, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    Europe
    Description

    This dataset provides values for CORONAVIRUS CASES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

  17. Asia Covid 19 Cases

    • kaggle.com
    zip
    Updated Oct 11, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vivek Chowdhury (2021). Asia Covid 19 Cases [Dataset]. https://www.kaggle.com/datasets/vivek468/asia-covid-19-cases-updated-10-oct-21
    Explore at:
    zip(2174 bytes)Available download formats
    Dataset updated
    Oct 11, 2021
    Authors
    Vivek Chowdhury
    License

    http://opendatacommons.org/licenses/dbcl/1.0/http://opendatacommons.org/licenses/dbcl/1.0/

    Area covered
    Asia
    Description

    About the Data:

    A year ago, when WHO declared COVID-19 outbreak a pandemic, countries in WHO South-East Asia Region were either responding to their first cases of importation or cluster of cases or keeping a strict vigil against importation of the new coronavirus.

    The following months were unprecedented, and for many reasons. Scientists, experts, governments, societies, communities and even individuals responded to the new virus with urgency and measures never witnessed before.

    Metadata:

    ID: Unique Identifier Country: Name of Country TotalCases: Total Number of cases recorded so far TotalDeaths: Total Deaths recorded so far TotalRecovered: How many people survived ActiveCases: Number of people who currently has the virus TotalCasesPerMillion: How many cases are recorded per million individual TotalDeathsPerMillion: How many deaths recorded per million individual TotalTests: Total number of COVID19 tests conducted RTPCR + RAT + any other tests TotalTestsPerMillion: How many tests were conducted per million individual TotalPopulation: Population of the country

    Acknowledgements:

    This dataset was collected from: https://www.worldometers.info/coronavirus/#countries

    Call For Code:

    Fellow Data Scientist and ML engineers, can you identify which countries are doing relatively well and which ones need immediate attention? Your insights can save millions of lives in Asia!

  18. m

    Coronavirus Panoply.io for Database Warehousing and Post Analysis using...

    • data.mendeley.com
    Updated Feb 4, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pranav Pandya (2020). Coronavirus Panoply.io for Database Warehousing and Post Analysis using Sequal Language (SQL) [Dataset]. http://doi.org/10.17632/4gphfg5tgs.2
    Explore at:
    Dataset updated
    Feb 4, 2020
    Authors
    Pranav Pandya
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    It has never been easier to solve any database related problem using any sequel language and the following gives an opportunity for you guys to understand how I was able to figure out some of the interline relationships between databases using Panoply.io tool.

    I was able to insert coronavirus dataset and create a submittable, reusable result. I hope it helps you work in Data Warehouse environment.

    The following is list of SQL commands performed on dataset attached below with the final output as stored in Exports Folder QUERY 1 SELECT "Province/State" As "Region", Deaths, Recovered, Confirmed FROM "public"."coronavirus_updated" WHERE Recovered>(Deaths/2) AND Deaths>0 Description: How will we estimate where Coronavirus has infiltrated, but there is effective recovery amongst patients? We can view those places by having Recovery twice more than the Death Toll.

    Query 2 SELECT country, sum(confirmed) as "Confirmed Count", sum(Recovered) as "Recovered Count", sum(Deaths) as "Death Toll" FROM "public"."coronavirus_updated" WHERE Recovered>(Deaths/2) AND Confirmed>0 GROUP BY country

    Description: Coronavirus Epidemic has infiltrated multiple countries, and the only way to be safe is by knowing the countries which have confirmed Coronavirus Cases. So here is a list of those countries

    Query 3 SELECT country as "Countries where Coronavirus has reached" FROM "public"."coronavirus_updated" WHERE confirmed>0 GROUP BY country Description: Coronavirus Epidemic has infiltrated multiple countries, and the only way to be safe is by knowing the countries which have confirmed Coronavirus Cases. So here is a list of those countries.

    Query 4 SELECT country, sum(suspected) as "Suspected Cases under potential CoronaVirus outbreak" FROM "public"."coronavirus_updated" WHERE suspected>0 AND deaths=0 AND confirmed=0 GROUP BY country ORDER BY sum(suspected) DESC

    Description: Coronavirus is spreading at alarming rate. In order to know which countries are newly getting the virus is important because in these countries if timely measures are taken, it could prevent any causalities. Here is a list of suspected cases with no virus resulted deaths.

    Query 5 SELECT country, sum(suspected) as "Coronavirus uncontrolled spread count and human life loss", 100*sum(suspected)/(SELECT sum((suspected)) FROM "public"."coronavirus_updated") as "Global suspected Exposure of Coronavirus in percentage" FROM "public"."coronavirus_updated" WHERE suspected>0 AND deaths=0 GROUP BY country ORDER BY sum(suspected) DESC Description: Coronavirus is getting stronger in particular countries, but how will we measure that? We can measure it by knowing the percentage of suspected patients amongst countries which still doesn’t have any Coronavirus related deaths. The following is a list.

    Data Provided by: SRK, Data Scientist at H2O.ai, Chennai, India

  19. o

    COVID-19 development in Vietnam and 10 neighboring countries in Asia -...

    • data.opendevelopmentmekong.net
    Updated Jul 30, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2020). COVID-19 development in Vietnam and 10 neighboring countries in Asia - Dataset OD Mekong Datahub [Dataset]. https://data.opendevelopmentmekong.net/dataset/covid-19-increasing-in-vietnam-and-14-neighboring-countries-in-asia
    Explore at:
    Dataset updated
    Jul 30, 2020
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Area covered
    Asia, Mekong River, Vietnam
    Description

    The data set provides numbers of COVID-19 infections in Vietnam and some neighboring countries in Asia. In the data set in Vietnam, there is a classification of the total number of new cases, new cases and new infections in the community. The number of infections in the community is the number of unknown infections. The case data is published on a daily basis and is aggregated to the time of current statistics.

  20. T

    CORONAVIRUS CASES by Country in AMERICA

    • tradingeconomics.com
    csv, excel, json, xml
    Updated Mar 8, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    TRADING ECONOMICS (2020). CORONAVIRUS CASES by Country in AMERICA [Dataset]. https://tradingeconomics.com/country-list/coronavirus-cases?continent=america
    Explore at:
    excel, csv, xml, jsonAvailable download formats
    Dataset updated
    Mar 8, 2020
    Dataset authored and provided by
    TRADING ECONOMICS
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    2025
    Area covered
    United States
    Description

    This dataset provides values for CORONAVIRUS CASES reported in several countries. The data includes current values, previous releases, historical highs and record lows, release frequency, reported unit and currency.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
GHOST5612 (2025). Novel Covid-19 Dataset [Dataset]. https://www.kaggle.com/datasets/ghost5612/novel-covid-19-dataset
Organization logo

Novel Covid-19 Dataset

Day level Info On Covid-19 affected cases Worldwide

Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Sep 18, 2025
Dataset provided by
Kagglehttp://kaggle.com/
Authors
GHOST5612
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Context:

From World Health Organization - On 31 December 2019, WHO was alerted to several cases of pneumonia in Wuhan City, Hubei Province of China. The virus did not match any other known virus. This raised concern because when a virus is new, we do not know how it affects people.

So daily level information on the affected people can give some interesting insights when it is made available to the broader data science community.

Johns Hopkins University has made an excellent dashboard using the affected cases data. Data is extracted from the google sheets associated and made available here.

Edited:

Now data is available as csv files in the Johns Hopkins Github repository. Please refer to the github repository for the Terms of Use details. Uploading it here for using it in Kaggle kernels and getting insights from the broader DS community.

Content

2019 Novel Coronavirus (2019-nCoV) is a virus (more specifically, a coronavirus) identified as the cause of an outbreak of respiratory illness first detected in Wuhan, China. Early on, many of the patients in the outbreak in Wuhan, China reportedly had some link to a large seafood and animal market, suggesting animal-to-person spread. However, a growing number of patients reportedly have not had exposure to animal markets, indicating person-to-person spread is occurring. At this time, it’s unclear how easily or sustainably this virus is spreading between people - CDC

This dataset has daily level information on the number of affected cases, deaths and recovery from 2019 novel coronavirus. Please note that this is a time series data and so the number of cases on any given day is the cumulative number.

The data is available from 22 Jan, 2020.

Here’s a polished version suitable for a professional Kaggle dataset description:

Dataset Description

This dataset contains time-series and case-level records of the COVID-19 pandemic. The primary file is covid_19_data.csv, with supporting files for earlier records and individual-level line list data.

Files and Columns

1. covid_19_data.csv (Main File)

This is the primary dataset and contains aggregated COVID-19 statistics by location and date.

  • Sno – Serial number of the record
  • ObservationDate – Date of the observation (MM/DD/YYYY)
  • Province/State – Province or state of the observation (may be missing for some entries)
  • Country/Region – Country of the observation
  • Last Update – Timestamp (UTC) when the record was last updated (not standardized, requires cleaning before use)
  • Confirmed – Cumulative number of confirmed cases on that date
  • Deaths – Cumulative number of deaths on that date
  • Recovered – Cumulative number of recoveries on that date

2. 2019_ncov_data.csv (Legacy File)

This file contains earlier COVID-19 records. It is no longer updated and is provided only for historical reference. For current analysis, please use covid_19_data.csv.

3. COVID_open_line_list_data.csv

This file provides individual-level case information, obtained from an open data source. It includes patient demographics, travel history, and case outcomes.

4. COVID19_line_list_data.csv

Another individual-level case dataset, also obtained from public sources, with detailed patient-level information useful for micro-level epidemiological analysis.

✅ Use covid_19_data.csv for up-to-date aggregated global trends.

✅ Use the line list datasets for detailed, individual-level case analysis.

Country level datasets:

If you are interested in knowing country level data, please refer to the following Kaggle datasets:

India - https://www.kaggle.com/sudalairajkumar/covid19-in-india

South Korea - https://www.kaggle.com/kimjihoo/coronavirusdataset

Italy - https://www.kaggle.com/sudalairajkumar/covid19-in-italy

Brazil - https://www.kaggle.com/unanimad/corona-virus-brazil

USA - https://www.kaggle.com/sudalairajkumar/covid19-in-usa

Switzerland - https://www.kaggle.com/daenuprobst/covid19-cases-switzerland

Indonesia - https://www.kaggle.com/ardisragen/indonesia-coronavirus-cases

Acknowledgements :

Johns Hopkins University for making the data available for educational and academic research purposes

MoBS lab - https://www.mobs-lab.org/2019ncov.html

World Health Organization (WHO): https://www.who.int/

DXY.cn. Pneumonia. 2020. http://3g.dxy.cn/newh5/view/pneumonia.

BNO News: https://bnonews.com/index.php/2020/02/the-latest-coronavirus-cases/

National Health Commission of the People’s Republic of China (NHC): http://www.nhc.gov.cn/xcs/yqtb/list_gzbd.shtml

China CDC (CCDC): http://weekly.chinacdc.cn/news/TrackingtheEpidemic.htm

Hong Kong Department of Health: https://www.chp.gov.hk/en/features/102465.html

Macau Government: https://www.ssm.gov.mo/portal/

Taiwan CDC: https://sites.google....

Search
Clear search
Close search
Google apps
Main menu