Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. COVID-19 Dataset

    • kaggle.com
    zip
    Updated Nov 13, 2022
    + more versions
  2. g

    Coronavirus (Covid-19) Data in the United States

    • github.com
    • openicpsr.org
    • +3more
    csv
  3. Covid 19 Time Series Data

    • kaggle.com
    zip
    Updated Nov 10, 2020
  4. Daily COVID-19 Data (2020-2024)

    • kaggle.com
    zip
    Updated Aug 26, 2024
  5. d

    COVID-19 Cases and Deaths by Age Group - ARCHIVE

    • catalog.data.gov
    • data.ct.gov
    Updated Aug 12, 2023
  6. Data from: COVID-19 Case Surveillance Public Use Data with Geography

    • catalog.data.gov
    • data.fr.virginia.gov
    • +15more
    Updated May 8, 2021
  7. g

    Data from: Hospitalizations

    • health.google.com
    Updated Oct 7, 2021
  8. D

    COVID-19 Case Surveillance Restricted Access Detailed Data

    • data.cdc.gov
    • data.vi-vn.virginia.gov
    • +13more
    csv, xlsx, xml
    Updated Nov 20, 2020
    + more versions
  9. i

    Our World in Data COVID-19 Dataset

    • ieee-dataport.org
    Updated Aug 16, 2023
  10. COVID 19 Dataset

    • kaggle.com
    zip
    Updated Oct 23, 2024
  11. d

    COVID-19 County Level Data - Archive

    • catalog.data.gov
    • data.ct.gov
    • +1more
    Updated Jun 21, 2025
  12. Public Health Infobase - Data on COVID-19 in Canada

    • open.canada.ca
    • datasets.ai
    csv
    Updated Nov 21, 2025
  13. COVID-19 Dataset

    • kaggle.com
    zip
    Updated May 26, 2024
  14. COVID-19 Global Dataset

    • kaggle.com
    zip
    Updated Jun 2, 2024
    + more versions
  15. g

    Weather

    • health.google.com
    Updated Oct 7, 2021
  16. COVID-19 Dataset: Global Data for Analysis

    • kaggle.com
    zip
    Updated Jul 9, 2023
  17. Corona Virus Covid-19 US Counties

    • kaggle.com
    zip
    Updated Aug 22, 2022
  18. d

    COVID-19: Daily Cases Data

    • dataful.in
    Updated Mar 12, 2026
  19. COVID-19 Case Surveillance Public Use Data

    • catalog.data.gov
    • data.es.virginia.gov
    • +18more
    Updated Mar 3, 2022
  20. COVID-19 Weekly Cases and Deaths by Age, Race/Ethnicity, and Sex - ARCHIVED

    • healthdata.gov
    • data.fr.virginia.gov
    • +12more
    csv, xlsx, xml
    Updated Dec 24, 2022
    + more versions
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Meir Nizri (2022). COVID-19 Dataset [Dataset]. https://www.kaggle.com/datasets/meirnizri/covid19-dataset
Organization logo

COVID-19 Dataset

COVID-19 patient's symptoms, status, and medical history.

Explore at:
27 scholarly articles cite this dataset (View in Google Scholar)
zip(4890659 bytes)Available download formats
Dataset updated
Nov 13, 2022
Authors
Meir Nizri
License

https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

Description

Context

Coronavirus disease (COVID-19) is an infectious disease caused by a newly discovered coronavirus. Most people infected with COVID-19 virus will experience mild to moderate respiratory illness and recover without requiring special treatment. Older people, and those with underlying medical problems like cardiovascular disease, diabetes, chronic respiratory disease, and cancer are more likely to develop serious illness. During the entire course of the pandemic, one of the main problems that healthcare providers have faced is the shortage of medical resources and a proper plan to efficiently distribute them. In these tough times, being able to predict what kind of resource an individual might require at the time of being tested positive or even before that will be of immense help to the authorities as they would be able to procure and arrange for the resources necessary to save the life of that patient.

The main goal of this project is to build a machine learning model that, given a Covid-19 patient's current symptom, status, and medical history, will predict whether the patient is in high risk or not.

content

The dataset was provided by the Mexican government (link). This dataset contains an enormous number of anonymized patient-related information including pre-conditions. The raw dataset consists of 21 unique features and 1,048,576 unique patients. In the Boolean features, 1 means "yes" and 2 means "no". values as 97 and 99 are missing data.

  • sex: 1 for female and 2 for male.
  • age: of the patient.
  • classification: covid test findings. Values 1-3 mean that the patient was diagnosed with covid in different degrees. 4 or higher means that the patient is not a carrier of covid or that the test is inconclusive.
  • patient type: type of care the patient received in the unit. 1 for returned home and 2 for hospitalization.
  • pneumonia: whether the patient already have air sacs inflammation or not.
  • pregnancy: whether the patient is pregnant or not.
  • diabetes: whether the patient has diabetes or not.
  • copd: Indicates whether the patient has Chronic obstructive pulmonary disease or not.
  • asthma: whether the patient has asthma or not.
  • inmsupr: whether the patient is immunosuppressed or not.
  • hypertension: whether the patient has hypertension or not.
  • cardiovascular: whether the patient has heart or blood vessels related disease.
  • renal chronic: whether the patient has chronic renal disease or not.
  • other disease: whether the patient has other disease or not.
  • obesity: whether the patient is obese or not.
  • tobacco: whether the patient is a tobacco user.
  • usmr: Indicates whether the patient treated medical units of the first, second or third level.
  • medical unit: type of institution of the National Health System that provided the care.
  • intubed: whether the patient was connected to the ventilator.
  • icu: Indicates whether the patient had been admitted to an Intensive Care Unit.
  • date died: If the patient died indicate the date of death, and 9999-99-99 otherwise.
Search
Clear search
Close search
Google apps
Main menu