Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 20 labeled COVID-19 CT scans. Left lung, right lung, and infections are labeled by two radiologists and verified by an experienced radiologist.
To promote the studies of annotation-efficient deep learning methods, we set up three segmentation benchmark tasks based on this dataset https://gitee.com/junma11/COVID-19-CT-Seg-Benchmark.
In particular, we focus on learning to segment left lung, right lung, and infections using
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Background
The COVID-19 pandemic is a global healthcare emergency. Prediction models for COVID-19 imaging are rapidly being developed to support medical decision making in imaging. However, inadequate availability of a diverse annotated dataset has limited the performance and generalizability of existing models.
Purpose
To create the first multi-institutional, multi-national expert annotated COVID-19 imaging dataset made freely available to the machine learning community as a research and educational resource for COVID-19 chest imaging. The Radiological Society of North America (RSNA) assembled the RSNA International COVID-19 Open Radiology Database (RICORD) collection of COVID-related imaging datasets and expert annotations to support research and education. RICORD data will be incorporated in the Medical Imaging and Data Resource Center (MIDRC), a multi-institutional research data repository funded by the National Institute of Biomedical Imaging and Bioengineering of the National Institutes of Health.
Materials and Methods
This dataset was a collaboration between the RSNA and Society of Thoracic Radiology (STR).
Results
The RSNA International COVID-19 Open Annotated Radiology Database (RICORD) release 1b consists of 120 thoracic computed tomography (CT) scans of COVID negative patients from four international sites.
Patient Selection: Patients at least 18 years in age receiving negative diagnosis for COVID-19.
Data Abstract
120 de-identified Thoracic CT scans from COVID negative patients.
Supporting clinical variables: MRN*, Age, Exam Date/Time*, Exam Description, Sex, Study UID*, Image Count, Modality, Symptomatic, Testing Result, Specimen Source (* pseudonymous values).
Research Benefits
As this is a public dataset, RICORD is available for non-commercial use (and further enrichment) by the research and education communities which may include development of educational resources for COVID-19, use of RICORD to create AI systems for diagnosis and quantification, benchmarking performance for existing solutions, exploration of distributed/federated learning, further annotation or data augmentation efforts, and evaluation of the examinations for disease entities beyond COVID-19 pneumonia. Deliberate consideration of the detailed annotation schema, demographics, and other included meta-data will be critical when generating cohorts with RICORD, particularly as more public COVID-19 imaging datasets are made available via complementary and parallel efforts. It is important to emphasize that there are limitations to the clinical “ground truth” as the SARS-CoV-2 RT-PCR tests have widely documented limitations and are subject to both false-negative and false-positive results which impact the distribution of the included imaging data, and may have led to an unknown epidemiologic distortion of patients based on the inclusion criteria. These limitations notwithstanding, RICORD has achieved the stated objectives for data complexity, heterogeneity, and high-quality expert annotations as a comprehensive COVID-19 thoracic imaging data resource.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This COVID-19 dataset consists of Non-COVID and COVID cases of both X-ray and CT images. The associated dataset is augmented with different augmentation techniques to generate about 17099 X-ray and CT images. The dataset contains two main folders, one for the X-ray images, which includes two separate sub-folders of 5500 Non-COVID images and 4044 COVID images. The other folder contains the CT images. It includes two separate sub-folders of 2628 Non-COVID images and 5427 COVID images.
Facebook
TwitterA CT scan dataset about COVID-19
Facebook
TwitterWe built a large lung CT scan dataset for COVID-19 by curating data from 7 public datasets. Three of these datasets had shared COVID-19 lesion masks. This dataset merges the COVID-19 lesion masks and their corresponding frames of these 3 public datasets, with 2729 image and ground truth mask pairs. All different types of lesions are mapped to white color for consistency across datasets.
Facebook
TwitterA curated dataset of CT scan images of COVID-19 from India, used for training and testing machine learning models.
Facebook
TwitterThis public use dataset has 11 data elements reflecting COVID-19 community levels for all available counties. This dataset contains the same values used to display information available at https://www.cdc.gov/coronavirus/2019-ncov/science/community-levels-county-map.html. CDC looks at the combination of three metrics — new COVID-19 admissions per 100,000 population in the past 7 days, the percent of staffed inpatient beds occupied by COVID-19 patients, and total new COVID-19 cases per 100,000 population in the past 7 days — to determine the COVID-19 community level. The COVID-19 community level is determined by the higher of the new admissions and inpatient beds metrics, based on the current level of new cases per 100,000 population in the past 7 days. New COVID-19 admissions and the percent of staffed inpatient beds occupied represent the current potential for strain on the health system. Data on new cases acts as an early warning indicator of potential increases in health system strain in the event of a COVID-19 surge. Using these data, the COVID-19 community level is classified as low, medium , or high. COVID-19 Community Levels can help communities and individuals make decisions based on their local context and their unique needs. Community vaccination coverage and other local information, like early alerts from surveillance, such as through wastewater or the number of emergency department visits for COVID-19, when available, can also inform decision making for health officials and individuals. See https://www.cdc.gov/coronavirus/2019-ncov/science/community-levels.html for more information. Visit CDC’s COVID Data Tracker County View* to learn more about the individual metrics used for CDC’s COVID-19 community level in your county. Please note that county-level data are not available for territories. Go to https://covid.cdc.gov/covid-data-tracker/#county-view.
Facebook
TwitterWe built a large lung CT scan dataset for COVID-19 by curating data from 7 public datasets listed in the acknowledgements. These datasets have been publicly used in COVID-19 diagnosis literature and proven their efficiency in deep learning applications. Therefore, the merged dataset is expected to improve the generalization ability of deep learning methods by learning from all these resources together.
These datasets are made available in different formats. Our goal is to provide a large dataset of COVID-19, Normal, and CAP CT slices together with their corresponding metadata. Some of the datasets consist of categorized CT slices, and some include CT volumes with annotated lesion slices. Therefore, we used the slice-level annotations to extract axial slices from CT volumes. We then converted all the images to 8-bit to have a consistent depth.
To ensure the dataset quality, we have removed the closed lung normal slices that do not carry information about inside lung manifestations. Additionally, we did not include images lacking clear class labels or patient information. In total, we have gathered 7,593 COVID-19 images from 466 patients, 6,893 normal images from 604 patients, and 2,618 CAP images from 60 patients. All of our CAP images are from Afshar et al. dataset, in which 25 cases are already annotated. Our radiologist has annotated the remaining 35 CT scan volumes. This is the largest COVID-19 lung CT dataset so far, to the best of our knowledge. https://github.com/maftouni/Curated_Covid_CT.git
If you use this dataset for your research, please cite:
Maftouni, M., Law, A.C, Shen, B., Zhou, Y., Yazdi, N., and Kong, Z.J. “A Robust Ensemble-Deep Learning Model for COVID-19 Diagnosis based on an Integrated CT Scan Images Database,” Proceedings of the 2021 Industrial and Systems Engineering Conference, Virtual Conference, May 22-25, 2021.
You can find the paper here.
Facebook
TwitterThis dataset contains anonymised human lung computed tomography (CT) scans with COVID-19 related findings, as well as without such findings. In total, there are 1000 CT scans each from a unique patient.
A subset of 50 studies has been annotated with binary pixel masks for segmentation depicting regions of interest (ground-glass opacifications and consolidations). CT scans were obtained between 1st of March, 2020 and 25th of April, 2020, and provided by medical hospitals in Moscow, Russia.
Related COVID-19 CT dataset (different source) For more datasets, click here.
If you use this dataset in your research, please credit the authors
Morozov, S., Andreychenko, A., Blokhin, I., Vladzymyrskyy, A., Gelezhe, P., Gombolevskiy, V., Gonchar, A., Ledikhova, N., Pavlov, N., Chernina, V. MosMedData: Chest CT Scans with COVID-19 Related Findings, 2020, v. 1.0, link
CC BY NC ND 3.0
Image by rawpixel, available here.
Facebook
Twitterhttps://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This collection of cases was acquired at Stony Brook University from patients who tested positive for COVID-19. The collection includes images from different modalities and organ sites (chest radiographs, chest CTs, brain MRIs, etc.). Radiology imaging data is extremely important in COVID-19 from both a diagnostic and a monitoring perspective, given the crucial nature of COVID-19 pulmonary disease and its rapid phenotypic changes. The datasets are available for building AI systems for diagnostic and prognostic modeling.
This collection also includes associated clinical data for each patient. The clinical data consists of diagnoses, procedures, lab tests, covid19 specific data values (e.g., intubation status, symptoms at admission) and a set of derived data elements, which were used in analyses of this data. The clinical data is stored as a set of csv files which comply with OMOP Common Data Model data elements.
The images on the right show automated identification of regions of prognostic importance on baseline chest radiographs. The regions of highest prognostic importance (as determined by the AI algorithm) are observed primarily in lower lung regions, consistent with clinical findings on the corresponding CXRs.
Facebook
TwitterCOVID-19 CT-scans segmentation datasets
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
COVID-19, CAP and Normal subjects are placed in separate folders, within which patients are arranged in folders, followed by CT scan slices in DICOM format. Index.csv is related to the patients having slice-level and lobe-level labels. The indices given to patients in Index.csv file are then used in Slice-level-labels.npy and Lobe-level-labels.npy to indicate the slice and lobe labels. Slice-level-labels.npy is a 2D binary Numpy array in which the existence of infection in a specific slice is indicated by 1 and the lack of infection is shown by 0. In Slice-level-labels.npy, the first dimension represents the case index and the second one represents the slice numbers. Lobe-level-labels.npy is a 3D binary Numpy array in which the existence of infection in a specific lobe and slice is determined by 1 in the corresponding element of the array. Like the slice-level array, in Lobe-level-labels.npy, the two first dimensions represent the case index and slice numbers respectively. The third dimension shows the lobe indices which are specified as follows: 0 : Left Lower Lobe (LLL) 1 : Left Upper Lobe (LUL) 2 : Right Lower Lobe (RLL) 3 : Right Middle Lobe (RML) 4 : Right Upper Lobe (RUL) It is worth noting that CT slices are sorted based on the "Slice Location" value stored in the corresponding DICOM tag "(0020,1041) - DS - Slice Location". The slice-level and lobe-level labels are provided according to described slice order. The researchers, however, can re-arrange the slices using other CT attributes based on their preference, as long as they re-arrange the labels accordingly. The COVID-CT-MD dataset is also accompanied with the clinical data, stored in "Clinical-data.csv". Finally, to facilitate the inter-observer reliability studies, labels assigned by the three radiologists are separately provided in "Radiogists-seperated-labels.csv".
Facebook
TwitterThis dataset provides the following measures related to COVID-19 in CT public and private PK-12 schools for the latest week-long reporting period: Number of staff cases and change from the previous reporting period Number of student cases and change from the previous reporting period Number of student cases by learning model (fully in-person, hybrid, fully remote, or unknown) and change from the previous reporting period As of 6/24/2021, COVID-19 school-based surveillance activities for the 2020 – 2021 academic year has ended. The Connecticut Department of Public Health along with the Connecticut State Department of Education are planning to resume these activities at the start of the 2021 – 2022 academic year. Data for the 2021-2022 school year is available here: https://data.ct.gov/Health-and-Human-Services/COVID-19-Cases-in-CT-Schools-State-Summary-2021-20/r6vy-dvtz
Facebook
TwitterAttribution-NonCommercial 3.0 (CC BY-NC 3.0)https://creativecommons.org/licenses/by-nc/3.0/
License information was derived automatically
This dataset includes CT data and segmentation masks from patients diagnosed with COVID-19, as well as data from subjects without the infection.
This study is approved under the ethical approval codes of IR.TUMS.IKHC.REC.1399.255 and IR.TUMS.VCR.REC.1399.488 at Tehran University of Medical Sciences.
Please use the following citations:
1- Sotoudeh-Paima, Saman, et al. "A Multi-centric Evaluation of Deep Learning Models for Segmentation of COVID-19 Lung Lesions on Chest CT Scans." Iranian Journal of Radiology 19.4 (2022).
2- Hasanzadeh, Navid, et al. "Segmentation of COVID-19 Infections on CT: Comparison of four UNet-based networks." 2020 27th National and 5th International Iranian Conference on Biomedical Engineering (ICBME). IEEE, 2020.
Facebook
TwitterThe CT images of COVID-19 dataset is a collection of 349 CT images of COVID-19 patients and 397 non-COVID-19 CT images.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Classification results: COVID-19 vs. common pneumonia CT slices.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Details of the datasets.
Facebook
TwitterWe describe a publicly available multiclass CT scan dataset for SARS-CoV-2 infection identification. Which currently contains 4173 CT-scans of 210 different patients, out of which 2168 correspond to 80 patients infected with SARS-CoV-2 and confirmed by RT-PCR. These data have been collected in the Public Hospital of the Government Employees of Sao Paulo (HSPM) and the Metropolitan Hospital of Lapa, both in Sao Paulo - Brazil. The dataset is composed of CT scans in png format, which are divided into: 758 CT scans for healthy patients (15 CT scans per patient on average). 2168 CT scans for patients infected by SASR-CoV-2(27 CT scans per patient on average). 1247 CT scans for patients with other pulmonary directions (16 CT scans per patient on average). TOTAL: 4173 CT scans for 210 patients of Sao Paulo - Brazil (20 CT scans per patient on average).
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset contains key characteristics about the data described in the Data Descriptor COVID-CT-MD, COVID-19 computed tomography scan dataset applicable in machine learning and deep learning. Contents:
1. human readable metadata summary table in CSV format
2. machine readable metadata file in JSON format
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains 20 labeled COVID-19 CT scans. Left lung, right lung, and infections are labeled by two radiologists and verified by an experienced radiologist.
To promote the studies of annotation-efficient deep learning methods, we set up three segmentation benchmark tasks based on this dataset https://gitee.com/junma11/COVID-19-CT-Seg-Benchmark.
In particular, we focus on learning to segment left lung, right lung, and infections using