DICOM files are given for 11 CT scans which were used in a research article. Each scan contains about 1250 slices with 512x512 gray scale images, each in its own directory. The low number slices contain the diapers in the order D5 ... D1, then the vials of powder are contained in the order V11 ... V2. The symbols correspond to the injected masses which are given in the paper and which are repeated here in a file called "mass.txt". There is also a file "README.txt" which describes the directory structure.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This data set includes low-dose whole body CT images and tissue segmentations of thirty healthy adult research participants who underwent PET/CT imaging on the uEXPLORER total-body PET/CT system at UC Davis. Participants included in this study were healthy adults, 18 years of age or older, who were able to provide informed written consent. The participants' age, sex, weight, height, and body mass index are also provided.
Fifteen participants underwent PET/CT imaging at three timepoints during a 3-hour period (0 minutes, 90 minutes, and 180 minutes) after PET radiotracer injection, while the remaining 15 participants were imaged at six timepoints during a 12-hour period (additionally at 360 minutes, 540 minutes, and 720 minutes). The imaging timepoint is indicated in the Series Description DICOM tag, with a value of either 'dyn', '90min', '3hr', '6hr', '9hr', or '12hr', corresponding to the delay after PET tracer injection. CT images were acquired immediately before PET image acquisition. Currently, only CT images are included in the data set from either three or six timepoints. The tissue segmentations include 37 tissues consisting of 13 abdominal organs, 20 different bones, subcutaneous and visceral fat, skeletal and psoas muscle. Segmentations were automatically generated at the 90 minute timepoint for each participant using MOOSE, an AI segmentation tool for whole body data. The segmentations are provided in NIFTI format and may need to be re-oriented to correctly match the CT image data in DICOM format.
The uEXPLORER CT scanner is an 80-row, 160 slice CT scanner typically used for anatomical imaging and attenuation correction for PET/CT. The CT scan obtained at 90 minutes was performed with 140 kVp and an average of 50 mAs for all subjects. At all other time-points (0 minutes, 180 minutes, etc.) the CT scan was obtained with 140 kVp and an average of 5 mAs. CT images were reconstructed into a 512x512x828 image matrix with 0.9766x0.9766x2.344 mm3 voxel size.
A key is provided along with the segmentations download in the Data Access table which details the organ values.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Many research applications of neuroimaging use magnetic resonance imaging (MRI). As such, recommendations for image analysis and standardized imaging pipelines exist. Clinical imaging, however, relies heavily on X-ray computed tomography (CT) scans for diagnosis and prognosis. Currently, there is only one image processing pipeline for head CT, which focuses mainly on head CT data with lesions. We present tools and a complete pipeline for processing CT data, focusing on open-source solutions, that focus on head CT but are applicable to most CT analyses. We describe going from raw DICOM data to a spatially normalized brain within CT presenting a full example with code. Overall, we recommend anonymizing data with Clinical Trials Processor, converting DICOM data to NIfTI using dcm2niix, using BET for brain extraction, and registration using a publicly-available CT template for analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The two ZIP files contain the images of a CT phantom of abdomen. DualEnergy.zip contains files of acquisition at 80 kVp and 140 kVp. The second file HelicalAxialIterative.zip contains acquisition done in axial, helicoidal (aka spiral) and iterative reconstruction (Saphire) modes. Axial images have AbdSeq in the name, Helicoidal have Abdomen and B in the filter name, Iterative have Abdomen and I in the filter name. You can compare Exposure in mAs for each acquisition modality, how it diminishes from sequential/helicoidal/axial through helicoidal/spiral to iterative. This scanner if programmed with iterative acquisition can still reduce exposure from helicoidal scan. The acquisition was done on Siemens-Healthineers Somatom Confidence CT scanner.
Antonio Ortiz Lora antonio.ortiz.lora.sspa@juntadeandalucia.es
Marcin Balcerzyk mbalcerzyk@us.es
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Reconstructed images, patient age and gender, and pathology annotation are also provided for these de-identified data sets. The library consists of scans from various exam types, including non-contrast head CT scans acquired for acute cognitive or motor deficit, low-dose non-contrast chest scans acquired to screen high-risk patients for pulmonary nodules, and contrast-enhanced CT scans of the abdomen acquired to look for metastatic liver lesions.
2016 Low Dose CT Grand Challenge
The 2016 Low Dose CT Grand Challenge, sponsored by the AAPM, NIBIB, and Mayo Clinic, used 30 contrast-enhanced abdominal CT patient scans, 10 for training and 20 for testing. Thirteen of the 20 testing datasets from the Grand Challenge were subsequently included in this larger collection of CT image and projection data (TCIA LDCT-and-Projection-data). Because of the frequency of requests received by Mayo and the AAPM for the complete 2016 Grand Challenge dataset, on September 21, 2021 all 30 cases were updated to use the same projection data format as used for the TCIA data library and made publicly available in a single location. Please refer to the READ ME file at that location for a mapping between the case ID numbers used in the 2016 Grand Challenge and the case ID numbers used in the TCIA library for the 13 cases that exist in both libraries.
Additional information about the 2016 Low Dose CT Grand Challenge can be found on the AAPM website and in the Medical Physics paper by McCollough et al.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
CT scan data for titanosaurian sauropod braincase from the Upper Cretaceous (Turonian) Bissekty Formation of Uzbekistan (Sues, Averianov, Ridgely & Witmer)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Magnetic Resonance - Computed Tomography (MR-CT) Jordan University Hospital (JUH) dataset has been collected after receiving Institutional Review Board (IRB) approval of the hospital and consent forms have been obtained from all patients. All procedures has been carried out in accordance with The Code of Ethics of the World Medical Association (Declaration of Helsinki).
The dataset consists of 2D image slices extracted using the RadiAnt DICOM viewer software. The extracted images are transformed to DICOM image data format with a resolution of 256x256 pixels. There are a total of 179 2D axial image slices referring to 20 patient volumes (90 MR and 89 CT 2D axial image slices). The dataset contains MR and CT brain tumour images with corresponding segmentation masks. The MR images of each patient were acquired with a 5.00mm T Siemens Verio 3T using a T2-weighted without contrast agent, 3 Fat sat pulses (FS), 2500-4000 TR, 20-30 TE, and 90/180 flip angle. The CT images were acquired with Siemens Somatom scanner with 2.46mGY.cm dose length, 130KV voltage, 113-327 mAs tube current, topogram acquisition protocol, 64 dual source, one projection, and slice thickness of 7.0mm. Smooth and sharp filters have been applied to the CT images. The MR scans have a resolution of 0.7x0.6x5 mm^3, while the CT scans have a resolution of 0.6x0.6x7 mm^3.
More information and the application of the dataset can be found in the following research paper:
Alaa Abu-Srhan; Israa Almallahi; Mohammad Abushariah; Waleed Mahafza; Omar S. Al-Kadi. Paired-Unpaired Unsupervised Attention Guided GAN with Transfer Learning for Bidirectional Brain MR-CT Synthesis. Comput. Biol. Med. 136, 2021. doi: https://doi.org/10.1016/j.compbiomed.2021.104763.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset consists of CT and PET-CT DICOM images of lung cancer subjects with XML Annotation files that indicate tumor location with bounding boxes. The images were retrospectively acquired from patients with suspicion of lung cancer, and who underwent standard-of-care lung biopsy and PET/CT. Subjects were grouped according to a tissue histopathological diagnosis. Patients with Names/IDs containing the letter 'A' were diagnosed with Adenocarcinoma, 'B' with Small Cell Carcinoma, 'E' with Large Cell Carcinoma, and 'G' with Squamous Cell Carcinoma.
The images were analyzed on the mediastinum (window width, 350 HU; level, 40 HU) and lung (window width, 1,400 HU; level, –700 HU) settings. The reconstructions were made in 2mm-slice-thick and lung settings. The CT slice interval varies from 0.625 mm to 5 mm. Scanning mode includes plain, contrast and 3D reconstruction.
Before the examination, the patient underwent fasting for at least 6 hours, and the blood glucose of each patient was less than 11 mmol/L. Whole-body emission scans were acquired 60 minutes after the intravenous injection of 18F-FDG (4.44MBq/kg, 0.12mCi/kg), with patients in the supine position in the PET scanner. FDG doses and uptake times were 168.72-468.79MBq (295.8±64.8MBq) and 27-171min (70.4±24.9 minutes), respectively. 18F-FDG with a radiochemical purity of 95% was provided. Patients were allowed to breathe normally during PET and CT acquisitions. Attenuation correction of PET images was performed using CT data with the hybrid segmentation method. Attenuation corrections were performed using a CT protocol (180mAs,120kV,1.0pitch). Each study comprised one CT volume, one PET volume and fused PET and CT images: the CT resolution was 512 × 512 pixels at 1mm × 1mm, the PET resolution was 200 × 200 pixels at 4.07mm × 4.07mm, with a slice thickness and an interslice distance of 1mm. Both volumes were reconstructed with the same number of slices. Three-dimensional (3D) emission and transmission scanning were acquired from the base of the skull to mid femur. The PET images were reconstructed via the TrueX TOF method with a slice thickness of 1mm.
The location of each tumor was annotated by five academic thoracic radiologists with expertise in lung cancer to make this dataset a useful tool and resource for developing algorithms for medical diagnosis. Two of the radiologists had more than 15 years of experience and the others had more than 5 years of experience. After one of the radiologists labeled each subject the other four radiologists performed a verification, resulting in all five radiologists reviewing each annotation file in the dataset. Annotations were captured using Labellmg. The image annotations are saved as XML files in PASCAL VOC format, which can be parsed using the PASCAL Development Toolkit: https://pypi.org/project/pascal-voc-tools/. Python code to visualize the annotation boxes on top of the DICOM images can be downloaded here.
Two deep learning researchers used the images and the corresponding annotation files to train several well-known detection models which resulted in a maximum a posteriori probability (MAP) of around 0.87 on the validation set.
Dataset link: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70224216
Archive of medical images of cancer accessible for public download. All images are stored in DICOM file format and organized as Collections, typically patients related by common disease (e.g. lung cancer), image modality (MRI, CT, etc) or research focus. Neuroimaging data sets include clinical outcomes, pathology, and genomics in addition to DICOM images. Submitting Data Proposals are welcomed.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The dataset contains CT scans used as input for the algorithm developed in chapters 2 and 3 of the dissertation. Using a CT scan and a treatment plan as inputs, dose and dose change computations in regions of interest can be performed. Specifically, the dataset contains:
All the scans consist of CT slices and are stored in the DICOM format. To correctly read, relate to each other and further process the different CT slices, appropriate DICOM reading software (e.g., the pydicom Python package) must be used.
References:
[1] - Craft, D., Bangert, M., Long, T., Papp, D., & Unkelbach, J. (2014). Supporting material for: "Shared data for IMRT optimization research: the CORT dataset" [Data set]. GigaScience Database. https://doi.org/10.5524/100110
[2] - Yorke, A. A., McDonald, G. C., Solis, D., & Guerrero, T. (2019). Pelvic Reference Data (Version 1) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/TCIA.2019.WOSKQ5OO
This part of the data release includes raw computed tomography (CT) images of sediment cores collected in 2009 offshore of Palos Verdes, California. It is one of seven files included in this U.S. Geological Survey data release that include data from a set of sediment cores acquired from the continental slope, offshore Los Angeles and the Palos Verdes Peninsula, adjacent to the Palos Verdes Fault. Gravity cores were collected by the USGS in 2009 (cruise ID S-I2-09-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=SI209SC), and vibracores were collected with the Monterey Bay Aquarium Research Institute's remotely operated vehicle (ROV) Doc Ricketts in 2010 (cruise ID W-1-10-SC; http://cmgds.marine.usgs.gov/fan_info.php?fan=W110SC). One spreadsheet (PalosVerdesCores_Info.xlsx) contains core name, location, and length. One spreadsheet (PalosVerdesCores_MSCLdata.xlsx) contains Multi-Sensor Core Logger P-wave velocity, gamma-ray density, and magnetic susceptibility whole-core logs. One zipped folder of .bmp files (PalosVerdesCores_Photos.zip) contains continuous core photographs of the archive half of each core. One spreadsheet (PalosVerdesCores_GrainSize.xlsx) contains laser particle grain size sample information and analytical results. One spreadsheet (PalosVerdesCores_Radiocarbon.xlsx) contains radiocarbon sample information, results, and calibrated ages. One zipped folder of DICOM files (PalosVerdesCores_CT.zip) contains raw computed tomography (CT) image files. One .pdf file (PalosVerdesCores_Figures.pdf) contains combined displays of data for each core, including graphic diagram descriptive logs. This particular metadata file describes the information contained in the file PalosVerdesCores_CT.zip. All cores are archived by the U.S. Geological Survey Pacific Coastal and Marine Science Center.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Data Integration & Imaging Informatics (DI-Cubed) project explored the issue of lack of standardized data capture at the point of data creation, as reflected in the non-image data accompanying various TCIA breast cancer collections. The work addressed the desire for semantic interoperability between various NCI initiatives by aligning on common clinical metadata elements and supporting use cases that connect clinical, imaging, and genomics data. Accordingly, clinical and measurement data was imported into I2B2 and cross-mapped to industry standard concepts for names and values including those derived from BRIDG, CDISC SDTM, DICOM Structured Reporting models and using NCI Thesaurus, SNOMED CT and LOINC controlled terminology. A subset of the standardized data was then exported from I2B2 to CSV and thence converted to DICOM SR according to the the DICOM Breast Imaging Report template [1] , which supports description of patient characteristics, histopathology, receptor status and clinical findings including measurements. The purpose was not to advocate DICOM SR as an appropriate format for interchange or storage of such information for query purposes, but rather to demonstrate that use of standard concepts harmonized across multiple collections could be transformed into an existing standard report representation. The DICOM SR can be stored and used together with the images in repositories such as TCIA and in image viewers that support rendering of DICOM SR content. During the project, various deficiencies in the DICOM Breast Imaging Report template were identified with respect to describing breast MR studies, laterality of findings versus procedures, more recently developed receptor types, and patient characteristics and status. These were addressed via DICOM CP 1838, finalized in Jan 2019, and this subset reflects those changes. DICOM Breast Imaging Report Templates available from: http://dicom.nema.org/medical/dicom/current/output/chtml/part16/sect_BreastImagingReportTemplates.html
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contributes volumetric segmentations of the anatomic regions in a subset of CT images available from NCI Imaging Data Commons [1] (https://imaging.datacommons.cancer.gov/) automatically generated using the TotalSegmentation model v1.5.6 [2]. The initial release includes segmentations for the majority of the CT scans included in the National Lung Screening Trial (NLST) collection [3], [4] already available in IDC. Direct link to open this analysis result dataset in IDC (available after release of IDC v18): https://portal.imaging.datacommons.cancer.gov/explore/filters/?analysis_results_id=TotalSegmentator-CT-Segmentations.
Specifically, for each of the CT series analyzed, we include segmentations as generated by TotalSegmentator, converted into DICOM Segmentation object format using dcmqi v1.3.0 [5], and first order and shape features for each of the segmented regions, as produced by pyradiomics v3.0.1 [6]. Radiomics features were converted to DICOM Structured Reporting documents following template TID1500 using dcmqi. TotalSegmentator analysis on the NLST cohort was executed using Terra platform [7]. Implementation of the workflow that was used for performing the analysis is available at https://github.com/ImagingDataCommons/CloudSegmentator [8].
Due to the large size of the files, they are stored in the cloud buckets maintained by IDC, and the attached files are the manifests that can be used to download the actual files.
If you use the files referenced in the attached manifests, we ask you to cite this dataset and the preprint describing how it was generated [9].
Each of the manifests include instructions in the header on how to download the included files.
To download the TotalSegmentator segmentations (in DICOM SEG format) and pyradiomics measurements (in DICOM SR format) files using .s5cmd
manifests:
pip install --upgrade idc-index
.s5cmd
manifest file. E.g., idc download totalsegmentator_ct_segmentations_aws.s5cmd
Other files included in the record are:
If you have any questions about this dataset, or if you experience any issues, please reach out to Imaging Data Commons support via support@canceridc.dev or (preferred) IDC Forum at https://discourse.canceridc.dev.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Lung Image Database Consortium image collection (LIDC-IDRI) consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions. It is a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process.
Seven academic centers and eight medical imaging companies collaborated to create this data set which contains 1018 cases. Each subject includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm"). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus.
Note : The TCIA team strongly encourages users to review pylidc and the Standardized representation of the TCIA LIDC-IDRI annotations using DICOM (DICOM-LIDC-IDRI-Nodules) of the annotations/segmentations included in this dataset before developing custom tools to analyze the XML version.
CAT Scans (Computerized Axial Tomography) of Deep Sea Fish: - Orange Roughy - Icefish (Smoothed and sharp images) - Blue Grenadier - Oreo Dories (Spiky, Smooth and Warty) - Warehou. All images were recorded at the Royal Hobart Hospital between 2000 and 2007. The images are in black and white, with mm resolution, from which it is possible to extract density and size measurements. The images are stored in the Digital Imaging and Communications in Medicine (DICOM) format.
Added 5th June 2014:
zipped datasets (CT-Scan data files of Orange Roughy) have been made publicly available. To view the datasets, download and unzip the files, then copy all contents on a CD, to view the data, use the viewer application PCVCDVW.EXE from the CD in the subdirectory CDVIEWER. Note that the viewer application only runs from a CD.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Micro-CT of the human eye. This dataset includes:1) part of the globe;2) anterior chamber angle (iris and ciliary body in the two part archive);3) aqueous humour outflow system;4) cornea;5) lens;6) retina;7) epiretinal membrane.X-ray study were conducted on equipment of Research and Education Center "Materials" Don State Technical University (http://nano.donstu.ru) using an Xradia Versa 520 unit (Carl Zeiss X-ray Microscopy, Inc., USA). Data exported to DICOM file format.A human sample was obtained from a patient who underwent enucleation of the eye due to uveal melanoma. One half of the sample underwent conventional histopathological examination including fixation in 4% phosphate-buffered formalin, paraffin embedding, hematoxylin-eosin staining. Thin slices obtained from the paraffin block were subsequently imaged by transmitted light microscopy and scanned using a histological slide scanner Aperio2 (Leica, Germany). The remaining half of the anterior segment of the eye, two samples of the retina and crystalline lens were selected during gross examination in the pathology department and subsequently placed in 4% phosphate-buffered formalin for 48 hours.All samples underwent iodine staining, described by Silva, J. M. S. et al. Three-dimensional non-destructive soft-tissue visualization with X-ray staining micro-tomography. Sci. Rep. 5, 14088; doi: 10.1038/srep14088 (2015). Initially a specimen is dehydrated in a graded series of ethanol solutions with distilled water (50%, 70%, 80%, 90%, 96%, 100%) for 1 hour with 1 hour between each step. Graded ethanol concentration limits sample shrinkage and increases cell permeability for the contrast medium. After dehydration samples were placed in a 1wt% I2 solution in absolute ethanol for 14 hours. At the final step of this protocol, samples were washed in absolute ethanol and subsequently placed in plastic containers filled with absolute ethanol and then stored at 5° C.The study was conducted in accordance with the Declaration of Helsinki and was approved by the institutional ethics committee (Institutional Ethics Committee of the National Medical Research Center of Oncology (Rostov-on-Don, Russia), protocol number 34, approval received 02/12/2019). Written informed consent was obtained from the included patient.For each scan, the sample was placed at the closest possible distance to the X-ray source. The 2048x2048 pixels CCD camera was maintained at −59°C and the acquisition was performed with camera binning factor = 2, which resulted in up to 1024x1024 pixels sized projection images. The X-ray source filters were selected based on the observed transmittance values according to the recommendations of the Xradia Versa 520 User’s Guide A003030 Rev. B. The exposure time was selected to maintain count (intensity) values > 5000 with the selected source parameters and filter. The Dynamic Ring Removal (DRR) option, which enables small random motions of the sample during acquisition, was enabled for all the projections. During each tomography procedure, 10 reference (air) X-ray images were acquired with equal time intervals between them. The average of these references was applied to each projection. A half-hour to one-hour warm-up scan was performed with the same source parameters before each acquisition.
Overview
The RAD-ChestCT dataset is a large medical imaging dataset developed by Duke MD/PhD student Rachel Draelos during her Computer Science PhD supervised by Lawrence Carin. The full dataset includes 35,747 chest CT scans from 19,661 adult patients. This Zenodo repository contains an initial release of 3,630 chest CT scans, approximately 10% of the dataset. This dataset is of significant interest to the machine learning and medical imaging research communities.
Papers
The following published paper includes a description of how the RAD-ChestCT dataset was created: Draelos et al., "Machine-Learning-Based Multiple Abnormality Prediction with Large-Scale Chest Computed Tomography Volumes," Medical Image Analysis 2021. DOI: 10.1016/j.media.2020.101857 https://pubmed.ncbi.nlm.nih.gov/33129142/
Two additional papers leveraging the RAD-ChestCT dataset are available as preprints:
"Use HiResCAM instead of Grad-CAM for faithful explanations of convolutional neural networks" (https://arxiv.org/abs/2011.08891)
"Explainable multiple abnormality classification of chest CT volumes with deep learning" (https://arxiv.org/abs/2111.12215)
Details about the files included in this data release
Metadata Files (4)
CT_Scan_Metadata_Complete_35747.csv: includes metadata about the whole dataset, with information extracted from DICOM headers.
Extrema_5747.csv: includes coordinates for lung bounding boxes for the whole dataset. Coordinates were derived computationally using a morphological image processing lung segmentation pipeline.
Indications_35747.csv: includes scan indications for the whole dataset. Indications were extracted from the free-text reports.
Summary_3630.csv: includes a listing of the 3,630 scans that are part of this repository.
Label Files (3)
The label files contain abnormality x location labels for the 3,630 shared CT volumes. Each CT volume is annotated with a matrix of 84 abnormality labels x 52 location labels. Labels were extracted from the free text reports using the Sentence Analysis for Radiology Label Extraction (SARLE) framework. For each CT scan, the label matrix has been flattened and the abnormalities and locations are separated by an asterisk in the CSV column headers (e.g. "mass*liver"). The labels can be used as the ground truth when training computer vision classifiers on the CT volumes. Label files include: imgtrain_Abnormality_and_Location_Labels.csv (for the training set)
imgvalid_Abnormality_and_Location_Labels.csv (for the validation set)
imgtest_Abnormality_and_Location_Labels.csv (for the test set)
CT Volume Files (3,630)
Each CT scan is provided as a compressed 3D numpy array (npz format). The CT scans can be read using the Python package numpy, version 1.14.5 and above.
Related Code
Code related to RAD-ChestCT is publicly available on GitHub at https://github.com/rachellea.
Repositories of interest include:
https://github.com/rachellea/ct-net-models contains PyTorch code to load the RAD-ChestCT dataset and train convolutional neural network models for multiple abnormality prediction from whole CT volumes.
https://github.com/rachellea/ct-volume-preprocessing contains an end-to-end Python framework to convert CT scans from DICOM to numpy format. This code was used to prepare the RAD-ChestCT volumes.
https://github.com/rachellea/sarle-labeler contains the Python implementation of the SARLE label extraction framework used to generate the abnormality and location label matrix from the free text reports. SARLE has minimal dependencies and the abnormality and location vocabulary terms can be easily modified to adapt SARLE to different radiologic modalities, abnormalities, and anatomical locations.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Datasets contain CT scans of COVID-19 patients from Faculty hospital of Královké Vinohrady in DICOM (and TIFF) used in paper Estimation of Covid-19 lungs damage based on computer tomography images analysis presenting the tool is available on F1000reserach DOI: 10.12688/f1000research.109020.1. The tool sued for the analysis of the dataset is published in Zenodo (10.5281/zenodo.5805990). Data were anonymized before exporting. Each patient has a folder with a unique ID, subfolder contains TIFF image for reach CT slice, and whenever possible DICOM files are added. All files contain ID and data format in the name. The CT data overview is in CSV for the whole dataset.
Contributions: Martin SCHÄTZ: Dataset preparation and couration Olga RUBEŠOVÁ: Data selection and cleaning David GIRSA: Data measuring and selection Katarína NAĎOVA: Data measuring and selection
The work was funded by the Ministry of Education, Youth and Sports by grant ‘Development of Advanced Computational Algorithms for evaluating post-surgery rehabilitation’ number LTAIN19007. The work was also supported from the grant of Specific university research – grant No FCHI 2022-001.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Micro-CT scans of cartonnage fragment E.133.1891 provided in two zip files.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This study describes a subset of the HNSCC collection on TCIA.
DICOM files are given for 11 CT scans which were used in a research article. Each scan contains about 1250 slices with 512x512 gray scale images, each in its own directory. The low number slices contain the diapers in the order D5 ... D1, then the vials of powder are contained in the order V11 ... V2. The symbols correspond to the injected masses which are given in the paper and which are repeated here in a file called "mass.txt". There is also a file "README.txt" which describes the directory structure.