https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Cancer Genome Atlas Lung Adenocarcinoma (TCGA-LUAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Lung Phenotype Research Group.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Lung Image Database Consortium image collection (LIDC-IDRI) consists of diagnostic and lung cancer screening thoracic computed tomography (CT) scans with marked-up annotated lesions. It is a web-accessible international resource for development, training, and evaluation of computer-assisted diagnostic (CAD) methods for lung cancer detection and diagnosis. Initiated by the National Cancer Institute (NCI), further advanced by the Foundation for the National Institutes of Health (FNIH), and accompanied by the Food and Drug Administration (FDA) through active participation, this public-private partnership demonstrates the success of a consortium founded on a consensus-based process.
Seven academic centers and eight medical imaging companies collaborated to create this data set which contains 1018 cases. Each subject includes images from a clinical thoracic CT scan and an associated XML file that records the results of a two-phase image annotation process performed by four experienced thoracic radiologists. In the initial blinded-read phase, each radiologist independently reviewed each CT scan and marked lesions belonging to one of three categories ("nodule > or =3 mm," "nodule <3 mm," and "non-nodule > or =3 mm"). In the subsequent unblinded-read phase, each radiologist independently reviewed their own marks along with the anonymized marks of the three other radiologists to render a final opinion. The goal of this process was to identify as completely as possible all lung nodules in each CT scan without requiring forced consensus.
Note : The TCIA team strongly encourages users to review pylidc and the Standardized representation of the TCIA LIDC-IDRI annotations using DICOM (DICOM-LIDC-IDRI-Nodules) of the annotations/segmentations included in this dataset before developing custom tools to analyze the XML version.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
This dataset consists of CT and PET-CT DICOM images of lung cancer subjects with XML Annotation files that indicate tumor location with bounding boxes. The images were retrospectively acquired from patients with suspicion of lung cancer, and who underwent standard-of-care lung biopsy and PET/CT. Subjects were grouped according to a tissue histopathological diagnosis. Patients with Names/IDs containing the letter 'A' were diagnosed with Adenocarcinoma, 'B' with Small Cell Carcinoma, 'E' with Large Cell Carcinoma, and 'G' with Squamous Cell Carcinoma.
The images were analyzed on the mediastinum (window width, 350 HU; level, 40 HU) and lung (window width, 1,400 HU; level, –700 HU) settings. The reconstructions were made in 2mm-slice-thick and lung settings. The CT slice interval varies from 0.625 mm to 5 mm. Scanning mode includes plain, contrast and 3D reconstruction.
Before the examination, the patient underwent fasting for at least 6 hours, and the blood glucose of each patient was less than 11 mmol/L. Whole-body emission scans were acquired 60 minutes after the intravenous injection of 18F-FDG (4.44MBq/kg, 0.12mCi/kg), with patients in the supine position in the PET scanner. FDG doses and uptake times were 168.72-468.79MBq (295.8±64.8MBq) and 27-171min (70.4±24.9 minutes), respectively. 18F-FDG with a radiochemical purity of 95% was provided. Patients were allowed to breathe normally during PET and CT acquisitions. Attenuation correction of PET images was performed using CT data with the hybrid segmentation method. Attenuation corrections were performed using a CT protocol (180mAs,120kV,1.0pitch). Each study comprised one CT volume, one PET volume and fused PET and CT images: the CT resolution was 512 × 512 pixels at 1mm × 1mm, the PET resolution was 200 × 200 pixels at 4.07mm × 4.07mm, with a slice thickness and an interslice distance of 1mm. Both volumes were reconstructed with the same number of slices. Three-dimensional (3D) emission and transmission scanning were acquired from the base of the skull to mid femur. The PET images were reconstructed via the TrueX TOF method with a slice thickness of 1mm.
The location of each tumor was annotated by five academic thoracic radiologists with expertise in lung cancer to make this dataset a useful tool and resource for developing algorithms for medical diagnosis. Two of the radiologists had more than 15 years of experience and the others had more than 5 years of experience. After one of the radiologists labeled each subject the other four radiologists performed a verification, resulting in all five radiologists reviewing each annotation file in the dataset. Annotations were captured using Labellmg. The image annotations are saved as XML files in PASCAL VOC format, which can be parsed using the PASCAL Development Toolkit: https://pypi.org/project/pascal-voc-tools/. Python code to visualize the annotation boxes on top of the DICOM images can be downloaded here.
Two deep learning researchers used the images and the corresponding annotation files to train several well-known detection models which resulted in a maximum a posteriori probability (MAP) of around 0.87 on the validation set.
Dataset link: https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70224216
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Please cite our data paper published in "Data in Brief": https://www.sciencedirect.com/science/article/pii/S2352340923007473
BackgroundLiver cancer ranks as the third leading cause of cancer-related mortality worldwide [1] and alarmingly, both the incidence and mortality rates of liver cancer are increasing [2; 3]. Among the various types of primary liver cancer, hepatocellular carcinoma (HCC) stands out as the most prevalent, accounting for approximately 70-85% of liver cancer cases [4]. Leveraging the advantages of magnetic resonance (MR) imaging, HCC can be reliably detected and diagnosed without the requirement of an invasive biopsy [5]. MR imaging offers high tissue contrast, which can be further enhanced through contrast-enhanced multiphasic magnetic resonance imaging (mpMRI) techniques. This enables accurate identification and non-invasive diagnosis of HCC [6].
ObjectivePrecise segmentation of the liver plays a crucial role in volumetry assessment and serves as a vital pre-processing step for subsequent tumor detection algorithms [7]. However, accurate liver segmentation can be particularly challenging in patients with cancer-related tissue alterations and deformations in shape [8]. Accurate HCC tumor segmentation is essential for the extraction of quantitative imaging biomarkers such as radiomics and can be used for studies on treatment response assessment and prognosis evaluation and provides critical information about the tumor biology. In order to enhance the reproducibility of liver and tumor segmentation, automated methods utilizing image analysis techniques and machine learning have been developed. These methods have demonstrated promising results [7; 8]; however, most algorithms were tested only on small internal test sets and therefore do not guarantee generalizable and consistent performance on external data. Publicly available datasets allow for fair and objective comparisons between different algorithms, techniques, or approaches. Researchers can evaluate the strengths and weaknesses of their methods in relation to existing solutions and establish benchmarks for performance evaluation. In addition to providing a benchmark with this dataset, we also assess the inter-rater variability between two different sets of tumor segmentations. This analysis serves as a measure of reproducibility for human segmentations, highlighting the consistency or variability that may exist among different human raters. Understanding the reproducibility of human segmentations is essential in assessing the reliability of manual annotations and establishing a baseline for algorithm performance comparison. By introducing LiverHccSeg, we aim to fill the gap of lacking publicly available mpMRI HCC datasets and offer researchers and developers a valuable resource for algorithmic evaluation on external data and imaging biomarker analyzes.
Materials and Methods Inclusion of PatientsAll available scans from The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) (https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=6885436) were downloaded [9]. One multiphasic MRI scan (pre and triphasic post contrast) per patient was included. Patients who did not exhibit a tumor or residual tumor were excluded from the tumor segmentation dataset; however, they were included in the liver segmentation dataset.
MR Imaging DataSubsequently, all imaging data was converted to the Neuroimaging Informatics Technology Initiative (NIfTI) format with the dcm2nii (v2.1.53) package [10] and available header information was extracted using the pydicom (v.2.1.2) package [11]. Multiparametric MR sequences were labeled with a consistent syntax ('pre', 'art', 'pv', 'del', for the pre-contrast, arterial, portal-venous and delayed contrast phases, respectively). All images were already de-identified by the TCIA website. Images were acquired between the years 1993 and 2007 on Philips and Siemens scanners with field strengths of 1.5 and 3 Tesla. Full details of the imaging parameters can be found in Table 5. Briefly, the median repetition time (TR) and median echo time (TE) were 365.8 ms and 26.4 ms, respectively. The median slice thickness was 9.5 mm, the median bandwidth 536.9 Hz.
Scientific ReadingAfter conversion, all images were read in a scientific reading by two board-certified abdominal radiologists (S.A. and S.H with 9 and 10 years of experience, respectively). Any disagreement between the two raters was discussed in a consensus meeting. All HCC lesions were classified according to LI-RADS criteria [6].
Image RegistrationThe co-registration of pre-contrast, portal-venous, and delayed-phase images with arterial phase images was performed using the software BioImage Suite (v3.5) [12]. A non-rigid intensity-based registration approach was applied, employing a parameterized free-form deformation (FFD) with 3D B-splines [13]. The optimal FFD transformation was estimated by maximizing the normalized mutual information similarity metric [14] through gradient descent optimization. To enhance the optimization process, a multi-resolution image pyramid with three levels was utilized. The final B-spline control point spacing was set to 80 mm. The estimated transformation was then employed to warp the moving images (pre-contrast, portal-venous, and delayed-phase) into the reference image space, specifically the arterial phase image.
Liver and Tumor Segmentation and Statistical AnalysisAll livers and tumors were manually segmented under the supervision of two board-certified abdominal radiologists using the software 3D Slicer (v4.10.2) [15]. To compare the segmentation agreement between the two sets of liver and tumor segmentations, we calculated segmentation metrics using the Python package seg-metrics (v1.0.0) [16]. All segmentation metrics and statistics were calculated in Python (v3.7).
Data descriptionThe data that appears in this article include:
dicoms.zip: This zip file contains all the raw MR images from The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection (TCGA-LIHC) [1] in the Digital Imaging and Communications in Medicine (DICOM) format used for the curation of this dataset. The data is structured as Patient-ID/DATE/SEQUENCE where Patient-ID is the unique unidentified patient ID, DATE is the date of the image acquisition, and SEQUENCE is the name of the MR sequence. LiverHccSeg_MetaData.xlsx: This spreadsheet contains all the metadata from the DICOM headers along with the data from the scientific image readings. nifti_and_segms.zip: This zip file contains all MR images along with the liver and tumor segmentations in the Neuroimaging Informatics Technology Initiative (NIfTI) format.The data is structured as Patient-ID/DATE/SEQUENCE where Patient-ID is the unique anonymized patient identifier, DATE is the date of the image acquisition, and SEQUENCE is the name of the MRI sequence or segmentation image.The NIfTI files are named as follows:pre.nii.gz : Pre-contrast T1-weighted MRIart.nii.gz: Arterial-phase T1-weighted MRIpv.nii.gz: Portal-venous-phase T1-weighted MRIdel.nii.gz: Delayed-phase T1-weighted MRIart_pre.nii.gz: Pre-contrast T1-weighted MRI registered to the corresponding arterial-phase T1-weighted imageart_pv.nii.gz: Portal-venous-phase T1-weighted MRI registered to the corresponding arterial-phase T1-weighted MRIart_del.nii.gz: Delayed-phase T1-weighted MRI registered to the corresponding arterial-phase T1-weighted MRIThe corresponding manual segmentations are named after the rater and the type of segmentation and follow the format 'RATER_ROI.nii.gz' where RATER denotes the human rater and ROI denotes the region of interest that was segmented, for example, 'rater1_liver.nii.gz', 'rater2_liver.nii.gz', 'rater1_tumor1.nii.gz', and 'rater2_tumor1.nii.gz'. For tumor segmentations, an integer indicates the tumor identification number for different tumor ROIs, for example, 'rater1_tumor1.nii.gz' and 'rater2_tumor1.nii.gz'. The segmentations can be used for the arterial phase NIfTI file as well as the corresponding co-registered pre-contrast (art_pre.nii.gz), portal-venous (art_pv.nii.gz), and delayed-phase (art_del.nii.gz) images. segm_metrics.xlsx: This spreadsheet summarizes the segmentation agreement between the two sets of liver and tumor segmentations by the two board-certified abdominal radiologists.
References 1 Sung H, Ferlay J, Siegel RL et al (2021) Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA Cancer J Clin 71:209-249 2 Siegel RL, Miller KD, Jemal A (2019) Cancer statistics, 2019. CA Cancer J Clin 69:7-34 3 White DL, Thrift AP, Kanwal F, Davila J, El-Serag HB (2017) Incidence of Hepatocellular Carcinoma in All 50 United States, From 2000 Through 2012. Gastroenterology 152:812-820.e815 4 Perz JF, Armstrong GL, Farrington LA, Hutin YJ, Bell BP (2006) The contributions of hepatitis B virus and hepatitis C virus infections to cirrhosis and primary liver cancer worldwide. J Hepatol 45:529-538 5 Hamer OW, Schlottmann K, Sirlin CB, Feuerbach S (2007) Technology insight: advances in liver imaging. Nat Clin Pract Gastroenterol Hepatol 4:215-228 6 Chernyak V, Fowler KJ, Kamaya A et al (2018) Liver Imaging Reporting and Data System (LI-RADS) Version 2018: Imaging of Hepatocellular Carcinoma in At-Risk Patients. Radiology 289:816-830 7 Bousabarah K, Letzen B, Tefera J et al (2020) Automated detection and delineation of hepatocellular carcinoma on multiphasic contrast-enhanced MRI using deep learning. Abdom Radiol. 10.1007/s00261-020-02604-5 8 Gross M, Spektor M, Jaffe A et al (2021) Improved performance and consistency of deep learning 3D liver segmentation with heterogeneous cancer stages in magnetic resonance imaging. PLoS One 16:e0260630 9 Erickson BJ, Kirk S, Lee Y et al (2016) Radiology Data from The Cancer Genome Atlas
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This collection contains subjects from the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium CPTAC Ovarian Serous Cystadenocarcinoma cohort. CPTAC is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics. Radiology and pathology images from CPTAC patients are being collected and made publicly available by The Cancer Imaging Archive to enable researchers to investigate cancer phenotypes which may correlate to corresponding proteomic, genomic and clinical data.
Imaging from each cancer type will be contained in its own TCIA Collection, with the collection name "CPTAC-cancertype". Radiology imaging is collected from standard of care imaging performed on patients immediately before the pathological diagnosis, and from follow-up scans where available. For this reason the radiology image data sets are heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. Pathology imaging is collected as part of the CPTAC qualification workflow.
All CPTAC cohorts are released as either a single combined cohort, or split into Discovery and Confirmatory where applicable. There are two main types of proteomic studies: discovery proteomics and targeted proteomics. The term "discovery proteomics" is in reference to "untargeted" identification and quantification of a maximal number of proteins in a biological or clinical sample. The term “targeted proteomics” refers to quantitative measurements on a defined subset of total proteins in a biological or clinical sample, often following the completion of discovery proteomics studies to confirm interesting targets selected. Commonly used proteomic technologies and platforms are different types of mass spectrometry and protein microarrays depending on the needs, throughput and sample input requirement of an analysis, with further development on nanotechnologies and automation in the pipeline in order to improve the detection of low abundance proteins, increase throughput, and selectively reach a target protein in vivo. Once the protein targets of interest are identified, high-throughput targeted assays are developed for confirmatory studies: tests to affirm that the initial tests were accurate. A summary of CPTAC imaging efforts can be found on the CPTAC Imaging Proteomics page.
You can join the CPTAC Imaging Special Interest Group to be notified of webinars & data releases, collaborate on common data wrangling tasks and seek out partners to explore research hypotheses! Artifacts from previous webinars such as slide decks and video recordings can be found on the CPTAC SIG Webinars page.
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
This dataset contains 9000 patches obtained from a few of the whole slide image files publicly available in the A dataset of histopathological whole slide images for classification of Treatment effectiveness to ovarian cancer (Ovarian Bevacizumab Response). No label is associated with the data but the patches can be used in unsupervised tasks. Each patch is 1024 by 1024 in size and has been created from the highest magnification available in the dataset (20X).
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This collection contains subjects from the National Cancer Institute’s Clinical Proteomic Tumor Analysis Consortium Uterine Corpus Endometrial Carcinoma (CPTAC-UCEC) cohort. CPTAC is a national effort to accelerate the understanding of the molecular basis of cancer through the application of large-scale proteome and genome analysis, or proteogenomics. Radiology and pathology images from CPTAC patients are being collected and made publicly available by The Cancer Imaging Archive to enable researchers to investigate cancer phenotypes which may correlate to corresponding proteomic, genomic and clinical data.
Imaging from each cancer type will be contained in its own TCIA Collection, with the collection name "CPTAC-cancertype". Radiology imaging is collected from standard of care imaging performed on patients immediately before the pathological diagnosis, and from follow-up scans where available. For this reason the radiology image data sets are heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. Pathology imaging is collected as part of the CPTAC qualification workflow.
All CPTAC cohorts are released as either a single combined cohort, or split into Discovery and Confirmatory where applicable. There are two main types of proteomic studies: discovery proteomics and targeted proteomics. The term "discovery proteomics" is in reference to "untargeted" identification and quantification of a maximal number of proteins in a biological or clinical sample. The term “targeted proteomics” refers to quantitative measurements on a defined subset of total proteins in a biological or clinical sample, often following the completion of discovery proteomics studies to confirm interesting targets selected. Commonly used proteomic technologies and platforms are different types of mass spectrometry and protein microarrays depending on the needs, throughput and sample input requirement of an analysis, with further development on nanotechnologies and automation in the pipeline in order to improve the detection of low abundance proteins, increase throughput, and selectively reach a target protein in vivo. Once the protein targets of interest are identified, high-throughput targeted assays are developed for confirmatory studies: tests to affirm that the initial tests were accurate. A summary of CPTAC imaging efforts can be found on the CPTAC Imaging Proteomics page.
You can join the CPTAC Imaging Special Interest Group to be notified of webinars & data releases, collaborate on common data wrangling tasks and seek out partners to explore research hypotheses! Artifacts from previous webinars such as slide decks and video recordings can be found on the CPTAC SIG Webinars page.
On January 14, 2020 Emily Kawaler presented the consortium's proteogenomic analyses of the CPTAC-UCEC. This deep dive into the UCEC genomic and proteomic datasets will help researchers better understand how they can be correlated with features derived from the imaging data. (Download the slides)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A Lower limb bioheat COMSOL Multiphysics® (Massachusetts, USA) model. The model is based on the computed tomography dataset acquired from the Cancer Imaging Archives (Subject ID TGGA-CV-A6JU) [1,2,3]
This COMSOL model simulates the peripheral thermal behavior using Pennes bioheat equation and by considering blood flow in the main arterial structure.
[1] Zuley, M. L., Jarosz, R., Kirk, S., Lee, Y., Colen, R., Garcia, K., … Aredes, N. D. (2016). Radiology Data from The Cancer Genome Atlas Head-Neck Squamous Cell Carcinoma [TCGA-HNSC] collection. The Cancer Imaging Archive. http://doi.org/10.7937/K9/TCIA.2016.LXKQ47MS 6
[2] Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F. The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository, Journal of Digital Imaging, Volume 26, Number 6, December, 2013, pp 1045-1057. (paper)
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
At the time of our study, 108 cases with breast MRI data were available in the The Cancer Genome Atlas Breast Invasive Carcinoma Collection (TCGA-BRCA) collection. In order to minimize variations in image quality across the multi-institutional cases we included only breast MRI studies acquired on GE 1.5 Tesla magnet strength scanners (GE Medical Systems, Milwaukee, Wisconsin, USA) scanners, yielding a total of 93 cases. We then excluded cases that had missing images in the dynamic sequence (1 patient), or at the time did not have gene expression analysis available in the TCGA Data Portal (8 patients). After these criteria, a dataset of 84 breast cancer patients resulted, with MRIs from four institutions: Memorial Sloan Kettering Cancer Center, the Mayo Clinic, the University of Pittsburgh Medical Center, and the Roswell Park Cancer Institute. The resulting cases contributed by each institution were 9 (date range 1999-2002), 5 (1999-2003), 46 (1999-2004), and 24 (1999-2002), respectively. The dataset of biopsy proven invasive breast cancers included 74 (88%) ductal, 8 (10%) lobular, and 2 (2%) mixed. Of these, 73 (87%) were ER+, 67 (80%) were PR+, and 19 (23%) were HER2+. Various types of analyses were conducted using the combined imaging, genomic, and clinical data. Those analyses are described within several manuscripts created by the group (cited below). Additional information about the methodology for how the Radiologist Annotations file can be found on the TCGA Breast Image Feature Scoring Project page.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These files contain the BRATS2013 Brain tumour data and belong to the International BRATS 2013 Challenge in Image Segmentation from the MICCAI Conference of 2013.Public data can be found in https://www.virtualskeleton.ch/BRATS/Start2013More information can be found in https://wiki.cancerimagingarchive.net/display/Public/NCI-MICCAI+2013+Grand+Challenges+in+Image+SegmentationThe BRATS2013_CHALLENGE.zip file contains the 10 test cases released for the evaluation of the methods that participated in the Challenge.The BRATS_Leaderboard.zip file conforms the Learderboard set used to perform an initial ranking of the best methods before the final test set evaluation.
[1] https://bimcv.cipf.es/bimcv-projects/bimcv-covid19/#1590858128006-9e640421-6711 [2] https://github.com/ml-workgroup/covid-19-image-repository/tree/master/png [3] https://sirm.org/category/senza-categoria/covid-19/ [4] https://eurorad.org [5] https://github.com/ieee8023/covid-chestxray-dataset [6] https://figshare.com/articles/COVID-19_Chest_X-Ray_Image_Repository/12580328 [7] https://github.com/armiro/COVID-CXNet [8] https://www.kaggle.com/c/rsna-pneumonia-detection-challenge/data [9] https://wiki.cancerimagingarchive.net/pages/viewpage.action?pageId=70230281
This dataset contains the ONCOhabitats processing results for the patients with complete pre-surgical MRI (T1, T1-Gd, T2, FLAIR and DSC perfusion) included at the Ivy Glioblastoma Atlas Project (Ivy GAP) dataset.
The ONCOhabitats platform includes two main services:
1. The glioblastoma (GBM) segmentation service implements the MRI preprocessing and GBM morphological segmentation modules.
2. The GBM Hemodynamic Tissue Signature service implements the MRI preprocessing, GBM morphological segmentation, DSC quantification and the Hemodynamic Tissue Signature modules.
For each patient, we include a PDF report containing an analysis summary; two folders with the resulting images in MNI and native spaces; and a third folder with the transformation matrices.
*Users of this data results should include references to the following citations:
1. Juan-Albarracín, J., Fuster-Garcia, E., Pérez-Girbés, A., Aparici-Robles, F., Alberich-Bayarri, Á., Revert-Ventura, A., ... & García-Gómez, J. M. (2018). Glioblastoma: vascular habitats detected at preoperative dynamic susceptibility-weighted contrast-enhanced perfusion MR imaging predict survival. Radiology, 287(3), 944-954.
2. Álvarez‐Torres, M., Juan‐Albarracín, J., Fuster‐Garcia, E., Bellvís‐Bataller, F., Lorente, D., Reynés, G., ... & García‐Gómez, J. M. (2020). Robust association between vascular habitats and patient prognosis in glioblastoma: An international multicenter study. Journal of Magnetic Resonance Imaging, 51(5), 1478-1486.
The original data was presented in:
Shah, N., Feng, X., Lankerovich, M., Puchalski, R. B., & Keogh, B. (2016). Data from Ivy Glioblastoma Atlas Project (IvyGAP) [Data set]. The Cancer Imaging Archive. https://doi.org/10.7937/K9/TCIA.2016.XLwaN6nL
Puchalski RB, Shah N, Miller J, Dalley R, Nomura SR, Yoon J-G, Smith KA, Lankerovich M, Bertagnolli D, Bickley K, Boe AF, Brouner K, Butler S, Caldejon S, Chapin M, Datta S, Dee N, Desta T, Dolbeare T, Dotson N, Ebbert A, Feng D, Feng X, Fisher M, Gee G, Goldy J, Gourley L, Gregor BW, Gu G, Hejazinia N, Hohmann J, Hothi P, Howard R, Joines K, Kriedberg A, Kuan L, Lau C, Lee F, Lee H, Lemon T, Long F, Mastan N, Mott E, Murthy C, Ngo K, Olson E, Reding M, Riley Z, Rosen D, Sandman D, Shapovalova N, Slaughterbeck CR, Sodt A, Stockdale G, Szafer A, Wakeman W, Wohnoutka PE, White SJ, Marsh D, Rostomily RC, Ng L, Dang C, Jones A, Keogh B, Gittleman HR, Barnholtz-Sloan JS, Cimino PJ, Uppin MS, Keene CD, Farrokhi FR, Lathia JD, Berens ME, Iavarone A, Bernard A, Lein E, Phillips JW, Rostad SW, Cobbs C, Hawrylycz MJ, Foltz GD. (2018). An anatomic transcriptional atlas of human glioblastoma. Science, 360(6389), 660–663. https://doi.org/10.1126/science.aaf2666
Clark, K., Vendt, B., Smith, K., Freymann, J., Kirby, J., Koppel, P., Moore, S., Phillips, S., Maffitt, D., Pringle, M., Tarbox, L., & Prior, F. (2013). The Cancer Imaging Archive (TCIA): Maintaining and Operating a Public Information Repository. Journal of Digital Imaging, 26(6), 1045–1057. https://doi.org/10.1007/s10278-013-9622-7
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from "The Clinical Proteomic Tumor Analysis Consortium Pancreatic Ductal Adenocarcinoma Collection (CPTAC-PDA)”. This dataset was generated as part of a National Cancer Institute project to augment images from The Cancer Imaging Archive with tumor annotations that will improve their value for cancer researchers and artificial intelligence experts.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from "The Clinical Proteomic Tumor Analysis Consortium Clear Cell Renal Cell Carcinoma Collection (CPTAC-CCRCC)”. This dataset was generated as part of a National Cancer Institute project to augment images from The Cancer Imaging Archive with annotations that will improve their value for cancer researchers and artificial intelligence experts.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from the NCI Clinical Trial "ACRIN-HNSCC-FDG-PET-CT (ACRIN 6685)”. This dataset was generated as part of an NCI project to augment TCIA datasets with annotations that will improve their value for cancer researchers and AI developers.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from "The Clinical Proteomic Tumor Analysis Consortium Uterine Corpus Endometrial Carcinoma Collection (CPTAC-UCEC)”. This dataset was generated as part of a National Cancer Institute project to augment images from The Cancer Imaging Archive with annotations that will improve their value for cancer researchers and artificial intelligence experts.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This dataset contains image annotations derived from the NCI Clinical Trial "Rituximab and Combination Chemotherapy in Treating Patients With Diffuse Large B-Cell Non-Hodgkin's Lymphoma (CALGB50303)”. This dataset was generated as part of an NCI project to augment TCIA datasets with annotations that will improve their value for cancer researchers and AI developers.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
PURPOSE: To determine the variability of lesion size measurements in computed tomography data sets of patients imaged under a “no change” (“coffee break”) condition and to determine the impact of two reading paradigms on measurement variability. METHOD AND MATERIALS: Using data sets from 32 RIDER Lung CT patients and 8 RIDER Pilot patients scanned twice within 15 minutes (“no change”), measurements were performed by five radiologists in two phases: (1) independent reading of each computed tomography dataset (timepoint): (2) a locked, sequential reading of datasets. Readers performed measurements using several sizing methods, including one-dimensional (1D) longest in-slice dimension and 3D semi-automated segmented volume. Change in size was estimated by comparing measurements performed on both timepoints for the same lesion, for each reader and each measurement method. For each reading paradigm, results were pooled across lesions, across readers, and across both readers and lesions, for each measurement method. For additional information please see https://qibawiki.rsna.org/index.php/VolCT_-_Group_1B and the Release Notes from which the following may be specially useful: "Results are described in DICOM SR files, which in turn reference DICOM segmentation files that encode the region as a 3D raster, and presentation states that record the zoom, pan and window levels at the time of measurement" and "Readers are identified by number (from 1 through 5) ... and their actual identity recorded in the SR tree in observer context and worklist descriptions has been removed."
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
This manuscript correlates patient survival with morphologic imaging features and hemodynamic parameters obtained from the nonenhancing region (NER) of glioblastoma (GBM), along with clinical and genomic markers. Forty-five patients with GBM underwent baseline imaging with contrast material-enhanced magnetic resonance (MR) imaging and dynamic susceptibility contrast-enhanced T2*-weighted perfusion MR imaging. See DSC T2* MR Perfusion Analysis for more information about the authors' perfusion analysis. Molecular and clinical predictors of survival were obtained. Single and multivariable models of overall survival (OS) and progression-free survival (PFS) were explored with Kaplan-Meier estimates, Cox regression, and random survival forests. Worsening OS (log-rank test, P = .0103) and PFS (log-rank test, P = .0223) were associated with increasing relative cerebral blood volume of NER (rCBV NER ), which was higher with deep white matter involvement (t test, P = .0482) and poor NER margin definition (t test, P = .0147). NER crossing the midline was the only morphologic feature of NER associated with poor survival (log-rank test, P = .0125). Preoperative Karnofsky performance score (KPS) and resection extent (n = 30) were clinically significant OS predictors (log-rank test, P = .0176 and P = .0038, respectively). No genomic alterations were associated with survival, except patients with high rCBV NER and wild-type epidermal growth factor receptor (EGFR) mutation had significantly poor survival (log-rank test, P = .0306; area under the receiver operating characteristic curve = 0.62). Combining resection extent with rCBV NER marginally improved prognostic ability (permutation, P = .084). Random forest models of presurgical predictors indicated rCBV NER as the top predictor; also important were KPS, age at diagnosis, and NER crossing the midline. A multivariable model containing rCBV NER , age at diagnosis, and KPS can be used to group patients with more than 1 year of difference in observed median survival (0.49-1.79 years). Conclusion Patients with high rCBV NER and NER crossing the midline and those with high rCBV NER and wild-type EGFR mutation showed poor survival. In multivariable survival models, however, rCBV NER provided unique prognostic information that went above and beyond the assessment of all NER imaging features, as well as clinical and genomic features.
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Prostate cancer T1- and T2-weighted magnetic resonance images (MRIs) were acquired on a 1.5 T Philips Achieva by combined surface and endorectal coil, including dynamic contrast-enhanced images obtained prior to, during and after I.V. administration of 0.1 mmol/kg body weight of Gadolinium-DTPA (pentetic acid). Corresponding clinical metadata (XLS format) and 3D segmentation files (NRRD format) are offered as a supplement to this image collection. The XLS file contains pathology biopsy and excised gland tissue reports and the MRI radiology report for most subjects.
The Multi-component NRRD Segmentations allow visualization and downstream analysis in 3D Slicer of the following prostate components: prostate gland boundary; internal capsule; central gland, peripheral zone; seminal vesicles; urethra; cancer – dominant nodule; neurovascular bundle; penile bulb; ejaculatory duct; veru-montanum; and rectum. See our tutorial on Using 3D Slicer with the Prostate-Diagnosis data if you are not familiar with using this kind of data.
The Seminal vesicles (SV) and neurovascular bundle (NVB) Segmentations delineate the neurovascular bundle and seminal vessicles as MHA files. These were provided as part of a planned challenge competition that did not materialize.
The Third Party Analysis dataset mentioned beneath the Data Access table was added later as part of the NCI-ISBI 2013 Challenge - Automated Segmentation of Prostate Structures. It includes segmentations for 30 Prostate-Diagnosis subjects in NRRD format which mark the boundaries of the central gland and peripheral zone were also provided
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
The Cancer Genome Atlas Lung Adenocarcinoma (TCGA-LUAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Lung Phenotype Research Group.