49 datasets found

r
Genomic Data Commons Data Portal (GDC Data Portal)
rrid.site
scicrunch.org
+2more
Updated Jan 29, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2022). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_014514
Dataset updated
Jan 29, 2022
Description
A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.
b
Genomic Data Commons Data Portal
bioregistry.io
Updated Apr 23, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). Genomic Data Commons Data Portal [Dataset]. https://bioregistry.io/gdc
Explore at:
Dataset updated
Apr 23, 2021
Description
The GDC Data Portal is a robust data-driven platform that allows cancer researchers and bioinformaticians to search and download cancer data for analysis.
c
The Cancer Genome Atlas Breast Invasive Carcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Feb 2, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
Dataset updated
Feb 2, 2014
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.
r
Cancer Research Data Commons
rrid.site
scicrunch.org
Updated Sep 18, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2025). Cancer Research Data Commons [Dataset]. http://identifiers.org/RRID:SCR_019128
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_019128
Dataset updated
Sep 18, 2025
Description
Cloud based data science infrastructure that provides secure access to cancer research data from NCI programs and key external cancer programs. Serves as coordinated resource for public data sharing of NCI funded programs. Users can explore and use analytical and visualization tools for data analysis. Enables to search and aggregate data across repositories including Cancer Data Service, Clinical Trial Data Commons, Genomic Data Commons, Imaging Data Commons, Integrated Canine Data Commons, Proteomic Data Commons.
c
The Cancer Genome Atlas Rectum Adenocarcinoma Collection
stage.cancerimagingarchive.net
cancerimagingarchive.net
dicom, n/a
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Rectum Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Rectum Adenocarcinoma (TCGA-READ) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.
f
The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis
figshare.com
xlsx
Updated Feb 2, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Namshik Han (2018). The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.5851743.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5851743.v1
Dataset updated
Feb 2, 2018
Dataset provided by
figshare
Authors
Namshik Han
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TCGA RNA-seq V2 Level3 data were downloaded from TCGA Genomic Data Commons Data Portal (https://gdc-portal.nci.nih.gov), consisting of 11,303 samples in 34 cancer projects (33 cancer types). Nine cancer types that do not have corresponding non-tumour samples were filtered out, and the analysis was focused on tumour versus non-tumour comparison. 24 cancer types were used in this meta-analysis: BLCA, BRCA, CESC, CHOL, COAD, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LIHC, LUAD, LUSC, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, THCA, THYM, UCEC (https://gdc-portal.nci.nih.gov). The nine filtered cancer types were ACC, DLBC, LAML, LGG, MESO, OV, TGCT, UCS and UVM. To extract expression values from TCGA RNA-seq data, we used genomic coordinates to retrieve UCSC Transcript IDs that correspond to the identifiers in TCGA RNA-seq V2 Level3 data (isoform level). The GAF (General Annotation Format) file was used to map the coordinate to UCSC Transcript ID, and it was downloaded form https://tcga-data.nci.nih.gov/docs/GAF/GAF.hg19.June2011.bundle/outputs/TCGA.hg19.June2011.gaf. This file contains genomic annotations shared by all TCGA projects. More details of the GAF file format can be found at https://tcga-data.nci.nih.gov/docs/GAF/GAF3.0/GAF_v3_file_description.docx. We filtered out any coding exons overlapping UCSC Transcript IDs to eliminate expression value of coding genes and evaluate lncRNA expression.We could find the expression values of 443 pcRNAs and 203 tapRNAs in TCGA data, as many of non-coding regions are not yet fully annotated in the TCGA RNA-seq V2 Level3 data. The expression value of pcRNAs and tapRNAs were extracted and clustered by un-supervised Pearson correlation method (Supplementary Figure 18A). The expression values of tapRNA-associated coding genes were also extracted and used to generate the heat-map (Supplementary Figure 18B), which shows the similar pattern of expression with tapRNAs across the cancer types.To show that tapRNAs and associated coding genes have similar expression profiles in cancers we generated a Spearman's Rank-Order Correlation heatmap (Figure 6A) between tapRNAs and their associated coding genes based on the TCGA RNA-seq data. We used the MatLab function corr to calculate the Spearman's rho. This function takes two matrices X (197-by-8,850 expression profiling matrix of tapRNA) and Y (197-by-8,850 expression profiling matrix of tapRNA-assocated coding gene) and returns an 8,850-by-8,850 matrix containing the pairwise correlation coefficient between each pair of 8,850 columns (TCGA cancer samples in Supplementary Figure 18A and B). Thus, the rank-order correlation matrix that we computed from the matrices of expression profiling data (Supplementary Figure S18A and B) allowed us to compare the correlation between two column vectors i.e. cancer samples. This function also returns a matrix of p-values for testing the hypothesis of no correlation against the alternative that there is a nonzero correlation. Each element of a matrix of p-values is the p value for the corresponding element of Spearman's rho. The p-values for Spearman's rho are calculated using large-sample approximations. To check significance level of correlation between tapRNA and its associated coding gene, the diagonal of the p-value matrix was extracted and used. The median is 1.31x10-11 and the mean is 1.03x10-4 with standard deviation 0.0029.To identify cancer-specific tapRNAs, we considered not only the global expression pattern of a given tapRNA in each cancer type, but also expression pattern of specific sub-group that is significantly distinct, to take into account cancer sample heterogeneity. Thus, two conditions were applied: (1) average expression level of a tapRNA in a given cancer type is in top 10% or bottom 10% and (2) a tapRNA has at least 10% of samples in a given cancer type that are significantly up-regulated (Z-score > 2) or down-regulated (Z-score < -2).
TCGA-LUAD
kaggle.com
opendatalab.com
zip
Updated Jul 28, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nahin Kumar Dey (2021). TCGA-LUAD [Dataset]. https://www.kaggle.com/nahin333/tcgaluad
Explore at:
zip(10283785426 bytes)Available download formats
Dataset updated
Jul 28, 2021
Authors
Nahin Kumar Dey
Description
The Cancer Genome Atlas Lung Adenocarcinoma (TCGA-LUAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

https://wiki.cancerimagingarchive.net/display/Public/TCGA-LUAD
d
Proteomic Data Commons
dknet.org
scicrunch.org
+1more
Updated Apr 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Proteomic Data Commons [Dataset]. http://identifiers.org/RRID:SCR_018273
Explore at:
Unique identifier
https://identifiers.org/RRID:SCR_018273
Dataset updated
Apr 15, 2024
Description
Portal to make cancer related proteomic datasets easily accessible to public. Facilitates multiomic integration in support of precision medicine through interoperability with other resources. Developed to advance our understanding of how proteins help to shape risk, diagnosis, development, progression, and treatment of cancer. One of several repositories within NCI Cancer Research Data Commons which enables researchers to link proteomic data with other data sets (e.g., genomic and imaging data) and to submit, collect, analyze, store, and share data throughout cancer data ecosystem. PDC provides access to highly curated and standardized biospecimen, clinical, and proteomic data, intuitive interface to filter, query, search, visualize and download data and metadata. Provides common data harmonization pipeline to uniformly analyze all PDC data and provides advanced visualization of quantitative information. Cloud based (Amazon Web Services) infrastructure facilitates interoperability with AWS based data analysis tools and platforms natively. Application programming interface (API) provides cloud-agnostic data access and allows third parties to extend functionality beyond PDC. Structured workspace that serves as private user data store and also data submission portal. Distributes controlled access data, such as patient-specific protein fasta sequence databases, with dbGaP authorization and eRA Commons authentication.
c
The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.IMMQW8UQ
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Liver Hepatocellular Carcinoma (TCGA-LIHC) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.
DICOM converted Slide Microscopy images for the TCGA-PRAD collection
zenodo.org
bin
Updated Aug 20, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim (2024). DICOM converted Slide Microscopy images for the TCGA-PRAD collection [Dataset]. http://doi.org/10.5281/zenodo.12689936
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12689936
Dataset updated
Aug 20, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Clunie; David Clunie; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim; Andrey Fedorov; Andrey Fedorov; William Clifford; David Pot; Ulrike Wagner; Keyvan Farahani; Erika Kim
License
Attribution 3.0 (CC BY 3.0)https://creativecommons.org/licenses/by/3.0/
License information was derived automatically
Description
This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: TCGA-PRAD. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

Collection description

The Cancer Imaging Program (CIP) is working directly with primary investigators from institutes participating in TCGA to obtain and load images relating to the genomic, clinical, and pathological data being stored within the TCGA Data Portal. Currently this image collection of prostate adenocarcinoma (PRAD) patients can be matched by each unique case identifier with the extensive gene and expression data of the same case from The Cancer Genome Atlas Data Portal to research the link between clinical phenome and tissue genome.

Please see the TCGA-PRAD page to learn more about the images and to obtain any supporting metadata for this collection.

Files included

A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

tcga_prad-idc_v8-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets

tcga_prad-idc_v8-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets

tcga_prad-idc_v8-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

Download instructions

Each of the manifests include instructions in the header on how to download the included files.

To download the files using .s5cmd manifests:

install idc-index package: pip install --upgrade idc-index

download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd.

To download the files using .dcf manifest, see manifest header.

Acknowledgments

Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

References

[1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National Cancer Institute Imaging Data Commons: Toward Transparency, Reproducibility, and Scalability in Imaging Artificial Intelligence. RadioGraphics (2023). https://doi.org/10.1148/rg.230180
Small NBL Dataset for Analysis of AKT Signaling
zenodo.org
csv
Updated Jul 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Greg Hyde; Greg Hyde (2023). Small NBL Dataset for Analysis of AKT Signaling [Dataset]. http://doi.org/10.5281/zenodo.8148323
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8148323
Dataset updated
Jul 15, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Greg Hyde; Greg Hyde
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ExpressionData.csv

----------------------------

A small subset of transcriptomics data (30 genes) curated for learning Gene Regulatory Networks (GRNs) pertaining to signaling by the ALK pathway. Genes were selected by referencing the "signaling by ALK" pathway from Reactome (https://reactome.org/content/detail/R-HSA-201556). This subset of data belongs the TARGET-NBL project (https://portal.gdc.cancer.gov/projects/TARGET-NBL), hosted via the Genomic Data Commons Data Portal (https://portal.gdc.cancer.gov/). Please refer to GDCs data access policies (https://gdc.cancer.gov/about-gdc/gdc-policies) if planning to use the data.

refNetwork.csv

----------------------

Contains a reference network of known pairwise regulatory relationships among the genes of which we have transcriptomics data available in "ExpressionData.csv." These relationships were again determined by referencing the "signaling by ALK" pathway from Reactome (https://reactome.org/content/detail/R-HSA-201556).
h
GDC-QAG-genes-mutations
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Center for Translational Data Science, GDC-QAG-genes-mutations [Dataset]. https://huggingface.co/datasets/uc-ctds/GDC-QAG-genes-mutations
Explore at:
Dataset authored and provided by
Center for Translational Data Science
License
Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
License information was derived automatically
Description
Dataset Card for uc-ctds/GDC-QAG-genes-mutations

This dataset contains genes and somatic mutations observed in various cancers. It is scraped from the /ssms endpoint in the Genomic Data Commons (GDC). This data is used to run the Query Augmented Generation (GDC) tool on the GDC. GDC QAG is currently deployed in the HuggingFace Spaces as a web app.

Dataset Details Dataset Description

This dataset contains around 5.6 million somatic mutations (protein… See the full description on the dataset page: https://huggingface.co/datasets/uc-ctds/GDC-QAG-genes-mutations.
Metadata record for the manuscript: The CINSARC signature predicts the...
springernature.figshare.com
datasetcatalog.nlm.nih.gov
xlsx
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anthony GONCALVES; Pascal FINETTI; Daniel BIRNBAUM; François BERTUCCI (2023). Metadata record for the manuscript: The CINSARC signature predicts the clinical outcome in patients with Luminal B breast cancer [Dataset]. http://doi.org/10.6084/m9.figshare.14350871.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.14350871.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Anthony GONCALVES; Pascal FINETTI; Daniel BIRNBAUM; François BERTUCCI
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary

This metadata record provides details of the data supporting the claims of the related manuscript: “The CINSARC signature predicts the clinical outcome in patients with Luminal B breast cancer”.

The related study tested the prognostic value for disease-free survival (DFS) of CINSARC, a multigene expression signature originally developed in sarcomas and shown to have prognostic impact in various cancers, in a series of 6035 early-stage invasive primary breast cancers.

Type of data: prognostic value for DFS of CINSARC

Subject of data: Homo sapiens

Sample size: 6035

Population characteristics: All cases were invasive breast carcinomas profiled using DNA microarrays or RNA-sequencing with expression and clinicopathological data available. All samples are pre-treatment samples (operative specimen or diagnostic biopsy before neo-adjuvant chemotherapy). The

detailed characteristics of patients and tumours analysed in the present study are available in Supplementary Table 10.

Recruitment: publicly available transcriptomic data of invasive primary breast cancer enrolled in 36 retrospective studies published over a 10-year period between 2002 and 2012.

Data access

All data sets of primary breast cancer were downloaded from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (https://www.ebi.ac.uk/arrayexpress/), Genomic Data Commons (GDC, https://portal.gdc.cancer.gov/) and cBioPortal (https://www.cbioportal.org/) databases. All accession IDs are provided in Supplementary Table 10 (Table S10 revised.xlsx), which is included with this data record.

The data underlying the figures and tables of the related article are contained in the files ‘Goncalves_supporting_data.xlsx’ and ‘Table S8.xlsx’, which are included with this data record.

A detailed list of the data underlying each figure and table of the related article is available in the file ‘Goncalves_2021_underlying_data_list.xlsx’, which is included with this data record.

Corresponding author(s) for this study

Pr François BERTUCCI, MD PhD, Département d’Oncologie Médicale, Institut Paoli-Calmettes, 232 Bd. Ste-Marguerite, 13009 Marseille, France e-mail:bertuccif@ipc.unicancer.fr ; Phone : +33 4 91 22 35 37 ; Fax : +33 4 91 22 36 70

Study approval

The details of Institutional Review Board and Ethical Committee approval and patients’ consent for the 36 studies analysed in the related study are present in their corresponding publications, which are listed in Supplementary Table 10 of the related article.
c
The Cancer Genome Atlas Lung Squamous Cell Carcinoma Collection
stage.cancerimagingarchive.net
cancerimagingarchive.net
dicom, n/a
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Lung Squamous Cell Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.TYGKKFMQ
Explore at:
dicom, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.TYGKKFMQ
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Lung Squamous Cell Carcinoma (TCGA-LUSC) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Lung Phenotype Research Group.
f
Bioinformatics analysis of CEP350 tumor suppression in human TCGA cutaneous...
nih.figshare.com
datasetcatalog.nlm.nih.gov
pdf
Updated Jul 9, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Mann; Mik Black (2022). Bioinformatics analysis of CEP350 tumor suppression in human TCGA cutaneous melanoma | Datasets Supporting: Tumor Suppressive Functions of CEP350 in Cutaneous Melanoma Cells [Dataset]. http://doi.org/10.35092/yhjc.12636119.v1
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.35092/yhjc.12636119.v1
Dataset updated
Jul 9, 2022
Dataset provided by
The NIH Figshare Archive
Authors
Michael Mann; Mik Black
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Bioinformatics analysis of CEP350 tumor suppression in human TCGA cutaneous melanoma. Publicly available datasets were downloaded from Genomic Data Commons Data Portal using data from the The Cancer Genome Atlas Program and the International Cancer Genome Consortium Data Portal for statistical analyses using R and Shiny.Supplementary datasets and other information accompanying manuscript: Tumor Suppressive Functions of CEP350 in Cutaneous Melanoma Cells by Aziz Aiderus, Bin Fang, John M. Koomen and Michael B. Mann.Abstract: We previously identified Cep350 as a novel melanoma haploinsufficient melanoma tumor suppressor gene using SB transposon-mediated mutagenesis to drive melanoma progression in Braf(V600E) mutant (SB|Braf) mice functionally demonstrated that the human CEP350 ortholog is a new melanoma tumor-suppressor gene in human cancer cell lines (Mann et al., Nature Genetics, 2015). Further dissection of the latent tumor suppressive functions of CEP350 in cutaneous melanoma cells is essential for understanding its role in melanoma imitation and progression. In this work, we investigated the role of the novel tumor suppressive functions of CEP350 in cutaneous melanoma cells using comparative informatics, molecular oncology, and proteomics approaches to demonstrate that CEP350 acts via altered cytoskeletal dynamics to contribute to BRAF-V600E driven melanoma.
H
Supplementary Materials for A Linked Data Representation for Summary...
dataverse.harvard.edu
Updated Aug 28, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
James McCusker (2019). Supplementary Materials for A Linked Data Representation for Summary Statistics and Grouping Criteria [Dataset]. http://doi.org/10.7910/DVN/OK0BUG
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/OK0BUG
Dataset updated
Aug 28, 2019
Dataset provided by
Harvard Dataverse
Authors
James McCusker
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Summary statistics are fundamental to data science, and are the buidling blocks of statistical reasoning. Most of the data and statistics made available on government web sites are aggregate, however, until now, we have not had a suitable linked data representation available. We propose a way to express summary statistics across aggregate groups as linked data using Web Ontology Language (OWL) Class based sets, where members of the set contribute to the overall aggregate value. Additionally, many clinical studies in the biomedical field rely on demographic summaries of their study cohorts and the patients assigned to each arm. While most data query languages, including SPARQL, allow for computation of summary statistics, they do not provide a way to integrate those values back into the RDF graphs they were computed from. We represent this knowledge, that would otherwise be lost, through the use of OWL 2 punning semantics, the expression of aggregate grouping criteria as OWL classes with variables, and constructs from the Semanticscience Integrated Ontology (SIO), and the World Wide Web Consortium's provenance ontology, PROV-O, providing interoperable representations that are well supported across the web of Linked Data. We evaluate these semantics using a Resource Description Framework (RDF) representation of patient case information from the Genomic Data Commons, a data portal from the National Cancer Institute.
CCDI-MCI: DICOM converted whole slide hematoxylin and eosin stained images...
zenodo.org
bin
Updated Oct 16, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Clunie; David Clunie; Doug Hawkins; Malcolm Smith; Erin Rudzinski; Nilsa Ramirez; Diana Thomas; Elaine Mardis; Catherine Cottrell; Sarah Leary; Maryam Fouladi; John Shern; Rajkumar Venkatramani; Ted Laetsch; Kenneth Chen; Meredeth Irwin; William Clifford; David Pot; Ulrike Wagner; Ulrike Wagner; Erika Kim; Granger Sutton; Andrey Fedorov; Andrey Fedorov; Doug Hawkins; Malcolm Smith; Erin Rudzinski; Nilsa Ramirez; Diana Thomas; Elaine Mardis; Catherine Cottrell; Sarah Leary; Maryam Fouladi; John Shern; Rajkumar Venkatramani; Ted Laetsch; Kenneth Chen; Meredeth Irwin; William Clifford; David Pot; Erika Kim; Granger Sutton (2024). CCDI-MCI: DICOM converted whole slide hematoxylin and eosin stained images from the Molecular Characterization Initiative of the National Cancer Institute's Childhood Cancer Data Initiative [Dataset]. http://doi.org/10.5281/zenodo.11099087
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11099087
Dataset updated
Oct 16, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
David Clunie; David Clunie; Doug Hawkins; Malcolm Smith; Erin Rudzinski; Nilsa Ramirez; Diana Thomas; Elaine Mardis; Catherine Cottrell; Sarah Leary; Maryam Fouladi; John Shern; Rajkumar Venkatramani; Ted Laetsch; Kenneth Chen; Meredeth Irwin; William Clifford; David Pot; Ulrike Wagner; Ulrike Wagner; Erika Kim; Granger Sutton; Andrey Fedorov; Andrey Fedorov; Doug Hawkins; Malcolm Smith; Erin Rudzinski; Nilsa Ramirez; Diana Thomas; Elaine Mardis; Catherine Cottrell; Sarah Leary; Maryam Fouladi; John Shern; Rajkumar Venkatramani; Ted Laetsch; Kenneth Chen; Meredeth Irwin; William Clifford; David Pot; Erika Kim; Granger Sutton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset corresponds to a collection of images and/or image-derived data available from National Cancer Institute Imaging Data Commons (IDC) [1]. This dataset was converted into DICOM representation and ingested by the IDC team. You can explore and visualize the corresponding images using IDC Portal here: CCDI-MCI. You can use the manifests included in this Zenodo record to download the content of the collection following the Download instructions below.

Collection description

The Molecular Characterization Initiative (MCI) [2] is a component of the National Cancer Institute’s (NCI) Childhood Cancer Data Initiative (CCDI). It offers state-of-the-art molecular testing at no cost to newly diagnosed children, adolescents, and young adults (AYAs) with central nervous system (CNS) tumors, soft tissue sarcomas (STS), certain rare childhood cancers (RAR), and certain neuroblastomas (NBL) treated at a Children’s Oncology Group (COG)–affiliated hospital. The goal of MCI is to enhance the understanding of genetic factors in pediatric cancers and to provide timely, clinically relevant findings to doctors and families to aid in treatment decisions and determine eligibility for certain planned COG clinical trials.

The original images in vendor-specific format were collected on IRB-approved clinical trials or tissue banking studies from Children’s Oncology Group (COG) patients enrolled in EveryChild APEC14B1 protocol.

Those images, augmented with the metadata describing their content, were provided to the IDC team for the purposes of archival, and were converted into DICOM Whole Slide Microscopy (SM) representation [3,4] using custom open source scripts and tools as described in [5]. The resulting converted images were released in IDC in the CCDI-MCI collection with the IDC data release v19.

To learn how to access related clinical and genomic data accompanying this collection please see the CCDI-MCI page and CCDI Hub.

Files included

A manifest file's name indicates the IDC data release in which a version of collection data was first introduced. For example, collection_id-idc_v8-aws.s5cmd corresponds to the contents of the collection_id collection introduced in IDC data release v8. If there is a subsequent version of this Zenodo page, it will indicate when a subsequent version of the corresponding collection was introduced.

ccdi_mci-idc_v19-aws.s5cmd: manifest of files available for download from public IDC Amazon Web Services buckets

ccdi_mci-idc_v19-gcs.s5cmd: manifest of files available for download from public IDC Google Cloud Storage buckets

ccdi_mci-idc_v19-dcf.dcf: Gen3 manifest (for details see https://learn.canceridc.dev/data/organization-of-data/guids-and-uuids)

Note that manifest files that end in -aws.s5cmd reference files stored in Amazon Web Services (AWS) buckets, while -gcs.s5cmd reference files in Google Cloud Storage. The actual files are identical and are mirrored between AWS and GCP.

Download instructions

Each of the manifests include instructions in the header on how to download the included files.

To download the files using .s5cmd manifests:

install idc-index package: pip install --upgrade idc-index

download the files referenced by manifests included in this dataset by passing the .s5cmd manifest file: idc download manifest.s5cmd

To download the files using .dcf manifest, see manifest header.

Acknowledgments

Imaging Data Commons team has been funded in whole or in part with Federal funds from the National Cancer Institute, National Institutes of Health, under Task Order No. HHSN26110071 under Contract No. HHSN261201500003l.

References

[1] Fedorov, A., Longabaugh, W. J. R., Pot, D., Clunie, D. A., Pieper, S. D., Gibbs, D. L., Bridge, C., Herrmann, M. D., Homeyer, A., Lewis, R., Aerts, H. J. W. L., Krishnaswamy, D., Thiriveedhi, V. K., Ciausu, C., Schacherer, D. P., Bontempi, D., Pihl, T., Wagner, U., Farahani, K., Kim, E. & Kikinis, R. National cancer institute imaging data commons: Toward transparency, reproducibility, and scalability in imaging artificial intelligence. Radiographics 43, (2023).

[2] https://www.cancer.gov/research/areas/childhood/childhood-cancer-data-initiative/programs/molecular-characterization

[3] National Electrical Manufacturers Association (NEMA). DICOM PS3.3 - Information Object Definitions: A.32.8 VL Whole Slide Microscopy Image IOD. at <https://dicom.nema.org/medical/dicom/current/output/html/part03.html#sect_A.32.8>

[4] Herrmann, M. D., Clunie, D. A., Fedorov, A., Doyle, S. W., Pieper, S., Klepeis, V., Le, L. P., Mutter, G. L., Milstone, D. S., Schultz, T. J., Kikinis, R., Kotecha, G. K., Hwang, D. H., Andriole, K. P., John Lafrate, A., Brink, J. A., Boland, G. W., Dreyer, K. J., Michalski, M., Golden, J. A., Louis, D. N. & Lennerz, J. K. Implementing the DICOM standard for digital pathology. J. Pathol. Inform. 9, 37 (2018).

[5] Clunie, D., Fedorov, A. & Herrmann, M. D. ImagingDataCommons/idc-wsi-conversion: Initial release. (Zenodo, 2023). doi:10.5281/ZENODO.8240154
c
The Cancer Genome Atlas Stomach Adenocarcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Feb 2, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2014). The Cancer Genome Atlas Stomach Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.GDHL9KIM
Explore at:
dicom, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.GDHL9KIM
Dataset updated
Feb 2, 2014
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Stomach Adenocarcinoma (TCGA-STAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.
f
Table_1_A novel risk score based on immune-related genes for hepatocellular...
frontiersin.figshare.com
datasetcatalog.nlm.nih.gov
docx
Updated Jun 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Meiying Long; Zihan Zhou; Xueyan Wei; Qiuling Lin; Moqin Qiu; Yunxiang Zhou; Peiqin Chen; Yanji Jiang; Qiuping Wen; Yingchun Liu; Runwei Li; Xianguo Zhou; Hongping Yu (2023). Table_1_A novel risk score based on immune-related genes for hepatocellular carcinoma as a reliable prognostic biomarker and correlated with immune infiltration.docx [Dataset]. http://doi.org/10.3389/fimmu.2022.1023349.s003
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fimmu.2022.1023349.s003
Dataset updated
Jun 13, 2023
Dataset provided by
Frontiers
Authors
Meiying Long; Zihan Zhou; Xueyan Wei; Qiuling Lin; Moqin Qiu; Yunxiang Zhou; Peiqin Chen; Yanji Jiang; Qiuping Wen; Yingchun Liu; Runwei Li; Xianguo Zhou; Hongping Yu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
BackgroundImmunological-related genes (IRGs) play a critical role in the immune microenvironment of tumors. Our study aimed to develop an IRG-based survival prediction model for hepatocellular carcinoma (HCC) patients and to investigate the impact of IRGs on the immune microenvironment.MethodsDifferentially expressed IRGs were obtained from The Genomic Data Commons Data Portal (TCGA) and the immunology database and analysis portal (ImmPort). The univariate Cox regression was used to identify the IRGs linked to overall survival (OS), and a Lasso-regularized Cox proportional hazard model was constructed. The International Cancer Genome Consortium (ICGC) database was used to verify the prediction model. ESTIMATE and CIBERSORT were used to estimate immune cell infiltration in the tumor immune microenvironment (TIME). RNA sequencing was performed on HCC tissue specimens to confirm mRNA expression.ResultsA total of 401 differentially expressed IRGs were identified, and 63 IRGs were found related to OS on the 237 up-regulated IRGs by univariate Cox regression analyses. Finally, five IRGs were selected by the LASSO Cox model, including SPP1, BIRC5, STC2, GLP1R, and RAET1E. This prognostic model demonstrated satisfactory predictive value in the ICGC dataset. The risk score was an independent predictive predictor for OS in HCC patients. Immune-related analysis showed that the immune infiltration level in the high-risk group was higher, suggesting that the 5-IRG signature may play an important role in mediating immune escape and immune resistance in the TIME of HCC. Finally, we confirmed the 5-IRG signature is highly expressed in 65 HCC patients with good predictive power.ConclusionWe established and verified a new prognosis model for HCC patients based on survival-related IRGs, and the signature could provide new insights into the prognosis of HCC.
Raw and processed data for studying chaperon-client interactions in 12...
figshare.com
zip
Updated Aug 14, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shai Pilosof; Barak Rotblat; Geut Galai (2023). Raw and processed data for studying chaperon-client interactions in 12 cancer types. [Dataset]. http://doi.org/10.6084/m9.figshare.22779755.v2
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22779755.v2
Dataset updated
Aug 14, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Shai Pilosof; Barak Rotblat; Geut Galai
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data accompanies the paper "Ecological network analysis reveals cancer-dependent chaperone-client interaction structure and robustness", by Geut Galai, Xie He, Barak Rotblat, Shai Pilosof. Published in Nature Communications. Please cite the paper when using the data.All users must read the paper to understand how the data were obtained and processed, and their limitations. Data comes without warranty. Licence is CC BY-NC-SA (Attribution-NonCommercial-ShareAlike): This license lets you remix, tweak, and build upon this work non-commercially, as long as you credit the authors and license the new creations under the identical terms.All the computational processes related to data derivation and analysis are in the GutHub repository that accompanies the paper.Raw data (raw.zip)Gene level transcriptome profiling (RNA-Seq) data (in the form of HTSeq - FPKM) that was download from The Cancer Genome Atlas (TCGA) using the Genomic Data Commons Data Portal https://portal.gdc.cancer.gov).Human protein expression data that was downloaded from the string-db.org data base, and from published papers as follows.File: 12192_2020_1080_MOESM4_ESM.xlsx. Source: Bie AS, Cömert C, Körner R, Corydon TJ, Palmfeldt J, Hipp MS, et al. An inventory of interactors of the human HSP60/HSP10 chaperonin in the mitochondrial matrix space. Cell Stress Chaperones. 2020;25: 407–416. doi:10.1007/s12192-020-01080-6File: 41467_2013_BFncomms3139_MOESM481_ESM.xls. Source: Chae YC, Angelin A, Lisanti S, Kossenkov AV, Speicher KD, Wang H, et al. Landscape of the mitochondrial Hsp90 metabolome in tumours. Nat Commun. 2013;4: 2139. doi:10.1038/ncomms3139File: 12915_2020_740_MOESM8_ESM.xlsx Source: Joshi A, Dai L, Liu Y, Lee J, Ghahhari NM, Segala G, et al. The mitochondrial HSP90 paralog TRAP1 forms an OXPHOS-regulated tetramer and is involved in mitochondrial metabolic homeostasis. BMC Biol. 2020;18: 10. doi:10.1186/s12915-020-0740-7File: mmc2.xlsx Source: Ishizawa J, Zarabi SF, Davis RE, Halgas O, Nii T, Jitkova Y, et al. Mitochondrial ClpP-Mediated Proteolysis Induces Selective Cancer Cell Lethality. Cancer Cell. 2019;35: 721–737.e9. doi:10.1016/j.ccell.2019.03.014Processed data (processed.zip)The network data. Rows are chaperones, columns are clients.Source dataFile: Source Data for Figures and Tables.zipThis is the source data underlying the figures and tables, as requested by Nature Communications.

Facebook

Twitter

Click to copy link

Link copied

Cite

(2022). Genomic Data Commons Data Portal (GDC Data Portal) [Dataset]. http://identifiers.org/RRID:SCR_014514

Genomic Data Commons Data Portal (GDC Data Portal)

RRID:SCR_014514, Genomic Data Commons Data Portal (GDC Data Portal) (RRID:SCR_014514), Genomic Data Commons Data Portal, GDC Data Portal

Explore at:

81 scholarly articles cite this dataset (View in Google Scholar)

Unique identifier

https://identifiers.org/RRID:SCR_014514

Dataset updated

Jan 29, 2022

Description

A unified data repository of the National Cancer Institute (NCI)'s Genomic Data Commons (GDC) that enables data sharing across cancer genomic studies in support of precision medicine. The GDC supports several cancer genome programs at the NCI Center for Cancer Genomics (CCG), including The Cancer Genome Atlas (TCGA), Therapeutically Applicable Research to Generate Effective Treatments (TARGET), and the Cancer Genome Characterization Initiative (CGCI). The GDC Data Portal provides a platform for efficiently querying and downloading high quality and complete data. The GDC also provides a GDC Data Transfer Tool and a GDC API for programmatic access.

Clear search

Close search

Google apps

Main menu

Genomic Data Commons Data Portal (GDC Data Portal)

Genomic Data Commons Data Portal

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

CIP TCGA Radiology Initiative

Cancer Research Data Commons

The Cancer Genome Atlas Rectum Adenocarcinoma Collection

CIP TCGA Radiology Initiative

The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis

TCGA-LUAD

Proteomic Data Commons

The Cancer Genome Atlas Liver Hepatocellular Carcinoma Collection

CIP TCGA Radiology Initiative

DICOM converted Slide Microscopy images for the TCGA-PRAD collection

Collection description

Files included

Download instructions

Acknowledgments

References

Small NBL Dataset for Analysis of AKT Signaling

GDC-QAG-genes-mutations

Metadata record for the manuscript: The CINSARC signature predicts the...

The Cancer Genome Atlas Lung Squamous Cell Carcinoma Collection

CIP TCGA Radiology Initiative

Bioinformatics analysis of CEP350 tumor suppression in human TCGA cutaneous...

Supplementary Materials for A Linked Data Representation for Summary...

CCDI-MCI: DICOM converted whole slide hematoxylin and eosin stained images...

Collection description

Files included

Download instructions

Acknowledgments

References

The Cancer Genome Atlas Stomach Adenocarcinoma Collection

CIP TCGA Radiology Initiative

Table_1_A novel risk score based on immune-related genes for hepatocellular...

Raw and processed data for studying chaperon-client interactions in 12...

Genomic Data Commons Data Portal (GDC Data Portal)

RRID:SCR_014514, Genomic Data Commons Data Portal (GDC Data Portal) (RRID:SCR_014514), Genomic Data Commons Data Portal, GDC Data Portal