100+ datasets found

c
The Cancer Genome Atlas Breast Invasive Carcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Feb 2, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
Dataset updated
Feb 2, 2014
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.
h
TCGA-Cancer-Variant-and-Clinical-Data
huggingface.co
Updated Jan 6, 2026
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hammad Ali (2026). TCGA-Cancer-Variant-and-Clinical-Data [Dataset]. https://huggingface.co/datasets/hammad655/TCGA-Cancer-Variant-and-Clinical-Data
Explore at:
Dataset updated
Jan 6, 2026
Authors
Hammad Ali
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
TCGA Cancer Variant and Clinical Data

Dataset Description

This dataset combines genetic variant information at the protein level with clinical data from The Cancer Genome Atlas (TCGA) project, curated by the International Cancer Genome Consortium (ICGC). It provides a comprehensive view of protein-altering mutations and clinical characteristics across various cancer types.

Dataset Summary

The dataset includes:

Protein sequence data for both mutated and… See the full description on the dataset page: https://huggingface.co/datasets/hammad655/TCGA-Cancer-Variant-and-Clinical-Data.
c
The Cancer Genome Atlas Rectum Adenocarcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Jan 5, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2016). The Cancer Genome Atlas Rectum Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU
Explore at:
dicom, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU
Dataset updated
Jan 5, 2016
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Rectum Adenocarcinoma (TCGA-READ) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.
Z
TCGA Kidney Renal Clear Cell Carcinoma (KIRC) Clinical Data
data.niaid.nih.gov
data-staging.niaid.nih.gov
+1more
Updated Jul 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar (2023). TCGA Kidney Renal Clear Cell Carcinoma (KIRC) Clinical Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190145
Explore at:
Dataset updated
Jul 29, 2023
Authors
Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about KIRC. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The survival and phenotype data were merged into one file. Empty columns were removed. Columns with the same value for every sample were also removed.

Acknowledgments:

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

U-BRITE last update: 07/13/2023
Cancer Categories and clinical research figures
kaggle.com
zip
Updated Oct 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DrAHung (2025). Cancer Categories and clinical research figures [Dataset]. https://www.kaggle.com/datasets/drahung/cancer-categories-and-clinical-research-figures
Explore at:
zip(996208 bytes)Available download formats
Dataset updated
Oct 16, 2025
Authors
DrAHung
Description
This dataset integrates open public data from multiple biomedical sources to provide a structured, queryable database of cancer classifications and clinical data from The Cancer Genome Atlas (TCGA).

All data are de-identified and publicly available via the U.S. National Cancer Institute (NCI) Genomic Data Commons (GDC) API, ensuring full compliance with NIH open-access guidelines.

Included Tables Table Description cancer_category Disease Ontology (DOID) categories and hierarchical labels (including English + Chinese translations). patient_tcga_clinical De-identified patient clinical records per TCGA project (demographics, stage, grade, survival, treatment). tcga_project_summary Per-project summary statistics (case counts, survival averages, tumor stage/grade coverage, and mapped cancer type).

tcga_project TCGA project metadata with links to DOID cancer categories.

Data source is from The Cancer Genome Atlas (TCGA).

A snapshot of clinical data. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F29334708%2F0049f6224420593507bfc8072df3e0e4%2Fsample.png?generation=1760586452165254&alt=media" alt="">
Z
TCGA Gene Expression Datasets
data.niaid.nih.gov
Updated Jul 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar (2023). TCGA Gene Expression Datasets [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8192915
Explore at:
Dataset updated
Jul 29, 2023
Authors
Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. These datasets contain gene expression profiles of bladder urothelial carcinoma (BLCA), cervical squamous cell carcinoma and endocervical adenocarcinoma (CESC), glioblastoma multiforme (GBM), head & neck squamous cell carcinoma (HNSC), kidney renal clear cell carcinoma (KIRC), and lower grade glioma (LGG).

The gene expression profiles for BLCA, CESC, HNSC, KIRC, and LGG were measured experimentally using the Illumina HiSeq 2000 RNA Sequencing platform by the University of North Carolina TCGA genome characterization center. The gene expression profile of the GBM dataset was measured experimentally using the Affymetrix HT Human Genome U133a microarray platform by the Broad Institute of MIT and Harvard University cancer genomic characterization center.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The log2(x+1) normalization was removed, and z-normalization was performed on the BLCA, CESC, HNSC, KIRC, and LGG datasets.

The log2(x) normalization was removed, and z-normalization was performed on the GBM dataset.

Acknowledgments:

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8.

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764.

U-BRITE last update: 07/13/2023
TCGA Lower Grade Glioma (LGG) Clinical Data
zenodo.org
data-staging.niaid.nih.gov
csv
Updated Jul 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar; Swati Baskiyar (2023). TCGA Lower Grade Glioma (LGG) Clinical Data [Dataset]. http://doi.org/10.5281/zenodo.8190154
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8190154
Dataset updated
Jul 29, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Swati Baskiyar; Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about LGG. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The survival and phenotype data were merged into one file. Empty columns were removed. Columns with the same value for every sample were also removed.

Acknowledgments:

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

U-BRITE last update: 07/13/2023
c
The Cancer Genome Atlas Lung Adenocarcinoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Jan 30, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2017). The Cancer Genome Atlas Lung Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.JGNIHEP5
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.JGNIHEP5
Dataset updated
Jan 30, 2017
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Lung Adenocarcinoma (TCGA-LUAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Lung Phenotype Research Group.
Z
TCGA Head & Neck Squamous Cell Carcinoma (HNSC) Clinical Data
data.niaid.nih.gov
zenodo.org
Updated Jul 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar (2023). TCGA Head & Neck Squamous Cell Carcinoma (HNSC) Clinical Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190126
Explore at:
Dataset updated
Jul 29, 2023
Authors
Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about HNSC. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The survival and phenotype data were merged into one file. Empty columns were removed. Columns with the same value for every sample were also removed.

Acknowledgments:

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

U-BRITE last update: 07/13/2023
M
A collection of Whole-genome sequencing files from the Cancer Genome Atlas...
datacatalog.mskcc.org
Updated Jul 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Genome Atlas (TCGA) (2021). A collection of Whole-genome sequencing files from the Cancer Genome Atlas program on Adenocarcinoma, filtered from the GDC Data Portal [Dataset]. https://datacatalog.mskcc.org/dataset/10777
Explore at:
Dataset updated
Jul 26, 2021
Dataset provided by
The Cancer Genome Atlas (TCGA)
MSK Library
Description
The GDC Data Portal is a robust data-driven platform that allows cancer researchers and bioinformaticians to search and download cancer data for analysis. This dataset is a filtered search result in the GDC Data Portal for TCGA Project, Adenocarcinoma, Whole Genome Sequencing Reads. It consists of 196 BAM files and 99 cases.
Z
TCGA Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC)...
data.niaid.nih.gov
Updated Jul 29, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar (2023). TCGA Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC) Clinical Data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190025
Explore at:
Dataset updated
Jul 29, 2023
Authors
Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about CESC. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The survival and phenotype data were merged into one file.

Acknowledgments:

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

U-BRITE last update: 07/13/2023
h
TCGA-PAAD
huggingface.co
Updated Dec 3, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
HLMCC (2025). TCGA-PAAD [Dataset]. https://huggingface.co/datasets/HLMCC/TCGA-PAAD
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Dataset updated
Dec 3, 2025
Authors
HLMCC
Description
Dataset Card for TCGA-PAAD Clinical Data

Dataset Summary

The TCGA-PAAD (The Cancer Genome Atlas - Pancreatic Adenocarcinoma) clinical dataset contains clinical data related to pancreatic adenocarcinoma patients. This dataset is part of the broader TCGA project, aimed at providing comprehensive genomic and clinical data for various types of cancer. The clinical data includes information such as patient demographics, treatment history, survival data, and other clinical… See the full description on the dataset page: https://huggingface.co/datasets/HLMCC/TCGA-PAAD.
c
The Cancer Genome Atlas Ovarian Cancer Collection
cancerimagingarchive.net
dicom, n/a
Updated May 29, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2020). The Cancer Genome Atlas Ovarian Cancer Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.NDO1MDFQ
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.NDO1MDFQ
Dataset updated
May 29, 2020
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Ovarian Cancer (TCGA-OV) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Ovarian Phenotype Research Group.
Z
TCGA Bladder Urothelial Carcinoma (BLCA) Clinical Data
data-staging.niaid.nih.gov
Updated Jul 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Swati Baskiyar (2023). TCGA Bladder Urothelial Carcinoma (BLCA) Clinical Data [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_8189913
Explore at:
Dataset updated
Jul 29, 2023
Authors
Swati Baskiyar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Abstract:

The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about BLCA. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

Inspiration:

This dataset was uploaded to UBRITE for GTKB project.

Instruction:

The survival and phenotype data were merged into one file.

Acknowledgments:

Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

U-BRITE last update: 07/13/2023
TCGA Expedition Modules and associated TCGA Datatypes managed.
plos.figshare.com
xls
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Uma R. Chandran; Olga P. Medvedeva; M. Michael Barmada; Philip D. Blood; Anish Chakka; Soumya Luthra; Antonio Ferreira; Kim F. Wong; Adrian V. Lee; Zhihui Zhang; Robert Budden; J. Ray Scott; Annerose Berndt; Jeremy M. Berg; Rebecca S. Jacobson (2023). TCGA Expedition Modules and associated TCGA Datatypes managed. [Dataset]. http://doi.org/10.1371/journal.pone.0165395.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0165395.t001
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Uma R. Chandran; Olga P. Medvedeva; M. Michael Barmada; Philip D. Blood; Anish Chakka; Soumya Luthra; Antonio Ferreira; Kim F. Wong; Adrian V. Lee; Zhihui Zhang; Robert Budden; J. Ray Scott; Annerose Berndt; Jeremy M. Berg; Rebecca S. Jacobson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TCGA Expedition Modules and associated TCGA Datatypes managed.
TCGA Pan-Cancer expression and mutation data for Project Cognoma
figshare.com
bz2
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Himmelstein; Gregory Way; Casey Greene (2023). TCGA Pan-Cancer expression and mutation data for Project Cognoma [Dataset]. http://doi.org/10.6084/m9.figshare.3487685.v1
Explore at:
bz2Available download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.3487685.v1
Dataset updated
Jun 3, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Daniel Himmelstein; Gregory Way; Casey Greene
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
The following datasets were created for Project Cognoma:expression-matrix.tsv.bz2 is a sample × gene matrix indicating a gene's expression level for a given sample. This dataset will be the feature/x/predictor for Project Cognoma.mutation-matrix.tsv.bz2 is a sample × gene matrix indicating whether a gene is mutated for a given sample. Select columns (or unions of several columns) in this dataset will be the status/y/outcome for Project Cognoma.These are preliminary datasets for development use and machine learning. The data was retrieved from the UCSC Xena Browser. All original work in the data is released under CC0. However, the license of TCGA and Xena data is currently unclear.These two datasets are from this GitHub directory linked to below, although they were not tracked due to large file size.
TCGA-PANCAN
kaggle.com
zip
Updated Mar 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ahmed Hamdi906 (2024). TCGA-PANCAN [Dataset]. https://www.kaggle.com/datasets/ahmedhamdi906/tcga-pancan/data
Explore at:
zip(70267089135 bytes)Available download formats
Dataset updated
Mar 2, 2024
Authors
Ahmed Hamdi906
License
https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/
Description
the dataset contains multiomics data from 32 TCGA pancan projects and controls data. Each case has RNA-seq, DNA methylation, and miRNA expression; all the data is in parquet format. The data was collected from the GDC portal. Annotations include cancer type, tumor subtype, and tissue sites. The annotations were collected from cbioportal, and tissue site data was cleaned. The starter notebook shows how to get started with the data.
TCGA BRCA cancer dataset
zenodo.org
portalinvestigacion.udc.gal
+1more
bin
Updated Dec 11, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carlos Fernandez-Lozano; Carlos Fernandez-Lozano (2020). TCGA BRCA cancer dataset [Dataset]. http://doi.org/10.5281/zenodo.4309168
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4309168
Dataset updated
Dec 11, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Carlos Fernandez-Lozano; Carlos Fernandez-Lozano
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Following the same steps that we used in the previous course we downloaded the TCGA-BRCA using R and Bioconductor and in particular the TCGABiolinks package. We downloaded transcriptome profiling of gene expression quantification where the experimental strategy is (RNAseq) and the workflow type is HTSeq-FPKM-UQ and only primary solid tumor data of the affymetrix GPL86 profile and clinical data.
c
The Cancer Genome Atlas Low Grade Glioma Collection
cancerimagingarchive.net
dicom, n/a
Updated Jan 5, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2016). The Cancer Genome Atlas Low Grade Glioma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK
Explore at:
n/a, dicomAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.L4LTD3TK
Dataset updated
Jan 5, 2016
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Low Grade Glioma (TCGA-LGG) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Glioma Phenotype Research Group.
c
The Cancer Genome Atlas Sarcoma Collection
cancerimagingarchive.net
dicom, n/a
Updated Jan 5, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
The Cancer Imaging Archive (2016). The Cancer Genome Atlas Sarcoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.CX6YLSUX
Explore at:
dicom, n/aAvailable download formats
Unique identifier
https://doi.org/10.7937/K9/TCIA.2016.CX6YLSUX
Dataset updated
Jan 5, 2016
Dataset authored and provided by
The Cancer Imaging Archive
License
https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/
Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description
The Cancer Genome Atlas Sarcoma (TCGA-SARC) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).
Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.
CIP TCGA Radiology Initiative
Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.

Facebook

Twitter

Click to copy link

Link copied

Cite

The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

TCGA-BRCA

Explore at:

89 scholarly articles cite this dataset (View in Google Scholar)

n/a, dicomAvailable download formats

Unique identifier

https://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP

Dataset updated

Feb 2, 2014

Dataset authored and provided by

The Cancer Imaging Archive

License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered

May 29, 2020

Dataset funded by

National Cancer Institutehttp://www.cancer.gov/

Description

The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

CIP TCGA Radiology Initiative

Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

Clear search

Close search

Google apps

Main menu

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

CIP TCGA Radiology Initiative

TCGA-Cancer-Variant-and-Clinical-Data

The Cancer Genome Atlas Rectum Adenocarcinoma Collection

CIP TCGA Radiology Initiative

TCGA Kidney Renal Clear Cell Carcinoma (KIRC) Clinical Data

Cancer Categories and clinical research figures

tcga_project TCGA project metadata with links to DOID cancer categories.

TCGA Gene Expression Datasets

TCGA Lower Grade Glioma (LGG) Clinical Data

The Cancer Genome Atlas Lung Adenocarcinoma Collection

CIP TCGA Radiology Initiative

TCGA Head & Neck Squamous Cell Carcinoma (HNSC) Clinical Data

A collection of Whole-genome sequencing files from the Cancer Genome Atlas...

TCGA Cervical Squamous Cell Carcinoma and Endocervical Adenocarcinoma (CESC)...

TCGA-PAAD

The Cancer Genome Atlas Ovarian Cancer Collection

CIP TCGA Radiology Initiative

TCGA Bladder Urothelial Carcinoma (BLCA) Clinical Data

TCGA Expedition Modules and associated TCGA Datatypes managed.

TCGA Pan-Cancer expression and mutation data for Project Cognoma

TCGA-PANCAN

TCGA BRCA cancer dataset

The Cancer Genome Atlas Low Grade Glioma Collection

CIP TCGA Radiology Initiative

The Cancer Genome Atlas Sarcoma Collection

CIP TCGA Radiology Initiative

The Cancer Genome Atlas Breast Invasive Carcinoma Collection

TCGA-BRCA

CIP TCGA Radiology Initiative