100+ datasets found
  1. c

    The Cancer Genome Atlas Rectum Adenocarcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Jan 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2016). The Cancer Genome Atlas Rectum Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU
    Explore at:
    dicom, n/aAvailable download formats
    Dataset updated
    Jan 5, 2016
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Rectum Adenocarcinoma (TCGA-READ) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.

  2. c

    The Cancer Genome Atlas Breast Invasive Carcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Feb 2, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2014). The Cancer Genome Atlas Breast Invasive Carcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    Feb 2, 2014
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Breast Invasive Carcinoma (TCGA-BRCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Breast Phenotype Research Group.

  3. h

    Data from: TCGA

    • huggingface.co
    Updated May 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lab-Rasool (2024). TCGA [Dataset]. https://huggingface.co/datasets/Lab-Rasool/TCGA
    Explore at:
    Dataset updated
    May 13, 2024
    Dataset authored and provided by
    Lab-Rasool
    License

    Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
    License information was derived automatically

    Description

    Dataset Card for The Cancer Genome Atlas (TCGA) Multimodal Dataset

    The Cancer Genome Atlas (TCGA) Multimodal Dataset is a comprehensive collection of clinical data, pathology reports, slide images, molecular data, and radiology images for cancer patients. This dataset aims to facilitate research in multimodal machine learning for oncology by providing embeddings generated using state-of-the-art models including GatorTron, MedGemma, Qwen, Llama, UNI, SeNMo, REMEDIS, and… See the full description on the dataset page: https://huggingface.co/datasets/Lab-Rasool/TCGA.

  4. M

    Acute Myeloid Leukemia (TCGA, PanCancer Atlas) data

    • datacatalog.mskcc.org
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Genome Atlas (TCGA) (2019). Acute Myeloid Leukemia (TCGA, PanCancer Atlas) data [Dataset]. https://datacatalog.mskcc.org/dataset/10406
    Explore at:
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    MSK Library
    The Cancer Genome Atlas (TCGA)
    Description

    This dataset contains summary data visualizations and clinical data from a broad sampling of over 200 acute myeloid leukemias from 200 patients. The data was gathered as part of the PanCancer Atlas initiative, which aims to answer big, overarching questions about cancer by examining the full set of tumors characterized in the robust TCGA dataset. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, and chromosomal gain or loss. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  5. The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis

    • figshare.com
    xlsx
    Updated Feb 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Namshik Han (2018). The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.5851743.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 2, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Namshik Han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TCGA RNA-seq V2 Level3 data were downloaded from TCGA Genomic Data Commons Data Portal (https://gdc-portal.nci.nih.gov), consisting of 11,303 samples in 34 cancer projects (33 cancer types). Nine cancer types that do not have corresponding non-tumour samples were filtered out, and the analysis was focused on tumour versus non-tumour comparison. 24 cancer types were used in this meta-analysis: BLCA, BRCA, CESC, CHOL, COAD, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LIHC, LUAD, LUSC, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, THCA, THYM, UCEC (https://gdc-portal.nci.nih.gov). The nine filtered cancer types were ACC, DLBC, LAML, LGG, MESO, OV, TGCT, UCS and UVM. To extract expression values from TCGA RNA-seq data, we used genomic coordinates to retrieve UCSC Transcript IDs that correspond to the identifiers in TCGA RNA-seq V2 Level3 data (isoform level). The GAF (General Annotation Format) file was used to map the coordinate to UCSC Transcript ID, and it was downloaded form https://tcga-data.nci.nih.gov/docs/GAF/GAF.hg19.June2011.bundle/outputs/TCGA.hg19.June2011.gaf. This file contains genomic annotations shared by all TCGA projects. More details of the GAF file format can be found at https://tcga-data.nci.nih.gov/docs/GAF/GAF3.0/GAF_v3_file_description.docx. We filtered out any coding exons overlapping UCSC Transcript IDs to eliminate expression value of coding genes and evaluate lncRNA expression.We could find the expression values of 443 pcRNAs and 203 tapRNAs in TCGA data, as many of non-coding regions are not yet fully annotated in the TCGA RNA-seq V2 Level3 data. The expression value of pcRNAs and tapRNAs were extracted and clustered by un-supervised Pearson correlation method (Supplementary Figure 18A). The expression values of tapRNA-associated coding genes were also extracted and used to generate the heat-map (Supplementary Figure 18B), which shows the similar pattern of expression with tapRNAs across the cancer types.To show that tapRNAs and associated coding genes have similar expression profiles in cancers we generated a Spearman's Rank-Order Correlation heatmap (Figure 6A) between tapRNAs and their associated coding genes based on the TCGA RNA-seq data. We used the MatLab function corr to calculate the Spearman's rho. This function takes two matrices X (197-by-8,850 expression profiling matrix of tapRNA) and Y (197-by-8,850 expression profiling matrix of tapRNA-assocated coding gene) and returns an 8,850-by-8,850 matrix containing the pairwise correlation coefficient between each pair of 8,850 columns (TCGA cancer samples in Supplementary Figure 18A and B). Thus, the rank-order correlation matrix that we computed from the matrices of expression profiling data (Supplementary Figure S18A and B) allowed us to compare the correlation between two column vectors i.e. cancer samples. This function also returns a matrix of p-values for testing the hypothesis of no correlation against the alternative that there is a nonzero correlation. Each element of a matrix of p-values is the p value for the corresponding element of Spearman's rho. The p-values for Spearman's rho are calculated using large-sample approximations. To check significance level of correlation between tapRNA and its associated coding gene, the diagonal of the p-value matrix was extracted and used. The median is 1.31x10-11 and the mean is 1.03x10-4 with standard deviation 0.0029.To identify cancer-specific tapRNAs, we considered not only the global expression pattern of a given tapRNA in each cancer type, but also expression pattern of specific sub-group that is significantly distinct, to take into account cancer sample heterogeneity. Thus, two conditions were applied: (1) average expression level of a tapRNA in a given cancer type is in top 10% or bottom 10% and (2) a tapRNA has at least 10% of samples in a given cancer type that are significantly up-regulated (Z-score > 2) or down-regulated (Z-score < -2).

  6. h

    TCGA-Cancer-Variant-and-Clinical-Data

    • huggingface.co
    Updated Oct 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Seq-to-Pheno (2024). TCGA-Cancer-Variant-and-Clinical-Data [Dataset]. https://huggingface.co/datasets/seq-to-pheno/TCGA-Cancer-Variant-and-Clinical-Data
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Oct 10, 2024
    Dataset authored and provided by
    Seq-to-Pheno
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    TCGA Cancer Variant and Clinical Data

      Dataset Description
    

    This dataset combines genetic variant information at the protein level with clinical data from The Cancer Genome Atlas (TCGA) project, curated by the International Cancer Genome Consortium (ICGC). It provides a comprehensive view of protein-altering mutations and clinical characteristics across various cancer types.

      Dataset Summary
    

    The dataset includes:

    Protein sequence data for both mutated and… See the full description on the dataset page: https://huggingface.co/datasets/seq-to-pheno/TCGA-Cancer-Variant-and-Clinical-Data.

  7. TCGA Chemotherapy Response Dataset

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    csv, zip
    Updated Nov 1, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dalibor Hrg; Dalibor Hrg; Balthasar Huber; Lukas A. Huber; Balthasar Huber; Lukas A. Huber (2021). TCGA Chemotherapy Response Dataset [Dataset]. http://doi.org/10.5281/zenodo.3719291
    Explore at:
    zip, csvAvailable download formats
    Dataset updated
    Nov 1, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dalibor Hrg; Dalibor Hrg; Balthasar Huber; Lukas A. Huber; Balthasar Huber; Lukas A. Huber
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset on chemotherapeutic drug responses in TCGA cancer patients, cross-referenced for a hit in TCIA.at database, consisting of clinical (TCGA), cancer tissue gene-expression (TCGA) and tumor-immunome (TCIA) features. The dataset consists of 5 common chemotherapy agents, 3 CRC agents (FOLFOX, 5FU, Oxaliplatin) and 2 Lung agents (Carboplatin, Cisplatin). FOLFOX as a combinational therapy or regimen, was compiled from timings of monotherapies given to patients and as such is a novel dataset derived from TCGA data. FOLFOX dataset is primarily firstline treatment, while other drugs are not to be interpreted as firstline treatments. Drug datasets are individually available in own CSV files.

    Citation
    Dalibor Hrg, Balthasar Huber, Lukas A. Huber. (2020). TCGA Chemotherapy Response Dataset. Zenodo. http://doi.org/10.5281/zenodo.3719291

    The results here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

    License
    CC BY-SA 4.0 International https://creativecommons.org/licenses/by-sa/4.0. Authors take no liability for any use of this data.

    Contributions
    D. Hrg and B. Huber acknowledge major and equal work effort: data understanding, data science and dataset preparation (monotherapies and FOLFOX); L. A. Huber: help with dictionary of drug names and curration/cleaning of FOLFOX entries, clinical validation.

    Contact & Maintenance
    dalibor.hrg@gmail.com
    dalibor.hrg@i-med.ac.at

  8. c

    The Cancer Genome Atlas Lung Adenocarcinoma Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Jan 30, 2017
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2017). The Cancer Genome Atlas Lung Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.JGNIHEP5
    Explore at:
    n/a, dicomAvailable download formats
    Dataset updated
    Jan 30, 2017
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Lung Adenocarcinoma (TCGA-LUAD) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the TCGA Lung Phenotype Research Group.

  9. M

    Colorectal Adenocarcinoma (TCGA, PanCancer Atlas) data

    • datacatalog.mskcc.org
    Updated Nov 20, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Genome Atlas (TCGA) (2019). Colorectal Adenocarcinoma (TCGA, PanCancer Atlas) data [Dataset]. https://datacatalog.mskcc.org/dataset/10411
    Explore at:
    Dataset updated
    Nov 20, 2019
    Dataset provided by
    MSK Library
    The Cancer Genome Atlas (TCGA)
    Description

    This dataset contains summary data visualizations and clinical data from a broad sampling of 594 colorectal adenocarcinomas from 594 patients. The data was gathered as part of the PanCancer Atlas initiative, which aims to answer big, overarching questions about cancer by examining the full set of tumors characterized in the robust TCGA dataset. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, and chromosomal gain or loss. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  10. M

    Lung Adenocarcinoma (TCGA, PanCancer Atlas)

    • datacatalog.mskcc.org
    Updated Nov 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Genome Atlas (TCGA) (2019). Lung Adenocarcinoma (TCGA, PanCancer Atlas) [Dataset]. https://datacatalog.mskcc.org/dataset/10420
    Explore at:
    Dataset updated
    Nov 21, 2019
    Dataset provided by
    MSK Library
    The Cancer Genome Atlas (TCGA)
    Description

    This dataset contains summary data visualizations and clinical data from a broad sampling of 566 lung adenocarcinomas from 566 patients. The data was gathered as part of the PanCancer Atlas initiative, which aims to answer big, overarching questions about cancer by examining the full set of tumors characterized in the robust TCGA dataset. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, and chromosomal gain or loss. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  11. Results of GSVA for TCGA-COAD.

    • plos.figshare.com
    xls
    Updated Jul 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yongling Wang; Zan Yuan; Yi Lao; Jiangtao He; Shufen Mo; Kangbiao Chen; Yanyan Ye; Lu Huang (2025). Results of GSVA for TCGA-COAD. [Dataset]. http://doi.org/10.1371/journal.pone.0328560.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jul 18, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Yongling Wang; Zan Yuan; Yi Lao; Jiangtao He; Shufen Mo; Kangbiao Chen; Yanyan Ye; Lu Huang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe exact mechanisms driving colorectal cancer (CRC) are yet to be fully elucidated. This study aims to confirm the reliability of a prognostic model for colon adenocarcinoma (COAD) by analyzing the varied expression levels of Glycolysis & Pyroptosis-Related Differentially Expressed Genes (G&PRDEGs) in COAD using bioinformatics tools.MethodsWe retrieved gene expression data and clinical details for COAD patients from the Cancer Genome Atlas (TCGA) database. These data were analyzed to categorize the samples into pyroptosis-positive and pyroptosis-negative groups based on their expression of G&PRDEGs. A prognostic model for COAD was then developed using LASSO Cox regression analysis, focusing on these differentially expressed genes (DEGs). Kaplan-Meier curves were plotted to assess the differences in survival between the two groups. Furthermore, we conducted multivariate Cox regression analyses to evaluate the influence of clinical parameters and model-derived risk scores. Analyses of pathway enrichment were performed using R software, alongside single-sample gene-set enrichment analysis (ssGSEA) to explore the role of immune cells and functions associated with G&PRDEGs.ResultsA predictive model was developed using 53 G&PRDEGs that were expressed differentially. An examination of survival rates revealed that the high-risk groups exhibited a noticeably diminished overall survival (OS) in comparison to the low-risk groups in the TCGA database (P 

  12. TCGA Lower Grade Glioma (LGG) Clinical Data

    • zenodo.org
    csv
    Updated Jul 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Swati Baskiyar; Swati Baskiyar (2023). TCGA Lower Grade Glioma (LGG) Clinical Data [Dataset]. http://doi.org/10.5281/zenodo.8190154
    Explore at:
    csvAvailable download formats
    Dataset updated
    Jul 29, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Swati Baskiyar; Swati Baskiyar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Abstract:

    The Cancer Genome Atlas (TCGA) was a large-scale collaborative project initiated by the National Cancer Institute (NCI) and the National Human Genome Research Institute (NHGRI). It aimed to comprehensively characterize the genomic and molecular landscape of various cancer types. This dataset includes curated survival data from the Pan-cancer Atlas paper titled "An Integrated TCGA Pan-Cancer Clinical Data Resource (TCGA-CDR) to drive high quality survival outcome analytics". The paper highlights four types of carefully curated survival endpoints, and recommends the use of the endpoints of OS, PFI, DFI, and DSS for each TCGA cancer type. The dataset also includes phenotypic information about LGG. The Sample IDs are unique identifiers, which can be paired with the gene expression dataset.

    Inspiration:

    This dataset was uploaded to UBRITE for GTKB project.

    Instruction:

    The survival and phenotype data were merged into one file. Empty columns were removed. Columns with the same value for every sample were also removed.

    Acknowledgments:

    Goldman, M.J., Craft, B., Hastie, M. et al. Visualizing and interpreting cancer genomics data via the Xena platform. Nat Biotechnol (2020). https://doi.org/10.1038/s41587-020-0546-8

    Liu, Jianfang, Caesar-Johnson, Samantha J. et al. An Integrated TCGA Pan-Cancer Clinical Data Resource to Drive High-Quality Survival Outcome Analytics. Cell, Volume 173, Issue 2, 400 - 416.e11. https://doi.org/10.1016/j.cell.2018.02.052

    The Cancer Genome Atlas Research Network., Weinstein, J., Collisson, E. et al. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet 45, 1113–1120 (2013). https://doi.org/10.1038/ng.2764

    U-BRITE last update: 07/13/2023

  13. M

    Skin Cutaneous Melanoma (TCGA, PanCancer Atlas)

    • datacatalog.mskcc.org
    Updated Nov 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Genome Atlas (TCGA) (2019). Skin Cutaneous Melanoma (TCGA, PanCancer Atlas) [Dataset]. https://datacatalog.mskcc.org/dataset/10428
    Explore at:
    Dataset updated
    Nov 21, 2019
    Dataset provided by
    MSK Library
    The Cancer Genome Atlas (TCGA)
    Description

    This dataset contains summary data visualizations and clinical data from a broad sampling of 442 samples of skin cutaneous melanomas from 488 patients. The data was gathered as part of the PanCancer Atlas initiative, which aims to answer big, overarching questions about cancer by examining the full set of tumors characterized in the robust TCGA dataset. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, and chromosomal gain or loss. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  14. TCGA_PanCancerRNAseq

    • zenodo.org
    Updated Mar 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ramyar Molania; Ramyar Molania (2022). TCGA_PanCancerRNAseq [Dataset]. http://doi.org/10.1101/2021.11.01.466731
    Explore at:
    Dataset updated
    Mar 10, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ramyar Molania; Ramyar Molania
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes the harmonised version of all The Cancer Genome Atlas (TCGA) RNA-Seq data (33 cancer types, ~ 11000 samples). Each file is a "SummarizedExperiment" object that contains:1) three assays (raw gene-level counts, FPKM and FPKM.UQ), 2) sample and batch information from different resources, 3) several details for individual genes. The data can also be explored by an Rshiny app published in our pre-print paper doi: https://doi.org/10.1101/2021.11.01.466731.

  15. The clinical information and mRNA expression data from the TCGA database of...

    • plos.figshare.com
    xlsx
    Updated Nov 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hong Gyu Yoon; Jin Hwan Cheong; Je Il Ryu; Yu Deok Won; Kyueng-Whan Min; Myung-Hoon Han (2023). The clinical information and mRNA expression data from the TCGA database of 525 GBM cases. [Dataset]. http://doi.org/10.1371/journal.pone.0295061.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Hong Gyu Yoon; Jin Hwan Cheong; Je Il Ryu; Yu Deok Won; Kyueng-Whan Min; Myung-Hoon Han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The clinical information and mRNA expression data from the TCGA database of 525 GBM cases.

  16. f

    Raw clinical data for TCGA-THCA samples.

    • datasetcatalog.nlm.nih.gov
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhang, Yuting; Luo, Yulou; Ye, Yinghui; Chen, Yan; Saibaidoula, Yilina (2025). Raw clinical data for TCGA-THCA samples. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002098517
    Explore at:
    Dataset updated
    May 7, 2025
    Authors
    Zhang, Yuting; Luo, Yulou; Ye, Yinghui; Chen, Yan; Saibaidoula, Yilina
    Description

    The Proteasome 20S subunit beta 8 (PSMB8) is an integral element of the immunoproteasome complex, playing a pivotal role in antigen processing. Despite its significance, the contributory role of PSMB8 in oncogenesis, particularly in thyroid carcinoma (THCA), has not been well-characterized. To address this gap in knowledge, our study endeavored to delineate the potential associations between PSMB8 and THCA. Transcriptomic profiles and clinical data of patients with THCA were retrieved from The Cancer Genome Atlas (TCGA) database to facilitate comprehensive analysis. Complementary resources from additional online databases were utilized to augment the study. Logistic regression analysis was employed to elucidate the relationship between PSMB8 and various clinicopathological parameters. Uni/multivariate Cox regression analyses were conducted to ascertain the independent prognostic factors for THCA patient outcomes. Quantitative polymerase chain reaction (qPCR) and western blot assays were employed to verify the expression level of PSMB8 in vitro. Our study demonstrated that PSMB8 was significantly upregulated in THCA, with its overexpression correlating with lymph node metastasis, extrathyroidal extension, and favorable prognosis. Immunohistochemistry substantiated a higher PSMB8 protein presence in THCA tissue compared to the normal, supporting its potential role as a moderately accurate diagnostic biomarker. Logistic regression analysis identified PSMB8 as a significant indicator of the N1 stage, classical histological subtype, and extrathyroidal extension. Age, T stage, and PSMB8 were further determined as independent prognostic factors for THCA. Functional investigations linked PSMB8 to immune processes, evidenced by its association with increased immune cell infiltration and higher stromal/immune scores, as well as a positive co-expression with several immune checkpoints. A constructed predicted competing endogenous RNA (ceRNA) network implicated PSMB8 in complex post-transcriptional regulation. Finally, in vitro assays confirmed the upregulation of PSMB8, underscoring its relevance in THCA and as a target for future research. Our work has preliminarily appraised PSMB8 as a biomarker with certain prognostic and diagnostic significance, and as a potential target for immunotherapy in THCA.

  17. c

    The Cancer Genome Atlas Thyroid Cancer Collection

    • cancerimagingarchive.net
    dicom, n/a
    Updated Jan 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The Cancer Imaging Archive (2016). The Cancer Genome Atlas Thyroid Cancer Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.9ZFRVF1B
    Explore at:
    dicom, n/aAvailable download formats
    Dataset updated
    Jan 5, 2016
    Dataset authored and provided by
    The Cancer Imaging Archive
    License

    https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

    Time period covered
    May 29, 2020
    Dataset funded by
    National Cancer Institutehttp://www.cancer.gov/
    Description

    The Cancer Genome Atlas Thyroid Cancer (TCGA-THCA) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

    Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

    CIP TCGA Radiology Initiative

    Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.

  18. r

    Data from: The Cancer Genome Atlas (TCGA)

    • resodate.org
    Updated Dec 2, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    K. Tomczak; P. Czerwiska; M. Wiznerowicz (2024). The Cancer Genome Atlas (TCGA) [Dataset]. https://resodate.org/resources/aHR0cHM6Ly9zZXJ2aWNlLnRpYi5ldS9sZG1zZXJ2aWNlL2RhdGFzZXQvdGhlLWNhbmNlci1nZW5vbWUtYXRsYXMtLXRjZ2Et
    Explore at:
    Dataset updated
    Dec 2, 2024
    Dataset provided by
    Leibniz Data Manager
    Authors
    K. Tomczak; P. Czerwiska; M. Wiznerowicz
    Description

    The TCGA dataset contains gene expression data from 20,000 primary cancer and matched normal samples spanning 33 cancer types.

  19. Interaction Network for Acute Myeloid Leukemia (AML) using The Cancer Genome...

    • figshare.com
    txt
    Updated Jan 18, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yishai Shimoni; Mariano Alvarez (2016). Interaction Network for Acute Myeloid Leukemia (AML) using The Cancer Genome Atlas (TCGA) database [Dataset]. http://doi.org/10.6084/m9.figshare.871521.v5
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jan 18, 2016
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yishai Shimoni; Mariano Alvarez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    RNA-seq data was downloaded from the TCGA website (Nov 14, 2013) and includes 128 DESEq normalized samples. The data was derived from primary samples of sorted whole blood obtained from AML patients. 100 bootstraps of ARACNe were run using adaptive partitioning, p-value cut off of 1e-8, and full DPI. The regulator list that was used includes all the genes with GO annotation of transcriptional regulation or DNA binding (Shimoni, Yishai; Alvarez, Mariano (2013): TF list. figshare.http://dx.doi.org/10.6084/m9.figshare.871524)

  20. Cancer Categories and clinical research figures

    • kaggle.com
    zip
    Updated Oct 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    DrAHung (2025). Cancer Categories and clinical research figures [Dataset]. https://www.kaggle.com/datasets/drahung/cancer-categories-and-clinical-research-figures
    Explore at:
    zip(996208 bytes)Available download formats
    Dataset updated
    Oct 16, 2025
    Authors
    DrAHung
    Description

    This dataset integrates open public data from multiple biomedical sources to provide a structured, queryable database of cancer classifications and clinical data from The Cancer Genome Atlas (TCGA).

    All data are de-identified and publicly available via the U.S. National Cancer Institute (NCI) Genomic Data Commons (GDC) API, ensuring full compliance with NIH open-access guidelines.

    Included Tables Table Description cancer_category Disease Ontology (DOID) categories and hierarchical labels (including English + Chinese translations). patient_tcga_clinical De-identified patient clinical records per TCGA project (demographics, stage, grade, survival, treatment). tcga_project_summary Per-project summary statistics (case counts, survival averages, tumor stage/grade coverage, and mapped cancer type).

    tcga_project TCGA project metadata with links to DOID cancer categories.

    Data source is from The Cancer Genome Atlas (TCGA).

    A snapshot of clinical data. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F29334708%2F0049f6224420593507bfc8072df3e0e4%2Fsample.png?generation=1760586452165254&alt=media" alt="">

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The Cancer Imaging Archive (2016). The Cancer Genome Atlas Rectum Adenocarcinoma Collection [Dataset]. http://doi.org/10.7937/K9/TCIA.2016.F7PPNPNU

The Cancer Genome Atlas Rectum Adenocarcinoma Collection

TCGA-READ

Explore at:
8 scholarly articles cite this dataset (View in Google Scholar)
dicom, n/aAvailable download formats
Dataset updated
Jan 5, 2016
Dataset authored and provided by
The Cancer Imaging Archive
License

https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/https://www.cancerimagingarchive.net/data-usage-policies-and-restrictions/

Time period covered
May 29, 2020
Dataset funded by
National Cancer Institutehttp://www.cancer.gov/
Description

The Cancer Genome Atlas Rectum Adenocarcinoma (TCGA-READ) data collection is part of a larger effort to build a research community focused on connecting cancer phenotypes to genotypes by providing clinical images matched to subjects from The Cancer Genome Atlas (TCGA). Clinical, genetic, and pathological data resides in the Genomic Data Commons (GDC) Data Portal while the radiological data is stored on The Cancer Imaging Archive (TCIA).

Matched TCGA patient identifiers allow researchers to explore the TCGA/TCIA databases for correlations between tissue genotype, radiological phenotype and patient outcomes. Tissues for TCGA were collected from many sites all over the world in order to reach their accrual targets, usually around 500 specimens per cancer type. For this reason the image data sets are also extremely heterogeneous in terms of scanner modalities, manufacturers and acquisition protocols. In most cases the images were acquired as part of routine care and not as part of a controlled research study or clinical trial.

CIP TCGA Radiology Initiative

Imaging Source Site (ISS) Groups are being populated and governed by participants from institutions that have provided imaging data to the archive for a given cancer type. Modeled after TCGA analysis groups, ISS groups are given the opportunity to publish a marker paper for a given cancer type per the guidelines in the table above. This opportunity will generate increased participation in building these multi-institutional data sets as they become an open community resource. Learn more about the CIP TCGA Radiology Initiative.

Search
Clear search
Close search
Google apps
Main menu