27 datasets found
  1. The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis

    • figshare.com
    xlsx
    Updated Feb 2, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Namshik Han (2018). The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.5851743.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Feb 2, 2018
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Namshik Han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TCGA RNA-seq V2 Level3 data were downloaded from TCGA Genomic Data Commons Data Portal (https://gdc-portal.nci.nih.gov), consisting of 11,303 samples in 34 cancer projects (33 cancer types). Nine cancer types that do not have corresponding non-tumour samples were filtered out, and the analysis was focused on tumour versus non-tumour comparison. 24 cancer types were used in this meta-analysis: BLCA, BRCA, CESC, CHOL, COAD, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LIHC, LUAD, LUSC, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, THCA, THYM, UCEC (https://gdc-portal.nci.nih.gov). The nine filtered cancer types were ACC, DLBC, LAML, LGG, MESO, OV, TGCT, UCS and UVM. To extract expression values from TCGA RNA-seq data, we used genomic coordinates to retrieve UCSC Transcript IDs that correspond to the identifiers in TCGA RNA-seq V2 Level3 data (isoform level). The GAF (General Annotation Format) file was used to map the coordinate to UCSC Transcript ID, and it was downloaded form https://tcga-data.nci.nih.gov/docs/GAF/GAF.hg19.June2011.bundle/outputs/TCGA.hg19.June2011.gaf. This file contains genomic annotations shared by all TCGA projects. More details of the GAF file format can be found at https://tcga-data.nci.nih.gov/docs/GAF/GAF3.0/GAF_v3_file_description.docx. We filtered out any coding exons overlapping UCSC Transcript IDs to eliminate expression value of coding genes and evaluate lncRNA expression.We could find the expression values of 443 pcRNAs and 203 tapRNAs in TCGA data, as many of non-coding regions are not yet fully annotated in the TCGA RNA-seq V2 Level3 data. The expression value of pcRNAs and tapRNAs were extracted and clustered by un-supervised Pearson correlation method (Supplementary Figure 18A). The expression values of tapRNA-associated coding genes were also extracted and used to generate the heat-map (Supplementary Figure 18B), which shows the similar pattern of expression with tapRNAs across the cancer types.To show that tapRNAs and associated coding genes have similar expression profiles in cancers we generated a Spearman's Rank-Order Correlation heatmap (Figure 6A) between tapRNAs and their associated coding genes based on the TCGA RNA-seq data. We used the MatLab function corr to calculate the Spearman's rho. This function takes two matrices X (197-by-8,850 expression profiling matrix of tapRNA) and Y (197-by-8,850 expression profiling matrix of tapRNA-assocated coding gene) and returns an 8,850-by-8,850 matrix containing the pairwise correlation coefficient between each pair of 8,850 columns (TCGA cancer samples in Supplementary Figure 18A and B). Thus, the rank-order correlation matrix that we computed from the matrices of expression profiling data (Supplementary Figure S18A and B) allowed us to compare the correlation between two column vectors i.e. cancer samples. This function also returns a matrix of p-values for testing the hypothesis of no correlation against the alternative that there is a nonzero correlation. Each element of a matrix of p-values is the p value for the corresponding element of Spearman's rho. The p-values for Spearman's rho are calculated using large-sample approximations. To check significance level of correlation between tapRNA and its associated coding gene, the diagonal of the p-value matrix was extracted and used. The median is 1.31x10-11 and the mean is 1.03x10-4 with standard deviation 0.0029.To identify cancer-specific tapRNAs, we considered not only the global expression pattern of a given tapRNA in each cancer type, but also expression pattern of specific sub-group that is significantly distinct, to take into account cancer sample heterogeneity. Thus, two conditions were applied: (1) average expression level of a tapRNA in a given cancer type is in top 10% or bottom 10% and (2) a tapRNA has at least 10% of samples in a given cancer type that are significantly up-regulated (Z-score > 2) or down-regulated (Z-score < -2).

  2. M

    Glioblastoma Multiforme (TCGA, Firehose Legacy)

    • datacatalog.mskcc.org
    Updated Oct 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad Institute (2020). Glioblastoma Multiforme (TCGA, Firehose Legacy) [Dataset]. https://datacatalog.mskcc.org/dataset/10472
    Explore at:
    Dataset updated
    Oct 1, 2020
    Dataset provided by
    Broad Institute
    MSK Library
    Description

    TCGA Glioblastoma Multiforme. Source data from GDAC Firehose. Previously known as TCGA Provisional. This dataset contains summary data visualizations and clinical data from a broad sampling of 619 glioblastoma multiformes from 606 patients. The data was gathered as part of the Broad Institute of MIT and Harvard Firehose initiative, a cancer analysis pipeline. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, chromosomal gain or loss, Adjuvant Postoperative Pharmaceutical Therapy Administered Indicator, Days to Sample Collection, if the patient start adjuvant postoperative radiotherapy, Disease Free (Months), Disease Free Status, and First Pathologic Diagnosis Biospecimen Acquisition Method Type. The dataset includes Next-Generation Clustered Heat Maps (NG-CHM) viewable via an embedded NG-CHM Heat Map Viewer, provided my MD Anderson Cancer Center, which provides a graphical environment for exploration of clustered or non-clustered heat map data. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  3. M

    Acute Myeloid Leukemia (TCGA, Firehose Legacy)

    • datacatalog.mskcc.org
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad Institute (2020). Acute Myeloid Leukemia (TCGA, Firehose Legacy) [Dataset]. https://datacatalog.mskcc.org/dataset/10484
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    Broad Institute
    MSK Library
    Description

    TCGA Acute Myeloid Leukemia. Source data from GDAC Firehose. Previously known as TCGA Provisional. This dataset contains summary data visualizations and clinical data from a broad sampling of 200 carcinomas from 200 patients. The data was gathered as part of the Broad Institute of MIT and Harvard Firehose initiative, a cancer analysis pipeline. The clinical data includes mutation count, information about mutated genes, patient demographics, sample type, disease code, Abnormal Lymphocyte Percent, Atra Exposure, Basophils Cell Count, Blast Count, Cytogenetic abnormality type, and FAB. The dataset includes Next-Generation Clustered Heat Maps (NG-CHM) viewable via an embedded NG-CHM Heat Map Viewer, provided my MD Anderson Cancer Center, which provides a graphical environment for exploration of clustered or non-clustered heat map data. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  4. f

    Heatmap of the compounds with sensitivity ranging between picomolar to ≦...

    • plos.figshare.com
    xls
    Updated Aug 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Umesh Kathad; Neha Biyani; Raniero L. Peru y Colón De Portugal; Jianli Zhou; Harry Kochat; Kishor Bhatia (2024). Heatmap of the compounds with sensitivity ranging between picomolar to ≦ 10nM. [Dataset]. http://doi.org/10.1371/journal.pone.0308604.s007
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Aug 26, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Umesh Kathad; Neha Biyani; Raniero L. Peru y Colón De Portugal; Jianli Zhou; Harry Kochat; Kishor Bhatia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Compounds with sensitivity ranging between picomolar to ≦ 10nM (compounds exhibiting overlapping sensitivity with both picomolar as well as low nanomolar range across 9 NCI60 cancer indications). This data is used to generate Fig 9. (XLS)

  5. f

    Table_2_Screening the Cancer Genome Atlas Database for Genes of Prognostic...

    • frontiersin.figshare.com
    xlsx
    Updated Jun 12, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jie Ni; Yang Wu; Feng Qi; Xiao Li; Shaorong Yu; Siwen Liu; Jifeng Feng; Yuxiao Zheng (2023). Table_2_Screening the Cancer Genome Atlas Database for Genes of Prognostic Value in Acute Myeloid Leukemia.XLSX [Dataset]. http://doi.org/10.3389/fonc.2019.01509.s005
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    Frontiers
    Authors
    Jie Ni; Yang Wu; Feng Qi; Xiao Li; Shaorong Yu; Siwen Liu; Jifeng Feng; Yuxiao Zheng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Object: To identify genes of prognostic value which associated with tumor microenvironment (TME) in acute myeloid leukemia (AML).Methods and Materials: Level 3 AML patients gene transcriptome profiles were downloaded from The Cancer Genome Atlas (TCGA) database. Clinical characteristics and survival data were extracted from the Genomic Data Commons (GDC) tool. Then, limma package was utilized for normalization processing. ESTIMATE algorithm was used for calculating immune, stromal and ESTIMATE scores. We examined the distribution of these scores in Cancer and Acute Leukemia Group B (CALGB) cytogenetics risk category. Kaplan-Meier (K-M) curves were used to evaluate the relationship between immune scores, stromal scores, ESTIMATE scores and overall survival. We performed clustering analysis and screened differential expressed genes (DEGs) by using heatmaps, volcano plots and Venn plots. After pathway enrichment analysis and gene set enrichment analysis (GESA), protein-protein interaction (PPI) network was constructed and hub genes were screened. We explore the prognostic value of hub genes by calculating risk scores (RS) and processing survival analysis. Finally, we verified the expression level, association of overall survival and gene interactions of hub genes in the Vizome database.Results: We enrolled 173 AML samples from TCGA database in our study. Higher immune score was associated with higher risk rating in CALGB cytogenetics risk category (P = 0.0396) and worse overall survival outcomes (P = 0.0224). In Venn plots, 827 intersect genes were screened with differential analysis. Functional enrichment clustering analysis revealed a significant association between intersect genes and the immune response. After PPI network, 18 TME-related hub genes were identified. RS was calculated and the survival analysis results revealed that high RS was related with poor overall survival (P < 0.0001). Besides, the survival receiver operating characteristic curve (ROC) showed superior predictive accuracy (area under the curve = 0.725). Finally, the heatmap from Vizome database demonstrated that 18 hub genes showed high expression in patient samples.Conclusion: We identified 18 TME-related genes which significantly associated with overall survival in AML patients from TCGA database.

  6. M

    Testicular Germ Cell Cancer (TCGA Firehose Legacy)

    • datacatalog.mskcc.org
    Updated Nov 19, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad Institute (2020). Testicular Germ Cell Cancer (TCGA Firehose Legacy) [Dataset]. https://datacatalog.mskcc.org/dataset/10492
    Explore at:
    Dataset updated
    Nov 19, 2020
    Dataset provided by
    Broad Institute
    MSK Library
    Description

    TCGA Testicular Germ Cell Cancer. Source data from GDAC Firehose. Previously known as TCGA Provisional. This dataset contains summary data visualizations and clinical data from a broad sampling of 156 carcinomas from 150 patients. The data was gathered as part of the Broad Institute of MIT and Harvard Firehose initiative, a cancer analysis pipeline. The clinical data includes mutation count, information about mutated genes, patient demographics, sample type, disease code, Adjuvant Postoperative Pharmaceutical Therapy Administered Indicator, American Joint Committee on Cancer Lymph Node Stage Code, American Joint Committee on Cancer Lymph Node Stage Code.1, American Joint Committee on Cancer Metastasis Stage Code, American Joint Committee on Cancer Publication Version Type, American Joint Committee on Cancer Tumor Stage Code, Bilateral Diagnosis Timing Type, Cause of death source, and Days to bilateral tumor dx. The dataset includes Next-Generation Clustered Heat Maps (NG-CHM) viewable via an embedded NG-CHM Heat Map Viewer, provided my MD Anderson Cancer Center, which provides a graphical environment for exploration of clustered or non-clustered heat map data. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  7. M

    Esophageal Carcinoma (TCGA, Firehose Legacy)

    • datacatalog.mskcc.org
    Updated Sep 29, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Broad Institute (2020). Esophageal Carcinoma (TCGA, Firehose Legacy) [Dataset]. https://datacatalog.mskcc.org/dataset/10470
    Explore at:
    Dataset updated
    Sep 29, 2020
    Dataset provided by
    Broad Institute
    MSK Library
    Description

    This dataset contains summary data visualizations and clinical data from a broad sampling of 182 esophageal adenocarcinomas.
    TCGA Esophageal Carcinoma . Source data from GDAC Firehose. Previously known as TCGA Provisional. The data was gathered as part of the Broad Institute of MIT and Harvard Firehose initiative, a cancer analysis pipeline. his dataset contains summary data visualizations and clinical data from a broad sampling of 186 carcinomas from 185 patients. The clinical data includes mutation count, information about mutated genes, patient demographics, disease status, tumor typing, numbers of samples per patient, Adjuvant Postoperative Pharmaceutical Therapy Administered Indicator, Alcohol Consumption Frequency, Alcohol History Documented, American Joint Committee on Cancer Lymph Node Stage Code, American Joint Committee on Cancer Metastasis Stage Code, American Joint Committee on Cancer Publication Version Type, American Joint Committee on Cancer Tumor Stage Code, Antireflux treatment type, and the presence of Barrett's esophagus. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer. The dataset includes Next-Generation Clustered Heat Maps (NG-CHM) viewable via an embedded NG-CHM Heat Map Viewer, provided my MD Anderson Cancer Center, which provides a graphical environment for exploration of clustered or non-clustered heat map data. The data set also includes copy-number segment data downloadable as .seg files and viewable via the Integrative Genomics Viewer.

  8. f

    Additional file 2 of Polyamine metabolism patterns characterized tumor...

    • figshare.com
    zip
    Updated Feb 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Enkui Zhang; Chengsheng Ding; Shuchun Li; Batuer Aikemu; Xueliang Zhou; Xiaodong Fan; Jing Sun; Xiao Yang; Minhua Zheng (2024). Additional file 2 of Polyamine metabolism patterns characterized tumor microenvironment, prognosis, and response to immunotherapy in colorectal cancer [Dataset]. http://doi.org/10.6084/m9.figshare.22949331.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 23, 2024
    Dataset provided by
    figshare
    Authors
    Enkui Zhang; Chengsheng Ding; Shuchun Li; Batuer Aikemu; Xueliang Zhou; Xiaodong Fan; Jing Sun; Xiao Yang; Minhua Zheng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2: Figure S1.Boxplot of the expression of PAM genes in tumor and normal sample on the TCGA cohort.Heatmap of significant different PAM genes in tumor and normal sample..GO enrichment analysis for PAM genes.KEGG enrichment analysis for PAM genes.Forest plot of prognostic gene with Univariate cox regression analysis.Protein–protein interactionfor PAM-related genes.Venn diagram showing PAM genes after intersection of datasets. Figure S2. Survival prognostic analysis of each PAM gene with high and low expression using Kaplan–Meier analysis. Figure S3.Consensus clustering of 55 PAM genes matrix for k = 3 of 1224 patients in the TCGA cohort combine the GEO cohort.Determine the relevant CDF curve and Tracking plot of Consensus clustering. Figure S4.Consensus clustering of 328 PAM prognostic genes in the meta dataset.GO enrichment analysis and KEGG enrichment analysis of 328 PAM prognostic genes.Expression heatmap of 328 PAM prognostic genes in geneCluster A and B subgroups.Forest plots for univariate and multivariate cox analysis of PAMscore. Figure S5. Survival analysis of high and low score subgroups of PAMscore for different genders, different T, N, M stages and AJCC stages in CRC patients. Figure S6.RNA expression of ACAT2, SPHK1, SNED1, KPNA2, BZW2 and KIF15 in tumor and normal tissues in TCGA dataset.IHC cell staining intensity in normal and tumor tissues of the CRC cohort. Figure S7.Expression levels of marker genes in the 6 cell types.Expression levels of marker genes in high and low cell groups.

  9. f

    Table_1_Proteomics analysis of cancer tissues identifies IGF2R as a...

    • figshare.com
    docx
    Updated Jun 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bing Liu; Yuqiang Hu; Lixia Wan; Luan Wang; Liangjun Cheng; Hai Sun; Yaran Liu; Di Wu; Jiefei Zhu; Xiu Hong; Yang Li; Chong Zhou (2023). Table_1_Proteomics analysis of cancer tissues identifies IGF2R as a potential therapeutic target in laryngeal carcinoma.docx [Dataset]. http://doi.org/10.3389/fendo.2022.1031210.s001
    Explore at:
    docxAvailable download formats
    Dataset updated
    Jun 13, 2023
    Dataset provided by
    Frontiers
    Authors
    Bing Liu; Yuqiang Hu; Lixia Wan; Luan Wang; Liangjun Cheng; Hai Sun; Yaran Liu; Di Wu; Jiefei Zhu; Xiu Hong; Yang Li; Chong Zhou
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundLaryngeal cancer (LC) is a prevalent head and neck malignancy; however, the essential pathophysiological mechanism underlying its tumorigenesis and progression remains elusive. Due to the perduring scarcity of effective targeted drugs for laryngeal cancer, insights into the disease’s pathophysiological mechanisms would substantially impact the treatment landscape of laryngeal cancer.MethodsTo ensure quality consistency, 10 tumor and 9 non-tumor samples underwent proteomic analysis on a single mass spectrometer using a label-free technique. Subsequently, gene expression variations between laryngeal squamous cell carcinoma and normal tissues were analyzed using The Cancer Genome Atlas (TCGA) database. Immunohistochemical expressions of insulin-like growth factor 2 receptor (IGF2R), fibronectin (FN), vimentin, and α-smooth muscle actin (SMA) in LC tissues and normal tissues were determined.ResultsIn the tumor group, significant variations were detected for 433 upregulated and 61 downregulated proteins. Moreover, the heatmap revealed that the expressions of RNA translation-related proteins and proteins involved in RNA metabolism, such as IGF2R, tenascin C (TNC), periostin (POSTN), proteasome 26S subunit ATPase 4 (PSMC4), serpin family A member 3 (SERPINA3), heat shock protein family B (small) member 6 (HSPB6), osteoglycin (OGN), chaperonin containing TCP1 subunit 6A (CCT6A), and chaperonin containing TCP1 subunit 6B (CCT6B), were prominently elevated in the tumor group. Nonsense-mediated RNA decay (NMD), RNA translation, and protein stability were significantly altered in LC tumors. IGF2R was remarkably upregulated in LC tumors. In the TCGA database, the IGF2R mRNA level was significantly upregulated in LSCC tissues. Additionally, IGF2R mRNA expression was lowest in clinical grade 1 samples, with no significant difference between grades 2 and 3. In LSCC patients, a significant positive correlation between IGF2R expression and the stromal score was detected using the ESTIMATE algorithm to estimate the immune score, stromal score, and tumor purity in the tumor microenvironment. Lastly, immunohistochemical analysis revealed that IGF2R is overexpressed in LC.ConclusionThese results demonstrate the vital role of IGF2R in LC carcinogenesis and progression and may facilitate the identification of new therapeutic targets for the prevention and treatment of LC.

  10. f

    Additional file 2 of Epigenetic alterations are associated with tumor...

    • figshare.com
    application/x-rar
    Updated Feb 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liangliang Cai; Hua Bai; Jianchun Duan; Zhijie Wang; Shugeng Gao; Di Wang; Shuhang Wang; Jun Jiang; Jiefei Han; Yanhua Tian; Xue Zhang; Hao Ye; Minghui Li; Bingding Huang; Jie He; Jie Wang (2024). Additional file 2 of Epigenetic alterations are associated with tumor mutation burden in non-small cell lung cancer [Dataset]. http://doi.org/10.6084/m9.figshare.9119249.v1
    Explore at:
    application/x-rarAvailable download formats
    Dataset updated
    Feb 13, 2024
    Dataset provided by
    figshare
    Authors
    Liangliang Cai; Hua Bai; Jianchun Duan; Zhijie Wang; Shugeng Gao; Di Wang; Shuhang Wang; Jun Jiang; Jiefei Han; Yanhua Tian; Xue Zhang; Hao Ye; Minghui Li; Bingding Huang; Jie He; Jie Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    : Figure S1. Cell type composition by sample type. Figure S2. The comparation of DMPs in high TMB or low TMB based on lung cancer type LUAD or LUSC in our cohort. Figure S3. The comparation of DMPs in high TMB or low TMB in TCGA NSCLCs. Figure S4. The correlation of ASs with NOMs in TCGA database. Figure S5. 4 HOX family gene cluster’s methylation status in our cohort. Figure S6. Heatmap plot of all the 437 genes’ methylation probes in TCGA database. Figure S7. Subgroup analysis of NOMs based on ethnicity (ASIAN, BLACK and WHITE) in TCGA database. Figure S8. The first 20 high frequency mutated genes of ASIAN, BLACK or WHITE population in TCGA database. (RAR 16686 kb)

  11. S

    Correlation of AR and UPR gene expression in prostate cancer cohorts: Figure...

    • search.sourcedata.io
    zip
    Updated Jun 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sheng X; Arnoldussen YJ; Storm M; Tesikova M; Nenseth HZ; Zhao S; Fazli L; Rennie P; Risberg B; W; Danielsen H; Mills IG; Jin Y; Hotamisligil G; Saatcioglu F (2015). Correlation of AR and UPR gene expression in prostate cancer cohorts: Figure 1-A [Dataset]. https://search.sourcedata.io/panel/cache/20501
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2015
    Authors
    Sheng X; Arnoldussen YJ; Storm M; Tesikova M; Nenseth HZ; Zhao S; Fazli L; Rennie P; Risberg B; W; Danielsen H; Mills IG; Jin Y; Hotamisligil G; Saatcioglu F
    License

    Attribution 2.0 (CC BY 2.0)https://creativecommons.org/licenses/by/2.0/
    License information was derived automatically

    Variables measured
    Tumors
    Description

    Possible correlation between AR- and UPR-associated gene expression was assessed in the global gene expression data available in the TCGA Prostate Adenocarcinoma cohort (n = 190) (http://www.cbioportal.org/public-portal/index.do). Tumors were stratified according to AR status into three groups, that is ARlow (n = 60), ARmedium (n = 70), and ARhigh (n = 60). The levels of UPR gene expression in the three groups were compared using Pearson's correlation analysis by the R software and presented as a heatmap. There were significant differences between the three groups (Supplementary Table S2).. List of tagged entities: Tumors, , real-time PCR (bao:BAO_0002084)

  12. f

    File S1 - A 16-Gene Signature Distinguishes Anaplastic Astrocytoma from...

    • plos.figshare.com
    • figshare.com
    pdf
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Soumya Alige Mahabala Rao; Sujaya Srinivasan; Irene Rosita Pia Patric; Alangar Sathyaranjandas Hegde; Bangalore Ashwathnarayanara Chandramouli; Arivazhagan Arimappamagan; Vani Santosh; Paturu Kondaiah; Manchanahalli R. Sathyanarayana Rao; Kumaravel Somasundaram (2023). File S1 - A 16-Gene Signature Distinguishes Anaplastic Astrocytoma from Glioblastoma [Dataset]. http://doi.org/10.1371/journal.pone.0085200.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Soumya Alige Mahabala Rao; Sujaya Srinivasan; Irene Rosita Pia Patric; Alangar Sathyaranjandas Hegde; Bangalore Ashwathnarayanara Chandramouli; Arivazhagan Arimappamagan; Vani Santosh; Paturu Kondaiah; Manchanahalli R. Sathyanarayana Rao; Kumaravel Somasundaram
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary Text: Supplementary Methods and Supplementary Results. Supplementary Tables: Table S1. Primers used for RT-qPCR. Table S2. List of genes selected for expression analysis by PCR array. Table S3. Number of AA and GBM patient samples in training set, test set and three independent cohorts of patient samples (TCGA, GSE1993 and GSE4422). Table S4. Expression of 16 genes in AA (n = 20) and GBM (n = 54) samples of the test set. Table S5. Expression of 16 genes in Grade III glioma (n = 27) and GBM (n = 152) samples of the TCGA dataset. Table S6. Expression of 16 genes in AA (n = 19) and GBM (n = 39) samples of GSE1993 dataset. Table S7. Expression of 16 genes in AA (n = 5) and GBM (n = 71) samples of the GSE4422 dataset. Supplementary Figures: Figure S1. Heat map of one-way hierarchical clustering of 16 PAM-identified genes in AA (n = 20) and GBM (n = 54) patient samples in the test set. A dual-color code was used, with red and green indicating up- and down regulation, respectively. Figure S2. Heat map of one-way hierarchical clustering of 16 PAM-identified genes in grade III glioma (n = 27) and GBM (n = 152) patient samples in TCGA dataset. A dual-color code was used, with red and green indicating up- and down regulation, respectively. Figure S3. A. Heat map of one-way hierarchical clustering of 16 PAM-identified genes in AA (n = 19) and GBM (n = 39) patient samples in GSE1993 dataset. A dual-color code was used, with red and green indicating up- and down regulation, respectively. B. PCA was performed using expression values of 16-PAM identified genes between AA and GBM samples in GSE1993 dataset. A scatter plot is generated using the first two principal components for each sample. The color of the samples is as indicated. C. The detailed probabilities of 10-fold cross-validation for the samples of GSE1993 dataset based on the expression values of 16 genes are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S4. A. Heat map of one-way hierarchical clustering of 16 PAM-identified genes in AA (n = 5) and GBM (n = 71) patient samples in GSE4422 dataset. A dual-color code was used, with red and green indicating up- and down regulation, respectively. B. PCA was performed using expression values of 16-PAM identified genes between AA and GBM samples in GSE4422 dataset. A scatter plot is generated using the first two principal components for each sample. The color of the samples is as indicated. C. The detailed probabilities of 10-fold cross-validation for the samples of GSE4422 dataset based on the expression values of 16 genes are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S5. A. The detailed probabilities of 10-fold cross-validation for the samples of GSE4271 dataset based on the expression values of 16 genes are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. B. The average Age at Diagnosis along with standard deviation is plotted for Authentic AAs (n = 12), Authentic GBMs (n = 68), Discordant AAs (n = 10) and Discordant GBMs (n = 8) of GSE4271 dataset. C. The Kaplan Meier survival analysis of samples of GSE4271 dataset. Figure S6. PAM analysis of the Petalidis-gene signature in TCGA dataset. A. Plot showing classification error for the Petalidis gene set in TCGA dataset. The threshold value of 0.0 corresponded to all 54 genes which classified AA (n = 27) and GBM (n = 604) samples with classification error of 0.000. B. The detailed probabilities of 10-fold cross-validation for the samples of TCGA dataset based on Petalidis gene set are shown. For each sample, its probability as AA (green color) and GBM (red color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S7. PAM analysis of the Phillips gene signature in our dataset. A. Plot showing classification error for the Phillips gene set in our dataset. The threshold value of 0.0 that correspond to all 5 genes which classified AA (n = 50) and GBM (n = 132) samples with classification error of 0.159. B. The detailed probabilities of 10-fold cross-validation for the samples of our dataset based on Phillips gene set are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S8. PAM analysis of the Phillips gene signature in Phillips dataset. A. Plot showing classification error for the Phillips gene set in Phillips dataset. The threshold value of 0.0 that correspond to all 8 genes which classified AA (n = 24) and GBM (n = 76) samples with classification error of 0.169. B. The detailed probabilities of 10-fold cross-validation for the samples of our dataset based on Phillips gene set are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S9. PAM analysis of the Phillips gene signature in GSE4422 dataset. A. Plot showing classification error for the Phillips gene set in GSE4422 dataset. The threshold value of 0.0 that correspond to all 8 genes which classified AA (n = 5) and GBM (n = 76) samples with classification error of 0.065. B. The detailed probabilities of 10-fold cross-validation for the samples of our dataset based on Phillips gene set are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S10. PAM analysis of the Phillips-gene signature in TCGA dataset. A. Plot showing classification error for the Phillips gene set in TCGA dataset. The threshold value of 0.0 corresponded to all 8 genes which classified AA (n = 27) and GBM (n = 604) samples with classification error of 0.008. B. The detailed probabilities of 10-fold cross-validation for the samples of TCGA dataset based on Phillips gene set are shown. For each sample, its probability as AA (orange color) and GBM (blue color) are shown and it was predicted by the PAM program as either AA or GBM based on which grade's probability is higher. The original histological grade of the samples is shown on the top. Figure S11. Network obtained by using 16-genes of classification signature as input genes to Bisogenet plugin in Cytoscape. The gene rated network had 252 nodes (genes) and 1498 edges (interactions between genes/proteins). This network consisted of the seed proteins with their immediate interacting neighbors. The nodes corresponding to the input genes are highlighted by the bigger node size as compared to the rest of the interacting partners. The color code is as indicated in the scale. (PDF)

  13. Probabilities of source aberration with downstream target for colon cancer...

    • plos.figshare.com
    xls
    Updated Jun 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qian Ke; Wikum Dinalankara; Laurent Younes; Donald Geman; Luigi Marchionni (2023). Probabilities of source aberration with downstream target for colon cancer subtypes. [Dataset]. http://doi.org/10.1371/journal.pcbi.1008944.t008
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Qian Ke; Wikum Dinalankara; Laurent Younes; Donald Geman; Luigi Marchionni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For CRIS-class subtypes of colon cancer, the heatmap represents the probabilities that the indicated gene is a DNA-aberrant source gene with some downstream RNA-aberrant target. The sources are selected from the set of core genes for coverings of the given tissue; the selection criterion is that the probability of a DNA-aberration is high for at least one of the subtypes for that tissue.

  14. Probabilities of source aberration with downstream target for breast cancer...

    • plos.figshare.com
    xls
    Updated Jun 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qian Ke; Wikum Dinalankara; Laurent Younes; Donald Geman; Luigi Marchionni (2023). Probabilities of source aberration with downstream target for breast cancer subtypes. [Dataset]. http://doi.org/10.1371/journal.pcbi.1008944.t006
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 10, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Qian Ke; Wikum Dinalankara; Laurent Younes; Donald Geman; Luigi Marchionni
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    For PAM50 subtypes of breast cancer, the heatmap represents the probabilities that the indicated gene is a DNA-aberrant source gene with some downstream RNA-aberrant target. The sources are selected from the set of core genes for coverings of the given tissue; the selection criterion is that the probability of a DNA-aberration is high for at least one of the subtypes for that tissue. Core sources with varying probabilities present interesting candidates for discrimination between subtypes. For example, the DNA-aberration frequency of TP53 is much higher in the HER2-enriched and Basal-like subtypes than in Luminal A and Luminal B, whereas an aberration in PIK3CA is less frequent among basal-like samples than among the other subtypes.

  15. f

    Additional file 3 of An integrative gene expression signature analysis...

    • springernature.figshare.com
    xlsx
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mingli Yang; Thomas B. Davis; Lance Pflieger; Michael V. Nebozhyn; Andrey Loboda; Heiman Wang; Michael J. Schell; Ramya Thota; W. Jack Pledger; Timothy J. Yeatman (2023). Additional file 3 of An integrative gene expression signature analysis identifies CMS4 KRAS-mutated colorectal cancers sensitive to combined MEK and SRC targeted therapy [Dataset]. http://doi.org/10.6084/m9.figshare.19343150.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    figshare
    Authors
    Mingli Yang; Thomas B. Davis; Lance Pflieger; Michael V. Nebozhyn; Andrey Loboda; Heiman Wang; Michael J. Schell; Ramya Thota; W. Jack Pledger; Timothy J. Yeatman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file3. Fig S1-S11.pdf. Fig. S1.The 5-gene dasatinib sensitivity (Dasa-S) signature score predicted the trendof dasatinib sensitivity in multiple CRC cell lines (n=50). Fig.S2. The EMT genes were strongly associated with regionalmetastasis and disease recurrence in 2202 CRC tumors. Fig. S3.The CMS4 subtype was strongly correlated with the 13-gene MEKi“bypass”-resistance (13-gene BP), PC1, EMT, SRC activation and 5-gene Dasa-Ssignature scores in 1485 primary CRC tumors. Fig.S4. The CMS4 subtype was strongly correlated with the 13-geneMEKi “bypass”-resistance (13-gene BP), PC1, EMT, SRC activation and 5-geneDasa-S signature scores in 764 metastatic CRC tumors. Fig. S5.Scatter plots of Hu-Lgr5-ISC, Hu-EphB2-ISC, Hu-Late TA, and Hu-Proliferation vsEMT signature scores, respectively in Marisa 585 CRCs. Fig. S6.Scatter plots of Hu-Lgr5-ISC, Hu-EphB2-ISC, Hu-Late TA, and Hu-Proliferation vsEMT signature scores in TCGA 677 CRCs. Fig. S7.Spearman correlation heatmap of CMS1-4* scores, signature scores andEMT-associated genes in Marisa 585 CRC tumors. Fig.S8. Spearman correlation heatmap of CMS1-4* scores, signaturescores and EMT-associated genes in TCGA 677 CRC tumors. Fig. S9. Nodistinct association of MUT vs WT APC/TP53 tumors with the CMS subtypes. Fig. S10. Correlationanalysis of 154 CRC cell lines and in vitro drug treatment of HCT116 cells withMEKi + SRCi in CSC vs. non-CSC media. Fig.S11. In vitro drug treatment of LIM2405and HT29 cells with MEKi + SRCi in CSC vs. non-CSC media.

  16. f

    A 4-miRNA signature to predict survival in glioblastomas

    • plos.figshare.com
    tiff
    Updated Jun 3, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Simon K. Hermansen; Mia D. Sørensen; Anker Hansen; Steen Knudsen; Alvaro G. Alvarado; Justin D. Lathia; Bjarne W. Kristensen (2023). A 4-miRNA signature to predict survival in glioblastomas [Dataset]. http://doi.org/10.1371/journal.pone.0188090
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 3, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Simon K. Hermansen; Mia D. Sørensen; Anker Hansen; Steen Knudsen; Alvaro G. Alvarado; Justin D. Lathia; Bjarne W. Kristensen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Glioblastomas are among the most lethal cancers; however, recent advances in survival have increased the need for better prognostic markers. microRNAs (miRNAs) hold great prognostic potential being deregulated in glioblastomas and highly stable in stored tissue specimens. Moreover, miRNAs control multiple genes representing an additional level of gene regulation possibly more prognostically powerful than a single gene. The aim of the study was to identify a novel miRNA signature with the ability to separate patients into prognostic subgroups. Samples from 40 glioblastoma patients were included retrospectively; patients were comparable on all clinical aspects except overall survival enabling patients to be categorized as short-term or long-term survivors based on median survival. A miRNome screening was employed, and a prognostic profile was developed using leave-one-out cross-validation. We found that expression patterns of miRNAs; particularly the four miRNAs: hsa-miR-107_st, hsa-miR-548x_st, hsa-miR-3125_st and hsa-miR-331-3p_st could determine short- and long-term survival with a predicted accuracy of 78%. Heatmap dendrograms dichotomized glioblastomas into prognostic subgroups with a significant association to survival in univariate (HR 8.50; 95% CI 3.06–23.62; p

  17. f

    File S1 - Expression of VEGF and Semaphorin Genes Define Subgroups of Triple...

    • figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    R. Joseph Bender; Feilim Mac Gabhann (2023). File S1 - Expression of VEGF and Semaphorin Genes Define Subgroups of Triple Negative Breast Cancer [Dataset]. http://doi.org/10.1371/journal.pone.0061788.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    R. Joseph Bender; Feilim Mac Gabhann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supporting information for this study. This file contains Tables S1–S7, which list the datasets and genes analyzed in this study, basic statistics on the gene expression measurements, clinical trial results with anti-angiogenic agents, and genes correlated with PC3a and PC4a. It also contains Figures S1–S13, which contain information on the relationship of principal components to breast cancer subgroups, relationships between the different principal component analyses performed on various datasets, and additional details on K-means clustering; plus heatmaps of the TCGA datasets and survival analyses of several of the breast cancer subgroups considered in this study. (PDF)

  18. f

    Breast cancer validation information.

    • plos.figshare.com
    xls
    Updated Mar 21, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marta Lovino; Elisa Ficarra; Loredana Martignetti (2024). Breast cancer validation information. [Dataset]. http://doi.org/10.1371/journal.pone.0289699.t005
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Mar 21, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Marta Lovino; Elisa Ficarra; Loredana Martignetti
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MicroRNAs (miRNAs) are small molecules that play an essential role in regulating gene expression by post-transcriptional gene silencing. Their study is crucial in revealing the fundamental processes underlying pathologies and, in particular, cancer. To date, most studies on miRNA regulation consider the effect of specific miRNAs on specific target mRNAs, providing wet-lab validation. However, few tools have been developed to explain the miRNA-mediated regulation at the protein level. In this paper, the MoPC computational tool is presented, that relies on the partial correlation between mRNAs and proteins conditioned on the miRNA expression to predict miRNA-target interactions in multi-omic datasets. MoPC returns the list of significant miRNA-target interactions and plot the significant correlations on the heatmap in which the miRNAs and targets are ordered by the chromosomal location. The software was applied on three TCGA/CPTAC datasets (breast, glioblastoma, and lung cancer), returning enriched results in three independent targets databases.

  19. f

    Additional file 1 of Dissecting cellular states of infiltrating...

    • springernature.figshare.com
    zip
    Updated Aug 16, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Aiai Shi; Min Yan; Bo Pang; Lin Pang; Yihan Wang; Yujia Lan; Xinxin Zhang; Jinyuan Xu; Yanyan Ping; Jing Hu (2024). Additional file 1 of Dissecting cellular states of infiltrating microenvironment cells in melanoma by integrating single-cell and bulk transcriptome analysis [Dataset]. http://doi.org/10.6084/m9.figshare.24792276.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Aug 16, 2024
    Dataset provided by
    figshare
    Authors
    Aiai Shi; Min Yan; Bo Pang; Lin Pang; Yihan Wang; Yujia Lan; Xinxin Zhang; Jinyuan Xu; Yanyan Ping; Jing Hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1: Figure S1. Immune cell expression heterogeneity and cell subsets distribution across patients, related to Fig. 1. (A) UMAP projection of 2068 single T cells (left), 515 B cells (middle) and 126 macrophages (right) from 19 patients. Each dot corresponds to one single cell, colored according to cell cluster. (B) Heatmap of T cell clusters (left), B cell clusters (middle) and macrophage clusters (right) with unique signature genes. Top 20 specifically expressed genes are marked alongside, if available. (C-E) Bar plots showing the number (left panel) and fraction (right panel) of cells originating from the 19 patients for each subcluster of T cells (C), B cells (D) and macrophages (E). (F) The fractions of the 15 subclusters, NK cells, CAFs and endothelial cells in each patient. Figure S2. Cell subcluster characterization of functional status. (A) Top 100 ranked (based on fold change) differentially expressed genes indicative of the functional status in each T-cell cluster (top) and z-score normalized mean expression of known functional marker sets across single T cells (bottom). The numbers in parentheses correspond to the ranks and the key markers (Table S1) are highlighted by red color. (B) Heatmap showing the log2-transformed expression of selected T cell function-associated genes in single cells. (C) Violin plots showing the expression profile of selected genes involved in T-cell cytotoxicity (top) and exhaustion (bottom), stratified by T-cell clusters. (D) Top 100 ranked (based on fold change) differentially expressed genes indicative of the functional status in each cluster (C1, C2 and C3 for B cells; C0 and C1 for macrophages). The numbers in parentheses correspond to the ranks and the key markers (Table S1) are highlighted by red color. (E) Z-score normalized mean expression of known functional marker sets across single B cells (top) and the log2-transformed expression of selected B cell function-associated genes in single cells (bottom). (F-G) Heatmaps showing the z-score normalized mean expression of known functional marker sets across single macrophages and their log2-transformed expression in single cells. Blue boxes highlight the key markers and the numbers in brackets represent the total times appeared in literature. Figure S3. MM17 reference profile and performance assessment. (A) Heatmap of MM17 reference profile depicting z-score normalized expression of each gene across 17 tumor microenvironment (TME) cell subsets. (B-C) Correlation between predicted proportions and true proportions for each individual cell state (B) and for each individual patient (C). (D) Confusion matrix of all TME cell states. Figure S4. Functional associations of tumor microenvironment (TME) cell states. (A-D) Enriched GO biological processes of T_CD8_Cytotoxic (A), B_Non-regulatory (B), T_CD8_Mixed (C) and CAF (D) based on gene set enrichment analysis (GSEA). Figure S5. Associations between cell states and clinico-pathological variables. (A-C) Associations of molecular and clinical features with cell states. (A) Boxplots showing the cell fraction distribution of each cell state stratified by tumor type (left), gender (middle) and tumor status (right). (B) Boxplots showing the cell fraction distribution of each cell state stratified by integrative age (left), tumor stage (middle), and race (right). (C) The fraction distribution of cell states stratified by TCGA subtypes. Median value difference of cell fraction among subtypes was evaluated using Mood’s test. Wilcoxon rank sum tests were used to examine the significance of the differences between two groups. For tumor stage, patients with Stage 0, Stage I, IA, IB, Stage II, IIA, IIB and IIC are grouped as “LOW” (n=154), Stage III, IIIA, IIIB, IIIC and Stage IV are grouped as “HIGH” (n=162). * P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001. Figure S6. Associations between cell states and immune phenotypes, related to Fig. 4. (A) Scatterplots showing relationships between T_CD8_Cytotoxic and M_M2 (top), B_Regulatory and T_CD4_Exhausted (middle), CAF and T_CD8_Mixed (Cytotoxic and Exhausted) (bottom). Pearson correlations and p values are indicated. For significant correlations, linear models are shown as blue lines. (B) Contributions of the cell states to CA-1 (top) and CA-2 (bottom). (C) Scatter chart of the Pearson correlations of CA-1 and CA-2 with cell states. Different colors indicate whether or not significant associations between CA scores and cell states were observed (p < 0.05). (D) Boxplots showing the cell fraction distribution of each cell state stratified by the median values of CA-1 (top) and CA-2 (bottom), respectively. Wilcoxon rank sum tests were used to examine the significance of the differences between two groups. (E) The distribution of cell states across the three immunophenotype groups classified by median values of CA-1 and CA-2. Median value difference of cell fraction among groups was evaluated using Mood’s test. Then the statistical significance between any two groups was evaluated by Wilcoxon rank sum test and p values are shown at the top of each panel. * P < 0.05, ** P < 0.01, *** P < 0.001, **** P < 0.0001. Figure S7. Assessment on association between tumor microenvironment immune phenotypes (TMIP) and response to immune checkpoint blockade (ICB) in melanoma. (A) Box plots showing differences of CA-1 (upper panel) and CA-2 (middle panel) scores between responders and non-responders in patients under immunotherapy in TCGA data. Bar charts showing numbers of responders and non-responders with different TMIPs in those patients (lower panel). (B) Projection of each patient of Riaz et al. dataset onto the first and second component of the correspondence analysis. Left panel showed pre-treatment samples and right panel denoted on-treatment patients. Non-responders were colored blue, and responders were colored orange. Points denoted Ipi-naive patients, and triangles denoted Ipi-progressed patients. (C) Box plots showing differences of CA-2 scores between responders and non-responders in anti-PD1 pre-treatment patients (upper panel) and on-treatment patients (lower panel) who progressed after a first-line anti-CTLA4 treatment (Ipi-progressed) in Riaz et al. data. (D-E) Comparison of each cell state proportion between responders and non-responders in Ipi-progressed patients based on pre-treatment (D) and on-treatment (E) transcriptomic profiles. ns: not significant; *: p < 0.05. Table S1. Gene lists used for functional analyses. Table S3. Demographics and characteristics of patients with melanoma. Table S4. Uni- and multivariate analysis for progress-free survival (316 sample). Table S5. Uni- and multivariate analysis for overall survival (316 sample).

  20. f

    Additional file 4 of Systematic interrogation of mutation groupings reveals...

    • springernature.figshare.com
    application/x-gzip
    Updated Jun 11, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Michal R. Grzadkowski; Hannah D. Holly; Julia Somers; Emek Demir (2023). Additional file 4 of Systematic interrogation of mutation groupings reveals divergent downstream expression programs within key cancer genes [Dataset]. http://doi.org/10.6084/m9.figshare.14551655.v1
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jun 11, 2023
    Dataset provided by
    figshare
    Authors
    Michal R. Grzadkowski; Hannah D. Holly; Julia Somers; Emek Demir
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 4: Figure S3. Clustering subgrouping model coefficients reveals structure of mutation heterogeneity. We applied hierarchical clustering to examine the average regression model gene coefficients across all forty cross-validation folds for each of our subgrouping tasks. Subgroupings with task AUCs of below 0.7 were omitted, as were genes that did not have an absolute model coefficient ranked in the top five for any of the remaining tasks. Distances between subgroupings were computed by taking the inverse of the Spearman correlation across all gene coefficients; these were then used to cluster subgroupings into five groups. To facilitate presentation, here we only show these clusterings for subgroupings which did not have another subgrouping in the same cluster with a higher AUC and a Jaccard index of at least 0.9 with respect to the subgroupings’ mutated samples. The subgroupings with the highest AUC in each cluster are bolded, as is the gene-wide task. An asterisk is placed next to the AUCs of subgroupings with cv-significantly better performance than that of the gene-wide task. We include here these heatmaps for GATA3, TP53, and PIK3CA in METABRIC-(LumA) as well as KRAS in TCGA-LUAD. The corresponding figures for the remaining cases can be found at our data portal under Figures/S3 - Gene Coefficient Heatmaps. The names of these figures have the format (expr-source)_(cohort)_(gene)_auto-heatmap_Ridge.svg.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Namshik Han (2018). The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis [Dataset]. http://doi.org/10.6084/m9.figshare.5851743.v1
Organization logo

The Cancer Genome Atlas (TCGA) RNA-seq meta-analysis

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
xlsxAvailable download formats
Dataset updated
Feb 2, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Namshik Han
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

TCGA RNA-seq V2 Level3 data were downloaded from TCGA Genomic Data Commons Data Portal (https://gdc-portal.nci.nih.gov), consisting of 11,303 samples in 34 cancer projects (33 cancer types). Nine cancer types that do not have corresponding non-tumour samples were filtered out, and the analysis was focused on tumour versus non-tumour comparison. 24 cancer types were used in this meta-analysis: BLCA, BRCA, CESC, CHOL, COAD, ESCA, GBM, HNSC, KICH, KIRC, KIRP, LIHC, LUAD, LUSC, PAAD, PCPG, PRAD, READ, SARC, SKCM, STAD, THCA, THYM, UCEC (https://gdc-portal.nci.nih.gov). The nine filtered cancer types were ACC, DLBC, LAML, LGG, MESO, OV, TGCT, UCS and UVM. To extract expression values from TCGA RNA-seq data, we used genomic coordinates to retrieve UCSC Transcript IDs that correspond to the identifiers in TCGA RNA-seq V2 Level3 data (isoform level). The GAF (General Annotation Format) file was used to map the coordinate to UCSC Transcript ID, and it was downloaded form https://tcga-data.nci.nih.gov/docs/GAF/GAF.hg19.June2011.bundle/outputs/TCGA.hg19.June2011.gaf. This file contains genomic annotations shared by all TCGA projects. More details of the GAF file format can be found at https://tcga-data.nci.nih.gov/docs/GAF/GAF3.0/GAF_v3_file_description.docx. We filtered out any coding exons overlapping UCSC Transcript IDs to eliminate expression value of coding genes and evaluate lncRNA expression.We could find the expression values of 443 pcRNAs and 203 tapRNAs in TCGA data, as many of non-coding regions are not yet fully annotated in the TCGA RNA-seq V2 Level3 data. The expression value of pcRNAs and tapRNAs were extracted and clustered by un-supervised Pearson correlation method (Supplementary Figure 18A). The expression values of tapRNA-associated coding genes were also extracted and used to generate the heat-map (Supplementary Figure 18B), which shows the similar pattern of expression with tapRNAs across the cancer types.To show that tapRNAs and associated coding genes have similar expression profiles in cancers we generated a Spearman's Rank-Order Correlation heatmap (Figure 6A) between tapRNAs and their associated coding genes based on the TCGA RNA-seq data. We used the MatLab function corr to calculate the Spearman's rho. This function takes two matrices X (197-by-8,850 expression profiling matrix of tapRNA) and Y (197-by-8,850 expression profiling matrix of tapRNA-assocated coding gene) and returns an 8,850-by-8,850 matrix containing the pairwise correlation coefficient between each pair of 8,850 columns (TCGA cancer samples in Supplementary Figure 18A and B). Thus, the rank-order correlation matrix that we computed from the matrices of expression profiling data (Supplementary Figure S18A and B) allowed us to compare the correlation between two column vectors i.e. cancer samples. This function also returns a matrix of p-values for testing the hypothesis of no correlation against the alternative that there is a nonzero correlation. Each element of a matrix of p-values is the p value for the corresponding element of Spearman's rho. The p-values for Spearman's rho are calculated using large-sample approximations. To check significance level of correlation between tapRNA and its associated coding gene, the diagonal of the p-value matrix was extracted and used. The median is 1.31x10-11 and the mean is 1.03x10-4 with standard deviation 0.0029.To identify cancer-specific tapRNAs, we considered not only the global expression pattern of a given tapRNA in each cancer type, but also expression pattern of specific sub-group that is significantly distinct, to take into account cancer sample heterogeneity. Thus, two conditions were applied: (1) average expression level of a tapRNA in a given cancer type is in top 10% or bottom 10% and (2) a tapRNA has at least 10% of samples in a given cancer type that are significantly up-regulated (Z-score > 2) or down-regulated (Z-score < -2).

Search
Clear search
Close search
Google apps
Main menu