BackgroundSmoking was strongly associated with breast cancer in previous studies. Whether smoking promotes breast cancer through DNA methylation remains unknown.MethodsTwo-sample Mendelian randomization (MR) analyses were conducted to assess the causal effect of smoking-related DNA methylation on breast cancer risk. We used 436 smoking-related CpG sites extracted from 846 middle-aged women in the ARIES project as exposure data. We collected summary data of breast cancer from one of the largest meta-analyses, including 69,501 cases for ER+ breast cancer and 21,468 cases for ER− breast cancer. A total of 485 single-nucleotide polymorphisms (SNPs) were selected as instrumental variables (IVs) for smoking-related DNA methylation. We further performed an MR Steiger test to estimate the likely direction of causal estimate between DNA methylation and breast cancer. We also conducted colocalization analysis to evaluate whether smoking-related CpG sites shared a common genetic causal SNP with breast cancer in a given region.ResultsWe established four significant associations after multiple testing correction: the CpG sites of cg2583948 [OR = 0.94, 95% CI (0.91–0.97)], cg0760265 [OR = 1.07, 95% CI (1.03–1.11)], cg0420946 [OR = 0.95, 95% CI (0.93–0.98)], and cg2037583 [OR =1.09, 95% CI (1.04–1.15)] were associated with the risk of ER+ breast cancer. All the four smoking-related CpG sites had a larger variance than that in ER+ breast cancer (all p < 1.83 × 10−11) in the MR Steiger test. Further colocalization analysis showed that there was strong evidence (based on PPH4 > 0.8) supporting a common genetic causal SNP between the CpG site of cg2583948 [with IMP3 expression (PPH4 = 0.958)] and ER+ breast cancer. There were no causal associations between smoking-related DNA methylation and ER− breast cancer.ConclusionsThese findings highlight potential targets for the prevention of ER+ breast cancer. Tissue-specific epigenetic data are required to confirm these results.
Co-occurrence and mutual exclusivity (COME) of DNA methylation refer to two or more genes that tend to be positively or negatively correlated in DNA methylation among different samples. Although COME of gene mutations in pan-cancer have been well explored, little is known about the COME of DNA methylation in pan-cancer. Here, we systematically explored the COME of DNA methylation profile in diverse human cancer. A total of 5,128,332 COME events were identified in 14 main cancers types in The Cancer Genome Atlas (TCGA). We also identified functional epigenetic modules of the zinc finger gene family in six cancer types by integrating the gene expression and DNA methylation data and the frequently occurred COME network. Interestingly, most of the genes in those functional epigenetic modules are epigenetically repressed. Strikingly, those frequently occurred COME events could be used to classify the patients into several subtypes with significant different clinical outcomes in six cancers as well as pan-cancer (p-value ≤ = 0.05). Moreover, we observed significant associations between different COME subtypes and clinical features (e.g., age, gender, histological type, neoplasm histologic grade, and pathologic stage) in distinct cancers. Taken together, we identified millions of COME events of DNA methylation in pan-cancer and detected functional epigenetic COME events that could separate tumor patients into different subtypes, which may benefit the diagnosis and prognosis of pan-cancer.
https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/
This dataset contains genome-wide DNA methylation data generated from 142 pediatric acute myeloid leukemia (AML) samples originating from bone marrow or peripheral blood samples taken at AML diagnosis (N=123) or relapse (N=19). Further details regarding the samples are available in Supplementary Table S1 from Krali and Palle et. al., 2021 (https://doi.org/10.3390/genes12060895).Genome-wide DNA methylation was analyzed at the SNP&SEQ Technology Platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. 200ng of bisulfite converted DNA was amplified, fragmented and hybridised to Illumina Infinium Human Methylation450k Beadchip using the standard protocol from Illumina (iScan SQ instrument).This metadata record contains information about the raw idat files generated from the Infinium DNA methylation arrays. The Methylprep Python library was used to generate and normalize the beta-value matrix (https://pypi.org/project/methylprep/1.3.3/).The raw idat files along with a samplesheet, processed beta-value matrix, annotation file for CpG annotation, and signal intensities matrix will be made available upon request. Limited phenotype information is available in the Supplemental Table 1 of the manuscript. All scripts that give a walk-through from data preprocessing from the raw idat files until the modelling process with Machine Learning can be found on the following GitHub repository: https://github.com/Molmed/Krali-Palle_2021.Terms for accessThe DNA methylation dataset is only to be used for research that is seeking to advance the understanding of the influence of epigenetic factors on leukemia etiology and biology.The data should not be used for other purposes, i.e. investigating the epigenetic signatures that may lead to identification of a person.For retrieving the data used for the scope of this publication, please contact datacentre@scilifelab.se.
Background Abnormal DNA methylation is well established for breast cancer and contributes to its progression by silencing tumor suppressor genes. DNA methylation profiling platforms might provide an alternative approach to expression microarrays for accurate breast tumor subtyping. We sought to determine whether the distinction of the inflammatory breast cancer (IBC) phenotype from the non-IBC phenotype by transcriptomics could be sustained by methylomics. Methodology/principal findings We performed methylation profiling on a cohort of IBC (N?=?19) and non-IBC (N?=?43) samples using the Illumina Infinium Methylation Assay. These results were correlated with gene expression profiles. Methylation values allowed separation of breast tumor samples into high and low methylation groups. This separation was significantly related to DNMT3B mRNA levels. The high methylation group was enriched for breast tumor samples from patients with distant metastasis and poor prognosis, as predicted by the 70-gene prognostic signature. Furthermore, this tumor group tended to be enriched for IBC samples (54% vs. 24%) and samples with a high genomic grade index (67% vs. 38%). A set of 16 CpG loci (14 genes) correctly classified 97% of samples into the low or high methylation group. Differentially methylated genes appeared to be mainly related to focal adhesion, cytokine-cytokine receptor interactions, Wnt signaling pathway, chemokine signaling pathways and metabolic processes. Comparison of IBC with non-IBC led to the identification of only four differentially methylated genes (TJP3, MOGAT2, NTSR2 and AGT). A significant correlation between methylation values and gene expression was shown for 4,981 of 6,605 (75%) genes. Conclusions/significance A subset of clinical samples of breast cancer was characterized by high methylation levels, which coincided with increased DNMT3B expression. Furthermore, an association was observed with molecular signatures indicative of poor patient prognosis. The results of the current study also suggest that aberrant DNA methylation is not the main force driving the molecular biology of IBC.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Cancer is an aging-associated disease but the underlying molecular links between these processes are still largely unknown. Gene promoters that become hypermethylated in aging and cancer share a common chromatin signature in ES cells. In addition, there is also global DNA hypomethylation in both processes. However, any similarities of the regions where this loss of DNA methylation occurs is currently not well characterized, nor is it known whether such regions also share a common chromatin signature in aging and cancer. To address this issue we analysed TCGA DNA methylation data from a total of 2,311 samples, including control and cancer cases from patients with breast, kidney, thyroid, skin, brain and lung tumors and healthy blood, and integrated the results with histone, chromatin state and transcription factor binding site data from the NIH Roadmap Epigenomics and ENCODE projects. We identified 98,857 CpG sites differentially methylated in aging, and 286,746 in cancer. Hyper- and hypomethylated changes in both processes each had a similar genomic distribution across tissues and displayed tissue-independent alterations. The identified hypermethylated regions in aging and cancer shared a similar bivalent chromatin signature. In contrast, hypomethylated DNA sequences occurred in very different chromatin contexts. DNA hypomethylated sequences were enriched at genomic regions marked with the activating histone posttranslational modification H3K4me1 in aging, whilst in cancer, loss of DNA methylation was primarily associated with the repressive H3K9me3 mark.
Database to study interplay of DNA methylation, gene expression and cancer that hosts both highly integrated data of DNA methylation, cancer-related gene, mutation and cancer information from public resources, and the CpG Island (CGI) clones derived from our large-scale sequencing. Interconnections between different data types were analyzed and presented. Search tool and graphical MethyView are developed to help users access all the data and data connections and view DNA methylation in context of genomics and genetics data. The search tool and graphical MethyView are developed to help users access all the data and data connections and view DNA methylation in context of genomics and genetics data. As part of the Cancer Epigenomics Project in China, MethyCancer serves as a platform for sharing data and analytical results from the Cancer Genome/Epigenome Project in China with colleagues all over the world.
This dataset contains supplemental tables and tracks for the study entitled: "Partially methylated domains are hypervariable in breast cancer and fuel widespread CpG island hypermethylation".FilesPMDs_CGIsThe included files contain:- Genome positions of detected PMDs with their mean methylation (weighted mean, see Methods)- Genome positions of CpG islands with their mean methylation (weighted mean)- The "Brinkman" directory contains files from breast cancer data produced in this study- The "normals" directory contains files from normal tissues (external data) analyzed in this study- The "tumors" directory contains files from tumors (external data) analyzed in this studyAll genome positions are based on GRCh37/hg19All files are TAB-delimited text files (.tsv)DNAme_bigwigsThe included files are BIGWIG files for viewing the DNA methylation profiles in a genome browser such as UCSC (http://genome.ucsc.edu). Each file represents a whole-Genome Bisulfite Sequencing (WGBS) DNA methylation profile from one tumor used in this study. The used genome build was GRCh37/hg19. For every CpG with a coverage of at least 4 reads, the DNA methylation value (range: 0-1) is included.MethodsDetection of PMDsDetection of partially methylated domains (PMDs) in all whole-genome bisulfite sequencing (WGBS) methylation profiles throughout this study was done using the MethylSeekR package for R (1). Before PMD calling, CpGs overlapping common SNPs (dbSNP build 137) were removed. The alpha distribution (1) was used to determine whether PMDs were present at all, along with visual inspection of WGBS profiles. After PMD calling, the resulting PMDs were further filtered by removing regions overlapping with centromers (undetermined sequence content).Mean methylation inside CpG islandsMean methylation values from WGBS inside CGIs were calculated using the ‘weighted methylation level’ (2).Mean methylation inside PMDsMean methylation values from WGBS inside PMDs were calculated using the ‘weighted methylation level’ (2). Calculation of mean methylation within PMDs involved removing all CpGs overlapping with CpG island(-shores) and promoters, as the high CpG densities within these elements yield unbalanced mean methylation values, not representative of PMD methylation.References(1) Burger, L., Gaidatzis, D., Schübeler, D. & Stadler, M. B. Identification of active regulatory regions from DNA methylation data. Nucleic Acids Research 41, (2013).(2) Schultz, M. D., Schmitz, R. J. & Ecker, J. R. ’Leveling’ the playing field for analyses of single-base resolution DNA methylomes. Trends in Genetics 28, 583–585 (2012).
BackgroundSmoking was strongly associated with breast cancer in previous studies. Whether smoking promotes breast cancer through DNA methylation remains unknown.MethodsTwo-sample Mendelian randomization (MR) analyses were conducted to assess the causal effect of smoking-related DNA methylation on breast cancer risk. We used 436 smoking-related CpG sites extracted from 846 middle-aged women in the ARIES project as exposure data. We collected summary data of breast cancer from one of the largest meta-analyses, including 69,501 cases for ER+ breast cancer and 21,468 cases for ER− breast cancer. A total of 485 single-nucleotide polymorphisms (SNPs) were selected as instrumental variables (IVs) for smoking-related DNA methylation. We further performed an MR Steiger test to estimate the likely direction of causal estimate between DNA methylation and breast cancer. We also conducted colocalization analysis to evaluate whether smoking-related CpG sites shared a common genetic causal SNP with breast cancer in a given region.ResultsWe established four significant associations after multiple testing correction: the CpG sites of cg2583948 [OR = 0.94, 95% CI (0.91–0.97)], cg0760265 [OR = 1.07, 95% CI (1.03–1.11)], cg0420946 [OR = 0.95, 95% CI (0.93–0.98)], and cg2037583 [OR =1.09, 95% CI (1.04–1.15)] were associated with the risk of ER+ breast cancer. All the four smoking-related CpG sites had a larger variance than that in ER+ breast cancer (all p < 1.83 × 10−11) in the MR Steiger test. Further colocalization analysis showed that there was strong evidence (based on PPH4 > 0.8) supporting a common genetic causal SNP between the CpG site of cg2583948 [with IMP3 expression (PPH4 = 0.958)] and ER+ breast cancer. There were no causal associations between smoking-related DNA methylation and ER− breast cancer.ConclusionsThese findings highlight potential targets for the prevention of ER+ breast cancer. Tissue-specific epigenetic data are required to confirm these results.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Co-occurrence and mutual exclusivity (COME) of DNA methylation refer to two or more genes that tend to be positively or negatively correlated in DNA methylation among different samples. Although COME of gene mutations in pan-cancer have been well explored, little is known about the COME of DNA methylation in pan-cancer. Here, we systematically explored the COME of DNA methylation profile in diverse human cancer. A total of 5,128,332 COME events were identified in 14 main cancers types in The Cancer Genome Atlas (TCGA). We also identified functional epigenetic modules of the zinc finger gene family in six cancer types by integrating the gene expression and DNA methylation data and the frequently occurred COME network. Interestingly, most of the genes in those functional epigenetic modules are epigenetically repressed. Strikingly, those frequently occurred COME events could be used to classify the patients into several subtypes with significant different clinical outcomes in six cancers as well as pan-cancer (p-value ≤ = 0.05). Moreover, we observed significant associations between different COME subtypes and clinical features (e.g., age, gender, histological type, neoplasm histologic grade, and pathologic stage) in distinct cancers. Taken together, we identified millions of COME events of DNA methylation in pan-cancer and detected functional epigenetic COME events that could separate tumor patients into different subtypes, which may benefit the diagnosis and prognosis of pan-cancer.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Global loss of DNA methylation in mammalian genomes occurs cumulatively as a mitotic process during aging and cancer, primarily in Partially Methylated Domains (PMDs). It has been shown that local sequence context (100bp) has a strong effect on the rate of demethylation of individual CpG dinucleotides within PMDs. Here, we train a deep learning model to characterize this sequence dependence further, finding that methylation loss can be predicted from a CpG’s 150bp sequence context alone with an AUC of 0.95. We use re-methylation rates of newly synthesized DNA to show that CpGs with fast-loss sequence context are inefficiently re-methylated. Interestingly, we find that the 10% of CpGs predicted to have the “slowest” rate of loss lose almost no DNA methylation in healthy cell types. These same slow-loss CpGs lose a significant amount of DNA methylation in cancer, suggesting that they could be responsible for deregulation of genes and transposable elements that are associated with DNA hypomethylation in cancer.
This directory contains the Sep. 20, 2020 version of the human (hg19) CpG hypomethylation Neural network scores in one gzip-compressed tsv file per chromosome.
The Sep. 2020 Neural network score provides a prediction of the probability of each sequence to be a fast hypomethylation CpG, which was produced by a neural network model that used two independent input training datasets.
Files included in this directory:
- chr*. tsv.gz: Neural network score of each CpG in each chromosome, using hg19 coordinates. chrX and chrY are omitted.
Each row is a CG which provides (1) chromosome, (2) the corresponding C coordinate on the forward (watson) strand of the reference genome in one-based coordinates, (3) Neural network score, (4) number of CpGs within the 150bp sequence centered on this CpG, including the center CpG, (5) CpG is within a CpG island (0, no; 1, yes), CpG is within ENCODE blacklist (0, no; 1, yes)
Here the CpG islands are the union set of Irizarry (Irizarry et al. 2009, Nat Genet), Takai-Jones (Takai et al. 2002, PNAS), Gardner-Gardin CGIs (Gardner-Gardin et al. 1987, J Mol Biol.). The blacklist was downloaded from https://github.com/Boyle-Lab/Blacklist/tree/master/lists.
https://ega-archive.org/dacs/EGAC00001000145https://ega-archive.org/dacs/EGAC00001000145
Illumina 450K DNA methylation profiles of 314 fresh-frozen colorectal mucosa, adenoma or adenocarcinoma samples.
BackgroundSmoking was strongly associated with breast cancer in previous studies. Whether smoking promotes breast cancer through DNA methylation remains unknown.MethodsTwo-sample Mendelian randomization (MR) analyses were conducted to assess the causal effect of smoking-related DNA methylation on breast cancer risk. We used 436 smoking-related CpG sites extracted from 846 middle-aged women in the ARIES project as exposure data. We collected summary data of breast cancer from one of the largest meta-analyses, including 69,501 cases for ER+ breast cancer and 21,468 cases for ER− breast cancer. A total of 485 single-nucleotide polymorphisms (SNPs) were selected as instrumental variables (IVs) for smoking-related DNA methylation. We further performed an MR Steiger test to estimate the likely direction of causal estimate between DNA methylation and breast cancer. We also conducted colocalization analysis to evaluate whether smoking-related CpG sites shared a common genetic causal SNP with breast cancer in a given region.ResultsWe established four significant associations after multiple testing correction: the CpG sites of cg2583948 [OR = 0.94, 95% CI (0.91–0.97)], cg0760265 [OR = 1.07, 95% CI (1.03–1.11)], cg0420946 [OR = 0.95, 95% CI (0.93–0.98)], and cg2037583 [OR =1.09, 95% CI (1.04–1.15)] were associated with the risk of ER+ breast cancer. All the four smoking-related CpG sites had a larger variance than that in ER+ breast cancer (all p < 1.83 × 10−11) in the MR Steiger test. Further colocalization analysis showed that there was strong evidence (based on PPH4 > 0.8) supporting a common genetic causal SNP between the CpG site of cg2583948 [with IMP3 expression (PPH4 = 0.958)] and ER+ breast cancer. There were no causal associations between smoking-related DNA methylation and ER− breast cancer.ConclusionsThese findings highlight potential targets for the prevention of ER+ breast cancer. Tissue-specific epigenetic data are required to confirm these results.
https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/
This dataset contains genome-wide DNA methylation data generated from 384 pediatric acute lymphoblastic leukemia (ALL) samples originating from bone marrow or peripheral blood samples taken at ALL diagnosis (n = 384). Further details regarding the samples are available in Supplementary Table S2 from Krali et al., 2023 (https://doi.org/10.1038/s41698-023-00479-5).Genome-wide DNA methylation was analyzed at the SNP&SEQ Technology Platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. 250 ng of bisulfite converted DNA was amplified, fragmented and hybridised to Illumina Infinium Human Methylation450k Beadchip using the standard protocol from Illumina (iScan SQ instrument).This metadata record contains information about the raw idat files generated from the Infinium DNA methylation arrays. The raw idat files were processed with Methylation Module (1.8.5) software in Genome Studio (V2010.3). Peak-based correction was used to normalize the beta-value matrix.The raw idat files along with a samplesheet, processed beta-value matrix, annotation file for CpG annotation will be made available upon request. Limited phenotype information is available in the Supplemental Table S2 of the manuscript. All scripts that give a walk-through to our project, including the modelling process with Machine Learning can be found in our GitHub repository.Terms for accessThe DNA methylation dataset is only to be used for research that is seeking to advance the understanding of the influence of epigenetic factors on leukemia etiology and biology.The data should not be used for other purposes, i.e. investigating the epigenetic signatures that may lead to identification of a person.For retrieving the data used for the scope of this publication, please contact datacentre@scilifelab.se.
DNA methylation is a vital epigenetic change that regulates gene transcription and helps to keep the genome stable. The deregulation hallmark of human cancer is often defined by aberrant DNA methylation which is critical for tumor formation and controls the expression of several tumor-associated genes. In various cancers, methylation changes such as tumor suppressor gene hypermethylation and oncogene hypomethylation are critical in tumor occurrences, especially in breast cancer. Detecting DNA methylation-driven genes and understanding the molecular features of such genes could thus help to enhance our understanding of pathogenesis and molecular mechanisms of breast cancer, facilitating the development of precision medicine and drug discovery. In the present study, we retrospectively analyzed over one thousand breast cancer patients and established a robust prognostic signature based on DNA methylation-driven genes. Then, we calculated immune cells abundance in each patient and lower immune activity existed in high-risk patients. The expression of leukocyte antigen (HLA) family genes and immune checkpoints genes were consistent with the above results. In addition, more mutated genes were observed in the high-risk group. Furthermore, a in silico screening of druggable targets and compounds from CTRP and PRISM databases was performed, resulting in the identification of five target genes (HMMR, CCNB1, CDC25C, AURKA, and CENPE) and five agents (oligomycin A, panobinostat, (+)-JQ1, voxtalisib, and arcyriaflavin A), which might have therapeutic potential in treating high-risk breast cancer patients. Further in vitro evaluation confirmed that (+)-JQ1 had the best cancer cell selectivity and exerted its anti-breast cancer activity through CENPE. In conclusion, our study provided new insights into personalized prognostication and may inspire the integration of risk stratification and precision therapy.
Additional file 7: Table S5. Correlations among DNA methylation-related enzymes in blood and leukemia. The RNA-Seq gene expression data of 7 DNA methylation-related enzymes were obtained from the GTEx and TCGA dataset. The correlations among the expression levels of the 7 enzymes are analyzed and shown.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Aberrant DNA methylation is a frequent epigenetic alteration in cancer cells that has emerged as a pivotal mechanism for tumorigenesis. Accordingly, novel therapies targeting the epigenome are being explored with the aim to restore normal DNA methylation patterns on oncogenes and tumor suppressor genes. A limited number of studies indicate that dietary compound resveratrol modulates DNA methylation of several cancer-related genes; however a complete view of changes in methylome by resveratrol has not been reported yet. In this study we performed a genome-wide survey of DNA methylation signatures in triple negative breast cancer cells exposed to resveratrol. Our data showed that resveratrol treatment for 24 h and 48 h decreased gene promoter hypermethylation and increased DNA hypomethylation. Of 2476 hypermethylated genes in control cells, 1,459 and 1,547 were differentially hypomethylated after 24 h and 48 h, respectively. Remarkably, resveratrol did not induce widespread non-specific DNA hyper- or hypomethylation as changes in methylation were found in only 12.5% of 27,728 CpG loci. Moreover, resveratrol restores the hypomethylated and hypermethylated status of key tumor suppressor genes and oncogenes, respectively. Importantly, the integrative analysis of methylome and transcriptome profiles in response to resveratrol showed that methylation alterations were concordant with changes in mRNA expression. Our findings reveal for the first time the impact of resveratrol on the methylome of breast cancer cells and identify novel potential targets for epigenetic therapy. We propose that resveratrol may be considered as a dietary epidrug as it may exert its anti-tumor activities by modifying the methylation status of cancer -related genes which deserves further in vivo characterization.
DNA methylation is a vital epigenetic change that regulates gene transcription and helps to keep the genome stable. The deregulation hallmark of human cancer is often defined by aberrant DNA methylation which is critical for tumor formation and controls the expression of several tumor-associated genes. In various cancers, methylation changes such as tumor suppressor gene hypermethylation and oncogene hypomethylation are critical occurrences such as breast cancer. Detecting DNA methylation-driven genes and understanding the molecular features of such genes could thus help to enhance our understanding of pathogenesis and molecular mechanisms of breast cancer, facilitating the development of precision medicine and drug discovery. In the present study, we retrospectively analyzed over one thousand breast cancer patients and established a robust prognostic signature based on DNA methylation-driven genes. Then, we calculated immune cells abundance in each patient and lower immune activity existed in high-risk patients. The expression of leukocyte antigen (HLA) family genes and immune checkpoints genes were consistent with the above results. In addition, more mutated genes were observed in the high-risk group. Furthermore, a in silico screening of druggable targets and compounds from CTRP and PRISM databases was performed, resulting in the identification of five target genes (HMMR, CCNB1, CDC25C, AURKA, and CENPE) and five agents (oligomycin A, panobinostat, (+)-JQ1, voxtalisib, and arcyriaflavin A), which might have therapeutic potential in treating high-risk breast cancer patients. Further in vitro evaluation confirmed that (+)-JQ1 had the best cancer cell selectivity and exerted its anti-breast cancer activity through CENPE. In conclusion, our study provided new insights into personalized prognostication and may inspire the integration of risk stratification and precision therapy.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
DNA methylation status is closely associated with diverse diseases, and is generally more stable than gene expression, thus abnormal DNA methylation could be important biomarkers for tumor diagnosis, treatment and prognosis. However, the signatures regarding DNA methylation changes for pan-cancer diagnosis and prognosis are less explored. Here we systematically analyzed the genome-wide DNA methylation patterns in diverse TCGA cancers with machine learning. We identified seven CpG sites that could effectively discriminate tumor samples from adjacent normal tissue samples for 12 main cancers of TCGA (1216 samples, AUC > 0.99). Those seven potential diagnostic biomarkers were further validated in the other 9 different TCGA cancers and 4 independent datasets (AUC > 0.92). Three out of the seven CpG sites were correlated with cell division, DNA replication and cell cycle. We also identified 12 CpG sites that can effectively distinguish 26 different cancers (7605 samples), and the result was repeatable in independent datasets as well as two disparate tumors with metastases (micro-average AUC > 0.89). Furthermore, a series of potential signatures that could significantly predict the prognosis of tumor patients for 7 different cancer were identified via survival analysis (p-value < 1e-4). Collectively, DNA methylation patterns vary greatly between tumor and adjacent normal tissues, as well as among different types of cancers. Our identified signatures may aid the decision of clinical diagnosis and prognosis for pan-cancer and the potential cancer-specific biomarkers could be used to predict the primary site of metastatic breast and prostate cancers.
Gastric cancer (GC) is one of the leading types of fatal cancer worldwide. Epigenetic manipulation of cancer cells is a useful tool to better understand gene expression regulatory mechanisms and contributes to the discovery of novel biomarkers. Our research group recently reported a list of 83 genes that are potentially modulated by DNA methylation in GC cell lines. Herein, we further explored the regulation of one of these genes, LRRC37A2, in clinical samples. LRRC37A2 expression was evaluated by RT-qPCR, and DNA methylation was studied using next-generation bisulphite sequencing in 36 GC and paired adjacent nonneoplastic tissue samples. We showed that both reduced LRRC37A2 mRNA levels and increased LRRC37A2 exon methylation were associated with undifferentiated and poorly differentiated tumours. Moreover, LRRC37A2 gene expression and methylation levels were inversely correlated at the +45 exon CpG site. We suggest that DNA hypermethylation may contribute to reducing LRRC37A2 expression in undifferentiated and poorly differentiated GC. Therefore, our results show how some genes may be useful to stratify patients who are more likely to benefit from epigenetic therapy.Abbreviations: AR: androgen receptor; 5-AZAdC: 5-aza-2'-deoxycytidine; B2M: beta-2-microglobulin; GAPDH: glyceraldehyde-3-phosphate dehydrogenase; GC: gastric cancer; GLM: general linear model; LRRC37A2: leucine-rich repeat containing 37 member A2; SD: standard deviation; TFII-I: general transcription factor II-I; TSS: transcription start site; XBP1: X-box binding protein 1
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The metastatic cancer of unknown primary (CUP) sites remains a leading cause of cancer death with few therapeutic options. The aberrant DNA methylation (DNAm) is the most important risk factor for cancer, which has certain tissue specificity. However, how DNAm alterations in tumors differ among the regulatory network of multi-omics remains largely unexplored. Therefore, there is room for improvement in our accuracy in the prediction of tumor origin sites and a need for better understanding of the underlying mechanisms. In our study, an integrative analysis based on multi-omics data and molecular regulatory network uncovered genome-wide methylation mechanism and identified 23 epi-driver genes. Apart from the promoter region, we also found that the aberrant methylation within the gene body or intergenic region was significantly associated with gene expression. Significant enrichment analysis of the epi-driver genes indicated that these genes were highly related to cellular mechanisms of tumorigenesis, including T-cell differentiation, cell proliferation, and signal transduction. Based on the ensemble algorithm, six CpG sites located in five epi-driver genes were selected to construct a tissue-specific classifier with a better accuracy (>95%) using TCGA datasets. In the independent datasets and the metastatic cancer datasets from GEO, the accuracy of distinguishing tumor subtypes or original sites was more than 90%, showing better robustness and stability. In summary, the integration analysis of large-scale omics data revealed complex regulation of DNAm across various cancer types and identified the epi-driver genes participating in tumorigenesis. Based on the aberrant methylation status located in epi-driver genes, a classifier that provided the highest accuracy in tracing back to the primary sites of metastatic cancer was established. Our study provides a comprehensive and multi-omics view of DNAm-associated changes across cancer types and has potential for clinical application.
BackgroundSmoking was strongly associated with breast cancer in previous studies. Whether smoking promotes breast cancer through DNA methylation remains unknown.MethodsTwo-sample Mendelian randomization (MR) analyses were conducted to assess the causal effect of smoking-related DNA methylation on breast cancer risk. We used 436 smoking-related CpG sites extracted from 846 middle-aged women in the ARIES project as exposure data. We collected summary data of breast cancer from one of the largest meta-analyses, including 69,501 cases for ER+ breast cancer and 21,468 cases for ER− breast cancer. A total of 485 single-nucleotide polymorphisms (SNPs) were selected as instrumental variables (IVs) for smoking-related DNA methylation. We further performed an MR Steiger test to estimate the likely direction of causal estimate between DNA methylation and breast cancer. We also conducted colocalization analysis to evaluate whether smoking-related CpG sites shared a common genetic causal SNP with breast cancer in a given region.ResultsWe established four significant associations after multiple testing correction: the CpG sites of cg2583948 [OR = 0.94, 95% CI (0.91–0.97)], cg0760265 [OR = 1.07, 95% CI (1.03–1.11)], cg0420946 [OR = 0.95, 95% CI (0.93–0.98)], and cg2037583 [OR =1.09, 95% CI (1.04–1.15)] were associated with the risk of ER+ breast cancer. All the four smoking-related CpG sites had a larger variance than that in ER+ breast cancer (all p < 1.83 × 10−11) in the MR Steiger test. Further colocalization analysis showed that there was strong evidence (based on PPH4 > 0.8) supporting a common genetic causal SNP between the CpG site of cg2583948 [with IMP3 expression (PPH4 = 0.958)] and ER+ breast cancer. There were no causal associations between smoking-related DNA methylation and ER− breast cancer.ConclusionsThese findings highlight potential targets for the prevention of ER+ breast cancer. Tissue-specific epigenetic data are required to confirm these results.