Saved datasets
Last updated
Download format
Croissant
Croissant is a format for Machine Learning datasets
Learn more about this at mlcommons.org/croissant.
Usage rights
License from data provider
Please review the applicable license to make sure your contemplated use is permitted.
Topic
Provider
Free
Cost to access
Described as free to access or have a license that allows redistribution.
100+ datasets found
  1. Data from: Predicting locus-specific DNA methylation levels in cancer and...

    • tandf.figshare.com
    xlsx
    Updated May 23, 2024
  2. Z

    Data from: DNA methylation protects cancer cells against senescence

    • datasetcatalog.nlm.nih.gov
    Updated May 20, 2025
  3. f

    Table_1_Co-occurrence and Mutual Exclusivity Analysis of DNA Methylation...

    • datasetcatalog.nlm.nih.gov
    Updated Jan 29, 2020
    + more versions
  4. Data from: Distinct chromatin signatures of DNA hypomethylation in aging and...

    • zenodo.org
    • produccioncientifica.ucm.es
    bin, pdf, xls
    Updated Aug 2, 2024
  5. e

    Genome-wide DNA methylation analysis of breast cancer

    • ebi.ac.uk
    Updated Feb 28, 2016
  6. s

    Data from: DNA methylation signatures predict cytogenetic subtype and...

    • figshare.scilifelab.se
    • datasetcatalog.nlm.nih.gov
    • +1more
    xlsx
    Updated Jan 15, 2025
  7. f

    Additional file 2 of DNA methylation and cancer incidence:...

    • datasetcatalog.nlm.nih.gov
    Updated Feb 26, 2021
  8. Table6_Pan-Cancer DNA Methylation Analysis and Tumor Origin Identification...

    • frontiersin.figshare.com
    xlsx
    Updated May 31, 2023
    + more versions
  9. E

    Data from: DNA methylation database for gynecological cancer detection,...

    • ega-archive.org
  10. PMD hypomethylation human (hg19) neural network scores

    • zenodo.org
    application/gzip
    Updated Jun 13, 2022
  11. f

    Table 1_Development and validation of a 14-CpG DNA methylation signature and...

    • datasetcatalog.nlm.nih.gov
    Updated Mar 19, 2025
  12. f

    Table1_A Prognostic Model for Breast Cancer Based on Cancer...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jan 3, 2022
  13. Data from: Conservation of aging and cancer epigenetic signatures across...

    • zenodo.org
    • produccioncientifica.ucm.es
    • +1more
    bin, zip
    Updated Mar 17, 2021
  14. TCGA Breast Cancer 450K Methylation Data

    • zenodo.org
    application/gzip
    Updated Jan 24, 2020
  15. f

    Additional file 1 of DNA methylation and cancer incidence:...

    • datasetcatalog.nlm.nih.gov
    Updated Feb 26, 2021
  16. f

    Data from: Integrative analysis identifies potential DNA methylation...

    • tandf.figshare.com
    docx
    Updated Feb 16, 2024
  17. e

    DNA methylation profiling in breast cancer discordant identical twins

    • ebi.ac.uk
    • omicsdi.org
    Updated Oct 16, 2012
  18. s

    Data from: Multimodal classification of molecular subtypes in pediatric...

    • figshare.scilifelab.se
    Updated Jan 15, 2025
    + more versions
  19. r

    MethyCancer

    • rrid.site
    • neuinfo.org
    • +2more
    Updated Aug 4, 2025
  20. f

    Table_3_Smoking, DNA Methylation, and Breast Cancer: A Mendelian...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Sep 28, 2021
    + more versions
Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Shuzheng Zhang; Baoshan Ma; Yu Liu; Yiwen Shen; Di Li; Shuxin Liu; Fengju Song (2024). Predicting locus-specific DNA methylation levels in cancer and paracancer tissues [Dataset]. http://doi.org/10.6084/m9.figshare.25889005.v1
Organization logo

Data from: Predicting locus-specific DNA methylation levels in cancer and paracancer tissues

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
May 23, 2024
Dataset provided by
Taylor & Francishttps://taylorandfrancis.com/
Authors
Shuzheng Zhang; Baoshan Ma; Yu Liu; Yiwen Shen; Di Li; Shuxin Liu; Fengju Song
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Aim: To predict base-resolution DNA methylation in cancerous and paracancerous tissues. Material & methods: We collected six cancer DNA methylation datasets from The Cancer Genome Atlas and five cancer datasets from Gene Expression Omnibus and established machine learning models using paired cancerous and paracancerous tissues. Tenfold cross-validation and independent validation were performed to demonstrate the effectiveness of the proposed method. Results: The developed cross-tissue prediction models can substantially increase the accuracy at more than 68% of CpG sites and contribute to enhancing the statistical power of differential methylation analyses. An XGBoost model leveraging multiple correlating CpGs may elevate the prediction accuracy. Conclusion: This study provides a powerful tool for DNA methylation analysis and has the potential to gain new insights into cancer research from epigenetics. The authors employed machine learning models to predict genome-wide DNA methylation (DNAm) levels in cancerous tissues (CTs) and paracancerous tissues (PTs) when one of them is difficult to obtain. The proposed model based on a single CpG site achieves an improvement of mean absolute error at more than 68% of CpGs. A multiple-CpG-based XGBoost model can further improve the predictive performance when there is considerable variability between individuals. The detected CpG sites in differential methylation analysis are statistically more significant by combining the measured and predicted PTs to enlarge the sample size. When using CTs as predictors instead of PTs, the prediction models have better performance. The aggressiveness of cancers and patient outcome may be predictable using well-predicted DNAm profiles in CT/PT. Functional enrichment analysis based on highly correlated CpG sites identified important pathways involved in cancer progression. The cross-tumor DNAm prediction model has the potential to be applied to an external cancer dataset for a subset of probes with high correlation in both cancers.

Search
Clear search
Close search
Google apps
Main menu