Dataset is downloaded from https://amp.pharm.mssm.edu/archs4/download.html The methods are described in Nature Communications paper: https://www.nature.com/articles/s41467-018-03751-6
The ARCHS4 data provides user-friendly access to multiple gene expression data from the GEO database. (https://www.ncbi.nlm.nih.gov/geo/ ). While in GEO database most of data is stored in raw formats, ARCHS4 provides prepared count matrix expression data. While GEO contains data stored separately for each research paper, ARCHS4 collects all the information in one single matrix. One may consult the main site for further information.
Main data files are in H5 (HD5, Hierarchical Data Format ) file format https://en.wikipedia.org/wiki/Hierarchical_Data_Format It contains expression data, as well as annotation data and futher meta-information. There are several other auxilliary files like TSNE 3d projection (in CSV format) and correlation matrices for genes for human and mouse in feather format.
The main file (for human): human_matrix.h5 - contains data matrix - which is 238522 samples times 35238 genes, as well as, various meta information: gene names, samples information (tissue, etc), references to GEO database id where all the details can be found.
There is also similar data for mouse, csv files with TSNE images, correlation matrices for genes.
The ARCHS4 project is by :
'Alexander Lachmann', 'alexander.lachmann@mssm.edu', update: '2020-02-06'
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
gabrielaltay/archs4-human-gene-v2.5-protein-coding-expression dataset hosted on Hugging Face and contributed by the HF Datasets community
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
orthosData
is the companion database to the orthos
software for mechanistic studies using differential gene expression experiments.
It currently encompasses data for over 100,000 differential gene expression mouse and human experiments distilled and compiled from the ARCHS4 database* as well as associated pre-trained variational models.
Together with orthos
it was developed to provide a better understanding of the effects of experimental treatments on gene expression and to help map treatments to mechanisms of action.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundChronic pressure overload triggers pathological cardiac hypertrophy that eventually leads to heart failure. Effective biomarkers and therapeutic targets for heart failure remain to be defined. The aim of this study is to identify key genes associated with pathological cardiac hypertrophy by combining bioinformatics analyses with molecular biology experiments.MethodsComprehensive bioinformatics tools were used to screen genes related to pressure overload-induced cardiac hypertrophy. We identified differentially expressed genes (DEGs) by overlapping three Gene Expression Omnibus (GEO) datasets (GSE5500, GSE1621, and GSE36074). Correlation analysis and BioGPS online tool were used to detect the genes of interest. A mouse model of cardiac remodeling induced by transverse aortic constriction (TAC) was established to verify the expression of the interest gene during cardiac remodeling by RT-PCR and western blot. By using RNA interference technology, the effect of transcription elongation factor A3 (Tcea3) silencing on PE-induced hypertrophy of neonatal rat ventricular myocytes (NRVMs) was detected. Next, gene set enrichment analysis (GSEA) and the online tool ARCHS4 were used to predict the possible signaling pathways, and the fatty acid oxidation relevant pathways were enriched and then verified in NRVMs. Furthermore, the changes of long-chain fatty acid respiration in NRVMs were detected using the Seahorse XFe24 Analyzer. Finally, MitoSOX staining was used to detect the effect of Tcea3 on mitochondrial oxidative stress, and the contents of NADP(H) and GSH/GSSG were detected by relevant kits.ResultsA total of 95 DEGs were identified and Tcea3 was negatively correlated with Nppa, Nppb and Myh7. The expression level of Tcea3 was downregulated during cardiac remodeling both in vivo and in vitro. Knockdown of Tcea3 aggravated cardiomyocyte hypertrophy induced by PE in NRVMs. GSEA and online tool ARCHS4 predict Tcea3 involved in fatty acid oxidation (FAO). Subsequently, RT-PCR results showed that knockdown of Tcea3 up-regulated Ces1d and Pla2g5 mRNA expression levels. In PE induced cardiomyocyte hypertrophy, Tcea3 silencing results in decreased fatty acid utilization, decreased ATP synthesis and increased mitochondrial oxidative stress.ConclusionOur study identifies Tcea3 as a novel anti-cardiac remodeling target by regulating FAO and governing mitochondrial oxidative stress.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
OBS! This is the limma results of the analysis. See https://doi.org/10.5281/zenodo.8162229 for the DESeq2/DEXSeq results.
This dataset contains results from paired differential expression and differential splicing analyses as well as gene-set over-representation analysis results for 199 baseline vs. case comparisons across 100 randomly curated datasets with accompanying metadata (article).
All results were computed using the R package pairedGSEA, which utilized Limma (Ritchie et al., 2015) and fgsea (Korotkevich et al., 2019).
Each .RDS file contains a list with four objects: A 'metadata' object with the metadata of the respective raw data, a 'genes' object with gene-level differential splicing and expression results, a 'gene_set' object with over-representation results, and 'experiment' with the experiment title.
The filenames follow this pattern: "[dataset ID]_[GEO accession number]_[Manually assigned comparison title].RDS".
All datasets were obtained from a local copy of the ARCHS4 v11 database of transcript counts (Lachmann et al., 2018).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S1. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 10nM E2 for 45 minutes in GSE94023 study. Table S2. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 10nM E2 for 45 minutes in GSE99626 study. Table S3. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 10nM E2 for 45 minutes in GSE67295 study. Table S4. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 10nM E2 for 45 minutes in GSE115607 study. Table S5. Differentially bound sites (DBSs) obtained from T47D cell line treated with 10nM E2 for 45 minutes in GSE80367 study. Table S6. Differentially bound sites (DBSs) obtained from T47D cell line treated with 100nM E2 for 45 minutes in GSE23893 study. Table S7. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 100nM E2 for 45 minutes in GSE23893 study. Table S8. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 100nM E2 for 45 minutes in GSE54855 study. Table S9. Differentially bound sites (DBSs) obtained from MCF7 cell line treated with 100nM E2 for 45 minutes in GSE59530 study. Table S10. Default binding affinity matrix of 6 samples by the 63,612 sites that overlap in at least two of the samples using DiffBind in (GSE94023, GSE99626, GSE67295, & GSE115607) MCF7 cell line treated with 10nM E2 for 45 minutes. Table S11. Default binding affinity matrix of 6 samples by the 23,517 sites that overlap in at least two of the samples using DiffBind in (GSE23893, GSE54855, & GSE59530) MCF7 cell line treated with 100nM E2 for 45 minutes. Table S12. Meta-differentially bound sites (meta-DBSs) obtained from a meta-analysis on (GSE94023, GSE99626, GSE67295, & GSE115607) MCF7 cell line treated with 10nM E2 for 45 minutes. Table S13. Meta-differentially bound sites (meta-DBSs) obtained from a meta-analysis on (GSE23893, GSE54855, & GSE59530) MCF7 cell line treated with 100nM E2 for 45 minutes. Table S14. literature_ChIP-seq. Table S15. Enrichr. Table S16. ARCHS4—Coexpression. Table S17. ENCODE--ChIP-seq. Table S18. ReMap--ChIP-seq. Table S19. GTEx—Coexpression. Table S20. Integrated_topRank. Table S21. Integrated_meanRank. Table S22. Gene Ontology (GO) for 7,308 meta-DBSs related to 617 common genes among MCF7 & T47D cell lines using Cistrome-GO. Table S23. KEGG pathways analysis for 7,308 meta-DBSs related to 617 common genes among MCF7 & T47D cell lines using Cistrome-GO. Table S24. Differentially expressed genes (DEGs) identified from GRO-seq data in the MCF7 cell line treated with 100nM E2 for 40 minutes in the GSE27463 study.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains various files essential for understanding and employing the GeneRAIN models, as described in the accompanying manuscript. GeneRAIN models use bulk RNA-seq data and a 'Binning-By-Gene' normalization method. These models aim to improve upon existing methods in understanding biological information and include a vector representation of genes called GeneRAIN-vec. After thorough testing, these models have shown their effectiveness in predicting a wide range of biological characteristics, including for long non-coding RNAs. This shows their usefulness and potential in bioinformatics and computational biology. The provided dataset includes:
Anatomical structure and cell type biomarker annotations from the HuBMAP ASCT+B tables, augmented with RNA-seq coexpression data from ARCHS4
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 2: Table S1. Subject information; Table S2. Full list of the main 251 DEGs; Table S3. Cell-type enrichment analysis using single-cell level gene expression profiles obtained from murine habenula; Table S4. Top-10 GO-BP terms and related genes; Table S5. Top-10 ARCHS4 co-expressed TFs and putative target DEGs; Table S6. List of EC markers that did not exhibit significant changes in mRNA expression levels.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains supplementary materials for the research paper "Association of copy number alterations with the immune transcriptomic landscape in cancer". The materials are organized in the following folders:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Table S2. Gene enrichment analyses (EnrichR) identifies highly expressed genes in tissues derived from all three germ layers using ARCHS4 enrichment tool. (XLSX 11 kb)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains results from paired differential expression and differential splicing analyses as well as gene-set over-representation analysis results for 199 baseline vs. case comparisons across 100 randomly curated datasets with accompanying metadata (article).
All results were computed using the R package pairedGSEA, which utilized DESeq2 (Love et al., 2014), DEXSeq (Anders et al., 2012), and fgsea (Korotkevich et al., 2019).
See limma results here: https://doi.org/10.5281/zenodo.8162214
Each .RDS file contains a list with four objects: A 'metadata' object with the metadata of the respective raw data, a 'genes' object with gene-level differential splicing and expression results, a 'gene_set' object with over-representation results, and 'experiment' with the experiment title.
The filenames follow this pattern: "[dataset ID]_[GEO accession number]_[Manually assigned comparison title].RDS".
All datasets were obtained from a local copy of the ARCHS4 v11 database of transcript counts (Lachmann et al., 2018).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundInappropriate repair of DNA damage drives carcinogenesis. Lymphoid-specific helicase (HELLS) is an important component of the chromatin remodeling complex that helps repair DNA through various mechanisms such as DNA methylation, histone posttranslational modification, and nucleosome remodeling. Its role in human cancer initiation and progression has garnered recent attention. Our study aims to provide a more systematic and comprehensive understanding of the role of HELLS in the development and progression of multiple malignancies through analysis of HELLS in cancers.MethodsWe explored the role of HELLS in cancers using The Cancer Genome Atlas (TCGA) and Genotype-Tissue Expression (GTEx) database. Multiple web platforms and software were used for data analysis, including R, Cytoscape, HPA, Archs4, TISIDB, cBioPortal, STRING, GSCALite, and CancerSEA.ResultsHigh HELLS expression was found in a variety of cancers and differentially expressed across molecular and immune subtypes. HELLS was involved in many cancer pathways. Its expression positively correlated with Th2 and Tcm cells in most cancers. It also correlated with genetic markers of immunomodulators in various cancers.ConclusionsOur study elucidates the role HELLS plays in promotion, inhibition, and treatment of different cancers. HELLS is a potential cancer diagnostic and prognostic biomarker with immune, targeted, or cytotoxic therapeutic value. This work is a prerequisite to clinical validation and treatment of HELLS in cancers.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Dataset is downloaded from https://amp.pharm.mssm.edu/archs4/download.html The methods are described in Nature Communications paper: https://www.nature.com/articles/s41467-018-03751-6
The ARCHS4 data provides user-friendly access to multiple gene expression data from the GEO database. (https://www.ncbi.nlm.nih.gov/geo/ ). While in GEO database most of data is stored in raw formats, ARCHS4 provides prepared count matrix expression data. While GEO contains data stored separately for each research paper, ARCHS4 collects all the information in one single matrix. One may consult the main site for further information.
Main data files are in H5 (HD5, Hierarchical Data Format ) file format https://en.wikipedia.org/wiki/Hierarchical_Data_Format It contains expression data, as well as annotation data and futher meta-information. There are several other auxilliary files like TSNE 3d projection (in CSV format) and correlation matrices for genes for human and mouse in feather format.
The main file (for human): human_matrix.h5 - contains data matrix - which is 238522 samples times 35238 genes, as well as, various meta information: gene names, samples information (tissue, etc), references to GEO database id where all the details can be found.
There is also similar data for mouse, csv files with TSNE images, correlation matrices for genes.
The ARCHS4 project is by :
'Alexander Lachmann', 'alexander.lachmann@mssm.edu', update: '2020-02-06'