5 datasets found
  1. Automated Retrieval GEO GE Data MicroarrayRNASeq

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). Automated Retrieval GEO GE Data MicroarrayRNASeq [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/automated-retrieval-geo-ge-data-microarrayrnaseq
    Explore at:
    zip(2393 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset provides a comprehensive pipeline for automated retrieval of gene expression data.

    Supports both Microarray and RNA-Seq datasets from the NCBI GEO database.

    Designed for researchers and bioinformaticians to streamline GEO data analysis.

    Includes R scripts to download, process, and export GEO datasets efficiently.

    Handles metadata extraction, sample annotation, and expression matrix generation.

    Facilitates downstream analyses such as differential gene expression and visualization.

    Compatible with various GEO platforms, reducing manual data curation efforts.

    Enables reproducible research by standardizing data retrieval and processing.

    Useful for comparative studies, functional genomics, and biomarker discovery.

    Reduces the technical barrier for users unfamiliar with GEO data structures.

  2. Gene expression data sources for in silico approach to assessing activation...

    • springernature.figshare.com
    application/gzip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sylvain Brohee; Amir Sonnenblick; David Venet (2023). Gene expression data sources for in silico approach to assessing activation of AKT/mTOR signalling pathway in ER-positive early Breast Cancer [Dataset]. http://doi.org/10.6084/m9.figshare.7461776.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sylvain Brohee; Amir Sonnenblick; David Venet
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains data files and identifiers for original data sources for 39 gene expression datasets from over 7,000 individuals with estrogen receptor positive (ER-positive) Breast Cancer (BC).BackgroundThe related study developed a novel in silico approach to assess activation of different signalling pathways. The phosphatidylinositol 3-kinase (PI3K)/AKT/mTOR signalling pathway mediates key cellular functions, including growth, proliferation and survival and is frequently involved in carcinogenesis, tumor progression and metastases. This research seeks to target relative contribution of AKT and mTOR (downstream of PI3K) in BC outcomes using the in silico approach via integrated reverse phase protein array (RPPA) and matched gene expression.Methods and sample sizeThe methodology includes the development of gene signatures that reflect level of expression of pAKT and p-mTOR separately. Pooled analysis of gene expression data from over 7,000 patients with ER-positive BC was then performed. This data record holds links to the repositories holding these data, as well as the R-data files for each data record used in the analysis. All gene signatures developed are captured in Supplementary Data Sonnenblick.pdf.xlsxData sourcesThe dataset name, relevant DOI, accession number or access requirements are listed alongside the file type and repository name or other source where applicable.GEO=Gene Expression OmnibusEGA=European Genome-phenome ArchiveThis data table is available to download as NPJBCANCER-00304R1-data-sources.xlsx including more detailed information and web urls to each data source. data_db.tab contains more detailed technical metadata for each data source.

    Dataset Data location Permanent identifier/url

    NKI CCB NKI http://ccb.nki.nl/data/van-t-Veer_Nature_2002/

    UCSF GEO GSE123833

    STNO2 GEO GSE4335

    NCI Research Article (Supplementary files) 10.1073/pnas.1732912100

    UNC4 GEO GSE18229

    CAL Array Express E-TABM-158

    MDA4 GEO GSE123832

    KOO GEO GSE123831

    HLP Array Express E-TABM-543

    EXPO GEO GSE2109

    VDX GEO GSE2034/GSE5327

    MSK GEO GSE2603

    UPP GEO GSE3494

    STK GEO GSE1456

    UNT GEO GSE2990

    DUKE GEO GSE3143

    TRANSBIG GEO GSE7390

    DUKE2 GEO GSE6961

    MAINZ GEO GSE11121

    LUND2 GEO GSE5325

    LUND GEO GSE5325

    FNCLCC GEO GSE7017

    EMC2 GEO GSE12276

    MUG GEO GSE10510

    NCCS GEO GSE5364

    MCCC GEO GSE19177

    EORTC10994 GEO GSE1561

    DFHCC GEO GSE19615

    DFHCC2 GEO GSE18864

    DFHCC3 GEO GSE3744

    DFHCC4 GEO GSE5460

    MAQC2 GEO GSE20194

    TAM GEO GSE6532/GSE9195

    MDA5 GEO GSE17705

    VDX3 GEO GSE12093

    METABRIC EGA EGAS00000000083

    TCGA TCGA https://tcga-data.nci.nih.gov/docs/publications/brca_2012/

    DNA methylation (Dedeurwaerder et al. 2011) GEO https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE20713

  3. DGE GO Enrichment Analysis Microarray Data GDS2778

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). DGE GO Enrichment Analysis Microarray Data GDS2778 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/dge-go-enrichment-analysis-microarray-data-gds2778
    Explore at:
    zip(6820264 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    his dataset is based on National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) DataSet accession GDS2778. girke.bioinformatics.ucr.edu +1

    The dataset originates from a microarray experiment measuring global gene expression under specific experimental conditions. girke.bioinformatics.ucr.edu +1

    Raw and processed expression data (for all probes/genes) are included, enabling downstream analysis such as normalization, differential expression, and clustering.

    The dataset has been used to perform differential gene expression (DGE) analysis to identify genes that are up- or down-regulated under the experimental condition compared to control.

    Data processing steps typically include normalization (e.g., log-transformation), quality control, probe-to-gene mapping, and statistical testing for significance (e.g., using packages such as limma or other DGE tools). mahsa-ehsanifard.github.io +1

    Resulting differentially expressed genes (DEGs) include statistics such as log fold change (logFC), adjusted p‑values (adj.P.Val), and possibly other metrics (e.g., B-statistic), allowing assessment of both magnitude and significance of changes.

    The dataset also includes a visualization file (heatmap image) that displays expression patterns of DEGs (or top variable genes) across samples — enabling clustering and pattern recognition across samples and genes.

    The heatmap helps illustrate sample-wise and gene-wise expression variation: clustering groups together samples (e.g. control vs treatment) and genes with similar expression dynamics. NCBI +1

    This dataset is suitable for further bioinformatics analysis: e.g. functional enrichment (GO/Pathway), co‑expression analysis, gene signature identification, or integration with other datasets.

    Users who download this dataset can reproduce or extend analyses, such as re-normalization, alternative clustering, custom DEG thresholds, or downstream biological interpretation (pathway, network analysis).

  4. Table3_In-depth exploration of the shared genetic signature and molecular...

    • frontiersin.figshare.com
    xlsx
    Updated Nov 24, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Weijuan Lou; Wenhui Li; Ming Yang; Chong Yuan; Rui Jing; Shunjie Chen; Cheng Fang (2023). Table3_In-depth exploration of the shared genetic signature and molecular mechanisms between end-stage renal disease and osteoporosis.XLSX [Dataset]. http://doi.org/10.3389/fgene.2023.1159868.s003
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 24, 2023
    Dataset provided by
    Frontiers Mediahttp://www.frontiersin.org/
    Authors
    Weijuan Lou; Wenhui Li; Ming Yang; Chong Yuan; Rui Jing; Shunjie Chen; Cheng Fang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Background: Osteoporosis (OS) and fractures are common in patients with end-stage renal disease (ESRD) and maintenance dialysis patients. However, diagnosing osteoporosis in this population is challenging. The aim of this research is to explore the common genetic profile and potential molecular mechanisms of ESRD and OS.Methods and results: Download microarray data for ESRD and OS from the Gene Expression Omnibus (GEO) database. Weighted correlation network analysis (WGCNA) was used to identify co-expression modules associated with ESRD and OS. Random Forest (RF) and Lasso Regression were performed to identify candidate genes, and consensus clustering for hierarchical analysis. In addition, miRNAs shared in ESRD and OS were identified by differential analysis and their target genes were predicted by Tragetscan. Finally, we constructed a common miRNAs-mRNAs network with candidate genes and shared miRNAs. By WGCNA, two important modules of ESRD and one important module of OS were identified, and the functions of three major clusters were identified, including ribosome, RAS pathway, and MAPK pathway. Eight gene signatures obtained by using RF and Lasso machine learning methods with area under curve (AUC) values greater than 0.7 in ESRD and in OS confirmed their diagnostic performance. Consensus clustering successfully stratified ESRD patients, and C1 patients with more severe ESRD phenotype and OS phenotype were defined as “OS-prone group”.Conclusion: Our work identifies biological processes and underlying mechanisms shared by ESRD and OS, and identifies new candidate genes that can be used as biomarkers or potential therapeutic targets, revealing molecular alterations in susceptibility to OS in ESRD patients.

  5. Alzheimer Microarray Analysis

    • kaggle.com
    zip
    Updated Dec 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrew Gao (2019). Alzheimer Microarray Analysis [Dataset]. https://www.kaggle.com/andrewgao/alzheimer-microarray-analysis
    Explore at:
    zip(1393807 bytes)Available download formats
    Dataset updated
    Dec 11, 2019
    Authors
    Andrew Gao
    Description

    https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE28146 "Microarray analyses of laser-captured hippocampus reveal distinct gray and white matter signatures associated with incipient Alzheimer’s disease" Blalock et al. https://www.googleapis.com/download/storage/v1/b/kaggle-user-content/o/inbox%2F4117206%2F295240dbe240e64fb667b364fbee3bc7%2FGSE28146valuedistribution.png?generation=1576089220300565&alt=media" alt="">

    Disease state: control, incipient, moderate, and severe patients around 80-90 years old male and female

    8 controls 7 incipient 8 moderate 7 severe

    "Alzheimer's disease (AD) is a devastating neurodegenerative disorder that threatens to reach epidemic proportions as our population ages. Although much research has examined molecular pathways associated with AD, relatively few studies have focused on critical early stages. Our prior microarray study correlated gene expression in human hippocampus with AD markers. Results suggested a new model of early-stage AD in which pathology spreads along myelinated axons, orchestrated by upregulated transcription and epigenetic factors related to growth and tumor suppression (Blalock et al., 2004). However, the microarray analyses were performed on RNA from fresh frozen hippocampal tissue blocks containing both gray and white matter, potentially obscuring region-specific changes. In the present study, we used laser capture microdissection to exclude major white matter tracts and selectively collect CA1 hippocampal gray matter from formalin-fixed, paraffin-embedded (FFPE) hippoc ampal sections of the same subjects assessed in our prior study. Microarray analyses of this gray matter-enriched tissue revealed many correlations similar to those seen in our prior study, particularly for neuron-related genes. Nonetheless, in the laser-captured tissue, we found a striking paucity of the AD-associated epigenetic and transcription factor genes that had been strongly overrepresented in the prior tissue block study. In addition, we identified novel pathway alterations that may have considerable mechanistic implications, including downregulation of genes stabilizing ryanodine receptor Ca2+ release and upregulation of vascular development genes. We conclude that FFPE tissue can be a reliable resource for microarray studies, that upregulation of growth-related epigenetic/ transcription factors with incipient AD is predominantly localized to white matter, further supporting our prior findings and model, and that alterations in vascular and ryanodine receptor-relat ed pathways in gray matter are closely associated with incipient AD."

    Enjoy!

  6. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dr. Nagendra (2025). Automated Retrieval GEO GE Data MicroarrayRNASeq [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/automated-retrieval-geo-ge-data-microarrayrnaseq
Organization logo

Automated Retrieval GEO GE Data MicroarrayRNASeq

Automated GEO Gene Expression Retrieval and Processing for Microarray/RNA-Seq

Explore at:
zip(2393 bytes)Available download formats
Dataset updated
Nov 29, 2025
Authors
Dr. Nagendra
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

This dataset provides a comprehensive pipeline for automated retrieval of gene expression data.

Supports both Microarray and RNA-Seq datasets from the NCBI GEO database.

Designed for researchers and bioinformaticians to streamline GEO data analysis.

Includes R scripts to download, process, and export GEO datasets efficiently.

Handles metadata extraction, sample annotation, and expression matrix generation.

Facilitates downstream analyses such as differential gene expression and visualization.

Compatible with various GEO platforms, reducing manual data curation efforts.

Enables reproducible research by standardizing data retrieval and processing.

Useful for comparative studies, functional genomics, and biomarker discovery.

Reduces the technical barrier for users unfamiliar with GEO data structures.

Search
Clear search
Close search
Google apps
Main menu