Facebook
TwitterBackground In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure. Results We used PCA to detect the major sources of variance underlying the hybridization conditions followed by gene selection based on PCA-derived and permutation-based test statistics. We validated our method by applying it to well characterized yeast cell-cycle data and to two datasets from our laboratory. We could describe the major sources of variance, select informative genes and visualize the relationship of genes and arrays. We observed differences in the level of the explained variance and the interpretability of the selected genes. Conclusions Combining data visualization and permutation-based gene selection, permutation-validated PCA enables one to illustrate gene-expression variance between several conditions and to select genes by taking into account the relationship of between-group to within-group variance of genes. The method can be used to extract the leading sources of variance from microarray data, to visualize relationships between genes and hybridizations and to select informative genes in a statistically reliable manner. This selection accounts for the level of reproducibility of replicates or group structure as well as gene-specific scatter. Visualization of the data can support a straightforward biological interpretation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This example includes two tissues with three replicates apiece downloaded from GTEx. Complete.csv file here: https://github.com/5c077/ExpressionDB/tree/master/data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Package versions can also be found online: https://github.com/5c077/ExpressionDB/tree/master/data.
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on September 6,2023. Many Laboratories chose to design and print their own microarrays. At present, the choice of the genes to include on a certain microarray is a very laborious process requiring a high level of expertise. Onto-Design database is able to assist the designers of custom microarrays by providing the means to select genes based on their experiment. Design custom microarrays based on GO terms of interest. User account required. Platform: Online tool
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This example comprises human annotations downloaded from Entrez Gene. Complete.csv files in appropriate format for many common organisms studied can be downloaded here:https://github.com/5c077/ExpressionDB/tree/master/data.
Facebook
TwitterA tool for mapping transcriptome data and for creating a database with an overview of the entire pathway, a web-based resource consisting of a web-application for the visualization of complex omics data onto KEGG pathways to overview all entities in the context of cellular pathways, and databases created with the software to visualize a series of microarray data. The web-application accepts transcriptome, proteome, metabolome, or the combination of these data as input, and because of this scalability it is advantageous for the visualization of cell simulation results. Several databases of transcriptome data obtained at Mori Laboratory, Nara Institute of Science and Technology, Japan, are also presented.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This dataset is based on GEO series GSE5583. OmicsDI
The experiment compares gene expression profiles between wild‑type mouse embryonic stem cells (ES cells) and ES cells in which Histone deacetylase 1 (HDAC1) has been knocked out. OmicsDI
The organism used is mouse (Mus musculus). OmicsDI
Microarray technology was employed to measure transcript abundance across the genome, aiming to identify putative HDAC1 target genes. OmicsDI +1
The dataset includes processed expression data (after normalization and log2 transformation), allowing for downstream exploratory data analysis (EDA) and differential gene expression (DGE) analysis.
As part of EDA, sample‑wise distribution plots (e.g. boxplots) are provided to assess normalization across all arrays.
The dataset also includes downstream visualizations and analysis results, such as boxplots, which help in evaluating the consistency and quality of the processed data.
Researchers can use this dataset to perform differential expression analysis between HDAC1 knockout vs wild‑type ES cells, investigate epigenetic regulation, or explore downstream effects of histone deacetylation loss.
Additionally, the dataset can serve as a reference example for microarray data preprocessing, normalization, transformation (e.g. log2), and exploratory visualization workflows.
The dataset is publicly available and sourced from a trusted repository (GEO), ensuring transparency and reproducibility of the experiment.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
his dataset is based on National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) DataSet accession GDS2778. girke.bioinformatics.ucr.edu +1
The dataset originates from a microarray experiment measuring global gene expression under specific experimental conditions. girke.bioinformatics.ucr.edu +1
Raw and processed expression data (for all probes/genes) are included, enabling downstream analysis such as normalization, differential expression, and clustering.
The dataset has been used to perform differential gene expression (DGE) analysis to identify genes that are up- or down-regulated under the experimental condition compared to control.
Data processing steps typically include normalization (e.g., log-transformation), quality control, probe-to-gene mapping, and statistical testing for significance (e.g., using packages such as limma or other DGE tools). mahsa-ehsanifard.github.io +1
Resulting differentially expressed genes (DEGs) include statistics such as log fold change (logFC), adjusted p‑values (adj.P.Val), and possibly other metrics (e.g., B-statistic), allowing assessment of both magnitude and significance of changes.
The dataset also includes a visualization file (heatmap image) that displays expression patterns of DEGs (or top variable genes) across samples — enabling clustering and pattern recognition across samples and genes.
The heatmap helps illustrate sample-wise and gene-wise expression variation: clustering groups together samples (e.g. control vs treatment) and genes with similar expression dynamics. NCBI +1
This dataset is suitable for further bioinformatics analysis: e.g. functional enrichment (GO/Pathway), co‑expression analysis, gene signature identification, or integration with other datasets.
Users who download this dataset can reproduce or extend analyses, such as re-normalization, alternative clustering, custom DEG thresholds, or downstream biological interpretation (pathway, network analysis).
Facebook
TwitterMicroarrays are at the center of a revolution in biotechnology, allowing researchers to screen tens of thousands of genes simultaneously. Typically, they have been used in exploratory research to help formulate hypotheses. In most cases, this phase is followed by a more focused, hypothesis driven stage in which certain specific biological processes and pathways are thought to be involved. Since a single biological process can still involve hundreds of genes, microarrays are still the preferred approach as proven by the availability of focused arrays from several manufacturers. Since focused arrays from different manufacturers use different sets of genes, each array will represent any given regulatory pathway to a different extent. We argue that a functional analysis of the arrays available should be the most important criterion used in the array selection. We developed Onto-Compare as a database that can provide this functionality, based on the GO nomenclature. Compare commercially available microarrays based on GO. User account required. Platform: Online tool
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE. Documented on May 2nd,2023. TMAD stores raw and processed data from Tissue Microarray experiments along with their corresponding stained tissue images. In addition, TMAD provides methods for data retrieval, grouping of data, analysis and visualization as well as export to standard formats. Researchers at the Stanford University School of Medicine and their collaborators worldwide have constructed many tissue microarrays for use in basic research.
Facebook
TwitterDatabase for microarray data storage, retrieval, analysis, and visualization.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Materials. (DOCX)
Facebook
TwitterProliferative and replicative senescent fibroblasts from aged human donors were reprogrammed towards pluripotency and re-differentiated in fibroblasts and then further analyzed for rejuvenation assessment. Comparison of microarrays were performed by non hierarchical clustering visualized in with Treeview software
Facebook
TwitterBio Resource for array genes is a free online resource for easy access to collective and integrated information from various public biological resources for human, mouse, rat, fly and c. elegans genes. The resource includes information about the genes that are represented in Unigene clusters. This resource provides interactive tools to selectively view, analyze and interpret gene expression patterns against the background of gene and protein functional information. Different query options are provided to mine the biological relationships represented in the underlying database. Search button will take you to the list of query tools available. This Bio resource is a platform designed as an online resource to assist researchers in analyzing results of microarray experiments and developing a biological interpretation of the results. This site is mainly to interpret the unique gene expression patterns found as biological changes that can lead to new diagnostic procedures and drug targets. This interactive site allows users to selectively view a variety of information about gene functions that is stored in an underlying database. Although there are other online resources that provide a comprehensive annotation and summary of genes, this resource differs from these by further enabling researchers to mine biological relationships amongst the genes captured in the database using new query tools. Thus providing a unique way of interpreting the microarray data results based on the knowledge provided for the cellular roles of genes and proteins. A total of six different query tools are provided and each offer different search features, analysis options and different forms of display and visualization of data. The data is collected in relational database from public resources: Unigene, Locus link, OMIM, NCBI dbEST, protein domains from NCBI CDD, Gene Ontology, Pathways (Kegg, Genmapp and Biocarta) and BIND (Protein interactions). Data is dynamically collected and compiled twice a week from public databases. Search options offer capability to organize and cluster genes based on their Interactions in biological pathways, their association with Gene Ontology terms, Tissue/organ specific expression or any other user-chosen functional grouping of genes. A color coding scheme is used to highlight differential gene expression patterns against a background of gene functional information. Concept hierarchies (Anatomy and Diseases) of MESH (Medical Subject Heading) terms are used to organize and display the data related to Tissue specific expression and Diseases. Sponsors: BioRag database is maintained by the Bioinformatics group at Arizona Cancer Center. The material presented here is compiled from different public databases. BioRag is hosted by the Biotechnology Computing Facility of the University of Arizona. 2002,2003 University of Arizona.
Facebook
TwitterThe datasets presented here are the description of the dataset Brain cancer gene expression - CuMiD. They include the total count, mean, standard deviation, minimum, maximum, percentiles(25%,50%,70%), and maximum, for Normal samples and cancer types: Ependymoma, Glioblastoma, Medulloblastoma, and Pilocytic Astrocytoma. These datasets could be helpful with the visualization of the large dataset by CuMiD
Facebook
TwitterAn experiment in web-database access to large multi-dimensional data sets using a standardized experimental platform to determine if the larger scientific community can be given simple, intuitive, and user-friendly web-based access to large microarray data sets. All data in PEPR is also available via NCBI GEO. The structure and goals of PEPR differ from other mRNA expression profiling databases in a number of important ways. * The experimental platform in PEPR is standardized, and is an Affymetrix - only database. All microarrays available in the PEPR web database should ascribe to quality control and standard operating procedures. A recent publication has described the QC/SOP criteria utilized in PEPR profiles ( The Tumor Analysis Best Practices Working Group 2004 ). * PEPR permits gene-based queries of large Affymetrix array data sets without any specialized software. For example, a number of large time series projects are available within PEPR, containing 40-60 microarrays, yet these can be simply queried via a dynamic web interface with no prior knowledge of microarray data analysis. * Projects in PEPR originate from scientists world-wide, but all data has been generated by the Research Center for Genetic Medicine, Children''''s National Medical Center, Washington DC. Future developments of PEPR will allow remote entry of Affymetrix data ascribing to the same QC/SOP protocols. They have previously described an initial implementation of PEPR, and a dynamic web-queried time series graphical interface ( Chen et al. 2004 ). A publication showing the utility of PEPR for pharmacodynamic data has recently been published ( Almon et al. 2003 ).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset provides the inputs used in the Galaxy Training Network (GTN) training 'End-to-End Tissue Microarray Image Analysis with Galaxy-ME'. The tutorial demonstrates how to use the Galaxy-ME tool suite for primary image processing, data analysis, and interactive visualization of multiple tissue imaging datasets. Original data was published by Schapiro et al.
Facebook
TwitterA high performance search engine for gene expression that integrates thousands of manually curated public microarray and RNAseq experiments and nicely visualizes gene expression across different biological contexts (diseases, drugs, tissues, cancers, genotypes, etc.). There are two basic analysis approaches: # for a gene of interest, identify which conditions affect its expression. # for condition(s) of interest, identify which genes are specifically expressed in this/these conditions. Genevestigator builds on the deep integration of data, both at the level of data normalization and on the level of sample annotations. This deep integration allows scientists to ask new types of questions that cannot be addressed using conventional tools.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
• This dataset contains expression matrix handling and normalization results derived from GEO dataset GSE32138. • It includes raw gene expression values processed using standardized bioinformatics workflows. • The dataset demonstrates quantile normalization applied to microarray-based expression data. • It provides visualization outputs used to assess data distribution before and after normalization. • The goal of this dataset is to support reproducible analysis of GSE32138 preprocessing and quality control. • Researchers can use the files for practice in normalization, exploratory data analysis, and visualization. • This dataset is useful for learning microarray preprocessing techniques in R or Python.
Facebook
TwitterbioPIXIE is a novel system for biological data integration and visualization. It allows you to discover interaction networks and pathways in which your gene(s) (e.g. BNI1, YFL039C) of interest participate.
Facebook
TwitterBackground In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure. Results We used PCA to detect the major sources of variance underlying the hybridization conditions followed by gene selection based on PCA-derived and permutation-based test statistics. We validated our method by applying it to well characterized yeast cell-cycle data and to two datasets from our laboratory. We could describe the major sources of variance, select informative genes and visualize the relationship of genes and arrays. We observed differences in the level of the explained variance and the interpretability of the selected genes. Conclusions Combining data visualization and permutation-based gene selection, permutation-validated PCA enables one to illustrate gene-expression variance between several conditions and to select genes by taking into account the relationship of between-group to within-group variance of genes. The method can be used to extract the leading sources of variance from microarray data, to visualize relationships between genes and hybridizations and to select informative genes in a statistically reliable manner. This selection accounts for the level of reproducibility of replicates or group structure as well as gene-specific scatter. Visualization of the data can support a straightforward biological interpretation.