Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
his dataset is based on National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) DataSet accession GDS2778. girke.bioinformatics.ucr.edu +1
The dataset originates from a microarray experiment measuring global gene expression under specific experimental conditions. girke.bioinformatics.ucr.edu +1
Raw and processed expression data (for all probes/genes) are included, enabling downstream analysis such as normalization, differential expression, and clustering.
The dataset has been used to perform differential gene expression (DGE) analysis to identify genes that are up- or down-regulated under the experimental condition compared to control.
Data processing steps typically include normalization (e.g., log-transformation), quality control, probe-to-gene mapping, and statistical testing for significance (e.g., using packages such as limma or other DGE tools). mahsa-ehsanifard.github.io +1
Resulting differentially expressed genes (DEGs) include statistics such as log fold change (logFC), adjusted p‑values (adj.P.Val), and possibly other metrics (e.g., B-statistic), allowing assessment of both magnitude and significance of changes.
The dataset also includes a visualization file (heatmap image) that displays expression patterns of DEGs (or top variable genes) across samples — enabling clustering and pattern recognition across samples and genes.
The heatmap helps illustrate sample-wise and gene-wise expression variation: clustering groups together samples (e.g. control vs treatment) and genes with similar expression dynamics. NCBI +1
This dataset is suitable for further bioinformatics analysis: e.g. functional enrichment (GO/Pathway), co‑expression analysis, gene signature identification, or integration with other datasets.
Users who download this dataset can reproduce or extend analyses, such as re-normalization, alternative clustering, custom DEG thresholds, or downstream biological interpretation (pathway, network analysis).
Facebook
TwitterBackground Microarray technologies are emerging as a promising tool for genomic studies. The challenge now is how to analyze the resulting large amounts of data. Clustering techniques have been widely applied in analyzing microarray gene-expression data. However, normal mixture model-based cluster analysis has not been widely used for such data, although it has a solid probabilistic foundation. Here, we introduce and illustrate its use in detecting differentially expressed genes. In particular, we do not cluster gene-expression patterns but a summary statistic, the t-statistic. Results The method is applied to a data set containing expression levels of 1,176 genes of rats with and without pneumococcal middle-ear infection. Three clusters were found, two of which contain more than 95% genes with almost no altered gene-expression levels, whereas the third one has 30 genes with more or less differential gene-expression levels. Conclusions Our results indicate that model-based clustering of t-statistics (and possibly other summary statistics) can be a useful statistical tool to exploit differential gene expression for microarray data.
Facebook
TwitterA robust semi-parametric normalization technique has been developed, based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumption that departures of the response from linearity are small and slowly varying. The method was tested using data simulated under various error models and it performs well.
Facebook
TwitterBackground Microarray experiments offer a potent solution to the problem of making and comparing large numbers of gene expression measurements either in different cell types or in the same cell type under different conditions. Inferences about the biological relevance of observed changes in expression depend on the statistical significance of the changes. In lieu of many replicates with which to determine accurate intensity means and variances, reliable estimates of statistical significance remain problematic. Without such estimates, overly conservative choices for significance must be enforced. Results A simple statistical method for estimating variances from microarray control data which does not require multiple replicates is presented. Comparison of datasets from two commercial entities using this difference-averaging method demonstrates that the standard deviation of the signal scales at a level intermediate between the signal intensity and its square root. Application of the method to a dataset related to the β-catenin pathway yields a larger number of biologically reasonable genes whose expression is altered than the ratio method. Conclusions The difference-averaging method enables determination of variances as a function of signal intensities by averaging over the entire dataset. The method also provides a platform-independent view of important statistical properties of microarray data.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Over the last decade, many analytical methods and tools have been developed for microarray data. The detection of differentially expressed genes (DEGs) among different treatment groups is often a primary purpose of microarray data analysis. In addition, association studies investigating the relationship between genes and a phenotype of interest such as survival time are also popular in microarray data analysis. Phenotype association analysis provides a list of phenotype-associated genes (PAGs). However, it is sometimes necessary to identify genes that are both DEGs and PAGs. We consider the joint identification of DEGs and PAGs in microarray data analyses. The first approach we used was a naïve approach that detects DEGs and PAGs separately and then identifies the genes in an intersection of the list of PAGs and DEGs. The second approach we considered was a hierarchical approach that detects DEGs first and then chooses PAGs from among the DEGs or vice versa. In this study, we propose a new model-based approach for the joint identification of DEGs and PAGs. Unlike the previous two-step approaches, the proposed method identifies genes simultaneously that are DEGs and PAGs. This method uses standard regression models but adopts different null hypothesis from ordinary regression models, which allows us to perform joint identification in one-step. The proposed model-based methods were evaluated using experimental data and simulation studies. The proposed methods were used to analyze a microarray experiment in which the main interest lies in detecting genes that are both DEGs and PAGs, where DEGs are identified between two diet groups and PAGs are associated with four phenotypes reflecting the expression of leptin, adiponectin, insulin-like growth factor 1, and insulin. Model-based approaches provided a larger number of genes, which are both DEGs and PAGs, than other methods. Simulation studies showed that they have more power than other methods. Through analysis of data from experimental microarrays and simulation studies, the proposed model-based approach was shown to provide a more powerful result than the naïve approach and the hierarchical approach. Since our approach is model-based, it is very flexible and can easily handle different types of covariates.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The global gene expression software market is booming, projected to reach $8.17 billion by 2033 with a 15% CAGR. Driven by NGS advancements and personalized medicine, key players like Agilent and Illumina are shaping this rapidly evolving landscape. Discover market trends, growth drivers, and competitive insights in this comprehensive analysis.
Facebook
TwitterBackground In microarray data analysis, the comparison of gene-expression profiles with respect to different conditions and the selection of biologically interesting genes are crucial tasks. Multivariate statistical methods have been applied to analyze these large datasets. Less work has been published concerning the assessment of the reliability of gene-selection procedures. Here we describe a method to assess reliability in multivariate microarray data analysis using permutation-validated principal components analysis (PCA). The approach is designed for microarray data with a group structure.
Results
We used PCA to detect the major sources of variance underlying the hybridization conditions followed by gene selection based on PCA-derived and permutation-based test statistics. We validated our method by applying it to well characterized yeast cell-cycle data and to two datasets from our laboratory. We could describe the major sources of variance, select informative genes and visualize the relationship of genes and arrays. We observed differences in the level of the explained variance and the interpretability of the selected genes.
Conclusions
Combining data visualization and permutation-based gene selection, permutation-validated PCA enables one to illustrate gene-expression variance between several conditions and to select genes by taking into account the relationship of between-group to within-group variance of genes. The method can be used to extract the leading sources of variance from microarray data, to visualize relationships between genes and hybridizations and to select informative genes in a statistically reliable manner. This selection accounts for the level of reproducibility of replicates or group structure as well as gene-specific scatter. Visualization of the data can support a straightforward biological interpretation.
Facebook
TwitterThe summaries of these datasets: Table 1: Summary of datasets
Data sets #Attributes/Gene #Instances #Classes
Lung cancer 12533 181 2 SRBCT 2308 83 4 Colon 2000 62 2 MLL 12582 72 3 Central Nervous System 7129 60 2 ALLAML 7129 72 2 ALLAML-3 7129 72 3 ALL-AML-4 7129 72 4 Ovarian Cancer 15154 253 2 Breast Cancer 24482 97 2 Lymphoma 4026 62 3
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented May 10, 2017. A pilot effort that has developed a centralized, web-based biospecimen locator that presents biospecimens collected and stored at participating Arizona hospitals and biospecimen banks, which are available for acquisition and use by researchers. Researchers may use this site to browse, search and request biospecimens to use in qualified studies. The development of the ABL was guided by the Arizona Biospecimen Consortium (ABC), a consortium of hospitals and medical centers in the Phoenix area, and is now being piloted by this Consortium under the direction of ABRC. You may browse by type (cells, fluid, molecular, tissue) or disease. Common data elements decided by the ABC Standards Committee, based on data elements on the National Cancer Institute''s (NCI''s) Common Biorepository Model (CBM), are displayed. These describe the minimum set of data elements that the NCI determined were most important for a researcher to see about a biospecimen. The ABL currently does not display information on whether or not clinical data is available to accompany the biospecimens. However, a requester has the ability to solicit clinical data in the request. Once a request is approved, the biospecimen provider will contact the requester to discuss the request (and the requester''s questions) before finalizing the invoice and shipment. The ABL is available to the public to browse. In order to request biospecimens from the ABL, the researcher will be required to submit the requested required information. Upon submission of the information, shipment of the requested biospecimen(s) will be dependent on the scientific and institutional review approval. Account required. Registration is open to everyone.. Documented on August 26, 2019.Database of published microarray gene expression data, and a software tool for comparing that published data to a user''''s own microarray results. It is very simple to use - all you need is a web browser and a list of the probes that went up or down in your experiment. If you find L2L useful please consider contributing your published data to the L2L Microarray Database in the form of list files. L2L finds true biological patterns in gene expression data by systematically comparing your own list of genes to lists of genes that have been experimentally determined to be co-expressed in response to a particular stimulus - in other words, published lists of microarray results. The patterns it finds can point to the underlying disease process or affected molecular function that actually generated the observed changed in gene expression. Its insights are far more systematic than critical gene analyses, and more biologically relevant than pure Gene Ontology-based analyses. The publications included in the L2L MDB initially reflected topics thought to be related to Cockayne syndrome: aging, cancer, and DNA damage. Since then, the scope of the publications included has expanded considerably, to include chromatin structure, immune and inflammatory mediators, the hypoxic response, adipogenesis, growth factors, hormones, cell cycle regulators, and others. Despite the parochial origins of the database, the wide range of topics covered will make L2L of general interest to any investigator using microarrays to study human biology. In addition to the L2L Microarray Database, L2L contains three sets of lists derived from Gene Ontology categories: Biological Process, Cellular Component, and Molecular Function. As with the L2L MDB, each GO sub-category is represented by a text file that contains annotation information and a list of the HUGO symbols of the genes assigned to that sub-category or any of its descendants. You don''''t need to download L2L to use it to analyze your microarray data. There is an easy-to-use web-based analysis tool, and you have the option of downloading your results so you can view them at any time on your own computer, using any web browser. However, if you prefer, the entire L2L project, and all of its components, can be downloaded from the download page. Platform: Online tool, Windows compatible, Mac OS X compatible, Linux compatible, Unix compatible
Facebook
TwitterTHIS RESOURCE IS NO LONGER IN SERVICE, documented on 6/12/25. ITTACA is a database created for Integrated Tumor Transcriptome Array and Clinical data Analysis. ITTACA centralizes public datasets containing both gene expression and clinical data and currently focuses on the types of cancer that are of particular interest to the Institut Curie: breast carcinoma, bladder carcinoma, and uveal melanoma. ITTACA is developed by the Institut Curie Bioinformatics group and the Molecular Oncology group of UMR144 CNRS/Institut Curie. A web interface allows users to carry out different class comparison analyses, including comparison of expression distribution profiles, tests for differential expression, patient survival analyses, and users can define their own patient groups according to clinical data or gene expression levels. The different functionalities implemented in ITTACA are: - To test if one or more gene, of your choice, is differentially expressed between two groups of samples exhibiting distinct phenotypes (Student and Wilcoxon tests). - The detection of genes differentially expressed (Significance Analysis of Microarrays) between two groups of samples. - The creation of histograms which represent the expression level according to a clinical parameter for each sample. - The computation of Kaplan Meier survival curves for each group. ITTACA has been developed to be a useful tool for comparing personal results to the existing results in the field of transcriptome studies with microarrays.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundMicroRNA is endogenous non-coding small RNA that negative regulate and control gene expression, and increasing evidence links microRNA to oncogenesis and the pathogenesis of cancer. The goal of this study was to explore the potential molecular mechanism of miR-375 in various cancers.MethodsMiR-375 overexpression in different tumor cell lines was probed with microarray data from Gene Expression Omnibus (GEO). The common target genes of miR-375 were obtained by Robust Rank Aggregation (RRA), and identified by miRWalk2.0 software for target gene prediction. Additionally, we directed in silico analysis including Protein-Protein Interactions (PPI) analysis, gene ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways annotations to provide a summary of the function of miR-375 in various carcinomas. Eventually, data was obtained from The Cancer Genome Atlas (TCGA) were utilized for a validation in 7 cancers.ResultsThe nine miR-375 related chips were acquired by the GEO data. The 5 down regulated genes came from 9 available microarray datasets, which overlapped with the potential target genes predicted by miRWalk2.0 software. The target genes were intensely enriched in amino acid biosynthetic and metabolic process from biological process (GO) and Cysteine and methionine metabolism (KEGG analysis). In view of these approaches, VASN, MAT2B, HERPUD1, TPAPPC6B and TAT are probably the most important miR-375 targets. In addition, miR-375 was negatively correlated with MAT2B, which was verified in 5 tumors of TCGA.ConclusionIn summary, this study based on common target genes provides an innovative perspective for exploring the molecular mechanism of miR-375 in human tumors.
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
It is well-known that correlations in microarray data represent a serious nuisance deteriorating the performance of gene selection procedures. This paper is intended to demonstrate that the correlation structure of microarray data provides a rich source of useful information. We discuss distinct correlation substructures revealed in microarray gene expression data by an appropriate ordering of genes. These substructures include stochastic proportionality of expression signals in a large percentage of all gene pairs, negative correlations hidden in ordered gene triples, and a long sequence of weakly dependent random variables associated with ordered pairs of genes. The reported striking regularities are of general biological interest and they also have far-reaching implications for theory and practice of statistical methods of microarray data analysis. We illustrate the latter point with a method for testing differential expression of non-overlapping gene pairs. While designed for testing a different null hypothesis, this method provides an order of magnitude more accurate control of type 1 error rate compared to conventional methods of individual gene expre ssion profiling. In addition, this method is robust to the technical noise. Quantitative inference of the correlation structure has the potential to extend the analysis of microarray data far beyond currently practiced methods.
Facebook
Twitter
According to our latest research, the global microarray analysis market size reached USD 5.8 billion in 2024, reflecting robust adoption across research and clinical diagnostics. The market is projected to expand at a CAGR of 7.1% during the forecast period, reaching approximately USD 10.8 billion by 2033. This growth is primarily attributed to the rising prevalence of chronic diseases, increasing investment in genomics and proteomics research, and the growing demand for personalized medicine. The microarray analysis market continues to benefit from technological advancements, expanding applications in drug discovery, and robust support from government and private funding. As per our latest research, these factors collectively drive the consistent growth trajectory of the market.
A significant growth factor for the microarray analysis market is the escalating focus on personalized medicine and genomics research. With the advent of precision medicine, healthcare providers and researchers are increasingly leveraging microarray technologies to analyze gene expression patterns, identify disease biomarkers, and tailor treatments to individual genetic profiles. The growing adoption of microarray analysis in oncology, rare disease diagnostics, and pharmacogenomics further underscores its critical role in advancing personalized healthcare. Additionally, government initiatives and funding programs aimed at promoting genomics research have catalyzed the deployment of microarray platforms in both academic and clinical settings. The integration of microarray analysis with next-generation sequencing and bioinformatics tools is also enhancing the accuracy and efficiency of genetic profiling, thereby fueling market growth.
Another major driver propelling the microarray analysis market is the expanding application landscape in drug discovery and disease diagnostics. Pharmaceutical and biotechnology companies are increasingly utilizing microarray platforms to accelerate target identification, validate drug candidates, and monitor gene expression changes during preclinical and clinical studies. This technology enables high-throughput screening of thousands of genes or proteins simultaneously, significantly reducing the time and cost associated with traditional drug development processes. In the clinical diagnostics domain, microarray analysis is gaining traction for the early detection of genetic disorders, infectious diseases, and cancer. The ability to simultaneously analyze multiple biomarkers enhances diagnostic accuracy and supports the development of multiplex assays, which are becoming standard in advanced diagnostic laboratories.
Technological advancements in microarray platforms and data analysis software have also played a pivotal role in market expansion. The introduction of high-density arrays, improved labeling techniques, and automated instrumentation has increased the sensitivity, throughput, and reproducibility of microarray experiments. Furthermore, advancements in data analytics, machine learning, and artificial intelligence are enabling researchers to extract actionable insights from complex microarray datasets. These innovations are not only improving research outcomes but are also lowering barriers to entry for smaller laboratories and emerging markets. The convergence of microarray analysis with digital health technologies and cloud-based data management solutions is expected to further broaden its adoption in the coming years.
The development of the Point-of-Care Microarray Chip is revolutionizing the accessibility and speed of microarray analysis. This innovative technology allows for rapid, on-site genetic testing, which is particularly beneficial in settings where traditional laboratory infrastructure is limited or unavailable. By integrating microarray capabilities into a compact, user-friendly device, healthcare providers can perform complex genetic analyses directly at the point of care. This advancement not only enhances the efficiency of diagnostic processes but also facilitates timely decision-making in clinical settings. The Point-of-Care Microarray Chip is poised to significantly impact personalized medicine by enabling real-time monitoring of genetic markers, thus supporting more tailored therapeutic interventions.
From a regional perspective, North America conti
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The size of the Microarray Analysis Industry market was valued at USD XX Million in 2023 and is projected to reach USD XXX Million by 2032, with an expected CAGR of 5.00% during the forecast period. Microarray is a powerful technology that lets you study thousands of genes in one go. It involves attaching DNA probes to a solid surface – a microarray chip – and then hybridising fluorescently labelled DNA or RNA to those probes. The intensity of the fluorescence at each spot on the chip tells you how much of the associated gene is in the sample. This can be used to find genes that are differentially expressed under different conditions – diseased vs healthy tissues or drug treated vs untreated cells. But this is a complex technique and requires careful thought about experimental design and data interpretation. Microarray has changed many areas of research (cancer, drug discovery and genetic diagnostics). In cancer research it helps find genes involved in tumour development and progression. This can lead to biomarkers for early detection and therapeutic targets. In drug discovery it’s used to screen large libraries of compounds to see which ones modulate gene expression; so can find new drug candidates. In genetic diagnostics it’s used to find genetic variations associated with inherited diseases like cystic fibrosis and Huntington’s disease. Overall microarray is a key tool to understand the interactions of genes and their products in biological systems. Although it’s used in many areas, the impact it has in medicine and biotech is biggest. Recent developments include: In June 2022, Ariceum Therapeutics launched with EUR 25M Series A to advance its lead asset, Satoreotide, for the treatment of low- and high-grade neuroendocrine cancers., In May 2022, Pfizer Inc. and Biohaven Pharmaceutical Holding Company Ltd reported that the companies entered a definitive agreement under which Pfizer will acquire Biohaven, the maker of NURTEC ODT, an innovative dual-acting migraine therapy approved for both acute treatment and episodic prevention of migraine in adults.. Key drivers for this market are: Growing Burden of Chronic Diseases, Technological Advancements in Diagnostic Testing. Potential restraints include: Reimbursement Issues. Notable trends are: The Instrument Segment is Expected to Hold a Major Market Share in the Peptide Microarray Market.
Facebook
TwitterThis dataset tracks the updates made on the dataset "A simple method for statistical analysis of intensity differences in microarray-derived gene expression data" as a repository for previous versions of the data and metadata.
Facebook
TwitterBackgroundPipeline comparisons for gene expression data are highly valuable for applied real data analyses, as they enable the selection of suitable analysis strategies for the dataset at hand. Such pipelines for RNA-Seq data should include mapping of reads, counting and differential gene expression analysis or preprocessing, normalization and differential gene expression in case of microarray analysis, in order to give a global insight into pipeline performances.MethodsFour commonly used RNA-Seq pipelines (STAR/HTSeq-Count/edgeR, STAR/RSEM/edgeR, Sailfish/edgeR, TopHat2/Cufflinks/CuffDiff)) were investigated on multiple levels (alignment and counting) and cross-compared with the microarray counterpart on the level of gene expression and gene ontology enrichment. For these comparisons we generated two matched microarray and RNA-Seq datasets: Burkitt Lymphoma cell line data and rectal cancer patient data.ResultsThe overall mapping rate of STAR was 98.98% for the cell line dataset and 98.49% for the patient dataset. Tophat’s overall mapping rate was 97.02% and 96.73%, respectively, while Sailfish had only an overall mapping rate of 84.81% and 54.44%. The correlation of gene expression in microarray and RNA-Seq data was moderately worse for the patient dataset (ρ = 0.67–0.69) than for the cell line dataset (ρ = 0.87–0.88). An exception were the correlation results of Cufflinks, which were substantially lower (ρ = 0.21–0.29 and 0.34–0.53). For both datasets we identified very low numbers of differentially expressed genes using the microarray platform. For RNA-Seq we checked the agreement of differentially expressed genes identified in the different pipelines and of GO-term enrichment results.ConclusionIn conclusion the combination of STAR aligner with HTSeq-Count followed by STAR aligner with RSEM and Sailfish generated differentially expressed genes best suited for the dataset at hand and in agreement with most of the other transcriptomics pipelines.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
MotivationWhen we were asked for help with high-level microarray data analysis (on Affymetrix HGU-133A microarray), we faced the problem of selecting an appropriate method. We wanted to select a method that would yield "the best result" (detected as many "really" differentially expressed genes (DEGs) as possible, without false positives and false negatives). However, life scientists could not help us – they use their "favorite" method without special argumentation. We also did not find any norm or recommendation. Therefore, we decided to examine it for our own purpose. We considered whether the results obtained using different methods of high-level microarray data analyses – Significant Analysis of Microarrays, Rank Products, Bland-Altman, Mann-Whitney test, T test and the Linear Models for Microarray Data – would be in agreement. Initially, we conducted a comparative analysis of the results on eight real data sets from microarray experiments (from the Array Express database). The results were surprising. On the same array set, the set of DEGs by different methods were significantly different. We also applied the methods to artificial data sets and determined some measures that allow the preparation of the overall scoring of tested methods for future recommendation.ResultsWe found a very low level concordance of results from tested methods on real array sets. The number of common DEGs (detected by all six methods on fixed array sets, checked on eight array sets) ranged from 6 to 433 (22,283 total array readings). Results on artificial data sets were better than those on the real data. However, they were not fully satisfying. We scored tested methods on accuracy, recall, precision, f-measure and Matthews correlation coefficient. Based on the overall scoring, the best methods were SAM and LIMMA. We also found TT to be acceptable. The worst scoring was MW. Based on our study, we recommend: 1. Carefully taking into account the need for study when choosing a method, 2. Making high-level analysis with more than one method and then only taking the genes that are common to all methods (which seems to be reasonable) and 3. Being very careful (while summarizing facts) about sets of differentially expressed genes: different methods discover different sets of DEGs.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Abstract There are still numerous challenges to be overcome in microarray data analysis because advanced, state-of-the-art analyses are restricted to programming users. Here we present the Gene Expression Analysis Platform, a versatile, customizable, optimized, and portable software developed for microarray analysis. GEAP was developed in C# for the graphical user interface, data querying, storage, results filtering and dynamic plotting, and R for data processing, quality analysis, and differential expression. Through a new automated system that identifies microarray file formats, retrieves contents, detects file corruption, and solves dependencies, GEAP deals with datasets independently of platform. GEAP covers 32 statistical options, supports quality assessment, differential expression from single and dual-channel experiments, and gene ontology. Users can explore results by different plots and filtering options. Finally, the entire data can be saved and organized through storage features, optimized for memory and data retrieval, with faster performance than R. These features, along with other new options, are not yet present in any microarray analysis software. GEAP accomplishes data analysis in a faster, straightforward, and friendlier way than other similar software, while keeping the flexibility for sophisticated procedures. By developing optimizations, unique customizations and new features, GEAP is destined for both advanced and non-programming users.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Next generation sequencing (NGS) is increasingly being used for transcriptome-wide analysis of differential gene expression. The NGS data are multidimensional count data. Therefore, most of the statistical methods developed well for microarray data analysis are not applicable to transcriptomic data. For this reason, a variety of new statistical methods based on count data of transcript reads have been correspondingly proposed. But due to high cost and limitation of biological resources, current NGS data are still generated from a few replicate libraries. Some of these existing methods do not always have desirable performances on count data. We here developed a very powerful and robust statistical method based on beta and binomial distributions. Our method (mBeta t-test) is specifically applicable to sequence count data from small samples. Both simulated and real transcriptomic data showed mBeta t-test significantly outperformed the existing top statistical methods chosen in all 12 given scenarios and performed with high efficiency and high stability. The differentially expressed genes found by our method from real transcriptomic data were validated by qPCR experiments. Our method shows high power in finding truly differential expression, conservatively estimating FDR and high stability in RNA sequence count data derived from small samples. Our method can also be extended to genome-wide detection of differential splicing events.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A summary of the each individual microarray datasets from different GEO dataset.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
his dataset is based on National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) DataSet accession GDS2778. girke.bioinformatics.ucr.edu +1
The dataset originates from a microarray experiment measuring global gene expression under specific experimental conditions. girke.bioinformatics.ucr.edu +1
Raw and processed expression data (for all probes/genes) are included, enabling downstream analysis such as normalization, differential expression, and clustering.
The dataset has been used to perform differential gene expression (DGE) analysis to identify genes that are up- or down-regulated under the experimental condition compared to control.
Data processing steps typically include normalization (e.g., log-transformation), quality control, probe-to-gene mapping, and statistical testing for significance (e.g., using packages such as limma or other DGE tools). mahsa-ehsanifard.github.io +1
Resulting differentially expressed genes (DEGs) include statistics such as log fold change (logFC), adjusted p‑values (adj.P.Val), and possibly other metrics (e.g., B-statistic), allowing assessment of both magnitude and significance of changes.
The dataset also includes a visualization file (heatmap image) that displays expression patterns of DEGs (or top variable genes) across samples — enabling clustering and pattern recognition across samples and genes.
The heatmap helps illustrate sample-wise and gene-wise expression variation: clustering groups together samples (e.g. control vs treatment) and genes with similar expression dynamics. NCBI +1
This dataset is suitable for further bioinformatics analysis: e.g. functional enrichment (GO/Pathway), co‑expression analysis, gene signature identification, or integration with other datasets.
Users who download this dataset can reproduce or extend analyses, such as re-normalization, alternative clustering, custom DEG thresholds, or downstream biological interpretation (pathway, network analysis).