100+ datasets found
  1. d

    Data from: Normalization and analysis of DNA microarray data by...

    • catalog.data.gov
    • healthdata.gov
    • +1more
    Updated Sep 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Normalization and analysis of DNA microarray data by self-consistency and local regression [Dataset]. https://catalog.data.gov/dataset/normalization-and-analysis-of-dna-microarray-data-by-self-consistency-and-local-regression
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    A robust semi-parametric normalization technique has been developed, based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumption that departures of the response from linearity are small and slowly varying. The method was tested using data simulated under various error models and it performs well.

  2. d

    Data from: Profound effect of normalization on detection of differentially...

    • catalog.data.gov
    • data.virginia.gov
    • +1more
    Updated Sep 6, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis [Dataset]. https://catalog.data.gov/dataset/profound-effect-of-normalization-on-detection-of-differentially-expressed-genes-in-oligonu
    Explore at:
    Dataset updated
    Sep 6, 2025
    Dataset provided by
    National Institutes of Health
    Description

    A number of procedures for normalization and detection of differentially expressed genes have been proposed. Four different normalization methods and all possible combinations with three different statistical algorithms have been used for detection of differentially expressed genes on a dataset. The number of genes detected as differentially expressed differs by a factor of about three.

  3. The sensitivity of transcriptomics BMD modeling to the methods used for...

    • catalog.data.gov
    Updated Aug 20, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2021). The sensitivity of transcriptomics BMD modeling to the methods used for microarray data normalization [Dataset]. https://catalog.data.gov/dataset/the-sensitivity-of-transcriptomics-bmd-modeling-to-the-methods-used-for-microarray-data-no
    Explore at:
    Dataset updated
    Aug 20, 2021
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    This dataset is a project file generated by BMDExpress 2.2 SW (Sciome, Research Triangle Park, NC). It contains gene expression data for livers of rats exposed to 4 chemicals (crude MCHM, neat MCHM, DMPT, p-toluidine) and kidneys of rats exposed to PPH. The project file includes normalized expression data (GeneChip Rat 230 2.0 Array) using 7 different pre-processing methods (RMA, GCRMA, MAS5.0, MAS5.0_noA calls, PLIER, PLIER16, and PLIER16_noA calls); differentially expressed probe-sets detected by William's method (p<0.05, and minimum fold change of 1.5); probeset-level and pathway-level BMD and BMDL values from transcriptomic dose-response modeling. This dataset is associated with the following publication: Mezencev, R., and S. Auerbach. The sensitivity of transcriptomics BMD modeling to the methods used for microarray data normalization. PLOS ONE. Public Library of Science, San Francisco, CA, USA, 15(5): e0232955, (2020).

  4. f

    Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    application/cdfv2
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.doc [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s001
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  5. Normalization and analysis of DNA microarray data by self-consistency and...

    • healthdata.gov
    csv, xlsx, xml
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Normalization and analysis of DNA microarray data by self-consistency and local regression - exke-snpu - Archive Repository [Dataset]. https://healthdata.gov/dataset/Normalization-and-analysis-of-DNA-microarray-data-/3y33-7tbf
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Sep 10, 2025
    Description

    This dataset tracks the updates made on the dataset "Normalization and analysis of DNA microarray data by self-consistency and local regression" as a repository for previous versions of the data and metadata.

  6. GSE65194 Data Normalization and Subtype Analysis

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). GSE65194 Data Normalization and Subtype Analysis [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/gse65194-data-normalization-and-subtype-analysis
    Explore at:
    zip(54989436 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Raw and preprocessed microarray expression data from the GSE65194 cohort.

    Includes samples from triple-negative breast cancer (TNBC), other breast cancer subtypes, and normal breast tissues.

    Expression profiles generated using the “Affymetrix Human Genome U133 Plus 2.0 Array (GPL570)” platform. tcr.amegroups.org +2 Journal of Cancer +2

    Provides normalized gene expression values suitable for downstream analyses such as differential expression, subtype classification, and clustering.

    Supports the identification of differentially expressed genes (DEGs) between TNBC, non-TNBC subtypes, and normal tissue. Aging-US +2 tcr.amegroups.org +2

    Useful for transcriptomic analyses in breast cancer research, including subtype analysis, biomarker discovery, and comparative studies.

  7. GSE206848 Data Normalization and Subtype Analysis

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). GSE206848 Data Normalization and Subtype Analysis [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/gse206848-data-normalization-and-subtype-analysis
    Explore at:
    zip(2631363 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset for human osteoarthritis (OA) — microarray gene expression (Affymetrix GPL570) PMC +1

    Contains expression data for 7 healthy control (normal) tissue samples and 7 osteoarthritis patient tissue samples from synovial / joint tissue. PMC +1

    Pre-processed for normalization (background correction, log-transformation, normalization) to remove technical variation.

    Suitable for downstream analyses: differential gene expression (normal vs OA), subtype- or phenotype-based classification, machine learning.

    Can act as a validation dataset when combining with other GEO datasets to increase sample size or test reproducibility. SpringerLink +1

    Useful for biomarker discovery, pathway enrichment analysis (e.g., GO, KEGG), immune infiltration analysis, and subtype analysis in osteoarthritis research.

  8. d

    Data from: A new non-linear normalization method for reducing variability in...

    • catalog.data.gov
    • odgavaprod.ogopendata.com
    • +2more
    Updated Sep 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (2025). A new non-linear normalization method for reducing variability in DNA microarray experiments [Dataset]. https://catalog.data.gov/dataset/a-new-non-linear-normalization-method-for-reducing-variability-in-dna-microarray-experimen
    Explore at:
    Dataset updated
    Sep 7, 2025
    Dataset provided by
    National Institutes of Health
    Description

    A simple and robust non-linear method is presented for normalization using array signal distribution analysis and cubic splines. Both the regression and spline-based methods described performed better than existing linear methods when assessed on the variability of replicate arrays

  9. o

    Cross-platform comparability of microarray data

    • omicsdi.org
    Updated Jun 10, 2010
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2010). Cross-platform comparability of microarray data [Dataset]. https://www.omicsdi.org/dataset/biostudies/E-GEOD-2458
    Explore at:
    Dataset updated
    Jun 10, 2010
    Variables measured
    Unknown
    Description

    To facilitate collaborative research efforts between multi-investigator teams using DNA microarrays, we identified sources of error and data variability between laboratories and across microarray platforms and methods to accommodate this variability. RNA expression data were generated in seven laboratories, comparing two standard RNA samples using twelve microarray platforms. At least two standard microarray types (one spotted, one commercial) were used by all laboratories. Reproducibility for most platforms within any laboratory was typically good, but reproducibility between platforms and across laboratories was generally poor. Reproducibility between laboratories dramatically increased when standardized protocols were implemented for RNA labeling, hybridization, microarray processing, data acquisition and data normalization. Nonetheless, concordance could be found across different laboratories and platforms when data were analyzed in terms of enriched Gene Ontology categories. These findings indicate that microarray results generated by multiple sites and platforms can be comparable, and that multi-investigator teams will maximize data comparability by adopting a common platform and a common set of procedures to generate compatible data. Keywords: other

  10. Profound effect of normalization on detection of differentially expressed...

    • healthdata.gov
    csv, xlsx, xml
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis - her4-nexh - Archive Repository [Dataset]. https://healthdata.gov/dataset/Profound-effect-of-normalization-on-detection-of-d/8jd2-z2ui
    Explore at:
    csv, xlsx, xmlAvailable download formats
    Dataset updated
    Sep 10, 2025
    Description

    This dataset tracks the updates made on the dataset "Profound effect of normalization on detection of differentially expressed genes in oligonucleotide microarray data analysis" as a repository for previous versions of the data and metadata.

  11. Novel R Pipeline for Analyzing Biolog Phenotypic Microarray Data

    • plos.figshare.com
    • data.niaid.nih.gov
    • +3more
    pdf
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Minna Vehkala; Mikhail Shubin; Thomas R Connor; Nicholas R Thomson; Jukka Corander (2023). Novel R Pipeline for Analyzing Biolog Phenotypic Microarray Data [Dataset]. http://doi.org/10.1371/journal.pone.0118392
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Minna Vehkala; Mikhail Shubin; Thomas R Connor; Nicholas R Thomson; Jukka Corander
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data produced by Biolog Phenotype MicroArrays are longitudinal measurements of cells’ respiration on distinct substrates. We introduce a three-step pipeline to analyze phenotypic microarray data with novel procedures for grouping, normalization and effect identification. Grouping and normalization are standard problems in the analysis of phenotype microarrays defined as categorizing bacterial responses into active and non-active, and removing systematic errors from the experimental data, respectively. We expand existing solutions by introducing an important assumption that active and non-active bacteria manifest completely different metabolism and thus should be treated separately. Effect identification, in turn, provides new insights into detecting differing respiration patterns between experimental conditions, e.g. between different combinations of strains and temperatures, as not only the main effects but also their interactions can be evaluated. In the effect identification, the multilevel data are effectively processed by a hierarchical model in the Bayesian framework. The pipeline is tested on a data set of 12 phenotypic plates with bacterium Yersinia enterocolitica. Our pipeline is implemented in R language on the top of opm R package and is freely available for research purposes.

  12. DGE GO Enrichment Analysis Microarray Data GDS2778

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). DGE GO Enrichment Analysis Microarray Data GDS2778 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/dge-go-enrichment-analysis-microarray-data-gds2778
    Explore at:
    zip(6820264 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    his dataset is based on National Center for Biotechnology Information (NCBI) Gene Expression Omnibus (GEO) DataSet accession GDS2778. girke.bioinformatics.ucr.edu +1

    The dataset originates from a microarray experiment measuring global gene expression under specific experimental conditions. girke.bioinformatics.ucr.edu +1

    Raw and processed expression data (for all probes/genes) are included, enabling downstream analysis such as normalization, differential expression, and clustering.

    The dataset has been used to perform differential gene expression (DGE) analysis to identify genes that are up- or down-regulated under the experimental condition compared to control.

    Data processing steps typically include normalization (e.g., log-transformation), quality control, probe-to-gene mapping, and statistical testing for significance (e.g., using packages such as limma or other DGE tools). mahsa-ehsanifard.github.io +1

    Resulting differentially expressed genes (DEGs) include statistics such as log fold change (logFC), adjusted p‑values (adj.P.Val), and possibly other metrics (e.g., B-statistic), allowing assessment of both magnitude and significance of changes.

    The dataset also includes a visualization file (heatmap image) that displays expression patterns of DEGs (or top variable genes) across samples — enabling clustering and pattern recognition across samples and genes.

    The heatmap helps illustrate sample-wise and gene-wise expression variation: clustering groups together samples (e.g. control vs treatment) and genes with similar expression dynamics. NCBI +1

    This dataset is suitable for further bioinformatics analysis: e.g. functional enrichment (GO/Pathway), co‑expression analysis, gene signature identification, or integration with other datasets.

    Users who download this dataset can reproduce or extend analyses, such as re-normalization, alternative clustering, custom DEG thresholds, or downstream biological interpretation (pathway, network analysis).

  13. f

    Data from: hemaClass.org: Online One-By-One Microarray Normalization and...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated Oct 5, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Young, Ken H.; Nielsen, Kasper Lindblad; Bilgrau, Anders Ellern; Brøndum, Rasmus Froberg; Johnsen, Hans Erik; El-Galaly, Tarec Christoffer; Have, Jonas; Dybkær, Karen; Schmitz, Alexander; Bøgsted, Martin; Falgreen, Steffen; Jakobsen, Lasse Hjort; Bødker, Julie Støve (2016). hemaClass.org: Online One-By-One Microarray Normalization and Classification of Hematological Cancers for Precision Medicine [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001602916
    Explore at:
    Dataset updated
    Oct 5, 2016
    Authors
    Young, Ken H.; Nielsen, Kasper Lindblad; Bilgrau, Anders Ellern; Brøndum, Rasmus Froberg; Johnsen, Hans Erik; El-Galaly, Tarec Christoffer; Have, Jonas; Dybkær, Karen; Schmitz, Alexander; Bøgsted, Martin; Falgreen, Steffen; Jakobsen, Lasse Hjort; Bødker, Julie Støve
    Description

    BackgroundDozens of omics based cancer classification systems have been introduced with prognostic, diagnostic, and predictive capabilities. However, they often employ complex algorithms and are only applicable on whole cohorts of patients, making them difficult to apply in a personalized clinical setting.ResultsThis prompted us to create hemaClass.org, an online web application providing an easy interface to one-by-one RMA normalization of microarrays and subsequent risk classifications of diffuse large B-cell lymphoma (DLBCL) into cell-of-origin and chemotherapeutic sensitivity classes. Classification results for one-by-one array pre-processing with and without a laboratory specific RMA reference dataset were compared to cohort based classifiers in 4 publicly available datasets. Classifications showed high agreement between one-by-one and whole cohort pre-processsed data when a laboratory specific reference set was supplied. The website is essentially the R-package hemaClass accompanied by a Shiny web application. The well-documented package can be used to run the website locally or to use the developed methods programmatically.ConclusionsThe website and R-package is relevant for biological and clinical lymphoma researchers using affymetrix U-133 Plus 2 arrays, as it provides reliable and swift methods for calculation of disease subclasses. The proposed one-by-one pre-processing method is relevant for all researchers using microarrays.

  14. Additional file 4: of DBNorm: normalizing high-density oligonucleotide...

    • springernature.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy (2023). Additional file 4: of DBNorm: normalizing high-density oligonucleotide microarray data based on distributions [Dataset]. http://doi.org/10.6084/m9.figshare.5648956.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DBNorm installation. Describes how to install DBNorm via devtools in R. (TXT 4Â kb)

  15. GEOExpressionMatrixHandling Normalization-GSE57691

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). GEOExpressionMatrixHandling Normalization-GSE57691 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/geoexpressionmatrixhandling-normalization-gse57691
    Explore at:
    zip(31247125 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    GEO series: GSE57691 microarray dataset from human arterial tissue samples. PMC +2 BioMed Central +2

    Platform: Illumina HumanHT-12 V4.0 expression beadchip (GPL10558). SpringerLink +1

    Contains gene expression data for 59 samples: 49 samples from patients with abdominal aortic aneurysm (AAA) and 10 healthy control arterial-wall samples. PMC +1

    The raw / probe-level data from GEO have been processed: probes annotated to official gene symbols; duplicate probes collapsed by median (or similar) expression.

    Data have been normalized and batch-corrected to produce a clean gene × sample expression matrix for downstream analyses.

    This dataset is suitable for differential expression analysis, co-expression network analysis (e.g. WGCNA), immune infiltration analysis, or other transcriptomic investigations of vascular disease (AAA) vs normal.

  16. A comparison of per sample global scaling and per gene normalization methods...

    • plos.figshare.com
    pdf
    Updated Jun 5, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaohong Li; Guy N. Brock; Eric C. Rouchka; Nigel G. F. Cooper; Dongfeng Wu; Timothy E. O’Toole; Ryan S. Gill; Abdallah M. Eteleeb; Liz O’Brien; Shesh N. Rai (2023). A comparison of per sample global scaling and per gene normalization methods for differential expression analysis of RNA-seq data [Dataset]. http://doi.org/10.1371/journal.pone.0176185
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 5, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Xiaohong Li; Guy N. Brock; Eric C. Rouchka; Nigel G. F. Cooper; Dongfeng Wu; Timothy E. O’Toole; Ryan S. Gill; Abdallah M. Eteleeb; Liz O’Brien; Shesh N. Rai
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Normalization is an essential step with considerable impact on high-throughput RNA sequencing (RNA-seq) data analysis. Although there are numerous methods for read count normalization, it remains a challenge to choose an optimal method due to multiple factors contributing to read count variability that affects the overall sensitivity and specificity. In order to properly determine the most appropriate normalization methods, it is critical to compare the performance and shortcomings of a representative set of normalization routines based on different dataset characteristics. Therefore, we set out to evaluate the performance of the commonly used methods (DESeq, TMM-edgeR, FPKM-CuffDiff, TC, Med UQ and FQ) and two new methods we propose: Med-pgQ2 and UQ-pgQ2 (per-gene normalization after per-sample median or upper-quartile global scaling). Our per-gene normalization approach allows for comparisons between conditions based on similar count levels. Using the benchmark Microarray Quality Control Project (MAQC) and simulated datasets, we performed differential gene expression analysis to evaluate these methods. When evaluating MAQC2 with two replicates, we observed that Med-pgQ2 and UQ-pgQ2 achieved a slightly higher area under the Receiver Operating Characteristic Curve (AUC), a specificity rate > 85%, the detection power > 92% and an actual false discovery rate (FDR) under 0.06 given the nominal FDR (≤0.05). Although the top commonly used methods (DESeq and TMM-edgeR) yield a higher power (>93%) for MAQC2 data, they trade off with a reduced specificity (

  17. GEO ExpressionMatrixHandlingNormalization GSE32138

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). GEO ExpressionMatrixHandlingNormalization GSE32138 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/geo-expressionmatrixhandlingnormalization-gse32138
    Explore at:
    zip(8536153 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    • This dataset contains expression matrix handling and normalization results derived from GEO dataset GSE32138. • It includes raw gene expression values processed using standardized bioinformatics workflows. • The dataset demonstrates quantile normalization applied to microarray-based expression data. • It provides visualization outputs used to assess data distribution before and after normalization. • The goal of this dataset is to support reproducible analysis of GSE32138 preprocessing and quality control. • Researchers can use the files for practice in normalization, exploratory data analysis, and visualization. • This dataset is useful for learning microarray preprocessing techniques in R or Python.

  18. Additional file 3: of DBNorm: normalizing high-density oligonucleotide...

    • springernature.figshare.com
    txt
    Updated Nov 30, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy (2017). Additional file 3: of DBNorm: normalizing high-density oligonucleotide microarray data based on distributions [Dataset]. http://doi.org/10.6084/m9.figshare.5648932.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Nov 30, 2017
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Qinxue Meng; Daniel Catchpoole; David Skillicorn; Paul Kennedy
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    DBNorm test script. Code of how we test DBNorm package. (TXT 2Â kb)

  19. Data Preprocessing EDA Microarray GE Data GSE5583

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). Data Preprocessing EDA Microarray GE Data GSE5583 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/data-preprocessing-eda-microarray-ge-data-gse5583
    Explore at:
    zip(3144708 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset is based on GEO series GSE5583. OmicsDI

    The experiment compares gene expression profiles between wild‑type mouse embryonic stem cells (ES cells) and ES cells in which Histone deacetylase 1 (HDAC1) has been knocked out. OmicsDI

    The organism used is mouse (Mus musculus). OmicsDI

    Microarray technology was employed to measure transcript abundance across the genome, aiming to identify putative HDAC1 target genes. OmicsDI +1

    The dataset includes processed expression data (after normalization and log2 transformation), allowing for downstream exploratory data analysis (EDA) and differential gene expression (DGE) analysis.

    As part of EDA, sample‑wise distribution plots (e.g. boxplots) are provided to assess normalization across all arrays.

    The dataset also includes downstream visualizations and analysis results, such as boxplots, which help in evaluating the consistency and quality of the processed data.

    Researchers can use this dataset to perform differential expression analysis between HDAC1 knockout vs wild‑type ES cells, investigate epigenetic regulation, or explore downstream effects of histone deacetylation loss.

    Additionally, the dataset can serve as a reference example for microarray data preprocessing, normalization, transformation (e.g. log2), and exploratory visualization workflows.

    The dataset is publicly available and sourced from a trusted repository (GEO), ensuring transparency and reproducibility of the experiment.

  20. A new non-linear normalization method for reducing variability in DNA...

    • healthdata.gov
    csv, xlsx, xml
    Updated Sep 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). A new non-linear normalization method for reducing variability in DNA microarray experiments - qasd-zvnh - Archive Repository [Dataset]. https://healthdata.gov/dataset/A-new-non-linear-normalization-method-for-reducing/gbaz-64pm
    Explore at:
    csv, xml, xlsxAvailable download formats
    Dataset updated
    Sep 10, 2025
    Description

    This dataset tracks the updates made on the dataset "A new non-linear normalization method for reducing variability in DNA microarray experiments" as a repository for previous versions of the data and metadata.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
National Institutes of Health (2025). Normalization and analysis of DNA microarray data by self-consistency and local regression [Dataset]. https://catalog.data.gov/dataset/normalization-and-analysis-of-dna-microarray-data-by-self-consistency-and-local-regression

Data from: Normalization and analysis of DNA microarray data by self-consistency and local regression

Related Article
Explore at:
Dataset updated
Sep 6, 2025
Dataset provided by
National Institutes of Health
Description

A robust semi-parametric normalization technique has been developed, based on the assumption that the large majority of genes will not have their relative expression levels changed from one treatment group to the next, and on the assumption that departures of the response from linearity are small and slowly varying. The method was tested using data simulated under various error models and it performs well.

Search
Clear search
Close search
Google apps
Main menu