100+ datasets found
  1. The CO-Regulation Database (CORD): A Tool to Identify Coordinately Expressed...

    • plos.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally (2023). The CO-Regulation Database (CORD): A Tool to Identify Coordinately Expressed Genes [Dataset]. http://doi.org/10.1371/journal.pone.0090408
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundMeta-analysis of gene expression array databases has the potential to reveal information about gene function. The identification of gene-gene interactions may be inferred from gene expression information but such meta-analysis is often limited to a single microarray platform. To address this limitation, we developed a gene-centered approach to analyze differential expression across thousands of gene expression experiments and created the CO-Regulation Database (CORD) to determine which genes are correlated with a queried gene.ResultsUsing the GEO and ArrayExpress database, we analyzed over 120,000 group by group experiments from gene microarrays to determine the correlating genes for over 30,000 different genes or hypothesized genes. CORD output data is presented for sample queries with focus on genes with well-known interaction networks including p16 (CDKN2A), vimentin (VIM), MyoD (MYOD1). CDKN2A, VIM, and MYOD1 all displayed gene correlations consistent with known interacting genes.ConclusionsWe developed a facile, web-enabled program to determine gene-gene correlations across different gene expression microarray platforms. Using well-characterized genes, we illustrate how CORD's identification of co-expressed genes contributes to a better understanding a gene's potential function. The website is found at http://cord-db.org.

  2. Gene expression data sources for in silico approach to assessing activation...

    • springernature.figshare.com
    application/gzip
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sylvain Brohee; Amir Sonnenblick; David Venet (2023). Gene expression data sources for in silico approach to assessing activation of AKT/mTOR signalling pathway in ER-positive early Breast Cancer [Dataset]. http://doi.org/10.6084/m9.figshare.7461776.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Sylvain Brohee; Amir Sonnenblick; David Venet
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This dataset contains data files and identifiers for original data sources for 39 gene expression datasets from over 7,000 individuals with estrogen receptor positive (ER-positive) Breast Cancer (BC).BackgroundThe related study developed a novel in silico approach to assess activation of different signalling pathways. The phosphatidylinositol 3-kinase (PI3K)/AKT/mTOR signalling pathway mediates key cellular functions, including growth, proliferation and survival and is frequently involved in carcinogenesis, tumor progression and metastases. This research seeks to target relative contribution of AKT and mTOR (downstream of PI3K) in BC outcomes using the in silico approach via integrated reverse phase protein array (RPPA) and matched gene expression.Methods and sample sizeThe methodology includes the development of gene signatures that reflect level of expression of pAKT and p-mTOR separately. Pooled analysis of gene expression data from over 7,000 patients with ER-positive BC was then performed. This data record holds links to the repositories holding these data, as well as the R-data files for each data record used in the analysis. All gene signatures developed are captured in Supplementary Data Sonnenblick.pdf.xlsxData sourcesThe dataset name, relevant DOI, accession number or access requirements are listed alongside the file type and repository name or other source where applicable.GEO=Gene Expression OmnibusEGA=European Genome-phenome ArchiveThis data table is available to download as NPJBCANCER-00304R1-data-sources.xlsx including more detailed information and web urls to each data source. data_db.tab contains more detailed technical metadata for each data source.

    Dataset Data location Permanent identifier/url

    NKI CCB NKI http://ccb.nki.nl/data/van-t-Veer_Nature_2002/

    UCSF GEO GSE123833

    STNO2 GEO GSE4335

    NCI Research Article (Supplementary files) 10.1073/pnas.1732912100

    UNC4 GEO GSE18229

    CAL Array Express E-TABM-158

    MDA4 GEO GSE123832

    KOO GEO GSE123831

    HLP Array Express E-TABM-543

    EXPO GEO GSE2109

    VDX GEO GSE2034/GSE5327

    MSK GEO GSE2603

    UPP GEO GSE3494

    STK GEO GSE1456

    UNT GEO GSE2990

    DUKE GEO GSE3143

    TRANSBIG GEO GSE7390

    DUKE2 GEO GSE6961

    MAINZ GEO GSE11121

    LUND2 GEO GSE5325

    LUND GEO GSE5325

    FNCLCC GEO GSE7017

    EMC2 GEO GSE12276

    MUG GEO GSE10510

    NCCS GEO GSE5364

    MCCC GEO GSE19177

    EORTC10994 GEO GSE1561

    DFHCC GEO GSE19615

    DFHCC2 GEO GSE18864

    DFHCC3 GEO GSE3744

    DFHCC4 GEO GSE5460

    MAQC2 GEO GSE20194

    TAM GEO GSE6532/GSE9195

    MDA5 GEO GSE17705

    VDX3 GEO GSE12093

    METABRIC EGA EGAS00000000083

    TCGA TCGA https://tcga-data.nci.nih.gov/docs/publications/brca_2012/

    DNA methylation (Dedeurwaerder et al. 2011) GEO https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE20713

  3. d

    Data from: Gene Expression Omnibus (GEO)

    • catalog.data.gov
    • data.virginia.gov
    • +2more
    Updated Jul 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    National Institutes of Health (NIH) (2023). Gene Expression Omnibus (GEO) [Dataset]. https://catalog.data.gov/dataset/gene-expression-omnibus-geo
    Explore at:
    Dataset updated
    Jul 26, 2023
    Dataset provided by
    National Institutes of Health (NIH)
    Description

    Gene Expression Omnibus is a public functional genomics data repository supporting MIAME-compliant submissions of array- and sequence-based data. Tools are provided to help users query and download experiments and curated gene expression profiles.

  4. N

    PD991 Gene Expression Arrays

    • data.niaid.nih.gov
    Updated Dec 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hoog JW (2023). PD991 Gene Expression Arrays [Dataset]. https://data.niaid.nih.gov/resources?id=gse93204
    Explore at:
    Dataset updated
    Dec 26, 2023
    Dataset provided by
    Washington University
    Authors
    Hoog JW
    Description

    Agilent gene expression arrays were used for intrinsic subtyping and to measure changes after anastrozole (C1D1) and after combination of anastrazole and palbociclib (C1D15). 118 breast cancer patient samples (baseline, n=32; C1D1, n=33; C1D15, n=29; surgery, n=24) from clinical trial NCT01723774 were arrayed, subtyped, and used to detect changes with treatment.

  5. Top 20 Genes Co-expressed with vimentin (VIM) identified by CORD.

    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally (2023). Top 20 Genes Co-expressed with vimentin (VIM) identified by CORD. [Dataset]. http://doi.org/10.1371/journal.pone.0090408.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Top 20 Genes Co-expressed with vimentin (VIM) identified by CORD.

  6. Medulloblastoma omics data

    • kaggle.com
    zip
    Updated Feb 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2023). Medulloblastoma omics data [Dataset]. https://www.kaggle.com/alexandervc/medulloblastoma-omics-data
    Explore at:
    zip(2278448493 bytes)Available download formats
    Dataset updated
    Feb 22, 2023
    Authors
    Alexander Chervov
    Description

    Collection of gene expression and similar datasets related to brain tumors. In particular Medulloblastoma. Medulloblastoma is the most common malignant brain tumor in childhood. Typically csv files genes x samples.

    GSE124814 WOW! Integration of many (all?) medulloblastoma datasets(!): 1641 samples, of which 1350 samples represent primary medulloblastomas and 291 samples represent normal brain

    https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE124814 Weishaupt H, Johansson P, Sundström A, Lubovac-Pilav Z et al. Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes. Bioinformatics 2019 Sep 15;35(18):3357-3364. PMID: 30715209 https://doi.org/10.1093/bioinformatics/btz066 We downloaded a total of 1796 CEL files from previously published GEO or ArrayExpress records: GSE85217(n=763), GSE25219(n=154), GSE60862(n=130), GSE12992(n=40), GSE67850(n=22), GSE10327(n=62), GSE30074(n=30), E-MTAB-292(n=19), GSE74195(n=30), GSE37418(n=76), GSE4036(n=14), GSE62803(n=52), GSE21140(n=103), GSE37382(n=50), GSE22569(n=24), GSE35974(n=50), GSE73038(n=46), GSE50161(n=24), GSE3526(n=9), GSE50765(n=12), GSE49243(n=58), GSE41842(n=19), GSE44971(n=9). After preprocessing of all CEL files, we averaged the expression profiles of samples that mapped to the same patient in a single dataset, producing a final expression array comprising 1641 samples, of which 1350 samples represent primary medulloblastomas and 291 samples represent normal brain (cerebellum/upper rhombic lip). Also discussed in paper: A transcriptome-based classifier to determine molecular subtypes in medulloblastoma https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008263

    GSE85217 (Cavalli ... Taylor ) 768 samples 2016 ( Affimetrix Human Gene 1.1 ST Array ) Cavalli FMG, Remke M, Rampasek L, Peacock J et al. Intertumoral Heterogeneity within Medulloblastoma Subgroups. Cancer Cell 2017 Jun 12;31(6):737-754.e6. PMID: 28609654 Ramaswamy V, Taylor MD. Bioinformatic Strategies for the Genomic and Epigenomic Characterization of Brain Tumors. Methods Mol Biol 2019;1869:37-56. PMID: 30324512 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE85217

    GSE202043 (Pomeroy) 214 samples, 2011 (Expression profiling by array) Cho YJ, Tsherniak A, Tamayo P, Santagata S et al. Integrative genomic analysis of medulloblastoma identifies a molecular subgroup that drives poor clinical outcome. J Clin Oncol 2011 Apr 10;29(11):1424-30. PMID: 21098324 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE202043

    GSE12992 (Fattet ... Delattre) 72 samples, 2009 (Expression profiling by array) Fattet S, Haberler C, Legoix P, Varlet P et al. Beta-catenin status in paediatric medulloblastomas: correlation of immunohistochemical expression with mutational status, genetic profiles, and clinical characteristics. J Pathol 2009 May;218(1):86-94. PMID: 19197950 A series of 72 pediatric medulloblastoma tumors has been studied at the genomic level (array-CGH), screened for CTNNB1 mutations and beta-catenin expression (immunohistochemistry). A subset of 40 tumor samples has been analyzed at the RNA expression level (Affymetrix HG U133 Plus 2.0). https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12992

    GSE37382 (Northcott ... Taylor) 2012 (Expression profiling by array, Affymetrix Human Gene 1.1 ST Array profiling of 285 primary medulloblastoma samples.) Northcott PA, Shih DJ, Peacock J, Garzia L et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 2012 Aug 2;488(7409):49-56. PMID: 22832581 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37382

    GSE10327 (M. Kool ) 62 samples, 2008 ( Expression profiling by array ) (beware it is sometimes referred as GSE10237 in original paper and several references - that is an error reference). Kool M, Koster J, Bunt J, Hasselt NE et al. Integrated genomics identifies five medulloblastoma subtypes with distinct genetic profiles, pathway signatures and clinicopathological features. PLoS One 2008 Aug 28;3(8):e3088. PMID: 18769486 Rack PG, Ni J, Payumo AY, Nguyen V et al. Arhgap36-dependent activation of Gli transcription factors. Proc Natl Acad Sci U S A 2014 Jul 29;111(30):11061-6. PMID: 25024229 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10327

    Other datasets (not yet loaded):

    (47.1 Gb, 2012) (Expression profiling by array, Genome variation profiling by SNP array, SNP genotyping by SNP array ) Northcott PA, Shih DJ, Peacock J, Garzia L et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 2012 Aug 2;488(7409):49-56. PMID: 22832581 Here we report somatic copy number aberrations (SCNAs) in 1087 unique medulloblastomas. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37385

  7. e

    [HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array

    • ebi.ac.uk
    Updated Jan 25, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). [HG_U95Av2] Affymetrix Human Genome U95 Version 2 Array [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/arrays/A-GEOD-8300
    Explore at:
    Dataset updated
    Jan 25, 2022
    Description

    Array Manufacturer: Affymetrix, Distribution: commercial, Technology: in situ oligonucleotide, Affymetrix submissions are typically submitted to GEO using the GEOarchive method described at http://www.ncbi.nlm.nih.gov/projects/geo/info/geo_affy.html Based on this UniGene build and associated annotations, the HG-U95Av2 array represents approximately 10,000 full-length genes.

  8. GDS4399

    • kaggle.com
    zip
    Updated Oct 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bassam165 (2025). GDS4399 [Dataset]. https://www.kaggle.com/datasets/bassam165/gds4399
    Explore at:
    zip(11496559 bytes)Available download formats
    Dataset updated
    Oct 26, 2025
    Authors
    Bassam165
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    This dataset contains microarray-based gene expression profiles of granulosa cells collected from women diagnosed with Polycystic Ovary Syndrome (PCOS) and from healthy controls. It originates from the NCBI GEO DataSet GDS4399, which was generated to study the molecular mechanisms underlying PCOS pathogenesis and its relationship to insulin resistance, steroidogenesis, and oocyte maturation.

    The data were collected using the Affymetrix Human Genome U133 Plus 2.0 Array (GPL570 platform). Each sample corresponds to an RNA expression profile of granulosa cells isolated from ovarian aspirates of PCOS and non-PCOS women undergoing in-vitro fertilization (IVF).

    Key Details

    NCBI GEO Accession: GDS4399

    Source: Gene Expression Omnibus (GEO), NCBI. GEO Accession: GDS4399 Title: Polycystic ovary syndrome: granulosa cells Platform: Affymetrix Human Genome U133 Plus 2.0 Array (GPL570) Authors: Wood JR, et al. (Original study contributors) National Center for Biotechnology Information, U.S. National Library of Medicine. Available at: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GDS4399

    Recommended citation style (IEEE): [1] J. R. Wood et al., “Polycystic ovary syndrome: granulosa cells,” Gene Expression Omnibus (GEO), GDS4399, NCBI, Bethesda, MD, USA. [Online]. Available: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GDS4399

    License: This dataset is part of the public NCBI GEO database and is distributed under the Public Domain / CC0 License for research and educational use. Please cite the original GEO entry when reusing this dataset.

  9. Differentially Expressed Genes in the Pre-Eclamptic Placenta: A Systematic...

    • plos.figshare.com
    docx
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    C. Emily Kleinrouweler; Miranda van Uitert; Perry D. Moerland; Carrie Ris-Stalpers; Joris A. M. van der Post; Gijs B. Afink (2023). Differentially Expressed Genes in the Pre-Eclamptic Placenta: A Systematic Review and Meta-Analysis [Dataset]. http://doi.org/10.1371/journal.pone.0068991
    Explore at:
    docxAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    C. Emily Kleinrouweler; Miranda van Uitert; Perry D. Moerland; Carrie Ris-Stalpers; Joris A. M. van der Post; Gijs B. Afink
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ObjectiveTo systematically review the literature on human gene expression data of placental tissue in pre-eclampsia and to characterize a meta-signature of differentially expressed genes in order to identify novel putative diagnostic markers.Data SourcesMedline through 11 February 2011 using MeSH terms and keywords related to placenta, gene expression and gene expression arrays; GEO database using the term “placent*”; and reference lists of eligible primary studies, without constraints.MethodsFrom 1068 studies retrieved from the search, we included original publications that had performed gene expression array analyses of placental tissue in the third trimester and that reported on differentially expressed genes in pre-eclampsia versus normotensive controls. Two reviewers independently identified eligible studies, extracted descriptive and gene expression data and assessed study quality. Using a vote-counting method based on a comparative meta-profiling algorithm, we determined a meta-signature that characterizes the significant intersection of differentially expressed genes from the collection of independent gene signatures.ResultsWe identified 33 eligible gene expression array studies of placental tissue in the 3rd trimester comprising 30 datasets on mRNA expression and 4 datasets on microRNA expression. The pre-eclamptic placental meta-signature consisted of 40 annotated gene transcripts and 17 microRNAs. At least half of the mRNA transcripts encode a protein that is secreted from the cell and could potentially serve as a biomarker.ConclusionsIn addition to well-known and validated genes, we identified 14 transcripts not reported previously in relation to pre-eclampsia of which the majority is also expressed in the 1st trimester placenta, and three encode a secreted protein.

  10. Flu vaccinated blood samples

    • kaggle.com
    zip
    Updated Jan 9, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Janis (2020). Flu vaccinated blood samples [Dataset]. https://www.kaggle.com/janiscorona/flu-vaccinated-blood-samples
    Explore at:
    zip(7111137 bytes)Available download formats
    Dataset updated
    Jan 9, 2020
    Authors
    Janis
    Description

    Context

    No matter how much you wash your hands, you are still susceptible to flu airborne viruses or cold viruses in close proximity to others who have a cold or flu. The flu vaccine is a treatment many folks get in hopes of not getting sick that cold/flu season. The flu vaccine is somewhat of a math cheat sheet for your body preparing for a math course final without having to know all of the formulas off hand, but only the ones that are on the exam. If you have a crooked teacher/TA that decided not to allow the cheat sheet to be a good representation of what the content of the final exam is, then you could assume that is how your body will be with a flu vaccine that doesn't have the strand(s) of flu your body is likely to encounter that flu season. I found this data set munging the GEO database sets of NCBI while searching for 'flu vaccines' and wanted some microarray gene expression data sets that I could also compare those values to other blood micro array samples from separate studies on females using EGCG for obesity, and males who do/don't have heart disease. This data can be blended with the other data sets here or in my github repositories at janjanjan2018.

    Content

    Blood gene expressions of microarray samples.

    Acknowledgements

    NCBI and the GEO grant funded data repositories of gene expression data.

    Inspiration

    Sick people.

  11. N

    Data from: Design and use of multiplexed chemostat arrays

    • data.niaid.nih.gov
    Updated Feb 15, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miller AW; Befort C; Kerr EO; Dunham MJ (2018). Design and use of multiplexed chemostat arrays [Dataset]. https://data.niaid.nih.gov/resources?id=gse36691
    Explore at:
    Dataset updated
    Feb 15, 2018
    Dataset provided by
    University of Washington
    Authors
    Miller AW; Befort C; Kerr EO; Dunham MJ
    Description

    We developed and validated a small-footprint array of miniature chemostats built from readily available parts for low cost. Physiological and experimental evolution results were similar to larger volume chemostats. The ministat array provides a compact, inexpensive, and accessible platform for traditional chemostat experiments, functional genomics, and chemical screening applications. Three experiments are gene expression comparisons between three ministat cultures and a single Sixfors sample. The four CGH arrays are individual clones evolved in four sulfate limitation ministats compared to a wt ancestor strain.

  12. o

    Immunological Genome Project data Phase 1

    • omicsdi.org
    xml
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Cruse, Immunological Genome Project data Phase 1 [Dataset]. https://www.omicsdi.org/dataset/arrayexpress-repository/E-GEOD-15907
    Explore at:
    xmlAvailable download formats
    Authors
    Richard Cruse
    Variables measured
    Genomics,Multiomics
    Description

    Gene-expression microarray datasets generated as part of the Immunological Genome Project (ImmGen). Primary cells from multiple immune lineages are isolated ex-vivo, primarily from young adult B6 male mice, and double-sorted to >99% purity. RNA is extracted from cells in a centralized manner, amplified and hybridized to Affymetrix 1.0 ST MuGene arrays. Protocols are rigorously standardized for all sorting and RNA preparation. Data is released monthly in batches of cell populations. This Series record provides access to Immunological Genome Project data submitted to GEO.

  13. arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human...

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Haoyang Cai; Nitin Kumar; Michael Baudis (2023). arrayMap: A Reference Resource for Genomic Copy Number Imbalances in Human Malignancies [Dataset]. http://doi.org/10.1371/journal.pone.0036944
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Haoyang Cai; Nitin Kumar; Michael Baudis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    BackgroundThe delineation of genomic copy number abnormalities (CNAs) from cancer samples has been instrumental for identification of tumor suppressor genes and oncogenes and proven useful for clinical marker detection. An increasing number of projects have mapped CNAs using high-resolution microarray based techniques. So far, no single resource does provide a global collection of readily accessible oncogenomic array data. Methodology/Principal FindingsWe here present arrayMap, a curated reference database and bioinformatics resource targeting copy number profiling data in human cancer. The arrayMap database provides a platform for meta-analysis and systems level data integration of high-resolution oncogenomic CNA data. To date, the resource incorporates more than 40,000 arrays in 224 cancer types extracted from several resources, including the NCBI’s Gene Expression Omnibus (GEO), EBI’s ArrayExpress (AE), The Cancer Genome Atlas (TCGA), publication supplements and direct submissions. For the majority of the included datasets, probe level and integrated visualization facilitate gene level and genome wide data review. Results from multi-case selections can be connected to downstream data analysis and visualization tools. Conclusions/SignificanceTo our knowledge, currently no data source provides an extensive collection of high resolution oncogenomic CNA data which readily could be used for genomic feature mining, across a representative range of cancer entities. arrayMap represents our effort for providing a long term platform for oncogenomic CNA data independent of specific platform considerations or specific project dependence. The online database can be accessed at http//www.arraymap.org.

  14. e

    [Ce25b_MR] Affymetrix C. elegans Tiling 1.0R Array

    • ebi.ac.uk
    Updated Jul 19, 2007
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2007). [Ce25b_MR] Affymetrix C. elegans Tiling 1.0R Array [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/arrays/A-GEOD-5634
    Explore at:
    Dataset updated
    Jul 19, 2007
    Description

    Array Manufacturer: Affymetrix, Distribution: commercial, Technology: in situ oligonucleotide, Tiling array submissions are typically submitted to GEO using the GEOarchive method described at http://www.ncbi.nlm.nih.gov/projects/geo/info/geo_affy.html The GeneChip C. elegans Tiling 1.0R Array is designed for identifying novel transcripts or mapping sites of protein/DNA interaction in chromatin immunoprecipitation (ChIP) experiments, or other Caenorhabditis elegans whole-genome experiments. The C. elegans 1.0R Array is a single array comprised of over 3.2 million perfect match/mismatch probe pairs tiled through the complete non-repetitive Caenorhabditis elegans genome. Sequences used in the design of the C. elegans Tiling 1.0R Array were selected from the WormBase web site, www.wormbase.org, release WS140, March 26, 2005. Probes are tiled at an average resolution of 25 base pair, as measured from the central position of adjacent 25-mer oligos. BPMAP and other files can be downloaded from the Affymetrix Web site below.

  15. N

    MicroRNA array analysis from infected and uninfected PBMC samples

    • data.niaid.nih.gov
    Updated Jan 31, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Manchon L; Tazi J (2019). MicroRNA array analysis from infected and uninfected PBMC samples [Dataset]. https://data.niaid.nih.gov/resources?id=gse116148
    Explore at:
    Dataset updated
    Jan 31, 2019
    Dataset provided by
    CNRS
    Authors
    Manchon L; Tazi J
    Description

    ABX464, a new drug for curing HIV and treating inflammatory diseases induces upregulation of the anti-inflammatory miR-124.We used microarrays to show the implication of ABX464 in the biogenesis of small noncoding RNAs. So, we decided to evaluate if miRNAs or small nucleolar RNAs (snoRNAs) were differentially regulated by ABX464. We performed a microarray analysis for these RNAs from the PBMCs of 6 donors. Cells that were infected with the YU2 strain, followed by treatment with ABX464 were compared with uninfected and untreated controls. A total of 104 human miRNAs and 40 snoRNAs were significantly differentially expressed in infected PBMCs, when compared to uninfected PBMCs (data file S4), with a false discovery rate lower than 0.05 and fold change higher than 1.5.

  16. o

    A microarray meta-dataset of prostate cancer

    • omicsdi.org
    xml
    Updated Apr 11, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Su Bin Lim (2019). A microarray meta-dataset of prostate cancer [Dataset]. https://www.omicsdi.org/dataset/arrayexpress-repository/E-MTAB-6694
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Apr 11, 2019
    Authors
    Su Bin Lim
    Variables measured
    Transcriptomics
    Description

    We present a meta-dataset comprising of a total of 237 samples including both primary tumors and tumor-free prostate tissues from six independent GEO datasets. To minimise inter-platform variation, only datasets generated from the GPL570 platform (Affymetrix Human Genome U133 Plus 2.0 Array) were processed to develop the meta-dataset. Using multiple open source R packages implemented in our previously developed bioinformatics pipeline, each dataset has been preprocessed with RMA normalisation, merged, and batch effect-corrected via Combat method. With increased sample size, the present meta-dataset serves an excellent 'discovery cohort' for discovering differentially expressed in diseased phenotype.

  17. d

    Data from: A Clinically Relevant Gene Signature in Triple-Negative and...

    • datamed.org
    Updated Oct 1, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). A Clinically Relevant Gene Signature in Triple-Negative and Basal-Like Breast Cancer [Dataset]. https://datamed.org/display-item.php?repository=0044&idName=ID&id=5841d9225152c649505fc23c
    Explore at:
    Dataset updated
    Oct 1, 2011
    Description

    Current prognostic gene expression profiles for breast cancer mainly reflect proliferation status and are most useful in ER-positive cancers. Triple-negative breast cancers (TNBCs) are clinically heterogeneous, and prognostic markers and biology-based therapies are needed to better treat this disease. We assembled Affymetrix gene expression data for 579 TNBCs and performed unsupervised analysis to define metagenes that distinguish molecular subsets within TNBC. We used n=394 cases for discovery and n=185 cases for validation. Sixteen metagenes emerged that identified basal-like, apocrine and claudin-low molecular subtypes, or reflected various non-neoplastic cell populations including immune cells, blood, adipocytes, stroma, angiogenesis, and inflammation within the cancer. The expressions of these metagenes were correlated with survival and multivariate analysis was performed including routine clinical and pathological variables. 73% of TNBCs displayed basal-like molecular subtype that correlated with high histological grade and younger age. Survival of basal-like TNBC was not different from non-basal-like TNBC. High expression of immune cell metagenes was associated with good and high expression of inflammation and angiogenesis-related metagenes were associated with poor prognosis. A ratio of high B-cell and low IL-8 metagenes identified 32% of TNBC with good prognosis (HR 0.37, 95% CI 0.22-0.61; P<0.001) and was the only significant predictor in multivariate analysis including routine clincopathological variables. We describe a ratio of high B-cell presence and low IL-8 activity as a powerful new prognostic marker for TNBC. Inhibition of the IL-8 pathway also represents an attractive novel therapeutic target for this disease. Analysis of primary breast cancer biopsies from patients before treatment. No replicates. No control or reference samples are included. The set of 579 TNBCs includes: (1) 67 new GEO Samples (GSM782523-GSM782589), (2) 489 re-analyzed GEO Samples (see 'Relation' links below), and (3) 23 re-analyzed ArrayExpress Samples. Cohorts: HH = University of Hamburg FRA = University of Frankfurt, adjuvant chemotherapy FRA-2 = University of Frankfurt, neoadjuvant chemotherapy FRA-3 = University of Frankfurt, no adjuvant chemotherapy Data processing of the 579 TNBC Samples: MAS5 values were taken from GEO if available. For samples with no MAS5 values, CEL files were downloaded from GEO and the affy package from Bioconductor was used to generate MAS5 values. Next, MAS5 values corresponding only to the 22283 probesets from the U133A array were compiled. Subsequently, normalization of MAS5 data was performed using the command line version of the program CLUSTER 3.0 (Michael Eisen; updated by Michiel de Hoon; http://bonsai.hgc.jp/~mdehoon/software/cluster/command.txt). The following three steps were performed in the following order: 1. log2 transformation of MAS5 values 2. median centering of arrays 3. magnitude normalization of arrays These three steps correspond to the following commands: cluster.com filename -l cluster.com filename -ca m cluster.com filename -na The resulting dataset, which is linked below as a supplementary file, was used for the subsequent analyses.

  18. N

    Gene expression arrays comparing Kluyveromyces lactis wild-type a cells to...

    • data.niaid.nih.gov
    Updated Dec 20, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Booth LN; Baker CR; Sorrells TR; Johnson AD (2012). Gene expression arrays comparing Kluyveromyces lactis wild-type a cells to MATa2 knock-out a cells and wild-type alpha cells to MATalpha2 knock-out alpha cells [Dataset]. https://data.niaid.nih.gov/resources?id=gse39027
    Explore at:
    Dataset updated
    Dec 20, 2012
    Dataset provided by
    University of California, San Francisco
    Authors
    Booth LN; Baker CR; Sorrells TR; Johnson AD
    Description

    We examine how different transcriptional network structures can evolve from a common, ancestral network. We show that regulatory protein modularity, conversion of one cis-regulatory sequence to another, distribution of binding energy among protein-protein and protein-DNA interactions, and exploitation of ancestral network features all contribute to the evolution of a novel mode of regulation at a conserved gene set. The formation of this derived mode of regulation did not disrupt the ancestral mode and thereby created a hybrid regulatory state where both means of transcription regulation (ancestral and derived) contribute to the conserved expression pattern of the network. Finally, we show how this hybrid regulatory state has resolved in different ways in different lineages to generate the diversity of regulatory network structures observed in modern species. a2 KO and alpha2 KO mRNA abundance was measured relative to a WT cell of the same mating type. 2 replicates each. Dye-swaps were performed.

  19. e

    Agilent-016097 UNIFE_HGDMDAntisense_44K_V1.0

    • ebi.ac.uk
    Updated Jun 4, 2011
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2011). Agilent-016097 UNIFE_HGDMDAntisense_44K_V1.0 [Dataset]. https://www.ebi.ac.uk/biostudies/arrayexpress/arrays/A-GEOD-13121?query=type%3A%22array%22
    Explore at:
    Dataset updated
    Jun 4, 2011
    Description

    Array Manufacturer: Agilent, Distribution: custom-commercial, Technology: in situ oligonucleotide, We tiled the entire DMD gene, in both sense and antisense directions, using the web-based Agilent eArray database, Version 4.5 (Agilent Technologies), with 60-mer oligos every 66 bp of repeat-masked genome sequence. We defined probe sets for both orientations, encompassing the DMD exons, promoters, introns, predicted MiRNA (identified by PromiRII) and conserved non-coding sequences (CNSs) identified within dystrophin introns using the VISTA programme (http://genome.lbl.gov/vista/index.shtml). Two specific sets of probes were designed to cover, in both directions, the cDNA sequences of a group of control genes (Supplementary Table S1) identified in the Gene Expression Omnibus (GEO) database http://www.ncbi.nlm.nih.gov/geo/) and expressed equally in both normal and dystrophic muscles. Each probe set was opportunely distributed and replicated several times in order to obtain two 4x44k microarrays, referred to as DMD GEx Sense and DMD GEx Antisense, respectively, able to detect transcripts in the same and opposite directions as that of DMD gene transcription.

  20. N

    Empirical Annotation of the Daphnia pulex genome; Experiment B

    • data.niaid.nih.gov
    Updated Jun 25, 2012
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Colbourne JK; Lopez J; Choi J (2012). Empirical Annotation of the Daphnia pulex genome; Experiment B [Dataset]. https://data.niaid.nih.gov/resources?id=gse25852
    Explore at:
    Dataset updated
    Jun 25, 2012
    Dataset provided by
    University of Notre Dame
    Authors
    Colbourne JK; Lopez J; Choi J
    Description

    Experiments conducted on this tiling array are used to (1) validate the frozen gene sets of the current genome annotation, (2) improve the predicted gene structures by empirically determining UTRs and intron-exon boundaries, identifying missing upstream, internal, and downstream exons and alternative transcripts, (3) propose gene structure models in transcribed regions containing no predicted genes and (4) delineate transcriptionally active regions of the genome from intergenic, intronic and genic regions. Signal to background ratios were determined by first calling probes that fluoresced at intensities greater than 99% of the random probes’ signal intensities; therefore, only 1% of fluorescing experimental probes should be false positives. We conducted two-color competitive hybridizations that measure differential expression from three replicates, each using RNA from independent biological extractions. Transcriptional active regions (TARs) were defined by stringing together overlapping probes showing fluorescence above a 1% false positive rate (FPR). Positive probes were joined into a TAR if they were adjacent (maxgap=0, no intermittent non-positive probe) and a TAR’s length had to be at least 45 bp (minrun=45, mid-point first positive probe to mid-point last positive probe, resulting in at least 3 adjacent positive probes for a TAR). The data analysis to measure differential expression of genes and of unannotated TARs was performed using the statistical software package R and Bioconductor with additions and modifications. The signal distributions across chips, samples and replicates were adjusted to be equal according to the mean fluorescence of the random probes on each array. All probes including random probes were quantile-normalized across replicates. Expression-level scores were assigned for each predicted gene based on the median log2 fluorescence over background intensity of probes falling within the exon boundaries.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally (2023). The CO-Regulation Database (CORD): A Tool to Identify Coordinately Expressed Genes [Dataset]. http://doi.org/10.1371/journal.pone.0090408
Organization logo

The CO-Regulation Database (CORD): A Tool to Identify Coordinately Expressed Genes

Explore at:
3 scholarly articles cite this dataset (View in Google Scholar)
txtAvailable download formats
Dataset updated
May 31, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
John P. Fahrenbach; Jorge Andrade; Elizabeth M. McNally
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

BackgroundMeta-analysis of gene expression array databases has the potential to reveal information about gene function. The identification of gene-gene interactions may be inferred from gene expression information but such meta-analysis is often limited to a single microarray platform. To address this limitation, we developed a gene-centered approach to analyze differential expression across thousands of gene expression experiments and created the CO-Regulation Database (CORD) to determine which genes are correlated with a queried gene.ResultsUsing the GEO and ArrayExpress database, we analyzed over 120,000 group by group experiments from gene microarrays to determine the correlating genes for over 30,000 different genes or hypothesized genes. CORD output data is presented for sample queries with focus on genes with well-known interaction networks including p16 (CDKN2A), vimentin (VIM), MyoD (MYOD1). CDKN2A, VIM, and MYOD1 all displayed gene correlations consistent with known interacting genes.ConclusionsWe developed a facile, web-enabled program to determine gene-gene correlations across different gene expression microarray platforms. Using well-characterized genes, we illustrate how CORD's identification of co-expressed genes contributes to a better understanding a gene's potential function. The website is found at http://cord-db.org.

Search
Clear search
Close search
Google apps
Main menu