60 datasets found

h
geoquery
huggingface.co
Updated Aug 8, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Giwon Hong (2025). geoquery [Dataset]. https://huggingface.co/datasets/GWHed/geoquery
Explore at:
Dataset updated
Aug 8, 2025
Authors
Giwon Hong
Description
GWHed/geoquery dataset hosted on Hugging Face and contributed by the HF Datasets community
n
Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R...
data.niaid.nih.gov
search.dataone.org
zip
Updated Jul 9, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bianca Habermann; Margaux Haering (2021). Extended data tables to Haering and Habermann, F1000Res, RNfuzzyApp: an R shiny RNA-seq data analysis app for visualisation, differential expression analysis, time-series clustering and enrichment analysis [Dataset]. http://doi.org/10.5061/dryad.8pk0p2nnd
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8pk0p2nnd
Dataset updated
Jul 9, 2021
Dataset provided by
Institut de Biologie du Développement Marseille
Authors
Bianca Habermann; Margaux Haering
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Background

RNA-seq is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species.

Results

With RNfuzzyApp, we provide a user-friendly, web-based R-shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, Mfuzz loop computations, cluster overlap analysis, as well as cluster enrichments.

Conclusion

RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its orthology assignment, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.

Methods Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt: mean values calculated from raw reads of replicates, downloaded from gene expression omnibus (dataset GSE143430 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE143430).

Haering_etal_extendedDatatable_1a_Tabulamurissenis_3vs12m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1b_Tabulamurissenis_3vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1c_Tabulamurissenis_12vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1d_Tabulamurissenis_3vs12m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1e_Tabulamurissenis_3vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_1f_Tabulamurissenis_12vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2a_Tabulamurissenis_cluster1_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2b_Tabulamurissenis_cluster2_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2c_Tabulamurissenis_cluster3_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2d_Tabulamurissenis_cluster4_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_2e_Tabulamurissenis_cluster5_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3a_DmLeg_cluster1_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3b_DmLeg_cluster2_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3c_DmLeg_cluster3_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3d_DmLeg_cluster4_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3e_DmLeg_cluster5_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3f_DmLeg_cluster6_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3g_DmLeg_cluster7_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3h_DmLeg_cluster8_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3i_DmLeg_cluster9_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3j_DmLeg_cluster10_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3k_DmLeg_cluster11_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)

Haering_etal_extendedDatatable_3l_DmLeg_cluster12_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Medulloblastoma omics data
kaggle.com
zip
Updated Feb 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2023). Medulloblastoma omics data [Dataset]. https://www.kaggle.com/alexandervc/medulloblastoma-omics-data
Explore at:
zip(2278448493 bytes)Available download formats
Dataset updated
Feb 22, 2023
Authors
Alexander Chervov
Description
Collection of gene expression and similar datasets related to brain tumors. In particular Medulloblastoma. Medulloblastoma is the most common malignant brain tumor in childhood. Typically csv files genes x samples.

GSE124814 WOW! Integration of many (all?) medulloblastoma datasets(!): 1641 samples, of which 1350 samples represent primary medulloblastomas and 291 samples represent normal brain

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE124814 Weishaupt H, Johansson P, Sundström A, Lubovac-Pilav Z et al. Batch-normalization of cerebellar and medulloblastoma gene expression datasets utilizing empirically defined negative control genes. Bioinformatics 2019 Sep 15;35(18):3357-3364. PMID: 30715209 https://doi.org/10.1093/bioinformatics/btz066 We downloaded a total of 1796 CEL files from previously published GEO or ArrayExpress records: GSE85217(n=763), GSE25219(n=154), GSE60862(n=130), GSE12992(n=40), GSE67850(n=22), GSE10327(n=62), GSE30074(n=30), E-MTAB-292(n=19), GSE74195(n=30), GSE37418(n=76), GSE4036(n=14), GSE62803(n=52), GSE21140(n=103), GSE37382(n=50), GSE22569(n=24), GSE35974(n=50), GSE73038(n=46), GSE50161(n=24), GSE3526(n=9), GSE50765(n=12), GSE49243(n=58), GSE41842(n=19), GSE44971(n=9). After preprocessing of all CEL files, we averaged the expression profiles of samples that mapped to the same patient in a single dataset, producing a final expression array comprising 1641 samples, of which 1350 samples represent primary medulloblastomas and 291 samples represent normal brain (cerebellum/upper rhombic lip). Also discussed in paper: A transcriptome-based classifier to determine molecular subtypes in medulloblastoma https://journals.plos.org/ploscompbiol/article?id=10.1371/journal.pcbi.1008263

GSE85217 (Cavalli ... Taylor ) 768 samples 2016 ( Affimetrix Human Gene 1.1 ST Array ) Cavalli FMG, Remke M, Rampasek L, Peacock J et al. Intertumoral Heterogeneity within Medulloblastoma Subgroups. Cancer Cell 2017 Jun 12;31(6):737-754.e6. PMID: 28609654 Ramaswamy V, Taylor MD. Bioinformatic Strategies for the Genomic and Epigenomic Characterization of Brain Tumors. Methods Mol Biol 2019;1869:37-56. PMID: 30324512 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE85217

GSE202043 (Pomeroy) 214 samples, 2011 (Expression profiling by array) Cho YJ, Tsherniak A, Tamayo P, Santagata S et al. Integrative genomic analysis of medulloblastoma identifies a molecular subgroup that drives poor clinical outcome. J Clin Oncol 2011 Apr 10;29(11):1424-30. PMID: 21098324 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE202043

GSE12992 (Fattet ... Delattre) 72 samples, 2009 (Expression profiling by array) Fattet S, Haberler C, Legoix P, Varlet P et al. Beta-catenin status in paediatric medulloblastomas: correlation of immunohistochemical expression with mutational status, genetic profiles, and clinical characteristics. J Pathol 2009 May;218(1):86-94. PMID: 19197950 A series of 72 pediatric medulloblastoma tumors has been studied at the genomic level (array-CGH), screened for CTNNB1 mutations and beta-catenin expression (immunohistochemistry). A subset of 40 tumor samples has been analyzed at the RNA expression level (Affymetrix HG U133 Plus 2.0). https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12992

GSE37382 (Northcott ... Taylor) 2012 (Expression profiling by array, Affymetrix Human Gene 1.1 ST Array profiling of 285 primary medulloblastoma samples.) Northcott PA, Shih DJ, Peacock J, Garzia L et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 2012 Aug 2;488(7409):49-56. PMID: 22832581 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37382

GSE10327 (M. Kool ) 62 samples, 2008 ( Expression profiling by array ) (beware it is sometimes referred as GSE10237 in original paper and several references - that is an error reference). Kool M, Koster J, Bunt J, Hasselt NE et al. Integrated genomics identifies five medulloblastoma subtypes with distinct genetic profiles, pathway signatures and clinicopathological features. PLoS One 2008 Aug 28;3(8):e3088. PMID: 18769486 Rack PG, Ni J, Payumo AY, Nguyen V et al. Arhgap36-dependent activation of Gli transcription factors. Proc Natl Acad Sci U S A 2014 Jul 29;111(30):11061-6. PMID: 25024229 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE10327

Other datasets (not yet loaded):

(47.1 Gb, 2012) (Expression profiling by array, Genome variation profiling by SNP array, SNP genotyping by SNP array ) Northcott PA, Shih DJ, Peacock J, Garzia L et al. Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 2012 Aug 2;488(7409):49-56. PMID: 22832581 Here we report somatic copy number aberrations (SCNAs) in 1087 unique medulloblastomas. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE37385
e
The Parathyroid Hormone-Regulated Transcriptome in Osteocytes: Parallel...
ebi.ac.uk
Updated Nov 13, 2014
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mark Meyer; Hillary St. John; J Pike (2014). The Parathyroid Hormone-Regulated Transcriptome in Osteocytes: Parallel Actions with 1,25-Dihydroxyvitamin D3 to Oppose Gene Expression Changes During Differentiation and to Promote Mature Cell Function [RNA-Seq] [Dataset]. https://www.ebi.ac.uk/biostudies/studies/E-GEOD-62979
Explore at:
Dataset updated
Nov 13, 2014
Authors
Mark Meyer; Hillary St. John; J Pike
Description
Although localized to the mineralized matrix of bone, osteocytes are able to respond to systemic factors such as the calciotropic hormones 1,25(OH)2D3 and PTH. In the present studies, we examine the transcriptomic response to PTH in an osteocyte cell model and found that this hormone regulated an extensive panel of genes. Surprisingly, PTH uniquely modulated two cohorts of genes, one that was expressed and associated with the osteoblast to osteocyte transition and the other a cohort that was expressed only in the mature osteocyte. Interestingly, PTHM-bM-^@M-^Ys effects were largely to oppose the expression of differentiation-related genes in the former cohort, while potentiating the expression of osteocyte-specific genes in the latter cohort. A comparison of the transcriptional effects of PTH with those obtained previously with 1,25(OH)2D3 revealed a subset of genes that was strongly overlapping. While 1,25(OH)2D3 potentiated the expression of osteocyte-specific genes similar to that seen with PTH, the overlap between the two hormones was more limited. Additional experiments identified the PKA-activated phospho-CREB (pCREB) cistrome, revealing that while many of the differentiation-related PTH regulated genes were apparent targets of a PKA-mediated signaling pathway, a reduction in pCREB binding at sites associated with osteocyte-specific PTH targets appeared to involve alternative PTH activation pathways. That pCREB binding activities positioned near important hormone-regulated gene cohorts were localized to control regions of genes was reinforced by the presence of epigenetic enhancer signatures exemplified by unique modifications at histones H3 and H4. These studies suggest that both PTH and 1,25(OH)2D3 may play important and perhaps cooperative roles in limiting osteocyte differentiation from its precursors while simultaneously exerting distinct roles in regulating mature osteocyte function. Our results provide new insight into transcription factor-associated mechanisms through which PTH and 1,25(OH)2D3 regulate a plethora of genes important to the osteoblast/osteocyte lineage. Fully differentiated IDG-SW3 cells were treated in biological triplicate with 100nM PTH for 24 hours prior to mRNA isolation and sequencing. Vehicle treated samples were previously published in GSE54783: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323967 http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323968 http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSM1323969
Integrated and Annotated Single Cell Object
zenodo.org
bin
Updated Oct 7, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sofoklis Keisaris; George Gavriilids; Sofoklis Keisaris; George Gavriilids (2023). Integrated and Annotated Single Cell Object [Dataset]. http://doi.org/10.5281/zenodo.8413934
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8413934
Dataset updated
Oct 7, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Sofoklis Keisaris; George Gavriilids; Sofoklis Keisaris; George Gavriilids
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Integrated and annotated Seurat object created with the following script: https://doi.org/10.5281/zenodo.8413883
using the following studies: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE162577
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE142016
https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE135779
Z
zebrafish GSE223922 scRNA data set objects
datasetcatalog.nlm.nih.gov
Updated Jul 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ullrich, Kristian (2023). zebrafish GSE223922 scRNA data set objects [Dataset]. http://doi.org/10.5281/zenodo.8133569
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8133569
Dataset updated
Jul 11, 2023
Authors
Ullrich, Kristian
Description
scRNA data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223922 (Sur et al. 2023), see a detailed description of the study here: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10055256/ Data were downloaded from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223922 to create a R Seurat object and converted into AnnData (h5ad) file to be able to analyse with e.g. python scanpy package. If you use this data, please cite Sur et al. 2023.
Systematic integrative analysis of gene expression identifies HNF4A as the...
plos.figshare.com
tiff
Updated Jun 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cristina Baciu; Elisa Pasini; Marc Angeli; Katherine Schwenger; Jenifar Afrin; Atul Humar; Sandra Fischer; Keyur Patel; Johane Allard; Mamatha Bhat (2023). Systematic integrative analysis of gene expression identifies HNF4A as the central gene in pathogenesis of non-alcoholic steatohepatitis [Dataset]. http://doi.org/10.1371/journal.pone.0189223
Explore at:
tiffAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0189223
Dataset updated
Jun 2, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Cristina Baciu; Elisa Pasini; Marc Angeli; Katherine Schwenger; Jenifar Afrin; Atul Humar; Sandra Fischer; Keyur Patel; Johane Allard; Mamatha Bhat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Non-alcoholic fatty liver disease (NAFLD) is the most common chronic liver disease in the Western world, and encompasses a spectrum from simple steatosis to steatohepatitis (NASH). There is currently no approved pharmacologic therapy against NASH, partly due to an incomplete understanding of its molecular basis. The goal of this study was to determine the key differentially expressed genes (DEGs), as well as those genes and pathways central to its pathogenesis. We performed an integrative computational analysis of publicly available gene expression data in NASH from GEO (GSE17470, GSE24807, GSE37031, GSE89632). The DEGs were identified using GEOquery, and only the genes present in at least three of the studies, to a total of 190 DEGs, were considered for further analyses. The pathways, networks, molecular interactions, functional analyses were generated through the use of Ingenuity Pathway Analysis (IPA). For selected networks, we computed the centrality using igraph package in R. Among the statistically significant predicted networks (p-val < 0.05), three were of most biological interest: the first is involved in antimicrobial response, inflammatory response and immunological disease, the second in cancer, organismal injury and development and the third in metabolic diseases. We discovered that HNF4A is the central gene in the network of NASH connected to metabolic diseases and that it regulates HNF1A, an additional transcription regulator also involved in lipid metabolism. Therefore, we show, for the first time to our knowledge, that HNF4A is central to the pathogenesis of NASH. This adds to previous literature demonstrating that HNF4A regulates the transcription of genes involved in the progression of NAFLD, and that HNF4A genetic variants play a potential role in NASH progression.
barechey/PredictIO.data:
zenodo.org
zip
Updated Sep 8, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yacine Bareche; Yacine Bareche (2022). barechey/PredictIO.data: [Dataset]. http://doi.org/10.5281/zenodo.7044234
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7044234
Dataset updated
Sep 8, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yacine Bareche; Yacine Bareche
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data for our paper titled "Leveraging Big Data of Immune Checkpoint Blockade Response Identifies Novel Potential Targets".

Bareche et al., Annals of Oncology (2022); https://doi.org/10.1016/j.annonc.2022.08.084

----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------

Background: The development of immune checkpoint blockade (ICB) has changed the way we treat various cancers. While ICB produces durable survival benefits in a number of malignancies, a large proportion of treated patients do not derive clinical benefit. Recent clinical profiling studies have shed light on molecular features and mechanisms that modulate response to ICB. Nevertheless, none of these identified molecular features were investigated in large enough cohorts to be of clinical value.

Materials and methods: Literature review was performed to identify relevant studies including clinical dataset of patient treated with ICB (anti-PD1/L1, anti-CTLA4 or the combo) and available sequencing data. Tumor mutational burden (TMB) and 37 previously reported gene expression (GE) signature were computed with respect to the original publication. Biomarker association with ICB response (IR) and survival (PFS/OS) was investigated separately within each study and combined together for meta-analysis.

Results: We performed a comparative meta-analysis of genomic and transcriptomic biomarkers of immune-checkpoint blockade (ICB) responses in over 3,600 patients across 12 tumor types and implemented an open-source web-application (predictIO.ca) for exploration. Tumor mutation burden (TMB) and 21/37 gene signatures were predictive of ICB responses across tumor types. We next developed a de novo gene expression signature (PredictIO) from our pan-cancer analysis and demonstrated its superior predictive value over other biomarkers. To identify novel targets, we computed the T-cell dysfunction score for each gene within PredictIO and their ability to predict dual PD-1/CTLA-4 blockade in mice. Two genes, F2RL1 (encoding protease-activated receptor-2) and RBFOX2 (encoding RNA-binding motif protein 9), were concurrently associated with worse ICB clinical outcomes, T cell dysfunction in ICB-naive patients and resistance to dual PD-1/CTLA-4 blockade in preclinical models.

Conclusions: Our study highlights the potential of large-scale meta-analyses in identifying novel biomarkers and potential therapeutic targets for cancer immunotherapy.

----------------------------------------------------------------------------------------------------------------------------------------------
----------------------------------------------------------------------------------------------------------------------------------------------

Data description

mouseModel:

Chen: Expression data of the TNBC mouse model study from Chen et al. (PMID:32907939)

Meskini: Expression data of the Melanoma mouse model study from Meskini et al. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE172320)

Zemek: Expression data of the AB1 & Renca mouse model study from Zemek et al. (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE117358)

Discovery_cohort:
Expression and SNV data of the discovery cohort

Validation_cohort:
Expression and SNV data of the validation cohort
Raw mouse mammary RNA-Seq data (fastq)
zenodo.org
application/gzip, bin +1
Updated Nov 6, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Helena Rasche; Helena Rasche (2020). Raw mouse mammary RNA-Seq data (fastq) [Dataset]. http://doi.org/10.5281/zenodo.4249516
Explore at:
bin, application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4249516
Dataset updated
Nov 6, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Helena Rasche; Helena Rasche
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
12 fastq files with 1000 reads each, 4 index files for chr 1 for mm10, targets files with sample information.

References

http://www.ncbi.nlm.nih.gov/pubmed/25730472

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE60450

http://hgdownload.soe.ucsc.edu/goldenPath/mm10/chromosomes/

http://www.ncbi.nlm.nih.gov/pubmed/20921232

http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18508
Taxol Drug Resistance cell lines in Breast Cancer
kaggle.com
zip
Updated Apr 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ali Abedi Madiseh (2023). Taxol Drug Resistance cell lines in Breast Cancer [Dataset]. https://www.kaggle.com/datasets/aliabedimadiseh/taxol-drug-resistance-cell-lines-in-breast-cancer/discussion
Explore at:
zip(247688 bytes)Available download formats
Dataset updated
Apr 12, 2023
Authors
Ali Abedi Madiseh
License
Open Database License (ODbL) v1.0https://www.opendatacommons.org/licenses/odbl/1.0/
License information was derived automatically
Description
This dataset collected from NCBI - GEO datasets: - GSE144113 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE144113) - GSE76200 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE76200) - GSE12791 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE12791)

These datasets include four paclitaxel-resistant cell lines which includes BAS, HS578T, MCF7 and MDA-MB-231.

Gene expression analysis was performed using R in each of the datasets, which was between control cells and drug-resistant cells. And using different Bioinformatics databases, they were converted into gene symbols. Genes with a p-value of less than 0.05 were also removed.
f
Table_5_Integrated Multichip Analysis Identifies Potential Key Genes in the...
datasetcatalog.nlm.nih.gov
Updated Nov 26, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li, Yating; Xu, Chunquan; Zhou, Cui; Lin, Yishuai; Ye, Wanchun; Zhou, Tieli; Zhao, Yajie; Wang, Qing; Chen, Lijiang; Bai, Fumao; Ye, Jianzhong; Sun, Yao; Wu, Qing (2020). Table_5_Integrated Multichip Analysis Identifies Potential Key Genes in the Pathogenesis of Nonalcoholic Steatohepatitis.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000537075
Explore at:
Dataset updated
Nov 26, 2020
Authors
Li, Yating; Xu, Chunquan; Zhou, Cui; Lin, Yishuai; Ye, Wanchun; Zhou, Tieli; Zhao, Yajie; Wang, Qing; Chen, Lijiang; Bai, Fumao; Ye, Jianzhong; Sun, Yao; Wu, Qing
Description
BackgroundNonalcoholic steatohepatitis (NASH) is rapidly becoming a major chronic liver disease worldwide. However, little is known concerning the pathogenesis and progression mechanism of NASH. Our aim here is to identify key genes and elucidate their biological function in the progression from hepatic steatosis to NASH.MethodsGene expression datasets containing NASH patients, hepatic steatosis patients, and healthy subjects were downloaded from the Gene Expression Omnibus database, using the R packages biobase and GEOquery. Differentially expressed genes (DEGs) were identified using the R limma package. Functional annotation and enrichment analysis of DEGs were undertaken using the R package ClusterProfile. Protein-protein interaction (PPI) networks were constructed using the STRING database.ResultsThree microarray datasets GSE48452, GSE63067 and GSE89632 were selected. They included 45 NASH patients, 31 hepatic steatosis patients, and 43 healthy subjects. Two up-regulated and 24 down-regulated DEGs were found in both NASH patients vs. healthy controls and in steatosis subjects vs. healthy controls. The most significantly differentially expressed genes were FOSB (P = 3.43×10-15), followed by CYP7A1 (P = 2.87×10-11), and FOS (P = 6.26×10-11). Proximal promoter DNA-binding transcription activator activity, RNA polymerase II-specific (P = 1.30×10-5) was the most significantly enriched functional term in the gene ontology analysis. KEGG pathway enrichment analysis indicated that the MAPK signaling pathway (P = 3.11×10-4) was significantly enriched.ConclusionThis study characterized hub genes of the liver transcriptome, which may contribute functionally to NASH progression from hepatic steatosis.
Inferelator 3.0 Yeast Single-Cell Benchmarking Data
zenodo.org
application/gzip, tsv
Updated Aug 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christopher Jackson; Christopher Jackson (2021). Inferelator 3.0 Yeast Single-Cell Benchmarking Data [Dataset]. http://doi.org/10.5281/zenodo.5272314
Explore at:
application/gzip, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5272314
Dataset updated
Aug 27, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Christopher Jackson; Christopher Jackson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Yeast single-cell gene expression data, database-derived prior knowledge network, and hand-curated gold standard network. Used to benchmark the Inferelator 3.0, SCENIC, and CellOracle.

Expression data (GSE144820_GSE125162.tsv.gz) is an integer count matrix [44343 rows x 6763 columns] with an index column (0) assembled from GSE144820 and GSE125162. Included is a paired metadata file (GSE144820_GSE125162_META_DATA.tsv.gz).

A database-derived prior knowledge network (YEASTRACT_20190713_BOTH.tsv) is a boolean connectivity matrix [6885 rows x 220 columns] with an index column (0) obtained from the YEASTRACT database on 07132019. It consists of edges which have both DNA localization evidence and evidence of changes to gene expression after TF perturbation.

A curated gold standard network (Tchourine_2018_yeast_gold_standard.tsv) is a signed connectivity matrix [993 rows x 98 columns] with an index column (0). Details of its construction have been published.
h
Geo_Benchmark
huggingface.co
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodrigo Ferreira Rodrigues, Geo_Benchmark [Dataset]. https://huggingface.co/datasets/rfr2003/Geo_Benchmark
Explore at:
Authors
Rodrigo Ferreira Rodrigues
License
https://choosealicense.com/licenses/other/https://choosealicense.com/licenses/other/
Description
Dataset Card for Geo-Benchmark

Dataset Summary

Geo-Benchmark aims to assess Large Language Models' (LLM) geographical abilities across a multitude of tasks. It is built from 12 datasets split across 8 differents tasks:

Knowledge/Coordinates Prediction : GeoQuestions1089 Knowledge/Yes|No questions: GeoQuestions1089 Knowledge/Regression questions: GeoQuestions1089, GeoQuery Knowledge/Place Prediction: GeoQuestions1089, GeoQuery, Ms Marco Reasoning/Scenario Complex QA:… See the full description on the dataset page: https://huggingface.co/datasets/rfr2003/Geo_Benchmark.
e
Strand-specific RNA-seq of nine mouse tissues
ebi.ac.uk
Updated Dec 20, 2012
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jason Merkin; Christopher Burge (2012). Strand-specific RNA-seq of nine mouse tissues [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-2801/
Explore at:
Dataset updated
Dec 20, 2012
Authors
Jason Merkin; Christopher Burge
Description
This experiment is contains mouse organism part samples and strand-specific RNA-seq data from experiment E-GEOD-41637 (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-41637/), which aimed at assessing tissue-specific transcriptome variation across mammals, with chicken used as an outgroup in evolutionary analyses. Each organism part (with the exception of heart) was sourced from animals from three different strains: C57BL/6, DBA/2J and CD1. (There is no data for heart from the C57BL/6 strain.) This data set was originally submitted to NCBI Gene Expression Omnibus under accession number GSE41637 (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE41637) and later imported to ArrayExpress as E-GEOD-41637.
m
AT-hook transcription factors restrict petiole growth by antagonizing PIFs....
data.mendeley.com
Updated Mar 20, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David Favero (2020). AT-hook transcription factors restrict petiole growth by antagonizing PIFs. Favero et al. [Dataset]. http://doi.org/10.17632/hp76dxbmwh.1
Explore at:
Unique identifier
https://doi.org/10.17632/hp76dxbmwh.1
Dataset updated
Mar 20, 2020
Authors
David Favero
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data used to generate figures in the manuscript, excluding RNA-seq and ChIP-seq data, which can instead be downloaded from Gene Expression Omnibus: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE122456.
d
Genetic accessions, treatment information, and methodology from laboratory...
search.dataone.org
Updated Mar 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Neelakanteswar Aluru; Mark Hahn (2025). Genetic accessions, treatment information, and methodology from laboratory experiments studying transcriptomic responses to saxitoxin in zebrafish (Danio rerio) [Dataset]. http://doi.org/10.26008/1912/bco-dmo.881469.1
Explore at:
Unique identifier
https://doi.org/10.26008/1912/bco-dmo.881469.1
Dataset updated
Mar 9, 2025
Dataset provided by
Biological and Chemical Oceanography Data Management Office (BCO-DMO)
Authors
Neelakanteswar Aluru; Mark Hahn
Time period covered
Jan 1, 2022
Description
The data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (GEO) and are accessible through GEO Series accession number GSE204989 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE204989).

Sequence Read Archive (SRA) data, BioSamples, and GEO holdings can be accessed from the NCBI BioProject PRJNA843039 (http://www.ncbi.nlm.nih.gov/bioproject/PRJNA843039).
f
Gene expression profiles of WT and alh-4(-) obtained by RNA-sequencing...
datasetcatalog.nlm.nih.gov
Updated Jul 8, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cheung, Tom H.; Preusch, Christopher B.; He, Gary J.; Qu, Jianan; Mak, Ho Yi; Zeng, Lidan; Xu, Ningyi; Li, Xuesong (2021). Gene expression profiles of WT and alh-4(-) obtained by RNA-sequencing (sheet 1) and the down-stream analysis. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000875184
Explore at:
Dataset updated
Jul 8, 2021
Authors
Cheung, Tom H.; Preusch, Christopher B.; He, Gary J.; Qu, Jianan; Mak, Ho Yi; Zeng, Lidan; Xu, Ningyi; Li, Xuesong
Description
Differentially expressed genes (DEGs) analysis (Sheet 2); 2. Gene ontology and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis (Sheet 3); 3. Motif discovery of the promoter region of DEGs (Sheet 4); 4. The transcription factor candidates and RNAi results (Sheet 5). The RNA-sequencing dataset has been deposited in NCBI’s Gene Expression Omnibus and is accessible through GEO Series accession number GSE162792 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE162792). (XLSX)
Biomarker Benchmark - GSE38958
search.datacite.org
Updated Oct 28, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anna Guyer; Stephen Piccolo (2016). Biomarker Benchmark - GSE38958 [Dataset]. http://doi.org/10.6084/m9.figshare.2069708
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.2069708
Dataset updated
Oct 28, 2016
Dataset provided by
figshare
Figsharehttp://figshare.com/
DataCite
Authors
Anna Guyer; Stephen Piccolo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description

[NOTICE: This data set has been deprecated. Please see our new version of the data (and additional data sets) here: https://osf.io/mhk93 ]

"Idiopathic pulmonary fibrosis (IPF) is a specific form of chronic, progressive fibrosing interstitial disease of unknown cause. It remains impractical to conduct early diagnosis and predict IPF progression just based on gene expression information. Moreover, the relationship between gene expression and quantitative phenotypic value in IPF keeps controversial. To identify biomarkers to predict survival in IPF, we profiled protein-coding gene expression in peripheral blood mononuclear cells (PBMCs). We linked the gene expression level with the quantitative phenotypic variation in IPF, including diffusing capacity of the lung for carbon monoxide (DLCO) and forced vital capacity (FVC) percent predicted. In silico analyses on the expression profiles and quantitative phenotypic data allowed for the generation of a set of IPF molecular signature that predicted survival of IPF effectively."
http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE38958
We have included gene-expression data, the outcome (class) being predicted, and any clinical covariates. When gene-expression data were processed in multiple batches, we have provided batch information. Each data set is organized into a file set, where each contains all pertinent files for an individual dataset. The gene expression files have been normalized using both the SCAN and UPC methods using the SCAN.UPC package in Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/SCAN.UPC.html). We summarized the data at the gene level using the BrainArray resource (http://brainarray.mbni.med.umich.edu/Brainarray/Database/CustomCDF/20.0.0/ensg.asp). We used Ensembl identifiers. The class, clinical, and batch data were hand curated to ensure consistency ("tidy data" formatting). In addition, the data files have been formatted to be imported easily into the ML-Flex machine learning package (http://mlflex.sourceforge.net/).
Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hsu, Jonathan; Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Authors
Hsu, Jonathan; Stoop, Allart
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
f
Genes Downregulated in the Lungs of PBS-treated DDAH1-transgenic Mice versus...
datasetcatalog.nlm.nih.gov
Updated Jan 10, 2014
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Deng, Jingyuan; Figueroa, Julio A. Landero; Day, Brandy P.; Medvedovic, Mario; Hershey, Gurjit K. Khurana; Chen, Weiguo; Gibson, Aaron M.; Kinker, Kayla G.; Bass, Stacey A. (2014). Genes Downregulated in the Lungs of PBS-treated DDAH1-transgenic Mice versus PBS-treated Wild Type Mice. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001207496
Explore at:
Dataset updated
Jan 10, 2014
Authors
Deng, Jingyuan; Figueroa, Julio A. Landero; Day, Brandy P.; Medvedovic, Mario; Hershey, Gurjit K. Khurana; Chen, Weiguo; Gibson, Aaron M.; Kinker, Kayla G.; Bass, Stacey A.
Description
NCBI Gene Expression Omnibus accession numbers GSE49047 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE49047).