https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
Human RNA-Seq data set GSM2819712 stored in NCBI (GEO)
https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
Human DNA methylation data stored in NCBI (GEO) Dataset GSM2819626; liver tissue sample 7137_CV_RRBS https://seek.lisym.org/samples/236
https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
Human RNA-Seq data set GSM2819698 stored in NCBI (GEO)
liver tissue sample : 6922_IZ_RNA
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analyzed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.
Archived dataset contains following files:
- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
- output/document_summaries.csv, document summaries of NCBI GEO series
- output/publications.csv, publication info of NCBI GEO series
- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series
- output/single-cell.csv, single cell experiments
- spots.csv, NCBI SRA sequencing run metadata
- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.
- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
HumanDNA methylation data set GSM2819642 stored in NCBI (GEO) liver tissue sample : 7012_IZ_RRBS
We introduce a new high-throughput transcriptomics (HTTr) platform comprised of a collagen sandwich primary rat hepatocyte culture and the TempO-Seq assay for screening and prioritizing potential hepatotoxicants. We selected 14 chemicals based on their risk of drug-induced liver injury (DILI) and tested them in hepatocytes at two treatment concentrations. HTTr data was generated using the TempO-Seq whole transcriptome and S1500+ assays. The HTTr platform exhibited high reproducibility between technical replicates (r>0.9) but biological replication was greater for TempO-Seq S1500+ (r>0.85) than for the whole transcriptome (r>0.7). Reproducibility between biological replicates was dependent on the strength of transcriptional effects induced by a chemical treatment. Despite targeting a smaller number of genes, the S1500+ assay clustered chemical treatments and produced gene set enrichment analysis (GSEA) scores comparable to those of the whole transcriptome. Connectivity mapping showed a high-level of reproducibility between TempO-Seq data and Affymetrix GeneChip data from the Open TG-GATES project with high concordance between the S1500+ gene set and whole transcriptome. Taken together, our results provide guidance on selecting the number of technical and biological replicates and support the use of TempO-Seq S1500+ assay for a high-throughput platform for screening hepatotoxicants. FASTQ files and read counts data have been deposited in the National Center for Biotechnology Information Gene Expression Omnibus (GEO) (GSE152128).
This dataset is associated with the following publication: Lee, F., I. Shah, Y.T. Soong, J. Xing, I.C. Ng, F. Tasnim, and H. Yu. Reproducibility and Robustness of High-Throughput S1500+ Transcriptomics on Primary Rat Hepatocytes for Chemical-Induced Hepatotoxicity Assessment. Current Research in Toxicology. Elsevier B.V., Amsterdam, NETHERLANDS, 2: 282-295, (2021).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
*NCBI Gene Expression Omnibus Accession number, it can be used to retrieve the microarray experiment data via http://www.ncbi.nlm.nih.gov/geo/.
https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
HumanDNA methylation data set stored in NCBI (GEO) Data set GSM2819620;
GSM2819620_CV_6610_207 7344_PP_RNA; Homo sapiens; RNA-Seq SEEK Sample: 6610_CV_RRBS
[https://seek.lisym.org/samples/130 ]
This link contains 3 files:
GSM2819620_CV_6610_207.MCSv3.20161129.hs37.cpg.filtered.CG.bed.gz 109.4 Mb (ftp)(http) BED GSM2819620_CV_6610_207.MCSv3.20161129.hs37.cpg.filtered.CG.bw 85.6 Mb (ftp)(http) BW GSM2819620_CV_6610_207.MCSv3.20161129.hs37.cpg.filtered.CG.ct_coverage.bw 86.8 Mb (ftp)(http) BW
Functional genomics data repository supporting MIAME-compliant data submissions. Includes microarray-based experiments measuring the abundance of mRNA, genomic DNA, and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. Array- and sequence-based data are accepted. Collection of curated gene expression DataSets, as well as original Series and Platform records. The database can be searched using keywords, organism, DataSet type and authors. DataSet records contain additional resources including cluster tools and differential expression queries.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Summary of gene expression profiling studies containing gene expression profiles for six datasets.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Multiple myeloma (MM) is a common hematological malignancy with poorly understood recurrence and relapse mechanisms. Notably, bortezomib resistance leading to relapse makes MM treatment significantly challenging. To clarify the drug resistance mechanism, we employed a quantitative proteomics approach to identify differentially expressed protein candidates implicated in bortezomib-resistant recurrent and relapsed MM (RRMM). Bone marrow aspirates from five patients newly diagnosed with MM (NDMM) were compared with those from five patients diagnosed with bortezomib-resistant RRMM using tandem mass tag-mass spectrometry (TMT-MS). Subcellular localization and functional classification of the differentially expressed proteins were determined by gene ontology, Kyoto Encyclopedia of Genes and Genomes pathway, and hierarchical clustering analyses. The top candidates identified were validated with parallel reaction monitoring (PRM) analysis using tissue samples from 11 NDMM and 8 RRMM patients, followed by comparison with the NCBI Gene Expression Omnibus (GEO) dataset of 10 MM patients and 10 healthy controls (accession no.: GSE80608). Thirty-four differentially expressed proteins in RRMM, including proteinase inhibitor 9 (SERPINB9), were identified by TMT-MS. Subsequent functional enrichment analyses of the identified protein candidates indicated their involvement in regulating cellular metabolism, apoptosis, programmed cell death, lymphocyte-mediated immunity, and defense response pathways in RRMM. The top protein candidate SERPINB9 was confirmed by PRM analysis and western blotting as well as by comparison with an NCBI GEO dataset. We elucidated the proteome landscape of bortezomib-resistant RRMM and identified SERPINB9 as a promising novel therapeutic target. Our results provide a resource for future studies on the mechanism of RRMM.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In the third column the genes in the overlapping gene sets are reported. The fourth column indicates the false discovery rate (FDR) analog of hypergeometric p-value after correction for multiple hypothesis testing according to Benjamini and Hochberg [43]. he table shows the top three significant enrichments.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
NCBI accession numbers and related metadata from a study of transcriptomic response of Emiliania huxleyi to 2-heptyl-4-quinolone (HHQ). Sequences from this study are available at the NCBI GEO under accession series GSE131846 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?&acc=GSE131846
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Background
RNA-seq is a widely adopted affordable method for large scale gene expression profiling. However, user-friendly and versatile tools for wet-lab biologists to analyse RNA-seq data beyond standard analyses such as differential expression, are rare. Especially, the analysis of time-series data is difficult for wet-lab biologists lacking advanced computational training. Furthermore, most meta-analysis tools are tailored for model organisms and not easily adaptable to other species.
Results
With RNfuzzyApp, we provide a user-friendly, web-based R-shiny app for differential expression analysis, as well as time-series analysis of RNA-seq data. RNfuzzyApp offers several methods for normalization and differential expression analysis of RNA-seq data, providing easy-to-use toolboxes, interactive plots and downloadable results. For time-series analysis, RNfuzzyApp presents the first web-based, automated pipeline for soft clustering with the Mfuzz R package, including methods to aid in cluster number selection, Mfuzz loop computations, cluster overlap analysis, as well as cluster enrichments.
Conclusion
RNfuzzyApp is an intuitive, easy to use and interactive R shiny app for RNA-seq differential expression and time-series analysis, offering a rich selection of interactive plots, providing a quick overview of raw data and generating rapid analysis results. Furthermore, its orthology assignment, enrichment analysis, as well as ID conversion functions are accessible to non-model organisms.
Methods Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt: mean values calculated from raw reads of replicates, downloaded from gene expression omnibus (dataset GSE143430 https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE143430).
Haering_etal_extendedDatatable_1a_Tabulamurissenis_3vs12m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_1b_Tabulamurissenis_3vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_1c_Tabulamurissenis_12vs27m_DEA.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_1d_Tabulamurissenis_3vs12m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_1e_Tabulamurissenis_3vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_1f_Tabulamurissenis_12vs27m_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_2a_Tabulamurissenis_cluster1_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_2b_Tabulamurissenis_cluster2_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_2c_Tabulamurissenis_cluster3_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_2d_Tabulamurissenis_cluster4_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_2e_Tabulamurissenis_cluster5_gpofiler.txt: Tabula muris senis limb muscle data (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE132040) from 3, 12 and 27month males, processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3a_DmLeg_cluster1_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3b_DmLeg_cluster2_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3c_DmLeg_cluster3_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3d_DmLeg_cluster4_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3e_DmLeg_cluster5_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3f_DmLeg_cluster6_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3g_DmLeg_cluster7_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3h_DmLeg_cluster8_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3i_DmLeg_cluster9_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3j_DmLeg_cluster10_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3k_DmLeg_cluster11_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Haering_etal_extendedDatatable_3l_DmLeg_cluster12_gpofiler.txt: Haering_etal_extendedData_DmdevLeg_GSE143430_mean.txt processed with RNfuzzyApp (https://gitlab.com/habermann_lab/rna-seq-analysis-app)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Tuberculosis (TB) is among the leading causes of death by infectious diseases. An epidemiological association between Mycobacterium tuberculosis infection and autoimmune diseases like rheumatoid arthritis (RA) has been reported but it remains unclear if there is a causal relationship, and if so, which molecular pathways and regulatory mechanisms contribute to it. Here we used a computational biology approach by global gene expression meta-analysis to identify candidate genes and pathways that may link TB and RA. Data were collected from public expression databases such as NCBI GEO. Studies were selected that analyzed mRNA-expression in whole blood or blood cell populations in human case control studies at comparable conditions. Six TB and RA datasets (41 active TB patients, 33 RA patients, and 67 healthy controls) were included in the downstream analysis. This approach allowed the identification of deregulated genes that had not been identified in the single analysis of TB or RA patients and that were co-regulated in TB and RA patients compared to healthy subjects. The genes encoding TLR5, TNFSF10/TRAIL, PPP1R16B/TIMAP, SIAH1, PIK3IP1, and IL17RA were among the genes that were most significantly deregulated in TB and RA. Pathway enrichment analysis revealed ‘T cell receptor signaling pathway’, ‘Toll-like receptor signaling pathway,’ and ‘virus defense related pathways’ among the pathways most strongly associated with both diseases. The identification of a common gene signature and pathways substantiates the observation of an epidemiological association of TB and RA and provides clues on the mechanistic basis of this association. Newly identified genes may be a basis for future functional and epidemiological studies.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The most significant biomarker from the analyzed datasets.
Table of Contents
1. Main Description
---------------------------
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled `marengo_code_for_paper_jan_2023.R` was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
File Descriptions
---------------------------
Linked Files
---------------------
This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:
Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Installation and Instructions
--------------------------------------
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:
> Ensure you have R version 4.1.2 or higher for compatibility.
> Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.
1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
3. Set your working directory to where the following files are located:
You can use the following code to set the working directory in R:
> setwd(directory)
4. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
5. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
6. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
7. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
Affymetrix Bovine GeneChip® Gene 1.0 ST Array RNA expression analysis was performed on four somatic ovarian cell types: the granulosa cells (GCs) and theca cells (TCs) of the dominant follicle and the large luteal cells (LLCs) and small luteal cells (SLCs) of the corpus luteum. The normalized linear microarray data was deposited to the NCBI GEO repository (GSE83524). Subsequent ANOVA determined genes that were enriched (≥2 fold more) or decreased (≤−2 fold less) in one cell type compared to all three other cell types, and these analyzed and filtered datasets are presented as tables. Genes that were shared in enriched expression in both follicular cell types (GCs and TCs) or in both luteal cells types (LLCs and SLCs) are also reported in tables. The standard deviation of the analyzed array data in relation to the log of the expression values is shown as a figure. These data have been further analyzed and interpreted in the companion article "Gene expression profiling of ovarian follicular and luteal cells provides insight into cellular identities and functions", Romereim et al., (2017) Mol. Cell. Endocrinol. 439:379-394. https://doi.org/10.1016/j.mce.2016.09.029 Resources in this dataset:Resource Title: RNA Expression Data from Four Isolated Bovine Ovarian Somatic Cell Types. File Name: Web Page, url: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE83524 NCBI Gene Expression Omnibus (GEO) Accession Display. Analysis of the RNA present in each bovine cell type using Affymetrix microarrays yielded new cell-specific genetic markers, functional insight into the behavior of each cell type via Gene Ontology Annotations and Ingenuity Pathway Analysis, and evidence of small and large luteal cell lineages using Principle Component Analysis. Enriched expression of select genes for each cell type was validated by qPCR. This expression analysis offers insight into the lineage and differentiation process that transforms somatic follicular cells into luteal cells. The orignal Affymetrix .CEL files and the normalized linear expression data are included in this submission.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data discussed in this publication have been deposited in NCBI's Gene Expression Omnibus (GEO) and are accessible through GEO Series accession number GSE204989 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE204989).
Sequence Read Archive (SRA) data, BioSamples, and GEO holdings can be accessed from the NCBI BioProject PRJNA843039 (http://www.ncbi.nlm.nih.gov/bioproject/PRJNA843039).
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The best-performing pipeline for each modelling strategy is highlighted in bold.
https://choosealicense.com/no-permission/https://choosealicense.com/no-permission/
Human RNA-Seq data set GSM2819712 stored in NCBI (GEO)