https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
Modern humans spend most of their time having eaten recently. The purpose of the current project is to understand how the blood, which contains immune cells, responds in the hours after eating a meal that is moderately high in fat. We used a sequencing method to observe the expression of all the genes in blood cells in five participants who were each fed a high fat meal on three separate days. The results are reported in the manuscript, “Temporal changes in postprandial blood transcriptomes reveal subject-specific pattern of expression of innate immunity genes after a high-fat meal." Overall design: We used a sequencing method to observe the expression of all the genes in blood cells in five participants who were each fed a high fat meal on three separate days, resulting in 45 whole blood transcriptomes. For each sample, 3 mL of venous whole blood was drawn into a Tempus Blood RNA tube, shaken vigorously, and then frozen at -80°C until use. Total RNA was purified with the Tempus Spin RNA Isolation Kit with minor modifications to the manufacturer’s protocol. To remove residual genomic DNA, RNA samples were treated on-column with RNase-Free DNase per manufacturer’s instructions. RNA quantity, quality, and integrity were assessed with NanoDrop 1000 and 2100 Bioanalyzer. All isolated RNA had A260/A280 ratios greater than 2 and RNA integrity numbers higher than 7.3. RNA-Seq libraries were constructed at the DNA Technologies and Expression Core at the University of California, Davis, using the Ovation Human Blood RNA-Seq Library System (NuGEN Technologies). Sequencing was performed in a 2x100bp format with 45 samples multiplexed on 3 lanes on an Illumina HiSeq 4000. Analysis of the data is reported in the manuscript, “Temporal changes in postprandial blood transcriptomes reveal subject-specific pattern of expression of innate immunity genes after a high-fat meal.”
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.
- This release includes GEO series up to Dec-31, 2020;
- Fixed xlrd missing optional dependency, which affected import of some xls files, previously we were using only openpyxl (thanks to anonymous reviewer);
- All files in supplementary _RAW.tar files were checked for p values, previously _RAW.tar files were completely omitted, alas (thanks to anonymous reviewer).
Archived dataset contains following files:
- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).
- output/document_summaries.csv, document summaries of NCBI GEO series
- output/publications.csv, publication info of NCBI GEO series
- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series
- output/single-cell.csv, single cell experiments
- spots.csv, NCBI SRA sequencing run metadata
- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.
- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
Gene Expression Omnibus is a public functional genomics data repository supporting MIAME-compliant submissions of array- and sequence-based data. Tools are provided to help users query and download experiments and curated gene expression profiles.
This template is for recording genome data from the NimbleGen platform. This template was taken from the GEO website (http://www.ncbi.nlm.nih.gov/geo/info/spreadsheet.html) and modified to conform to the SysMO-JERM (Just enough Results Model) for transcriptomics. Using these templates will mean easier submission to GEO/ArrayExpress and greater consistency of data in SEEK.
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx
, barcodes.tsv
, and genes.tsv
files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz
files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R
and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R
R script and execute commands as necessary.
GEO (Gene Expression Omnibus) is a public functional genomics data repository supporting MIAME-compliant data submissions. There are also tools provided to help users query and download experiments and curated gene expression profiles.
Table of Contents
Main Description File Descriptions Linked Files Installation and Instructions
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R
was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap
The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.
This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:
Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
Description: This submission contains the raw sequencing or .fastq.gz
files, which are tab delimited text files.
Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).
Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:
Ensure you have R version 4.1.2 or higher for compatibility.
Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.
marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt
You can use the following code to set the working directory in R:
setwd(directory)
PubMed Central reuse of GEO datasets deposited in 2007This is the raw data behind the analysis. It contains one row for every mention of a 2007 GEO dataset in PubMed Central. Each row identifies the mentioned GEO dataset, the PubMed Central article that mentions the dataset's accession number, whether the authors of the dataset and the attributing article overlap, and whether this is considered an instance of third-party data reuse.PMC_reuse_of_2007_GEO_datasets.csvAggregate Table DataAggregate table data behind the figures and results in the README associated with the main dataset. Includes Baseline metrics used for extrapolating PubMed Central (PMC) results to PubMed, Number of mentions of a 2007 GEO dataset by authors who submitted the dataset, and Number of mentions of a dataset by authors who DID NOT submit the dataset across 2007-2010.tables.csv Funding agencies are reluctant to support data archiving, even though large research funders such as the National Science Foundation (NSF) and the National Institutes of Health acknowledge its importance for scientific progress. Our quantitative estimates of data reuse indicate that ongoing financial investment in data-archiving infrastructure provides a high scientific return.
Administration of public lands includes controlling rights of access, surface rights and subsurface rights, or mineral rights. Public lands are administered for all Albertans through the issuance of dispositions. A disposition must be obtained for any access to or activity on public lands. Applicants for a disposition must submit the appropriate plan type that meets the requirements for the activity and purpose of the disposition being applied for. Disposition plans submitted digitally, and digital plan submissions are to be appropriately geo-referenced.
For the 2021 redistricting cycle, members of the public could use ESRI's online redistricting application to submit their own Congressional and legislative plans for consideration by the Oregon Redistricting Committees. The deadline for submissions was September 8, 2021 at 5 PM Oregon time. This .zip file contains shapefiles for each of the submissions and any documentation submitters provided with their plans.Users submitted 59 congressional plans, 10 Oregon house plans, and 8 Oregon senate plans.Oregon Redistricting Website:https://www.oregonlegislature.gov/redistricting/GIS Downloads:GIS Downloads: Oregon Redistricting (arcgis.com)
Funding agencies are reluctant to support data archiving, even though large research funders such as the National Science Foundation (NSF) and the National Institutes of Health acknowledge its importance for scientific progress. Our quantitative estimates of data reuse indicate that ongoing financial investment in data-archiving infrastructure provides a high scientific return.
This proposal addressed the theme of “impact of oil spills on public health†. Specifically, the proposal addressed the general hypothesis, which is: upon oil/dispersant respiratory exposure there will be a higher carcinogenic potential of lung tissue.
To test this hypothesis, we profiled and confirmed the existence of molecular signatures of carcinogenesis through RNA-seq analysis of a mouse model treated with instilled oil/dispersants. We exposed the wild-type C57BL/6 (B6) mice to BP crude oil, dispersant 9500, dispersant 9527, oil + 9527, oil + 9500 and H2O (as control) using intratracheal instillation method for 2 weeks. We then performed RNA-seq analysis of the lung tissue from the mice to identify differentially expressed (DEx) genes (DEGs) in the treated mice vs. the control mice. These DEGs were functionally annotated to search for GO terms and pathways related to carcinogenesis.
For each treatment group, 3 male and 3 female mice were used. Therefore, we generated RNA-seq data for a total of 36 animals (6 animals/group x 6 treatment group).
We have submitted the RNA-seq data to NCBI's GEO (Gene Expression Omnibus) online database (https://www.ncbi.nlm.nih.gov/geo/). The dataset is now assigned a GEO series number GSE137204.
In the GEO website, RNA-seq data are organized under three types: the metadata, the processed data and the raw data files. The metadata describes the treatment group and other information related to a sample. The processed data files contain raw counts of sequencing reads for transcripts. The raw data files are the raw fastq data files generated in the RNA-seq experiments.
This dataset supports the publication: Liu, Yao-Zhong; Charles A. Miller; Yan Zhuang; Sudurika S. Mukhopadhyay; Shigeki Saito; Edward B. Overton; and Gilbert F. Morris. 2020. The Impact of the Deepwater Horizon Oil Spill upon Lung Health—Mouse Model-Based RNA-Seq Analyses. International Journal of Environmental Research and Public Health, 17(15), 5466. doi:10.3390/ijerph17155466
The C2H2 zinc finger is the most prevalent DNA-binding motif in the mammalian proteome, with DNA-binding domains usually containing more tandem fingers than are needed for stable sequence-specific DNA recognition. To examine the reason for the frequent presence of multiple zinc fingers, we generated mice lacking finger 1 or finger 4 of the 4-finger DNA-binding domain of Ikaros, a critical regulator of lymphopoiesis and leukemogenesis. Each mutant strain exhibited a specific subset of the phenotypes observed with Ikaros null mice. Of particular relevance, fingers 1 and 4 contributed to distinct stages of B- and T-cell development and finger 4 was selectively required for tumor suppression in thymocytes and in a new model of BCR-ABL+ acute lymphoblastic leukemia. These results, combined with transcriptome profiling (this GEO submission: RNA-Seg of whole thymus from wt and the two ZnF mutants), reveal that different subsets of fingers within multi-finger transcription factors can regulate distinct target genes and biological functions, and they demonstrate that selective mutagenesis can facilitate efforts to elucidate the functions and mechanisms of action of this prevalent class of factors. Overall design: RNA-Seq from Whole Thymus comparing wt (3 replicates), Ikaros-ZnF1-/- mutant (2 replicates) and Ikaros-ZnF4-/- mutant (2 replicates) RPKM_Thymocytes.txt (linked below as a supplementary file) reports the relative mRNA expression levels (RPKM)values for all annotated Refseq genes that had at least one read in at least one of the samples, with duplicates for the same gene (different transcripts for same gene) filtered out. RPKM (Mortazavi et al., 2008) were calculated based on exonic reads obtained by using the software SeqMonk (Babraham Bioinformatics) and reference genome annotations from NCBI (mm9).
https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
The powdery mildew fungus, Blumeria graminis, is an obligate biotrophic pathogen of cereals and has significant impact on food security (Dean et al., 2012. Molecular Plant Pathology 13 (4): 414-430. DOI: 10.1111/j.1364-3703.2011.00783.x). Blumeria graminis f. sp. hordei (Bgh) is the causal agent of powdery mildew on barley (Hordeum vulgare L.). We sought to identify small RNA-derived transcript cleavage sites from both barley and Bgh that regulate gene expression at the post-transcriptional level both within species and cross-kingdom. Overall design: 6 timepoints (0, 16, 20 24, 32, and 48 hours after inoculation with Bgh 5874) pooled for each of 5 genotypes. This experiment used the identical split-plot design, tissue, and source RNA as GEO submission GSE101304.
Astronauts are exposed to a unique combination of stressors during spaceflight which leads to alterations in their physiology and potentially increases their susceptibility to infectious pathogens. Here we report the first microarray evaluation of any astronaut tissue sample specifically whole blood before and after spaceflight using an array comprising 234 well-characterized stress response genes. Differentially regulated genes included those important for DNA repair oxidative stress and protein folding/degradation. Microarrays comprising 234 well characterized stress-related genes were used to profile transcriptomic changes in six astronauts before and after short-duration spaceflight. Blood samples were collected for analysis from each eastronaut 10 days prior and 2-3 hours after return from spaceflight. Data submitted for platform GPL140 contain genes that have been pre-filtered by the analytical software to remove values of low certainty resulting in missing values for some samples. Unfortunately these original data are no longer available due to physical damage at Tulane University during hurricane Katrina but the processed values were retained in redundant locations and these are submitted for upload to GEO.
For the 2021 redistricting cycle, members of the public could use ESRI's online redistricting application to submit their own Congressional and legislative plans for consideration by the Oregon Redistricting Committees. The deadline for submissions was September 8, 2021 at 5 PM pacific time. This feature service contains each of the 10 submitted Oregon house plans.Oregon Redistricting Website:https://www.oregonlegislature.gov/redistricting/GIS Downloads:GIS Downloads: Oregon Redistricting (arcgis.com)
This experiment is contains rat organism part samples and strand-specific RNA-seq data from experiment E-GEOD-41637 (https://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-41637/), which aimed at assessing tissue-specific transcriptome variation across mammals, with chicken used as an outgroup in evolutionary analyses. Each organism part was sourced from animals of three different strains (Sprague-Dawley, BN/SsNHsd and F344/Cr1). This data set was originally submitted to NCBI Gene Expression Omnibus under accession number GSE41637 (http://www.ncbi.nlm.nih.gov/projects/geo/query/acc.cgi?acc=GSE41637) and later imported to ArrayExpress as E-GEOD-41637.
A publicly accessible database containing data on Affymetrix DNA microarray experiments, and Serial Analysis of Gene Expression, mostly on human and mouse stem cell samples and their derivatives to facilitate the discovery of gene functions relevant to stem cell control and differentiation. It has grown in both size and scope into a system with analysis tools that examine either the whole database at once, or slices of data, based on tissue type, cell type or gene of interest. There is currently more than 210 stem cell samples in 60 different experiments, with more being added regularly. The samples were originated by researchers of the Stem Cell Network and processed at the Core Facility of Stemcore Laboratories under the management of Ms. Pearl Campbell in the frame of the Stem Cell Genomics Project. Periodically, new expression data is submitted to the Gene Expression Omnibus (GEO) repository at the National Center for Biotechnological Information, in order to allow researchers to compare the data deposited in StemBase to a large amount of gene expression data sets. StemBase is different from GEO in both focus and scope. StemBase is concerned exclusively with stem cell related data. we are focused in Stem Cell research. We have made a significant effort to ensure the quality and consistency of the data included. This allows us to offer more specialized analysis tools related to Stem Cell data. GEO is intended as a large scale public archive. Deposition in a public repository such as GEO is required by most important scientific journals and it is advantageous for a further diffusion of the data since GEO is more broadly used than StemBase.
To better understand the heat production, electricity generation performance, and economic viability of closed-loop geothermal systems in hot-dry rock, the Closed-Loop Geothermal Working Group -- a consortium of several national labs and academic institutions has tabulated time-dependent numerical solutions and levelized cost results of two popular closed-loop heat exchanger designs (u-tube and co-axial). The heat exchanger designs were evaluated for two working fluids (water and supercritical CO2) while varying seven continuous independent parameters of interest (mass flow rate, vertical depth, horizontal extent, borehole diameter, formation gradient, formation conductivity, and injection temperature). The corresponding numerical solutions (approximately 1.2 million per heat exchanger design) are stored as multi-dimensional HDF5 datasets and can be queried at off-grid points using multi-dimensional linear interpolation. A Python script was developed to query this database and estimate time-dependent electricity generation using an organic Rankine cycle (for water) or direct turbine expansion cycle (for CO2) and perform a cost assessment. This document aims to give an overview of the HDF5 database file and highlights how to read, visualize, and query quantities of interest (e.g., levelized cost of electricity, levelized cost of heat) using the accompanying Python scripts. Details regarding the capital, operation, and maintenance and levelized cost calculation using the techno-economic analysis script are provided. This data submission will contain results from the Closed Loop Geothermal Working Group study that are within the public domain, including publications, simulation results, databases, and computer codes. GeoCLUSTER is a Python-based web application created using Dash, an open-source framework built on top of Flask that streamlines the building of data dashboards. GeoCLUSTER provides users with a collection of interactive methods for streamlining the exploration and visualization of an HDF5 dataset. The GeoCluster app and database are contained in the compressed file geocluster_vx.zip, where the "x" refers to the version number. For example, geocluster_v1.zip is Version 1 of the app. This zip file also contains installation instructions. **To use the GeoCLUSTER app in the cloud, click the link to "GeoCLUSTER on AWS" in the Resources section below. To use the GeoCLUSTER app locally, download the geocluster_vx.zip to your computer and uncompress this file. When uncompressed this file comprises two directories and the geocluster_installation.pdf file. The geo-data app contains the HDF5 database in condensed format, and the GeoCLUSTER directory contains the GeoCLUSTER app in the subdirectory dash_app, as app.py. The geocluster_installation.pdf file provides instructions on installing Python, the needed Python modules, and then executing the app.
Gene-expression microarray datasets generated as part of the Immunological Genome Project (ImmGen). Primary cells from multiple immune lineages are isolated ex-vivo, primarily from young adult B6 male mice, and double-sorted to >99% purity. RNA is extracted from cells in a centralized manner, amplified and hybridized to Affymetrix 1.0 ST MuGene arrays. Protocols are rigorously standardized for all sorting and RNA preparation. Data is released monthly in batches of cell populations. Overall design: This Series record provides access to Immunological Genome Project data submitted to GEO.
https://rightsstatements.org/vocab/UND/1.0/https://rightsstatements.org/vocab/UND/1.0/
Modern humans spend most of their time having eaten recently. The purpose of the current project is to understand how the blood, which contains immune cells, responds in the hours after eating a meal that is moderately high in fat. We used a sequencing method to observe the expression of all the genes in blood cells in five participants who were each fed a high fat meal on three separate days. The results are reported in the manuscript, “Temporal changes in postprandial blood transcriptomes reveal subject-specific pattern of expression of innate immunity genes after a high-fat meal." Overall design: We used a sequencing method to observe the expression of all the genes in blood cells in five participants who were each fed a high fat meal on three separate days, resulting in 45 whole blood transcriptomes. For each sample, 3 mL of venous whole blood was drawn into a Tempus Blood RNA tube, shaken vigorously, and then frozen at -80°C until use. Total RNA was purified with the Tempus Spin RNA Isolation Kit with minor modifications to the manufacturer’s protocol. To remove residual genomic DNA, RNA samples were treated on-column with RNase-Free DNase per manufacturer’s instructions. RNA quantity, quality, and integrity were assessed with NanoDrop 1000 and 2100 Bioanalyzer. All isolated RNA had A260/A280 ratios greater than 2 and RNA integrity numbers higher than 7.3. RNA-Seq libraries were constructed at the DNA Technologies and Expression Core at the University of California, Davis, using the Ovation Human Blood RNA-Seq Library System (NuGEN Technologies). Sequencing was performed in a 2x100bp format with 45 samples multiplexed on 3 lanes on an Illumina HiSeq 4000. Analysis of the data is reported in the manuscript, “Temporal changes in postprandial blood transcriptomes reveal subject-specific pattern of expression of innate immunity genes after a high-fat meal.”