100+ datasets found

Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
zenodo.org
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Dataset provided by
Hsu, Jonathan
Stoop, Allart
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
Field-wide assessment of differential HT-seq from NCBI GEO database
zenodo.org
application/gzip
Updated Jan 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. http://doi.org/10.5281/zenodo.5068928
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5068928
Dataset updated
Jan 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analyzed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository. Our work puts an upper bound of 62% to field-wide reproducibility, based on the types of files submitted to GEO.

Archived dataset contains following files:

- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

- output/document_summaries.csv, document summaries of NCBI GEO series

- output/publications.csv, publication info of NCBI GEO series

- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

- output/single-cell.csv, single cell experiments

- spots.csv, NCBI SRA sequencing run metadata

- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
o
Repository for the single cell RNA sequencing data analysis for the human...
explore.openaire.eu
Updated Aug 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8286134
Dataset updated
Aug 26, 2023
Authors
Jonathan; Andrew; Pierre; Allart; Adrian
Description
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.
Z
Field-wide assessment of differential HT-seq from NCBI GEO database
data.niaid.nih.gov
zenodo.org
Updated Jan 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tenson, Tanel (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3747112
Explore at:
Dataset updated
Jan 13, 2023
Dataset provided by
Luidalepp, Hannes
Tenson, Tanel
Päll, Taavi
Maiväli, Ülo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

This release includes GEO series published up to Dec-31, 2020;

geo-htseq.tar.gz archive contains following files:

output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

output/document_summaries.csv, document summaries of NCBI GEO series.

output/suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions.

output/suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO.

output/publications.csv, publication info of NCBI GEO series.

output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

output/spots.csv, NCBI SRA sequencing run metadata.

output/cancer.csv, cancer related experiment accessions.

output/transcription_factor.csv, TF related experiment accessions.

output/single-cell.csv, single cell experiment accessions.

blacklist.txt, list of supplementary files that were either too large to import or were causing computing environment crash during import.

Workflow to produce this dataset is available on Github at rstats-tartu/geo-htseq.

geo-htseq-updates.tar.gz archive contains files:

results/detools_from_pmc.csv, differential expression analysis programs inferred from published articles

results/n_data.csv, manually curated sample size info for NCBI GEO HT-seq series

results/simres_df_parsed.csv, pi0 values estimated from differential expression results obtained from simulated RNA-seq data

results/data/parsed_suppfiles_rerun.csv, pi0 values estimated using smoother method from anti-conservative p-value sets
Field-wide assessment of differential HT-seq from NCBI GEO database
zenodo.org
application/gzip
Updated Jan 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). Field-wide assessment of differential HT-seq from NCBI GEO database [Dataset]. http://doi.org/10.5281/zenodo.5356064
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5356064
Dataset updated
Jan 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analysed the field of expression profiling by high throughput sequencing, or HT-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository.

- This release includes GEO series up to Dec-31, 2020;

- Fixed xlrd missing optional dependency, which affected import of some xls files, previously we were using only openpyxl (thanks to anonymous reviewer);

- All files in supplementary _RAW.tar files were checked for p values, previously _RAW.tar files were completely omitted, alas (thanks to anonymous reviewer).

Archived dataset contains following files:

- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

- output/document_summaries.csv, document summaries of NCBI GEO series

- output/publications.csv, publication info of NCBI GEO series

- output/scopus_citedbycount.csv, Scopus citation info of NCBI GEO series

- output/single-cell.csv, single cell experiments

- spots.csv, NCBI SRA sequencing run metadata

- suppfilenames.txt, list of all supplementary file names of NCBI GEO submissions. One filename per row.

- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
N
Single cell RNA-seq data of human hESCs to evaluate SCnorm: robust...
data.niaid.nih.gov
Updated May 15, 2019
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Bacher R; Chu L; Kendziorski C; Swanson S (2019). Single cell RNA-seq data of human hESCs to evaluate SCnorm: robust normalization of single-cell rna-seq data [Dataset]. https://data.niaid.nih.gov/resources?id=gse85917
Explore at:
Dataset updated
May 15, 2019
Dataset provided by
University of Florida
Authors
Bacher R; Chu L; Kendziorski C; Swanson S
Description
Normalization of RNA-sequencing data is essential for accurate downstream inference, but the assumptions upon which most methods are based do not hold in the single-cell setting. Consequently, applying existing normalization methods to single-cell RNA-seq data introduces artifacts that bias downstream analyses. To address this, we introduce SCnorm for accurate and efficient normalization of scRNA-seq data. Total 183 single cells (92 H1 cells, 91 H9 cells), sequenced twice, were used to evaluate SCnorm in normalizing single cell RNA-seq experiments. Total 48 bulk H1 samples were used to compare bulk and single cell properties. For single-cell RNA-seq, the identical single-cell indexed and fragmented cDNA were pooled at 96 cells per lane or at 24 cells per lane to test the effects of sequencing depth, resulting in approximately 1 million and 4 million mapped reads per cell in the two pooling groups, respectively.
N
Neonatal hematopoietic stem and progenitor cell ontogeny at single cell...
data.niaid.nih.gov
Updated Nov 23, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Morris S (2020). Neonatal hematopoietic stem and progenitor cell ontogeny at single cell resolution (scRNA-seq) [Dataset]. https://data.niaid.nih.gov/resources?id=gse128761
Explore at:
Dataset updated
Nov 23, 2020
Dataset provided by
Washington University School of Medicine
Authors
Morris S
Description
Single-cell RNA-seq analysis of HSCs and HPCs during fetal, neonatal and adult stages of development HSCs and HPCs were isolated at the stated ages for single-cell sequencingThe strain background for all experiments is B6
f
Table1_Prognostic value of autophagy-related genes based on single-cell...
frontiersin.figshare.com
docx
Updated Jun 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuqi Luo; Xuesong Deng; Weihua Liao; Yiwen Huang; Caijie Lu (2023). Table1_Prognostic value of autophagy-related genes based on single-cell RNA-sequencing in colorectal cancer.docx [Dataset]. http://doi.org/10.3389/fgene.2023.1109683.s004
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2023.1109683.s004
Dataset updated
Jun 21, 2023
Dataset provided by
Frontiers
Authors
Yuqi Luo; Xuesong Deng; Weihua Liao; Yiwen Huang; Caijie Lu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background: Colorectal cancer (CRC) is the second most common cancer in China. Autophagy plays an important role in the initiation and development of CRC. Here, we assessed the prognostic value and potential functions of autophagy-related genes (ARGs) using integrated analysis using single-cell RNA sequencing (scRNA-seq) data from the Gene Expression Omnibus (GEO) and RNA sequencing (RNA-seq) data from The Cancer Genome Atlas (TCGA).Methods: We analyzed GEO-scRNA-seq data from GEO using various single-cell technologies, including cell clustering, and identification of differentially expressed genes (DEGs) in different cell types. Additionally, we performed gene set variation analysis (GSVA). The differentially expressed ARGs among different cell types and those between CRC and normal tissues were identified using TCGA-RNA-seq data, and the hub ARGs were screened. Finally, a prognostic model based on the hub ARGs was constructed and validated, and patients with CRC in TCGA datasets were divided into high- and low-risk groups based on their risk-score, and immune cells infiltration and drug sensitivity analyses between the two groups were performed.Results: We obtained single-cell expression profiles of 16,270 cells, and clustered them into seven types of cells. GSVA revealed that the DEGs among the seven types of cells were enriched in many signaling pathways associated with cancer development. We screened 55 differentially expressed ARGs, and identified 11 hub ARGs. Our prognostic model revealed that the 11 hub ARGs including CTSB, ITGA6, and S100A8, had a good predictive ability. Moreover, the immune cell infiltrations in CRC tissues were different between the two groups, and the hub ARGs were significantly correlated with the enrichment of immune cell infiltration. The drug sensitivity analysis revealed that the patients in the two risk groups had difference in their response to anti-cancer drugs.Conclusion: We developed a novel prognostic 11-hub ARG risk model, and these hubs may act as potential therapeutic targets for CRC.
N
Single cell RNA-seq of brain from young and old mice
data.niaid.nih.gov
Updated Nov 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Schafer MJ; Zhang X; LeBrasseur NK (2022). Single cell RNA-seq of brain from young and old mice [Dataset]. https://data.niaid.nih.gov/resources?id=gse178957
Explore at:
Dataset updated
Nov 26, 2022
Dataset provided by
Mayo Clinic
Authors
Schafer MJ; Zhang X; LeBrasseur NK
Description
We report the scRNA-seq of cells isolated from brain of young and old mice. We isolated single cells from brain of young (6 months, n=2) and old (24 months, n=4) mice and performed the scRNA-seq using 10X Genomics system.
h
GSE120180
huggingface.co
Updated Jul 12, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
iris lee (2025). GSE120180 [Dataset]. https://huggingface.co/datasets/Iris8090/GSE120180
Explore at:
Dataset updated
Jul 12, 2025
Authors
iris lee
Description
🧬 GSE120180 – Single-Cell Transcriptomics of Aging Human Skin

This dataset contains single-cell RNA-seq profiles from aging human skin, originally published as part of the GEO Series GSE120180. The dataset has been converted to .parquet format for faster I/O and compatibility with machine learning pipelines.

📂 Dataset Overview

Original Source: GEO: GSE120180 Species: Homo sapiens
Tissue: Human skin
Technique: 10x Genomics scRNA-seq
Format: .parquet (converted… See the full description on the dataset page: https://huggingface.co/datasets/Iris8090/GSE120180.
o
Single-Cell Gene Expression Profiles for Classification Problems
explore.openaire.eu
Updated Mar 15, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stefano Gualandi; Andrea Codegoni; Eleonora Vercesi (2021). Single-Cell Gene Expression Profiles for Classification Problems [Dataset]. http://doi.org/10.5281/zenodo.4604569
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4604569
Dataset updated
Mar 15, 2021
Authors
Stefano Gualandi; Andrea Codegoni; Eleonora Vercesi
Description
This repository contains a collection of three datasets we use to introduce the Gene Mover Distance in [1] and described below. The three datasets are exported with a basic text-based format (.csv file) like other public datasets largely used in the Machine Learning community. The three datasets are extracted from the Gene Expression Omnibus (GEO) database [2], where they appear, respectively, with access number GSE116256 (blood leukemia, [3]), GSE84133 (human pancreas, [4]), and GSE67835 (human brain, [5]). In GEO, the datasets are decomposed into several files, which contain much more details than those reported in this version. However, the proposed format should facilitate other researchers in using this data. The Gene Mover's Distance is a measure of similarity between a pair of cells based on their gene expression profiles obtained via single-cell RNA sequencing. The underlying idea of GMD is to interpret the gene expression array of a single cell as a discrete probability measure. The distance between two cells is hence computed by solving an Optimal Transport problem between the two corresponding discrete measures. The Gene Mover's Distance can be used, for instance, to solve two classification problems: the classification of cells according to their condition and according to their type. The repository contains a python script to check the basic statistics of the data. [1] Bellazzi, R., Codegoni, A., Gualandi, S., Nicora, G., Vercesi, E. The Gene Mover's Distance: Single-cell similarity via Optimal Transport. https://arxiv.org/abs/2102.01218 [2] Gene Expression Omnibus (GEO) database, http://www.ncbi.nlm.nih.gov/geo [3] van Galen, P., Hovestadt, V., Wadsworth II, M.H., Hughes, T.K., Griffin, G.K., Battaglia, S., Verga, J.A., Stephansky, J., Pastika, T.J., Story, J.L. and Pinkus, G.S., 2019. Single-cell RNA-seq reveals AML hierarchies relevant to disease progression and immunity. Cell, 176(6), pp.1265-1281. [4] Baron, M., Veres, A., Wolock, S.L., Faust, A.L., Gaujoux, R., Vetere, A., Ryu, J.H., Wagner, B.K., Shen-Orr, S.S., Klein, A.M. and Melton, D.A., 2016. A single-cell transcriptomic map of the human and mouse pancreas reveals inter-and intra-cell population structure. Cell systems, 3(4), pp.346-360. [5] Darmanis, S., Sloan, S.A., Zhang, Y., Enge, M., Caneda, C., Shuer, L.M., Gephart, M.G.H., Barres, B.A. and Quake, S.R., 2015. A survey of human brain transcriptome diversity at the single cell level. Proceedings of the National Academy of Sciences, 112(23), pp.7285-7290.
A field-wide assessment of differential RNAseq reveals ubiquitous bias
zenodo.org
application/gzip
Updated Jan 13, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp (2023). A field-wide assessment of differential RNAseq reveals ubiquitous bias [Dataset]. http://doi.org/10.5281/zenodo.3778160
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3778160
Dataset updated
Jan 13, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Taavi Päll; Taavi Päll; Hannes Luidalepp; Tanel Tenson; Tanel Tenson; Ülo Maiväli; Ülo Maiväli; Hannes Luidalepp
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We analyzed the field of expression profiling by high throughput sequencing, or RNA-seq, in terms of replicability and reproducibility, using data from the NCBI GEO (Gene Expression Omnibus) repository. Our work puts an upper bound of 56% to field-wide reproducibility, based on the types of files submitted to GEO.

Archived dataset contains following files:

- output/parsed_suppfiles.csv, p-value histograms, histogram classes, estimated number of true null hypotheses (pi0).

- output/document_summaries.csv, document summaries of GEO series

- output/publications.csv, publication info of GEO series

- output/scopus_citedbycount.csv, Scopus citation info of GEO series

- output/single-cell.csv, single cell experiments

- spots.csv, sequencing run metadata: number of spots and bases

- suppfilenames.txt, list of all supplementary file names of GEO submissions. One filename per row.

- suppfilenames_filtered.txt, list of supplementary file names used for downloading files from NCBI GEO. One filename per row.
h
GSE208154
huggingface.co
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
2025 Longevity x AI Hackathon, GSE208154 [Dataset]. https://huggingface.co/datasets/longevity-db/GSE208154
Explore at:
Dataset authored and provided by
2025 Longevity x AI Hackathon
Description
GSE208154 – Complete Cell Atlas of Aging C. elegans

📄 Description

This dataset provides a comprehensive single-cell RNA sequencing (scRNA-seq) profile of C. elegans across six adult time points. It is a highly curated developmental and aging atlas, useful for longevity, developmental biology, and transcriptomic studies.

📊 Dataset Summary

Species: Caenorhabditis elegans Type: Single-cell RNA-seq (scRNA-seq) Samples: 11 GSM samples from GEO Format:… See the full description on the dataset page: https://huggingface.co/datasets/longevity-db/GSE208154.
Single-cell RNA-seq exposed to multiple compounds
kaggle.com
zip
Updated Mar 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2021). Single-cell RNA-seq exposed to multiple compounds [Dataset]. https://www.kaggle.com/alexandervc/singlecell-rnaseq-exposed-to-multiple-compounds
Explore at:
zip(4314885205 bytes)Available download formats
Dataset updated
Mar 11, 2021
Authors
Alexander Chervov
Description
Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Remark 2: See same data at: https://www.kaggle.com/datasets/alexandervc/scrnaseq-exposed-to-multiple-compounds extracted pieces from huge file here - more easy to load and work.

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data:

Data - scRNA expressions for several cell lines affected by drugs with different doses/durations.

The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.

The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell

The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.

The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
Data for 'Comparative Analysis of Single-Cell RNA Sequencing Methods'
zenodo.org
application/gzip
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christoph Ziegenhain; Christoph Ziegenhain; Wolfgang Enard; Wolfgang Enard (2020). Data for 'Comparative Analysis of Single-Cell RNA Sequencing Methods' [Dataset]. http://doi.org/10.5281/zenodo.2574044
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.2574044
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Christoph Ziegenhain; Christoph Ziegenhain; Wolfgang Enard; Wolfgang Enard
License
Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
License information was derived automatically
Description
Raw sequencing data to "Comparative Analysis of Single-Cell RNA Sequencing Methods".

https://www.ncbi.nlm.nih.gov/pubmed/28212749

In addition to the GEO submission https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE75790, you can find here raw bam files for UMI-methods tagged with cell barcode and UMI sequences.

MD5 checksum: f10825509952fffd9c4dc0c1dcb9eb8e
Data from: Human Bone Marrow Assessment by Single Cell RNA Sequencing, Mass...
data.niaid.nih.gov
Updated Dec 23, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
NIH CHI Program (2018). Human Bone Marrow Assessment by Single Cell RNA Sequencing, Mass Cytometry and Flow Cytometry [Dataset]. https://data.niaid.nih.gov/resources?id=gse120446
Explore at:
Dataset updated
Dec 23, 2018
Dataset provided by
National Heart, Lung, and Blood Institutehttps://www.nhlbi.nih.gov/
NIH CHI Program
Description
This SuperSeries is composed of the SubSeries listed below. Refer to individual Series
N
Human nasal epithelial and immune cell responses to SARS-CoV-2 versus...
data.niaid.nih.gov
Updated Jul 24, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang JP; Derr AG; Gao K; Nundel K; Marshak-Rothstein A; Finberg RW (2023). Human nasal epithelial and immune cell responses to SARS-CoV-2 versus influenza A virus [Dataset]. https://data.niaid.nih.gov/resources?id=gse176269
Explore at:
Dataset updated
Jul 24, 2023
Dataset provided by
University of Massachusetts Medical School
Authors
Wang JP; Derr AG; Gao K; Nundel K; Marshak-Rothstein A; Finberg RW
Description
We performed single-cell RNA sequencing (scRNA-Seq) on nasal wash cells freshly collected from adults with COVID-19, influenza A, or no disease (healthy). Major cell types and subtypes were defined using cluster analysis and classic transcriptional markers. Seq-Well single-cell RNA-Seq analysis of cells taken from nasal wash samples from healthy donors and patients diagnosed with either COVID-19 or influenza A
h
Data from: Human iPSC-derived neural stem cells displaying radial glia...
ordr.hsr.it
Updated Apr 17, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Luciani (2025). Human iPSC-derived neural stem cells displaying radial glia signature exhibit long-term safety in mice [Dataset]. http://doi.org/10.17632/3wg6kztvvd.1
Explore at:
Unique identifier
https://doi.org/10.17632/3wg6kztvvd.1
Dataset updated
Apr 17, 2025
Authors
Marco Luciani
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The dataset includes the raw data of graphs shown in figures 1, 2, 3, 5, 6, and 7 and in supplementary figures 1, 2, 3, 4, 5, 7, and 8 of the paper: Human iPSC-derived neural stem cells displaying radial glia signature exhibit long-term safety in mice. Luciani M, Garsia C, Beretta S, Cifola I, Peano C, Merelli I, Petiti L, Miccio A, Meneghini V, Gritti A. Nat Commun. 2024 Nov 1;15(1):9433. doi: 10.1038/s41467-024-53613-7. PMID: 39487141 The bulk RNA-seq, SREBF1-deficient RNA-seq, and ChIP-seq data generated in this study have been deposited at GEO under accession number GSE239446. The single-cell RNA-seq data generated in this study have been deposited at GEO under accession number GSE238206. The processed RNA-seq, ChIP-seq, and scRNA-seq data (list of DEGs and GO terms) are available in Supplementary Data 1-4 files.

Human induced pluripotent stem cell-derived neural stem/progenitor cells (hiPSC-NSCs) hold promise for treating neurodegenerative and demyelinating disorders. However, comprehensive studies on their identity and safety remain limited. In this study, we demonstrate that hiPSC-NSCs adopt a radial glia-associated signature, sharing key epigenetic and transcriptional characteristics with human fetal neural stem cells (hfNSCs) while exhibiting divergent profiles from glioblastoma stem cells. Long-term transplantation studies in mice showed robust and stable engraftment of hiPSC-NSCs, with predominant differentiation into glial cells and no evidence of tumor formation. Additionally, we identified the Sterol Regulatory Element Binding Transcription Factor 1 (SREBF1) as a regulator of astroglial differentiation in hiPSC-NSCs. These findings provide valuable transcriptional and epigenetic reference datasets to prospectively define the maturation stage of NSCs derived from different hiPSC sources and demonstrate the long-term safety of hiPSC-NSCs, reinforcing their potential as a viable alternative to hfNSCs for clinical applications.
h
Reprogramming liver metastasis-associated macrophages towards an...
ordr.hsr.it
Updated May 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Chiara Bresesti (2025). Reprogramming liver metastasis-associated macrophages towards an anti-tumoral phenotype through enforced miR-342 expression [Dataset]. http://doi.org/10.17632/4gpbv5vpcr.1
Explore at:
Unique identifier
https://doi.org/10.17632/4gpbv5vpcr.1
Dataset updated
May 8, 2025
Authors
Chiara Bresesti
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Upon metastatic seeding in the liver, liver macrophages, including Kupffer cells, acquire a transcriptional profile typical of tumor-associated macrophages (TAMs), which support tumor progression. MicroRNAs (miRNAs) fine-tune TAM pro-tumoral functions, making their modulation a promising strategy for macrophage reprogramming into an anti-tumoral phenotype. Here, we analyze the transcriptomic profiles of liver and splenic macrophages, identifying miR-342-3p as a key regulator of liver macrophage function. miR-342-3p is highly active in healthy liver macrophages but significantly downregulated in colorectal cancer liver metastases (CRLMs). Lentiviral vector-engineered liver macrophages enforcing miR-342-3p expression acquire a pro-inflammatory phenotype and reduce CRLM growth. We identify Slc7a11, a cysteine-glutamate antiporter linked to pro-tumoral activity, as a direct miR-342-3p target, which may be at least partially responsible for TAM phenotypic reprogramming. Our findings highlight the potential of in vivo miRNA modulation as a therapeutic strategy for TAM reprogramming, offering an approach to enhance cancer immunotherapy.

Data and code availability: Next-generation sequencing data are deposited at the Gene Expression Omnibus (GEO) with the following accession numbers: GEO: GSE274043 (scRNA-seq), GSE274044 (RNA-seq on iKCs), GSE274045 (small RNA-seq on splenic and hepatic cell populations), and GSE274046 (bulk RNA-seq on splenic and hepatic cell populations). The code is available at GitLab: http://www.bioinfotiget.it/gitlab/custom/Bresesti_Cell_Reports_2025.
d
Data from: Regional epithelial cell diversity in the small intestine of pigs...
datasets.ai
agdatacommons.nal.usda.gov
+1more
45
Updated Sep 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Department of Agriculture (2024). Regional epithelial cell diversity in the small intestine of pigs [Dataset]. https://datasets.ai/datasets/regional-epithelial-cell-diversity-in-the-small-intestine-of-pigs-059d5
Explore at:
45Available download formats
Dataset updated
Sep 11, 2024
Dataset authored and provided by
Department of Agriculture
Description
Single cell suspensions enriched for epithelial cells were obtained from duodenum, jejunum, and ileum of a 7.5-week-old pig and subjected to single-cell RNA sequencing (scRNA-seq). scRNA-seq was performed to provide transcriptomic profiles of epithelial cells, with 695 cells annotated into 6 cell types. Deeper interrogation of data revealed previously undescribed cells in porcine intestine, and region-specific gene expression profiles within specific cell subsets. Data herein includes a .h5seurat files of the epithelial cell subsets analyzed. Files may be used to reconstruct different analyses and perform further data query. Scripts for original data analyses are found at https://github.com/USDA-FSEPRU/scRNAseqEpSI_Pilot. Raw data are available at GEO accession GSE208613. Data are available for online query at https://singlecell.broadinstitute.org/single_cell/study/SCP1936/regional-epithelial-cell-diversity-in-the-small-intestine-of-pigs.

Resources in this dataset:

Resource Title: .h5Seurat object - epithelial cells.

File Name: EpithelialCells.tar
Resource Description: Epithelial cells used for data analysis, available in .h5Seurat file format. Untar file before use.

Facebook

Twitter

Click to copy link

Link copied

Cite

Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

Explore at:

Dataset updated

Nov 20, 2023

Dataset provided by

Hsu, Jonathan
Stoop, Allart

Description

Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.

Clear search

Close search

Google apps

Main menu

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions

Field-wide assessment of differential HT-seq from NCBI GEO database

Repository for the single cell RNA sequencing data analysis for the human...

Field-wide assessment of differential HT-seq from NCBI GEO database

Field-wide assessment of differential HT-seq from NCBI GEO database

Single cell RNA-seq data of human hESCs to evaluate SCnorm: robust...

Neonatal hematopoietic stem and progenitor cell ontogeny at single cell...

Table1_Prognostic value of autophagy-related genes based on single-cell...

Single cell RNA-seq of brain from young and old mice

GSE120180

Single-Cell Gene Expression Profiles for Classification Problems

A field-wide assessment of differential RNAseq reveals ubiquitous bias

GSE208154

Single-cell RNA-seq exposed to multiple compounds

Data and Context

Particular data:

Related datasets:

Inspiration

Data for 'Comparative Analysis of Single-Cell RNA Sequencing Methods'

Data from: Human Bone Marrow Assessment by Single Cell RNA Sequencing, Mass...

Human nasal epithelial and immune cell responses to SARS-CoV-2 versus...

Data from: Human iPSC-derived neural stem cells displaying radial glia...

Reprogramming liver metastasis-associated macrophages towards an...

Data from: Regional epithelial cell diversity in the small intestine of pigs...

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions