100+ datasets found

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17058077.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yunshun Chen; Gordon Smyth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
zenodo.org
data.niaid.nih.gov
bin, txt
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. http://doi.org/10.5281/zenodo.10011622
Explore at:
bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10011622
Dataset updated
Nov 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Jonathan Hsu; Allart Stoop; Jonathan Hsu; Allart Stoop
Description
Table of Contents
Main Description
File Descriptions
Linked Files
Installation and Instructions

1. Main Description
---------------------------
This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled `marengo_code_for_paper_jan_2023.R` was used to generate the figures from the single-cell RNA sequencing data.
The following libraries are required for script execution:
Seurat
scReportoire
ggplot2
stringr
dplyr
ggridges
ggrepel
ComplexHeatmap

File Descriptions
---------------------------
The code can be downloaded and opened in RStudios.
The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper
The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113).
The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots.
The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files
---------------------

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data.
Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719
Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment.
Description: This submission contains the **raw sequencing** or `.fastq.gz` files, which are tab delimited text files.
Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)
Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.
Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code.
Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions
--------------------------------------
The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

> Ensure you have R version 4.1.2 or higher for compatibility.

> Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
3. Set your working directory to where the following files are located:
marengo_code_for_paper_jan_2023.R
Install_Packages.R
Marengo_newID_March242023.rds
genes_for_heatmap_fig5F.xlsx
all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:
> setwd(directory)

4. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
5. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
6. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
7. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
o
Repository for the single cell RNA sequencing data analysis for the human...
explore.openaire.eu
Updated Aug 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8286134
Dataset updated
Aug 26, 2023
Authors
Jonathan; Andrew; Pierre; Allart; Adrian
Description
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.
Scripts for Analysis
figshare.com
txt
Updated Jul 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6783569.v2
Dataset updated
Jul 18, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sneddon Lab UCSF
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
o
WORKSHOP: Single cell RNAseq analysis in R
explore.openaire.eu
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter (2023). WORKSHOP: Single cell RNAseq analysis in R [Dataset]. http://doi.org/10.5281/zenodo.10042918
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10042918
Dataset updated
Sep 26, 2023
Authors
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter
Description
This record includes training materials associated with the Australian BioCommons workshop 'Single cell RNAseq analysis in R'. This workshop took place over two, 3.5 hour sessions on 26 and 27 October 2023. Event description Analysis and interpretation of single cell RNAseq (scRNAseq) data requires dedicated workflows. In this hands-on workshop we will show you how to perform single cell analysis using Seurat - an R package for QC, analysis, and exploration of single-cell RNAseq data. We will discuss the 'why' behind each step and cover reading in the count data, quality control, filtering, normalisation, clustering, UMAP layout and identification of cluster markers. We will also explore various ways of visualising single cell expression data. This workshop is presented by the Australian BioCommons, Queensland Cyber Infrastructure Foundation (QCIF) and the Monash Genomics and Bioinformatics Platform with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative. Lead trainers: Sarah Williams, Adele Barugahare, Paul Harrison, Laura Perlaza Jimenez Facilitators: Nick Matigan, Valentine Murigneux, Magdalena (Magda) Antczak Infrastructure provision: Uwe Winter Coordinator: Melissa Burke Training materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. scRNAseq_Schedule (PDF): A breakdown of the topics and timings for the workshop Materials shared elsewhere: This workshop follows the tutorial 'scRNAseq Analysis in R with Seurat' https://swbioinf.github.io/scRNAseqInR_Doco/index.html Slides used to introduce key topics are available via GitHub https://github.com/swbioinf/scRNAseqInR_Doco/tree/main/slides This material is based on the introductory Guided Clustering Tutorial tutorial from Seurat. It is also drawing from a similar workshop held by Monash Bioinformatics Platform Single-Cell-Workshop, with material here.
Z
Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...
data.niaid.nih.gov
zenodo.org
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Melissa M. Holmes (2021). Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4278129
Explore at:
Dataset updated
Feb 12, 2021
Dataset provided by
Lauren Erdman
Michael D Wilson
Anna Goldenberg
Helen Zhu
Mariela Faykoo-Martinez
Cadia Chan
Huayun Hou
Dustin Sokolowski
Melissa M. Holmes
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data repository for the scMappR manuscript:

Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).

RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.
l
cellCounts
opal.latrobe.edu.au
researchdata.edu.au
bin
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.26181/21588276.v3
Dataset updated
Dec 19, 2022
Dataset provided by
La Trobe
Authors
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.

R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html

Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
c
Data from: Reference transcriptomics of porcine peripheral immune cells...
s.cnmilf.com
agdatacommons.nal.usda.gov
+3more
Updated Jun 5, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Agricultural Research Service (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. https://s.cnmilf.com/user74170196/https/catalog.data.gov/dataset/data-from-reference-transcriptomics-of-porcine-peripheral-immune-cells-created-through-bul-e667c
Explore at:
Dataset updated
Jun 5, 2025
Dataset provided by
Agricultural Research Service
Description
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Source data for Manuscript (R result)
figshare.com
application/gzip
Updated Feb 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hongzhou Lin (2023). Source data for Manuscript (R result) [Dataset]. http://doi.org/10.6084/m9.figshare.22139771.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22139771.v1
Dataset updated
Feb 22, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Hongzhou Lin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
singlecell seq result.rds : The Seurat Object which contains the single cell seq result in R. cellchat result.csv : The Ligand and receptor pairs by cellchat

molecular dock.pdb : The molecular dock result of BMP7 and adriamycin

Quantitative results for WB and qPCR.pzfx :

Quantitative results for WB and qPCR in Prism

Original Images for Westernblot.pdf:

Original Images for Westernblot (PDF Version)
f
MOESM11 of Benchmarking principal component analysis for large-scale...
springernature.figshare.com
application/x-gzip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido (2023). MOESM11 of Benchmarking principal component analysis for large-scale single-cell RNA-sequencing [Dataset]. http://doi.org/10.6084/m9.figshare.11662101.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11662101.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Authors
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 11 Pair plots of all the pCA (Brain) implementations.
Data used in SeuratIntegrate paper
zenodo.org
application/gzip, bin +2
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Specque; Florian Specque; Macha Nikolski; Macha Nikolski; Domitille Chalopin; Domitille Chalopin (2025). Data used in SeuratIntegrate paper [Dataset]. http://doi.org/10.5281/zenodo.15496601
Explore at:
bin, pdf, txt, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15496601
Dataset updated
May 23, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Specque; Florian Specque; Macha Nikolski; Macha Nikolski; Domitille Chalopin; Domitille Chalopin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository gathers the data and code used to generate hepatocellular carcinoma analyses in the paper presenting SeuratIntegrate. It contains the scripts to reproduce the figures presented in the article. Some figures are also available as pdf files.

To be able to fully reproduce the results from the paper, one shoud:

download all the files

install R 4.3.3, with correspondig base R packages (stats, graphics, grDevices, utils, datasets, methods and base)

install R packages listed in the file sessionInfo.txt

install the provided version of SeuratIntegrate. In an R session, run:

remotes::install_local("path/to/SeuratIntegrate_0.4.1.tar.gz")

install (mini)conda if necessary (we used miniconda version 23.11.0)

install the conda environments (if it fails with the *package-list.yml files, use the *package-list-from-history.yml files instead):

conda env create --file SeuratIntegrate_bbknn_package-list.yml conda env create --file SeuratIntegrate_scanorama_package-list.yml conda env create --file SeuratIntegrate_scvi-tools_package-list.yml conda env create --file SeuratIntegrate_trvae_package-list.yml

open an R session to make the conda environments usable by SeuratIntegrate:

library(SeuratIntegrate) UpdateEnvCache("bbknn", conda.env = "SeuratIntegrate_bbknn", conda.env.is.path = FALSE) UpdateEnvCache("scanorama", conda.env = "SeuratIntegrate_scanorama", conda.env.is.path = FALSE) UpdateEnvCache("scvi", conda.env = "SeuratIntegrate_scvi-tools", conda.env.is.path = FALSE) UpdateEnvCache("trvae", conda.env = "SeuratIntegrate_trvae", conda.env.is.path = FALSE)

Once done, running the code in integrate.R should produce reproducible results. Note that lines 3 to 6 from integrate.R should be adapted to the user's setup.
integrate.R is subdivided into six main parts:

Preparation: lines 1-56

Preprocessing: lines 58-74

Integration: lines 76-121

Processing of integration outputs: lines 126-267

Scoring of integration outputs: lines 269-353

Plotting: lines 380-507

Intermediate SeuratObjects have been saved between steps 3 and 4 and 5 and 6 (liver10k_integrated_object.RDS and liver10k_integrated_scored_object.RDS respectively). It is possible to start with these intermediate SeuratObjects to avoid the preceding steps, given that the Preparation step is always run before.
Single cell T-cell Atlas (V3)
repository.uantwerpen.be
zenodo.org
Updated 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mullan, Kerry A. (2024). Single cell T-cell Atlas (V3) [Dataset]. http://doi.org/10.5281/ZENODO.12606320
Explore at:
Unique identifier
https://doi.org/10.5281/ZENODO.12606320
Dataset updated
2024
Dataset provided by
Zenodohttp://zenodo.org/
University of Antwerp
Faculty of Sciences. Mathematics and Computer Science
Authors
Mullan, Kerry A.
Description
The attached datasets comprised of the merging of 21 high quality single cell T cell based dataset that had both the TCR-seq and GEx. The object contains ~1.3 paired TCR-seq with GEx in the Seurat Object (supercluster_added_ID-240531.rds). We also included the original identifiers in the Sup_Update_labels.csv a. See our https://stegor.readthedocs.io/en/latest/ for how we processed the 12 datasets (V2) and decided on the current 47 T cell annotation models using scGate (TcellFunction). Additionally, based on collaborator recommendataion, we have also now included a simpler T cell annotion model in STEGO.R process (Tsimplefunctions). This is the accompanying data set for the paper entitled ‘T cell receptor-centric approach to streamline multimodal single-cell data analysis.’, which is currently available as a preprint (https://www.biorxiv.org/content/10.1101/2023.09.27.559702v2). Details on the origin of the datasets, and processing steps can be found there. The purpose of this atlas both the full dataset and down sampling version is to aid in improving the interpretability of other T cell based datasets. This can be done by adding in the down sampled object that contains up to 500 cells per annotation model. This dataset aims to improve the capacity to identify TCR-specific signature by ensuring a well covered background, which will improve the robustness of the FindMarker Function in Seurat package.
Data from: Robust clustering and interpretation of scRNA-seq data using...
zenodo.org
data.niaid.nih.gov
bin, txt, zip
Updated May 27, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Schmidt; Florian Schmidt; Bobby Ranjan; Bobby Ranjan (2021). Robust clustering and interpretation of scRNA-seq data using reference component analysis [Dataset]. http://doi.org/10.5281/zenodo.4021967
Explore at:
bin, txt, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4021967
Dataset updated
May 27, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Schmidt; Florian Schmidt; Bobby Ranjan; Bobby Ranjan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Datasets and Code accompanying the new release of RCA, RCA2. The R-package for RCA2 is available at GitHub: https://github.com/prabhakarlab/RCAv2/

The datasets included here are:

Datasets required for a characterization of batch effects:

merged_rna_seurat.rds

de_list.rds

mergedRCAObj.rds

merged_rna_integrated.rds

10X_PBMCs.RDS: Processed 10X PBMC data RCA2 object (10X PBMC example data sets )

NBM_RDS_Files.zip: Several RDS files containing RCA2 object of Normal Bone Marrow (NBM) data, umap coordinates, doublet finder results and metadata information (Normal Bone Marrow use case)

Dataset used for the Covid19 example:

blish_covid.seu.rds

rownames_of_glocal_projection_immune_cells.txt

Data sets used to outline the ability of supervised clustering to detect disease states:

809653.seurat.rds

blish_covid.seu.rds

Performance benchmarking results:

Memory_consumption.txt

rca_time_list.rds

The R script provides R code to regenerate the main paper Figures 2 to 7 modulo some visual modifications performed in Inkscape.

Provided R scripts are:

ComputePairWiseDE_v2.R (Required code for pairwise DE computation)

RCA_Figure_Reproduction.R
n
Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...
data.niaid.nih.gov
datadryad.org
+1more
zip
Updated Jun 1, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow (2020). Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection [Dataset]. http://doi.org/10.5061/dryad.vt4b8gtp6
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.vt4b8gtp6
Dataset updated
Jun 1, 2020
Dataset provided by
Dartmouth College
Dartmouth–Hitchcock Medical Center
Authors
Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.

Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).

Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.

Preprocessing of single cell RNA sequencing (scRNA-seq) data

Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.

Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq

Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.

Differential gene expression by immune cells

Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.

Analysis of arsenic effect on immune cell gene expression by scRNA-seq.

Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.
Single-cell RNA-seq exposed to multiple compounds
kaggle.com
zip
Updated Mar 11, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2021). Single-cell RNA-seq exposed to multiple compounds [Dataset]. https://www.kaggle.com/alexandervc/singlecell-rnaseq-exposed-to-multiple-compounds
Explore at:
zip(4314885205 bytes)Available download formats
Dataset updated
Mar 11, 2021
Authors
Alexander Chervov
Description
Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Remark 2: See same data at: https://www.kaggle.com/datasets/alexandervc/scrnaseq-exposed-to-multiple-compounds extracted pieces from huge file here - more easy to load and work.

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data:

Data - scRNA expressions for several cell lines affected by drugs with different doses/durations.

The data from https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE139944 Status Public on Dec 05, 2019 Title Massively multiplex chemical transcriptomics at single cell resolution Organisms Homo sapiens; Mus musculus Experiment type Expression profiling by high throughput sequencing Summary Single-cell RNA-seq libraries were generated using two and three level single-cell combinatorial indexing RNA sequencing (sci-RNA-seq) of untreated or small molecule inhibitor exposed HEK293T, NIH3T3, A549, MCF7 and K562 cells. Different cells and different treatment were hashed and pooled prior to sci-RNA-seq using a nuclear barcoding strategy. This nuclear barcoding strategy relies on fixation of barcode containing well-specific oligos that are specific to a given cell type, replicate or treatment condition.

The corresponding paper is here: https://pubmed.ncbi.nlm.nih.gov/31806696/ Science. 2020 Jan 3;367(6473):45-51 "Massively multiplex chemical transcriptomics at single-cell resolution" Sanjay R Srivatsan, ... , Cole Trapnell

The authors splitted data into 4 subdatasets - see sciPlex1, sciPlex2, sciPlex3,sciPlex4 in filenames. The main dataset is the sciPlex3 which contains about 600K cells.

The data splitted into small parts - which one can be easily loaded into memory can be found in https://www.kaggle.com/alexandervc/scrnaseq-exposed-to-multiple-compounds

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

A collection of some bioinformatics related resources on kaggle: https://www.kaggle.com/general/203136

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
MOESM15 of Benchmarking principal component analysis for large-scale...
springernature.figshare.com
txt
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido (2023). MOESM15 of Benchmarking principal component analysis for large-scale single-cell RNA-sequencing [Dataset]. http://doi.org/10.6084/m9.figshare.11662113.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11662113.v1
Dataset updated
Jun 1, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 15 Crashed jobs caused by out-of-memory errors.
Data from: MOESM8 of Benchmarking principal component analysis for...
springernature.figshare.com
application/x-gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido (2023). MOESM8 of Benchmarking principal component analysis for large-scale single-cell RNA-sequencing [Dataset]. http://doi.org/10.6084/m9.figshare.11662170.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11662170.v1
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 8 Pair plots of all the pCA (PBMCs) implementations.
o
Datasets associated with the publication of the "satuRn" R package
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Jan 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeroen Gilis; Kristoffer Vitting-Seerup; Koen Van den Berge; Lieven Clement (2021). Datasets associated with the publication of the "satuRn" R package [Dataset]. http://doi.org/10.5281/zenodo.6826603
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6826603
Dataset updated
Jan 14, 2021
Authors
Jeroen Gilis; Kristoffer Vitting-Seerup; Koen Van den Berge; Lieven Clement
Description
On this Zenodo link, we share the data that is required to reproduce all the analyses from our publication "satuRn: Scalable Analysis of differential Transcript Usage for bulk and single-cell RNA-sequencing applications". This repository includes input transcript-level expression matrices and metadata for all datasets, as well as intermediate results and final outputs of the respective DTU analyses. For a more elaborate description of the data, we refer to the companion GitHub for our publications; https://github.com/statOmics/satuRnPaper. Note that this is version 1.0.3 of the data (uploaded on 2022-07-08). If any changes were to be made to the datasets in the future, this will also be communicated on our companion GitHub page.
f
Processed naive T cell single-cell RNA-seq, Seurat object
figshare.com
application/gzip
Updated Jan 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Bunis (2021). Processed naive T cell single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11886891.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11886891.v2
Dataset updated
Jan 5, 2021
Dataset provided by
figshare
Authors
Daniel Bunis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Processed naive CD4 and CD8 T cell single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis
Single-cell and spatial transcriptomics of stricturing Crohn's disease
zenodo.org
application/gzip, bin
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier; Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier (2025). Single-cell and spatial transcriptomics of stricturing Crohn's disease [Dataset]. http://doi.org/10.5281/zenodo.14509802
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14509802
Dataset updated
Sep 1, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier; Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the spatial transcriptomics data + code. This code was generated by members of the Smillie Lab @ MGH and Harvard Medical School.

github.tar.gz: spatial analysis code and data

anndata.h5ad: anndata object (scanpy)

V*tar.gz: raw spatial transcriptomics files

The github.tar.gz folder contains everything you need to reproduce the spatial transcriptomics figures. It is structured as follows:

1.BayesPrism: code for running BayesPrism on spatial data

2.SparCC: code for running SparCC on spatial data

3.Lasso: code for running lasso regression on spatial data

4.Analysis: code for reproducing all figures in the paper

4.Analysis/1.analysis.r: script to reproduce all figures in the paper ***

code: code library containing all necessary functions

load_data.r: code to load the single-cell and spatial datasets

sco.rds: single-cell analysis object (10X Chromium) formatted as an R list

vis.rds: spatial analysis object (10X Visium) formatted as an R list

All scripts are numbered. You need to run everything in order. For convenience, we include the output files for 1.BayesPrism, 2.SparCC, and 3.Lasso, allowing you to skip straight to the analysis code in 4.Analysis.

To reproduce all figures in the paper, you need to do the following:

Edit your PROJECT_FOLDER in the header of load_data.r

Install the packages listed at the top of load_data.r

Go to the 4.Analysis directory, start an interactive R session, and type:
> source('1.analysis.r')

This will load the beginning of the 1.analysis.r script (until the stop() statement on line 68). You can run the code in two different ways:

You can step through the code line by line in your interactive R session (starting at line 68)

Alternatively, remove the stop() statement from the script, then run the code start to finish

If you encounter any errors, try to debug them using a combination of Google+ChatGPT. If you still have trouble, please contact the Smillie Lab.

Note: the single-cell and spatial code are also available on GitHub. However, the spatial analysis requires large files that cannot be hosted on GitHub. Therefore, it is better to download the code + files from Zenodo. The GitHub link is provided below:

https://github.com/LJ-Kong/fibrosis_scRNA_stRNA

Facebook

Twitter

Click to copy link

Link copied

Cite

Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1

Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.17058077.v1

Dataset updated

May 31, 2023

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Yunshun Chen; Gordon Smyth

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.

Clear search

Close search

Google apps

Main menu

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

Repository for the single cell RNA sequencing data analysis for the human...

Scripts for Analysis

WORKSHOP: Single cell RNAseq analysis in R

Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...

cellCounts

Data from: Reference transcriptomics of porcine peripheral immune cells...

Source data for Manuscript (R result)

MOESM11 of Benchmarking principal component analysis for large-scale...

Data used in SeuratIntegrate paper

Single cell T-cell Atlas (V3)

Data from: Robust clustering and interpretation of scRNA-seq data using...

Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...

Single-cell RNA-seq exposed to multiple compounds

Data and Context

Particular data:

Related datasets:

Inspiration

MOESM15 of Benchmarking principal component analysis for large-scale...

Data from: MOESM8 of Benchmarking principal component analysis for...

Datasets associated with the publication of the "satuRn" R package

Processed naive T cell single-cell RNA-seq, Seurat object

Single-cell and spatial transcriptomics of stricturing Crohn's disease

Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues