18 datasets found

Scripts for Analysis
figshare.com
txt
Updated Jul 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6783569.v2
Dataset updated
Jul 18, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sneddon Lab UCSF
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
f
Processed CODEX Data (Seurat Objects)
plus.figshare.com
application/gzip
Updated Apr 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Shovik Bandyopadhyay; Jonathan Sussman; Kyung Jin Ahn; Kai Tan (2024). Processed CODEX Data (Seurat Objects) [Dataset]. http://doi.org/10.25452/figshare.plus.25127657.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.25452/figshare.plus.25127657.v1
Dataset updated
Apr 12, 2024
Dataset provided by
Figshare+
Authors
Shovik Bandyopadhyay; Jonathan Sussman; Kyung Jin Ahn; Kai Tan
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Seurat objects containing the raw and normalized data for:Normal bone marrow (NBM) atlas: contains all cells obtained through segmentation after filtering and QC. Includes coarse and fine level of annotations that were obtained through an iterative process of subclustering. Neighborhood analysis results are included as a metadata column. Additional Osteo-MSC and Fibro-MSC cells that were manually annotatedAML/NSM CODEX data: contains all cells after filtering for 3 diagnostic and 2 post-therapy AML samples as well as 3 negative staging marrow samples. Cell labels were derived through reciprocal principal component analysis (RPCA) reference mapping onto the normal bone marrow atlas. Neighborhood analysis was conducted separately for AML Diagnostic, AML Post-Therapy, and NSM samples. Neighborhoods were manually annotated for each set. The results of the neighborhood analysis were merged and included in the metadata of the Seurat object. All normalized data is stored in the Seurat assay object. Markers that were not included in normalization and downstream analysis are included with raw values as a metadata column. Full source code used to generate these objects can be found on GitHub: https://github.com/shovikb94/spatial-bonemarrow-atlas/tree/mainSee related materials in Collection at: https://doi.org/10.25452/figshare.plus.c.7174914
d
scRNA-seq_huang2019
search.dataone.org
dataverse.harvard.edu
Updated Nov 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Huang, Kee Wui (2023). scRNA-seq_huang2019 [Dataset]. http://doi.org/10.7910/DVN/QB5CC8
Explore at:
Unique identifier
https://doi.org/10.7910/DVN/QB5CC8
Dataset updated
Nov 22, 2023
Dataset provided by
Harvard Dataverse
Authors
Huang, Kee Wui
Description
Serialized R data files (.rds) associated with the inDrop single-cell RNA-seq analysis in Huang et al., 2019. Each file has a single Seurat object containing a subset of clusters from the full processed dataset, which were separated into different objects due to file size limitations. Raw data (UMIFM counts) are included in the corresponding slot in each Seurat object. Seurat objects can be re-merged into a single object containing the full dataset using the MergeSeurat function.
scRNAseq_Dataset Merge AMI d5 (CD45+Fibroblast) + AAA Kinetik +...
zenodo.org
Updated Mar 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Lang; Alexander Lang (2023). scRNAseq_Dataset Merge AMI d5 (CD45+Fibroblast) + AAA Kinetik + Cite-Seq_Dataset AG Gerdes [Dataset]. http://doi.org/10.5281/zenodo.7774809
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7774809
Dataset updated
Mar 28, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alexander Lang; Alexander Lang
Description
Integration Skript:

library(Seurat)
library(tidyverse)
library(Matrix)

#cite <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Merge AAA mit Cite AAA/Cite_seq_v0.41.rds")
#CD45 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/CD45.rds")
AAA <- readRDS("C:/Users/alex/sciebo/AAA_Zhao_v4.rds")
cite <- readRDS("C:/Users/alex/sciebo/CITE_Seq_v0.5.rds")
all4 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/Schrader_All4_Rohanalyse/all4_220228.rds")

#fuse lists
c <- list(cite, all4, AAA)
names(c) <- c("cite", "all4", "AAA")

pancreas.list <- c[c("cite", "all4", "AAA")]
for (i in 1:length(pancreas.list)) {
pancreas.list[[i]] <- SCTransform(pancreas.list[[i]], verbose = FALSE)
}

pancreas.features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
#options(future.globals.maxSize= 6091289600)
#pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = pancreas.features,
#verbose = FALSE) #future.globals.maxsize was to low. changed it to options(future.globals.maxSize= 1091289600)
#identify anchors

#alternative from tutorial (https://satijalab.org/seurat/articles/integration_introduction.html)
#memory.limit(9999999999)
features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = features)
pancreas.anchors <- FindIntegrationAnchors(object.list = pancreas.list, normalization.method = "SCT", anchor.features = pancreas.features, verbose = FALSE)
pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, normalization.method = "SCT",
verbose = FALSE)

setwd("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper")

saveRDS(pancreas.integrated, file = "integrated_AAA_Cite_AMI.rds")

saveRDS(cd45, file = "integrated_AAA_Cite_CD45.rds")

seurat <- pancreas.integrated

#seurat <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/integrated_d5_cite.rds")

DefaultAssay(object = seurat) <- "integrated"
seurat <- FindVariableFeatures(seurat, selection.method = "vst", nfeatures = 3000)
seurat <- ScaleData(seurat, verbose = FALSE)
seurat <- RunPCA(seurat, npcs = 30, verbose = FALSE)
seurat <- FindNeighbors(seurat, dims = 1:30)
seurat <- FindClusters(seurat, resolution = 0.5)
seurat <- RunUMAP(seurat, reduction = "pca", dims = 1:30)
DimPlot(seurat, reduction = "umap", split.by = "treatment") + NoLegend()

DimPlot(seurat, label = T, repel = T) + NoLegend()

DefaultAssay(object = seurat) <- "ADT"
adt_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.3)
write.csv(adt_marker_integrated, file = "adt_marker_all4_integrated.csv")

DefaultAssay(object = seurat) <- "RNA"
RNA_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.5)
write.csv(RNA_marker_integrated, file = "RNA_marker_all4_integrated.csv")

DimPlot(seurat, label = T, repel = T, split.by = "tissue") + NoLegend()

FeaturePlot(seurat, features = "Cd40", order = T, label = T)
FeaturePlot(seurat, features = "Ms.CD40", order = T, label = T)

#####
#leanup:
> seurat@meta.data[["sen_score1"]] <- NULL
> seurat@meta.data[["sen_score2"]] <- NULL
> seurat@meta.data[["sen_score3"]] <- NULL
> seurat@meta.data[["sen_score4"]] <- NULL
> seurat@meta.data[["sen_score5"]] <- NULL
> seurat@meta.data[["sen_score6"]] <- NULL
> seurat@meta.data[["sen_score7"]] <- NULL
> seurat@meta.data[["pANN_0.25_0.1_1211"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_1211"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_466"]] <- NULL
> seurat@assays[["prediction.score.celltype"]] <- NULL
> seurat@meta.data[["predicted.celltype"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_184"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_953"]] <- NULL
> seurat@meta.data[["integrated_snn_res.3"]] <- NULL
> seurat@meta.data[["RNA_snn_res.3"]] <- NULL
> seurat@meta.data[["SingleR"]] <- NULL
> seurat@meta.data[["SingleR_fine"]] <- NULL
> seurat@meta.data[["ImmGen"]] <- NULL
> seurat@meta.data[["ImmGen_fine"]] <- NULL
> seurat@meta.data[["percent.mt"]] <- NULL
> seurat@meta.data[["nCount_integrated"]] <- NULL
> seurat@meta.data[["nFeature_integrated"]] <- NULL
> seurat@meta.data[["S.Score"]] <- NULL
> seurat@meta.data[["G2M.Score"]] <- NULL
> seurat@meta.data[["Phase"]] <- NULL
> seurat@meta.data[["sen_score8"]] <- NULL
> seurat@meta.data[["sen_score9"]] <- NULL
> seurat@meta.data[["sen_score10"]] <- NULL
> seurat@meta.data[["sen_score11"]] <- NULL
> seurat@meta.data[["sen_score12"]] <- NULL
> seurat@meta.data[["sen_score13"]] <- NULL
> seurat@meta.data[["sen_score14"]] <- NULL
> seurat@meta.data[["sen_score15"]] <- NULL
> seurat@meta.data[["sen_score16"]] <- NULL
> seurat@meta.data[["sen_score17"]] <- NULL
> seurat@meta.data[["sen_score18"]] <- NULL
> seurat@meta.data[["sen_score19"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_184"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_953"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_466"]] <- NULL
utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...
zenodo.org
zip
Updated Apr 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Borcherding; Nicholas Borcherding (2022). utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments with TCR [Dataset]. http://doi.org/10.5281/zenodo.4995299
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4995299
Dataset updated
Apr 6, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicholas Borcherding; Nicholas Borcherding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

The original intent of assembling a data set of publicly-available tumor-infiltrating T cells (TILs) with paired TCR sequencing was to expand and improve the scRepertoire R package. However, after some discussion, we decided to release the data set for everyone, a complete summary of the sequencing runs and the sample information can be found in the meta data of the Seurat object. This repository contains the code for the initial processing and annotating of the data set (we are calling this version 0.0.1). This involves several steps 1) loading the respective GE data, 2) harmonizing the data by sample and cohort information, 3) iterating through automatic annotation, 4) unifying annotation via manual inspection and enrichment analysis, and 5) adding the TCR information.

Methods

Single-Cell Data Processing

The filtered gene matrices output from Cell Ranger align function from individual sequencing runs (10x Genomics, Pleasanton, CA) loaded into the R global environment. For each sequencing run cell barcodes were appended to contain a unique prefix to prevent issues with duplicate barcodes. The results were then ported into individual Seurat objects (citation), where the cells with > 10% mitochondrial genes and/or 2.5x natural log distribution of counts were excluded for quality control purposes. At the individual sequencing run level, doublets were estimated using the scDblFinder (v1.4.0) R package. All the sequencing runs across experiments were merged into a single Seurat Object using the merge() function. All the data was then normalized using the default settings and 2,000 variable genes were identified using the "vst" method. Next the data was scaled with the default settings and principal components were calculated for 40 components. Data was integrated using the harmony (v1.0.0) R package (citation) using both cohort and sample information to correct for batch effect with up to 20 iterations. The UMAP was created using the runUMAP() function in Seurat, using 20 dimensions of the harmony calculations.

Annotation of Cells

Automatic annotation was performed using the singler (v1.4.1) R package (citation) with the HPCA (citation) and DICE (citation) data sets as references and the fine label discriminators. Individual sequencing runs were subsetted to run through the singleR algorithm in order to reduce memory demands. The output of all the singleR analyses were collated and appended to the meta data of the seurat object. Likewise, the ProjecTILs (v0.4.1) R Package (citation) was used for automatic annotation as a partially orthogonal approach. Consensus annotation was derived from all 3 databases (HPCA, DICE, ProjecTILs) using a majority approach. No annotation designation was assigned to cells that returned NA for both singleR and ProjecTILs. Mixed annotations were designated with SingleR identified non-Tcells and ProjecTILs identified T cells. Cell type designations with less than 100 cells in the entire cohort were reduced to "other". Automated annotations were checked manually using canonical marker genes and gene enrichment analysis performed using UCell (v1.0.0) R package (citation).

Addition of TCR data

The filtered contig annotation T cell receptor (TCR) data for available sequencing runs were loaded into the R global environment. Individual contigs were combined using the combineTCR() function of scRepertoire (v1.3.2) R Package (citation). Clonotypes were assigned to barcodes and were multiple duplicate chains for individual cells were filtered to select for the top expressing contig by read count. The clonotype data was then added to the Seurat Object with proportion across individual patients being used to calculate frequency.

Citations

As of right now, there is no citation associated with the assembled data set. However if using the data, please find the corresponding manuscript for each data set in the meta.data of the single-cell object. In addition, if using the processed data, feel free to modify the language in the methods section (above) and please cite the appropriate manuscripts of the software or references that were used.

Itemized List of the Software Used

Seurat v4.0.3 - citation

harmony v1.0 - citation

singler v1.4.1 - citation

ProjecTILs v0.4.1 - citation

UCell v1.0.0 - citation

scRepertoire v1.3.2 - citation

Itemized List of Reference Data Used

Human Primary Cell Atlas (HPCA) - citation

Database Immune Cell Expression (DICE) - citation

Immune-related Gene Sets - citation

Future Directions

Data Hosting for Interactive Analysis

Easy Submission Portal for Researchers to Add Data

Using the Data to Build a Reference Atlas

There are areas in which we are actively hoping to develop to further facilitate the usefulness of the data set - if you have other suggestions, please reach out using the contact information below.

Contact

Questions, comments, suggestions, please feel free to contact Nick Borcherding via this repository, email, or using twitter.
f
scRNAseq Seurat and SCE objects
figshare.com
application/gzip
Updated Jul 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillermo Turiel (2025). scRNAseq Seurat and SCE objects [Dataset]. http://doi.org/10.6084/m9.figshare.29493215.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.29493215.v1
Dataset updated
Jul 7, 2025
Dataset provided by
figshare
Authors
Guillermo Turiel
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
SingleCellExperiment (SCE) and Seurat objects in RDS format to directly use in R. ECs+MoMac files is the merged dataset of ECs and MoMac cells used in the cell communication analysis.
d
Data from: Continuous expression of TOX safeguards exhausted CD8 T cell...
search.dataone.org
datadryad.org
Updated Mar 15, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yinghui Jane Huang; John Wherry; Sasikanth Manne (2025). Continuous expression of TOX safeguards exhausted CD8 T cell epigenetic fate [Dataset]. http://doi.org/10.5061/dryad.8kprr4xx9
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.8kprr4xx9
Dataset updated
Mar 15, 2025
Dataset provided by
Dryad Digital Repository
Authors
Yinghui Jane Huang; John Wherry; Sasikanth Manne
Description
CD8 T cell exhaustion is a major barrier limiting anti-tumor therapy. Though checkpoint blockade temporarily improves exhausted CD8 T cell (Tex) function, the underlying epigenetic landscape of Tex remains largely unchanged, preventing their durable â€œreinvigoration.â€ Whereas the transcription factor (TF) TOX has been identified as a critical initiator of Tex epigenetic programming, it remains unclear whether TOX plays an ongoing role in preserving Tex biology after cells commit to exhaustion. Here, we decoupled the role of TOX in the initiation versus maintenance of CD8 T cell exhaustion by temporally deleting TOX in established Tex. Induced TOX ablation in committed Tex resulted in apoptotic-driven loss of Tex, reduced expression of inhibitory receptors including PD-1, and a pronounced decrease in terminally differentiated subsets of Tex cells. Simultaneous gene expression and epigenetic profiling revealed a critical role for TOX in ensuring ongoing chromatin accessibility and transcri..., Cells from inducible-Cre (Rosa26CreERT2/+Toxfl/fl P14) mice where TOX was temporally deleted from mature populations of LCMV-specific T exhausted cells after establishment of chronic LCMV infection 5 days post infection were subjected to scRNA and scATACseq coassay,naive cells and WT cells were used as controls. Analysis pipeline developed by Josephine Giles and vignettes published by Satija and Stuart labs.Transcript count and peak accessibility matrices deposited in GSE255042,GSE255043. Seurat/Signac was used to process the scRNA and scATACseq coassay data The processed Seurat/Signac object above was subsequently used for downstream RNA and ATAC analyses as described below: DEGs between TOX WT and iKO cells within each subset were identified using FindMarkers (Seurat, Signac), with a log2-fold-change threshold of 0, using the SCT assay. DACRs were identified using FindMarkers using the "LR" test, with a log2-fold-change threshold of 0.1, a min.pct of 0.05, and included the number of c..., , # Continuous expression of TOX safeguards exhausted CD8 T cell epigenetic fate

https://doi.org/10.5061/dryad.8kprr4xx9

Seurat/Signac pipeline for multiomic scRNA-seq and scATAC-seq dataset, generated following inducible TOX deletion in LCMV-Cl13

Author

Yinghui Jane Huang

Script information

Purpose: Generate and process Seurat/Signac object for downstream analyses Written: Nov 2021 through Oct 2022 Adapted from: Analysis pipeline developed by Josephine Giles and vignettes published by Satija and Stuart labs Input dataset: Transcript count and peak accessibility matrices deposited in GSE255042,GSE255043

Signac Object Generation

1) Create individual signac objects for each sample from the raw 10x cellranger output.

2) Merge individual objects to create one seurat object.

3) Add metadata to merged seurat object.

Following are the steps in the attached html file for analysis of the paired data (ATAC+RNA)

Load fr...,
m
Islet DEGAS v1
data.mendeley.com
Updated Jan 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Kalwat (2024). Islet DEGAS v1 [Dataset]. http://doi.org/10.17632/3sdxv5tzbd.1
Explore at:
Unique identifier
https://doi.org/10.17632/3sdxv5tzbd.1
Dataset updated
Jan 22, 2024
Authors
Michael Kalwat
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Merged Seurat Object containing read count matrices and metadata for GSE84133, GSE85241, GSE86469, E-MTAB-5061, GSE81608.
Z
Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects...
data.niaid.nih.gov
zenodo.org
Updated Nov 28, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Soelter, Tabea (2023). Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects WNT, p53, and NFkB Signaling Determined by snRNA-seq [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10214496
Explore at:
Dataset updated
Nov 28, 2023
Dataset provided by
Clark, Amanda D.
Oza, Vishal H.
Soelter, Tabea
Howton, Timothy C.
Lasseigne, Brittany
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
data.tar.gz contains all files from the data directory associated with the 230313_TS_CCCinHumanAD GitHub project and includes the following:

CellRangerCounts/

GSE157827/

post_soupX/ : contains 21 directories for 21 samples, which each contain 3 files obtained from ambient RNA removal with soupX. Below is a representative example, but this repo contains 1 directory per sample:

SAMN16100290_S01_AD/

barcodes.tsv genes.tsv matrix.mtx pre_soupX/ : contains 21 directories for 21 samples, which each contain 2 files obtained from Cell Ranger after aligning fastq files to the reference genome. Below is a representative example, but this repo contains 1 directory per sample:

SAMN16100290_S01_AD/

filtered_feature_ bc_matrix.h5 Raw_feature_bc_matrix.h5 GSE174367/ : contains 19 directories for 19 samples, which contain 3 files each from Cell Ranger alignment of fastq files to the reference genome. Below is a representative example, but this repo contains 1 directory per sample:

SAMN19128610_S1_CTRL/

barcodes.tsv genes.tsv Matrix.mtx ccc/

nichenet_grn/

gr_network_human_21122021.rds : accessed in October 2023, gene regulation network – gene regulatory information from MultiNicheNet ligand_tf_matrix_nsga2r_final.rds: accessed in October 2023, ligand tf matrix for signaling path determination from MultiNicheNet signaling_network_human_21122021.rds : accessed in October 2023, signaling network – protein-protein interaction information from MultiNicheNet weighted_networks_nsga2r_final.rds : accessed in October 2023, networks weighted by literature evidence from MultiNicheNet nichenet_prior/

ligand_target_matrix.rds : accessed in April 2023, ligand to target matrix from NicheNet lr_network.rds : accessed in April 2023, ligand-receptor matrix from NicheNet nichenet_v2_prior/

ligand_target_matrix_nsga2r_final.rds : accessed in June 2023, ligand to target matrix from MultiNicheNet used to predict target genes. lr_network_human_21122021.rds : accessed in June 2023, ligand-receptor matrix from MultiNicheNet used to predict ligand-receptor pairs. geo_multinichenet_output.rds : MultiNicheNet output for Morabito et al., 2021 data geo_signaling_igraph_objects.rds : list of igraph objects for 17 overlapping LRTs and their signaling mediators in the Morabito et al., 2021 dataset. gse_multinichenet_output.rds : MultiNicheNet output for Lau et al., 2020 data gse_signaling_igraph_objects.rds : list of igraph objects for 17 overlapping LRTs and their signaling mediators in the Lau et al., 2020 dataset seurat_preprocessing/

geo_filtered_seurat.rds : merged and filtered seurat object of Morabito et al., 2021 data geo_integrated_seurat.rds : seurat object integrated using harmony of Morabito et al., 2021 data geo_clustered_seurat.rds : clustered seurat object of Morabito et al., 2021 data geo_processed_seurat.rds : processed seurat object with final cell type assignments at specified resolution of Morabito et al., 2021 data gse_filtered_seurat.rds : merged and filtered seurat object of Lau et al., 2020 data gse_integrated_seurat.rds : seurat object integrated using harmony of Lau et al., 2020 data gse_clustered_seurat.rds : clustered seurat object of Lau et al., 2020 data gse_processed_seurat.rds : processed seurat object with final cell type assignments at specified resolution of Lau et al., 2020 data
f
V1 Datasets Seurat Objects
figshare.com
application/gzip
Updated Aug 14, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sneddon Lab UCSF (2018). V1 Datasets Seurat Objects [Dataset]. http://doi.org/10.6084/m9.figshare.6783506.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6783506.v1
Dataset updated
Aug 14, 2018
Dataset provided by
figshare
Authors
Sneddon Lab UCSF
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
R objects for the V1 Datasets. Created with R package Seurat.E14_allcells_seur_ob.Rdata - E14.5 V1 Dataset. Includes all cells. Grouped by "ordered_manuscript" in @data.info slot. Corresponds to Fig. 1c. E14_mesenchyme_seur_ob.Rdata - E14.5 V1 Dataset. Only mesenchymal cells. Grouped by "ordered_manuscript" in @data.info slot. Corresponds to Fig. 2a. merged_mesenchyme_seur_ob.Rdata - E12.5, E14.5, E17.5 merged V1 Dataset. Only mesenchymal cells. Grouped by "ordered_manuscript" in @data.info slot. Corresponds to Fig. 3a. E14_epithelial_seur_ob.Rdata - E14.5 V1 Dataset. Only epithelial cells. Grouped by "ordered" in @data.info slot. Corresponds to Fig. 4a. E14_endocrine_seur_ob.Rdata - E14.5 V1 Dataset. Only endocrine cells. Grouped by "ordered_res1_5" in @data.info slot. Corresponds to Fig. 4f. merged_epithelial_seur_ob.Rdata - E12.5, E14.5, E17.5 merged V1 Dataset. Only epithelial cells. Grouped by "ordered" in @data.info slot. Corresponds to Supplementary Fig.5a
Z
Data from: Spatial Transcriptomics in Breast Cancer Reveals Tumour...
data.niaid.nih.gov
zenodo.org
Updated Nov 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jiménez-Santos, María José (2024). Spatial Transcriptomics in Breast Cancer Reveals Tumour Microenvironment-Driven Drug Responses and Clonal Therapeutic Heterogeneity [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10638905
Explore at:
Dataset updated
Nov 29, 2024
Dataset provided by
Gómez-López, Gonzalo
Jiménez-Santos, María José
Rubio-Fernández, Marcos
García-Martín, Santiago
Al-Shahrour, Fátima
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We acquired 10x Visium spatial transcriptomics (ST) data from 9 patients with invasive adenocarcinomas [1–5] to explore the role of the tumour microenvironment (TME) on intratumor heterogeneity (ITH) and drug response in breast cancer. By leveraging a new version of Beyondcell 6, a tool for identifying tumour cell subpopulations with distinct drug response patterns, we predicted sensitivity to over 1,200 drugs while accounting for the spatial context and interaction between the tumour and TME compartments. Moreover, we also used Beyondcell to compute spot-wise functional enrichment scores and identify niche-specific biological functions.

Here, you can find:

In signatures folder:

SSc breast: Collection of gene signatures used to predict sensitivity to > 1,200 drugs derived from breast cancer cell lines.

Functional signatures: Collection of gene signatures used to compute enrichment in different biological pathways.

In visium folder:

Visium objects: Processed ST Seurat objects with deconvoluted spots, SCTransform-normalised counts, and clonal composition predicted with SCEVAN [7]. These objects, together with the signatures, were used to compute the Beyondcell objects.

In single-cell folder:

Single-cell objects: Raw and filtered merged single-cell RNA-seq (scRNA-seq) Seurat objects with unnormalised counts used as a reference for spot deconvolution.

In beyondcell folder:

Beyondcell sensitivity objects with prediction scores for all drug response signatures in SSc breast.

Beyondcell functional objects with enrichment scores for all functional signatures.
n
Data from: Human tau mutations in cerebral organoids induce a progressive...
data.niaid.nih.gov
zenodo.org
+1more
zip
Updated Jan 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stella M.K. Glasauer; Susan K. Goderie; Jennifer N. Rauch; Elmer Guzman; Morgane Audouard; Taylor Bertucci; Shona Joy; Emma Rommelfanger; Gabriel Luna; Erica Keane-Rivera; Steven Lotz; Susan Borden; Aaron M. Armando; Oswald Quehenberger; Sally Temple; Kenneth S. Kosik (2023). Human tau mutations in cerebral organoids induce a progressive dyshomeostasis of cholesterol [Dataset]. http://doi.org/10.25349/D95898
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25349/D95898
Dataset updated
Jan 30, 2023
Dataset provided by
University of California, San Diego
University of California, Santa Barbara
Neural Stem Cell Institute
Authors
Stella M.K. Glasauer; Susan K. Goderie; Jennifer N. Rauch; Elmer Guzman; Morgane Audouard; Taylor Bertucci; Shona Joy; Emma Rommelfanger; Gabriel Luna; Erica Keane-Rivera; Steven Lotz; Susan Borden; Aaron M. Armando; Oswald Quehenberger; Sally Temple; Kenneth S. Kosik
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Single cell RNA sequencing (drop-seq) data of forebrain organoids carrying pathogenic MAPT R406W and V337M mutations. Organoids were generated from 5 heterozygous donor lines (two R406W lines and three V337M lines) and respective CRISPR-corrected isogenic controls. Organoids were also generated from one homozygous R406W donor line. Single-cell sequencing was performed at 1, 2, 3, 4, 6 and 8 months of organoid maturation. Methods Single-cell transcriptomes were obtained using drop-seq (Macosko et al., 2015, https://doi.org/10.1016/j.cell.2015.05.002). Counts matrices were generated using the Drop-seq tools package (Macosko et al. 2015), with full details available online (https://github.com/broadinstitute/Drop-seq/files/2425535/Drop-seqAlignmentCookbookv1.2Jan2016.pdf). Briefly, raw reads were converted to BAM files, cell barcodes and UMIs were extracted, and low-quality reads were removed. Adapter sequences and polyA tails were trimmed, and reads were converted to Fastq for STAR alignment (STAR version 2.6). Mapping to human genome (hg19 build) was performed with default settings. Reads mapped to exons were kept and tagged with gene names, beads synthesis errors were corrected, and a digital gene expression matrix was extracted from the aligned library. We extracted data from twice as many cell barcodes as the number of cells targeted (NUM_CORE_BARCODES = 2x # targeted cells). Downstream analysis was performed using Seurat 3.0 in R version 3.6.3. An individual Seurat object was generated for each sample, and filtered and clustered individually. Cells with < 300 genes detected were filtered out, as were cells with > 10% mitochondrial gene content. Counts data were log-normalized using the default NormalizeData function and the default scale of 1e4. Then, the top 2000 variable genes were identified using the Seurat FindVariableFeatures function (selection.method = “vst”, nfeatures = 2000), followed by scaling and centering using the default ScaleData function. Principal Components Analysis was carried out on the scaled expression values of the 2000 top variable genes, and the cells were clustered using the first 50 principal components (PCs) as input in the FindNeighbors function, and a resolution of 0.4 in the FindClusters function. Non-linear dimensionality reduction was performed by running UMAP on the first 50 PCs. Following clustering and dimensionality reduction, putative cell doublets were identified using DoubletFinder (McGinnis et al. 2019; https://doi.org/10.1016/j.cels.2019.03.003), assuming a doublet formation rate of 5%. For each sample, the optimal pK value was identified based on the results of paramSweep_vs, summarizeSweep and find.pK functions of the DoubletFinder package. Instead of using the default paramSweep_vs function, we extended the upper range of computed pK values to 1.2. We visually verified cells identified as doublets had high nFeatures (number of genes expressed) by plotting the pANN metric against nFeatures. For samples not showing this correlation, we adjusted the pK value to the next highest peak in the pK/BCmetric plot. Finally, the individual Seurat objects were merged.
Data for Evaluation of altered cell-cell communication between glia and...
zenodo.org
application/gzip
Updated Apr 23, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tabea M. Soelter; Tabea M. Soelter; Timothy C. Howton; Timothy C. Howton; Elizabeth J. Wilk; Elizabeth J. Wilk; Jordan H. Whitlock; Jordan H. Whitlock; Amanda D. Clark; Amanda D. Clark; Allison Birnbaum; Allison Birnbaum; Dalton C. Patterson; Dalton C. Patterson; Constanza J. Cortes; Constanza J. Cortes; Brittany N. Lasseigne; Brittany N. Lasseigne (2024). Data for Evaluation of altered cell-cell communication between glia and neurons in the hippocampus of 3xTg-AD mice at two time points [Dataset]. http://doi.org/10.5281/zenodo.11043321
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11043321
Dataset updated
Apr 23, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Tabea M. Soelter; Tabea M. Soelter; Timothy C. Howton; Timothy C. Howton; Elizabeth J. Wilk; Elizabeth J. Wilk; Jordan H. Whitlock; Jordan H. Whitlock; Amanda D. Clark; Amanda D. Clark; Allison Birnbaum; Allison Birnbaum; Dalton C. Patterson; Dalton C. Patterson; Constanza J. Cortes; Constanza J. Cortes; Brittany N. Lasseigne; Brittany N. Lasseigne
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
processed_data.tar.gz contains all files from the data directory associated with the 230418_TS_AgingCCC GitHub project and includes the following:

CellRangerCounts/

post_soupX/ : contains 12 directories for 12 samples, which each contain 3 files obtained from ambient RNA removal with soupX. Below is a representative example, but the post_soupX directory contains one directory for each of the 12 samples:

S01_6m_AD/

barcodes.tsv

genes.tsv

matrix.mtx

pre_soupX/ : contains 12 directories for 12 samples, which each contain 2 files obtained from Cell Ranger after aligning fastq files to the reference genome. Below is a representative example, but this directory contains 1 directory for each individual sample:

S01_6m_AD/outs/

filtered_feature_ bc_matrix.h5

raw_feature_bc_matrix.h5

PANDA_inputs/

PANDA_exp_files_array.txt: Text files with files paths to expression inputs for PANDA gene regulatory networks.

mm10_TFmotifs.txt: Mouse TF motif input for PANDA gene regulatory networks. Previously published in Whitlock et al. 2023

mm10_ppi.txt: Mouse protein-protein interaction information from SringDB input for PANDA gene regulatory networks. Previously published in Whitlock et al. 2023

ccc/

nichenet_v2_prior/

gr_network_mouse_21122021.rds : accessed in December 2023, gene regulation network – gene regulatory information from MultiNicheNet

ligand_target_matrix_nsga2r_final_mouse.rds: accessed in December 2023, ligand target matrix for mouse from MultiNicheNet.

ligand_tf_matrix_nsga2r_final_mouse.rds: accessed in December 2023, mouse ligand tf matrix for signaling path determination from MultiNicheNet

lr_network_mouse_21122021.rds : accessed in December 2023, ligand-receptor matrix from MultiNicheNet

signaling_network_mouse_21122021.rds : accessed in December 2023, signaling network – protein-protein interaction information from MultiNicheNet for mouse

weighted_networks_nsga2r_final_mouse.rds : accessed in October 2023, networks weighted by literature evidence from MultiNicheNet for mouse

multinichenet_output.rds : MultiNicheNet output for 3xTg-AD snRNA-seq data

12m_signaling_igraph_objects.rds : list of igraph objects for 93 LRTs and their signaling mediators at 12 months

6m_signaling_igraph_objects.rds :list of igraph objects for 2 LRTs and their signaling mediators at 6 months

elisa/: CSV files of measured OD values for every ELISA.

240319_ELISA_Ab40.csv: OD measurements for Ab40

240319_ELISA_Ab42.csv: OD measurements for Ab42

240319_ELISA_total_tau.csv: OD measurements for Total Tau

panda/: PANDA gene regulatory networks for each time point and condition in excitatory and inhibitory neurons. Used for differential gene targeting.

excitatory_neurons_AD12.Rdata

excitatory_neurons_AD6.Rdata

excitatory_neurons_WT12.Rdata

excitatory_neurons_WT6.Rdata

inhibitory_neurons_AD12.Rdata

inhibitory_neurons_AD6.Rdata

inhibitory_neurons_WT12.Rdata

inhibitory_neurons_WT6.Rdata

pseudobulk/: includes pseudo bulk matrices for every cell type which were used for downstream analyses. Each matrix also includes metadata information on condition and time point.

all_counts_ls.rds: List of all the pseudo bulk matrices (below).

astrocytes.rds: pseudobulk matrix for astrocytes. Include time point and condition information for downstream analyses.

endothelial_cells.rds: pseudobulk matrix for endothelial cells. Include time point and condition information for downstream analyses.

ependymal_cells.rds: pseudobulk matrix for ependymal cells. Include time point and condition information for downstream analyses.

excitatory_neurons.rds: pseudobulk matrix for excitatory neurons. Include time point and condition information for downstream analyses.

fibroblasts.rds: pseudobulk matrix for fibroblasts. Include time point and condition information for downstream analyses.

inhibitory_neurons.rds: pseudobulk matrix for inhibitory neurons. Include time point and condition information for downstream analyses.

meningeal_cells.rds: pseudobulk matrix for meningeal cells. Include time point and condition information for downstream analyses.

microglia.rds: pseudobulk matrix for microglia. Include time point and condition information for downstream analyses.

oligodendrocytes.rds: pseudobulk matrix for oligodendrocytes. Include time point and condition information for downstream analyses.

opcs.rds: pseudobulk matrix for oligodendrocyte progenitor cells. Include time point and condition information for downstream analyses.

percicytes.rds: pseudobulk matrix for pericytes. Include time point and condition information for downstream analyses.

rgcs.rds: pseudobulk matrix for retinal ganglion cells. Include time point and condition information for downstream analyses.

pseudobulk_split/: Includes pseudo bulk count matrices split by time point and condition. Used for input to PANDA for gene regulatory network construction.

excitatory_neurons_AD12.Rdata

excitatory_neurons_AD6.Rdata

excitatory_neurons_WT12.Rdata

excitatory_neurons_WT6.Rdata

inhibitory_neurons_AD12.Rdata

inhibitory_neurons_AD6.Rdata

inhibitory_neurons_WT12.Rdata

inhibitory_neurons_WT6.Rdata

seurat_preprocessing/

filtered_seurat.rds : merged and filtered seurat object

integrated_seurat.rds : seurat object integrated using harmony

clustered_seurat.rds : clustered seurat object

processed_seurat.rds : processed seurat object with final cell type assignments at specified resolution

Raw data publicly available on GEO under series accession: GSE261596
f
Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute...
figshare.com
bin
Updated Jun 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emin Onur Karakaslar (2025). Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute Myeloid Leukemia samples [Dataset]. http://doi.org/10.6084/m9.figshare.26189771.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26189771.v1
Dataset updated
Jun 16, 2025
Dataset provided by
figshare
Authors
Emin Onur Karakaslar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TLDRSeurat object of the 16 NPM1-mutated AML samples (n = 83,162 cells).AML samplesAll sixteen peripheral blood and bone marrow samples were obtained from patients with AML at diagnosis (n=15) or relapse after chemotherapy (n=1) with written informed consent according to the Declaration of Helsinki. Mononuclear cells were isolated by Ficoll-Isopaque density gradient centrifugation and cryopreserved in the Leiden University Medical Center (LUMC) Biobank for Hematological Diseases after approval by the LUMC Institutional Review Board (protocol no. B18.047).Upstream processing pipelineCellRanger v7.0.0 was run on all samples with the human reference genome hg38. For all QC Seurat v4 was used15. Our QC pipeline had three steps per sample: 1) soft filtering, 2) low quality cluster removal, and 3) doublet detection. In soft filtering, Seurat objects were created with cells expressing at least 200 genes and with the genes expressed at least in 3 cells. Then, standard Seurat command list with default parameters was run to detect low quality clusters. Clusters with >15% mitochondrial and 15% mitochondrial mRNA. We used standard Seurat commands to scale and normalize the data on integrated features. First 30 principal components were used to create UMAP plots. We used clustree to determine optimal cluster number, based on FindClusters with resolutions sweeping from 0 to 1.2. We chose res=0.5, as clusters became stable. Next, we merged two clusters (CC5 and CC12) into one GMP-like cluster as one of these clusters (CC12) had high expression of HSP-genes yet still retained its cell-type specific properties.Note: The file was processed with Seurat v4 but the object is updated for v5. Uploaded as .qs file format for faster reading. To read the file: qs:qread("path/to/data.qs")This data is available for research use only; and cannot be used for commercial purposes.For further queries please refer to our paper:
Single cell T cell atlas
zenodo.org
bin, csv
Updated Jul 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kerry A Mullan; Kerry A Mullan (2024). Single cell T cell atlas [Dataset]. http://doi.org/10.5281/zenodo.12569981
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12569981
Dataset updated
Jul 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kerry A Mullan; Kerry A Mullan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The attached datasets comprised of the merging of 12 high quality single cell T cell based dataset that had both the TCR-seq and GEx. The object contains ~500K paired TCR-seq with GEx in the Seurat Object (supercluster_added_ID-240531.rds). We also included the original identifiers in the Sup_Update_labels.csv a. See our https://stegor.readthedocs.io/en/latest/ for how we processed the 12 datasets and decided on the current 47 T cell annotation models using scGate.

This is the accompanying data set for the paper entitled ‘T cell receptor-centric approach to streamline multimodal single-cell data analysis.’, which is currently available as a preprint (https://www.biorxiv.org/content/10.1101/2023.09.27.559702v2). Details on the origin of the datasets, and processing steps can be found there.

The purpose of this atlas both the full dataset and down sampling version is to aid in improving the interpretability of other T cell based datasets. This can be done by adding in the down sampled object that contains up to 500 cells per annotation model or all 12 dataset to your new sample. This dataset aims to improve the capacity to identify TCR-specific signature by ensuring a well covered background, which will improve the robustness of the FindMarker Function in Seurat package.
Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...
data.niaid.nih.gov
datadryad.org
zip
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove (2024). Pre-ciliated tubal epithelial cells are prone to initiation of high-grade serous ovarian carcinoma [Dataset]. http://doi.org/10.5061/dryad.4mw6m90hm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4mw6m90hm
Dataset updated
Oct 17, 2024
Dataset provided by
Cornell University
Authors
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
The distal region of the uterine (Fallopian) tube is commonly associated with high-grade serous carcinoma (HGSC), the predominant and most aggressive form of ovarian or extra-uterine cancer. Specific cell states and lineage dynamics of the adult tubal epithelium (TE) remain insufficiently understood, hindering efforts to determine the cell of origin for HGSC. Here, we report a comprehensive census of cell types and states of the mouse uterine tube. We show that distal TE cells expressing the stem/progenitor cell marker Slc1a3 can differentiate into both secretory (Ovgp1+) and ciliated (Fam183b+) cells. Inactivation of Trp53 and Rb1, whose pathways are commonly altered in HGSC, leads to elimination of targeted Slc1a3+ cells by apoptosis, thereby preventing their malignant transformation. In contrast, pre-ciliated cells (Krt5+, Prom1+, Trp73+) remain cancer-prone and give rise to serous tubal intraepithelial carcinomas and overt HGSC. These findings identify transitional pre-ciliated cells as a previously unrecognized cancer-prone cell state and point to pre-ciliation mechanisms as novel diagnostic and therapeutic targets. Methods

Single-cell RNA-sequencing library preparation For TE single cell expression and transcriptome analysis we isolated TE from C57BL6 adult estrous female mice. In 3 independent experiments a total of 62 uterine tubes were collected. Each uterine tube was placed in sterile PBS containing 100 IU ml-1 of penicillin and 100 µg ml-1 streptomycin (Corning, 30-002-Cl), and separated in distal and proximal regions. Tissues from the same region were combined in a 40 µl drop of the same PBS solution, cut open lengthwise, and minced into 1.5-2.5 mm pieces with 25G needles. Minced tissues were transferred with help of a sterile wide bore 200 µl pipette tip into a 1.8 ml cryo vial containing 1.2 ml A-mTE-D1 (300 IU ml-1 collagenase IV mixed with 100 IU ml-1 hyaluronidase; Stem Cell Technologies, 07912, in DMEM Ham’s F12, Hyclone, SH30023.FS). Tissues were incubated with loose cap for 1 h at 37°C in a 5% CO2 incubator. During the incubation tubes were taken out 4 times and tissues suspended with a wide bore 200 µl pipette tip. At the end of incubation, the tissue-cell suspension from each tube was transferred into 1 ml TrypLE (Invitrogen, 12604013) pre-warmed to 37°C, suspended 70 times with a 1000 µl pipette tip, 5 ml A-SM [DMEM Ham’s F12 containing 2% fetal bovine serum (FBS)] were added to the mix, and TE cells were pelleted by centrifugation 300x g for 10 minutes at 25°C. Pellets were then suspended with 1 ml pre-warmed to 37°C A-mTE-D2 (7 mg ml-1 Dispase II, Worthington NPRO2, and 10 µg ml-1 Deoxyribonuclease I, Stem Cell Technologies, 07900), and mixed 70 times with a 1000 µl pipette tip. 5 ml A-mTE-D2 was added and samples were passed through a 40 µm cell strainer, and pelleted by centrifugation at 300x g for 7 minutes at +4°C. Pellets were suspended in 100 µl microbeads per 107 total cells or fewer, and dead cells were removed with the Dead Cell Removal Kit (Miltenyi Biotec, 130-090-101) according to the manufacturer’s protocol. Pelleted live cell fractions were collected in 1.5 ml low binding centrifuge tubes, kept on ice, and suspended in ice cold 50 µl A-Ri-Buffer (5% FBS, 1% GlutaMAX-I, Invitrogen, 35050-079, 9 µM Y-27632, Millipore, 688000, and 100 IU ml-1 penicillin 100 μg ml-1 streptomycin in DMEM Ham’s F12). Cell aliquots were stained with trypan blue for live and dead cell calculation. Live cell preparations with a target cell recovery of 5,000-6,000 were loaded on Chromium controller (10X Genomics, Single Cell 3’ v2 chemistry) to perform single cell partitioning and barcoding using the microfluidic platform device. After preparation of barcoded, next-generation sequencing cDNA libraries samples were sequenced on Illumina NextSeq500 System.

Download and alignment of single-cell RNA sequencing data For sequence alignment, a custom reference for mm39 was built using the cellranger (v6.1.2, 10x Genomics) mkref function. The mm39.fa soft-masked assembly sequence and the mm39.ncbiRefSeq.gtf (release 109) genome annotation last updated 2020-10-27 were used to form the custom reference. The raw sequencing reads were aligned to the custom reference and quantified using the cellranger count function.

Preprocessing and batch correction All preprocessing and data analysis was conducted in R (v.4.1.1 (2021-08-10)). The cellranger count outs were first modified with the autoEstCont and adjustCounts functions from SoupX (v.1.6.1) to output a corrected matrix with the ambient RNA signal (soup) removed (https://github.com/constantAmateur/SoupX). To preprocess the corrected matrices, the Seurat (v.4.1.1) NormalizeData, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors, and RunUMAP functions were used to create a Seurat object for each sample (https://github.com/satijalab/seurat). The number of principal components used to construct a shared nearest-neighbor graph were chosen to account for 95% of the total variance. To detect possible doublets, we used the package DoubletFinder (v.2.0.3) with inputs specific to each Seurat object. DoubletFinder creates artificial doublets and calculates the proportion of artificial k nearest neighbors (pANN) for each cell from a merged dataset of the artificial and actual data. To maximize DoubletFinder’s predictive power, mean-variance normalized bimodality coefficient (BCMVN) was used to determine the optimal pK value for each dataset. To establish a threshold for pANN values to distinguish between singlets and doublets, the estimated multiplet rates for each sample were calculated by interpolating between the target cell recovery values according to the 10x Chromium user manual. Homotypic doublets were identified using unannotated Seurat clusters in each dataset with the modelHomotypic function. After doublets were identified, all distal and proximal samples were merged separately. Cells with greater than 30% mitochondrial genes, cells with fewer than 750 nCount RNA, and cells with fewer than 200 nFeature RNA were removed from the merged datasets. To correct for any batch defects between sample runs, we used the harmony (v.0.1.0) integration method (github.com/immunogenomics/harmony).

Clustering parameters and annotations After merging the datasets and batch-correction, the dimensions reflecting 95% of the total variance were input into Seurat’s FindNeighbors function with a k.param of 70. Louvain clustering was then conducted using Seurat’s FindClusters with a resolution of 0.7. The resulting 19 clusters were annotated based on the expression of canonical genes and the results of differential gene expression (Wilcoxon Rank Sum test) analysis. One cluster expressing lymphatic and epithelial markers was omitted from later analysis as it only contained 2 cells suspected to be doublets. To better understand the epithelial populations, we reclustered 6 epithelial populations and reapplied harmony batch correction. The clustering parameters from FindNeighbors was a k.param of 50, and a resolution of 0.7 was used for FindClusters. The resulting 9 clusters within the epithelial subset were further annotated using differential expression analysis and canonical markers.

Pseudotime analysis Potential of heat diffusion for affinity-based transition embedding (PHATE) is dimensional reduction method to more accurately visualize continual progressions found in biological data 35. A modified version of Seurat (v4.1.1) was developed to include the ‘RunPHATE’ function for converting a Seurat Object to a PHATE embedding. This was built on the phateR package (v.1.0.7) (https://github.com/scottgigante/seurat/tree/patch/add-PHATE-again). In addition to PHATE, pseudotime values were calculated with Monocle3 (v.1.2.7), which computes trajectories with an origin set by the user 36,55–57. The origin was set to be a progenitor cell state confirmed with lineage tracing experiments. 35. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). doi:10.1038/s41587-019-0336-3 36. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). doi:10.1038/s41586-019-0969-x 55. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 32, 381–386 (2014). doi:10.1038/nbt.2859 56. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nature Methods 14, 309–315 (2017). doi:10.1038/nmeth.4150 57. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14, 979–982 (2017). doi:10.1038/nmeth.4402
Analysis Products: Transcription factor stoichiometry, motif affinity and...
zenodo.org
tsv, zip
Updated Nov 11, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Surag Nair; Surag Nair; Mohamed Ameen; Kevin Wang; Kevin Wang; Anshul Kundaje; Anshul Kundaje; Mohamed Ameen (2023). Analysis Products: Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency [Dataset]. http://doi.org/10.5281/zenodo.8313962
Explore at:
zip, tsvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8313962
Dataset updated
Nov 11, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Surag Nair; Surag Nair; Mohamed Ameen; Kevin Wang; Kevin Wang; Anshul Kundaje; Anshul Kundaje; Mohamed Ameen
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This record contains analysis products for the paper "Transcription factor stoichiometry, motif affinity and syntax regulate single cell chromatin dynamics during fibroblast reprogramming to pluripotency" by Nair, Ameen et al. Please refer to the READMEs in the directories, which are summarized below.

The record contains the following files:

`clusters.tsv`: contains the cluster id, name and colour of clusters in the paper

scATAC.zip

Analysis products for the single-cell ATAC-seq data. Contains:

- `cells.tsv`: list of barcodes that pass QC. Columns include:
- `barcode`
- `sample`: (time point)
- `umap1`
- `umap2`
- `cluster`
- `dpt_pseudotime_fibr_root`: pseudotime values treating a fibroblast cell as root
- `dpt_pseudotime_xOSK_root`: pseudotime values treating xOSK cell as root
- `peaks.bed`: list of peaks of 500bp across all cell states. 4th column contains the peak set label. Note that ~5000 peaks are not assigned to any peak set and are marked as NA.
- `features.tsv`: 50 dimensional representation of each cell
- `cell_x_peak.mtx.gz`: sparse matrix of fragment counts within peaks. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (combine sample + barcode). Rows correspond to peaks in `peaks.bed`

scATAC_clusters.zip

Analysis products corresponding to cluster pseudo-bulks of the single-cell ATAC-seq data.

- `clusters.tsv`: contains the cluster id, name and colour used in the paper
- `peaks`: contains `overlap_reproducibilty/overlap.optimal_peak` peaks called using ENCODE bulk ATAC-seq pipeline in the narrowPeak format.
- `fragments`: contains per cluster fragment files

scATAC_scRNA_integration.zip

Analysis products from the integration of scATAC with scRNA. Contains:

- `peak_gene_links_fdr1e-4.tsv`: file with peak gene links passing FDR 1e-4. For analyses in the paper, we filter to peaks with absolute correlation >0.45.
- `harmony.cca.30.feat.tsv`: 30 dimensional co-embedding for scATAC and scRNA cells obtained by CCA followed by applying Harmony over assay type.
- `harmony.cca.metadata.tsv`: UMAP coordinates for scATAC and scRNA cells derived from the Harmony CCA embedding. First column contains barcode.

scRNA.zip

Analysis products for the single-cell RNA-seq data. Contains:

- `seurat.rds`: seurat object that contains expression data (raw counts, normalized, and scaled), reductions (umap, pca), knn graphs, all associated metadata. Note that barcode suffix (1-9 corresponds to samples D0, D2, ..., D14, iPSC)
- `genes.txt`: list of all genes
- `cells.tsv`: list of barcodes that pass QC across samples. Contains:
- `barcode_sample`: barcode with index of sample (1-9 corresponding to D0, D2, ..., D14, iPSC)
- `sample`: sample name (D0, D2, .., D14, iPSC)
- `umap1`
- `umap2`
- `nCount_RNA`
- `nFeature_RNA`
- `cluster`
- `percent.mt`: percent of mitochondrial transcripts in cell
- `percent.oskm`: percent of OSKM transcripts in cell
- `gene_x_cell.mtx.gz`: sparse matrix of gene counts. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (barcode suffix contains sample information). Rows correspond to genes in `genes.txt`
- `pca.tsv`: first 50 PC of each cell
- `oskm_endo_sendai.tsv`: estimated raw counts (cts, may not be integers) and log(1+ tp10k) normalized expression (norm) for endogenous and exogenous (Sendai derived) counts of POU5F1 (OCT4), SOX2, KLF4 and MYC genes. Rows are consistent with `seurat.rds` and `cells.tsv`

multiome.zip

multiome/snATAC:

These files are derived from the integration of nuclei from multiome (D1M and D2M), with cells from day 2 of scATAC-seq (labeled D2).

- `cells.tsv`: This is the list of nuclei barcodes that pass QC from multiome AND also cell barcodes from D2 of scATAC-seq. Includes:
- `barcode`
- `umap1`: These are the coordinates used for the figures involving multiome in the paper.
- `umap2`: ^^^
- `sample`: D1M and D2M correspond to multiome, D2 corresponds to day 2 of scATAC-seq
- `cluster`: For multiome barcodes, these are labels transfered from scATAC-seq. For D2 scATAC-seq, it is the original cluster labels.
- `peaks.bed`: This is the same file as scATAC/peaks.bed. List of peaks of 500bp. 4th column contains the peak set label. Note that ~5000 peaks are not assigned to any peak set and are marked as NA.
- `cell_x_peak.mtx.gz`: sparse matrix of fragment counts within peaks. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (combine sample + barcode). Rows correspond to peaks in `peaks.bed`.
- `features.no.harmony.50d.tsv`: 50 dimensional representation of each cell prior to running Harmony (to correct for batch effect between D2 scATAC and D1M,D2M snMultiome). Rows correspond to cells from `cells.tsv`.
- `features.harmony.10d.tsv`: 10 dimensional representation of each cell after running Harmony. Rows correspond to cells from `cells.tsv`.

multiome/snRNA:

- `seurat.rds`: seurat object that contains expression data (raw counts, normalized, and scaled), reductions (umap, pca),associated metadata. Note that barcode suffix (1,2 corresponds to samples D1M, D2M). Please use the UMAP/features from snATAC/ for consistency.
- `genes.txt`: list of all genes (this is different from the list in scRNA analysis)
- `cells.tsv`: list of barcodes that pass QC across samples. Contains:
- `barcode_sample`: barcode with index of sample (1,2 corresponding to D1M, D2M respectively)
- `sample`: sample name (D1M, D2M)
- `nCount_RNA`
- `nFeature_RNA`
- `percent.oskm`: percent of OSKM genes in cell
- `gene_x_cell.mtx.gz`: sparse matrix of gene counts. Load using scipy.io.mmread in python or readMM in R. Columns correspond to cells from `cells.tsv` (barcode suffix contains sample information). Rows correspond to genes in `genes.txt`

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

zenodo.org

bin, csv, zip

Updated Oct 24, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang (2024). Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to anti-PD-1 and anti-PD-1/CTLA-4 immunotherapy in melanoma [Dataset]. http://doi.org/10.5281/zenodo.13971562

Explore at:

bin, csv, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13971562

Dataset updated

Oct 24, 2024

Dataset provided by

Zenodo

Authors

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset details the scRNASeq and TCR-Seq analysis of sorted PD-1+ CD8+ T cells from patients with melanoma treated with checkpoint therapy (anti-PD-1 monotherapy and anti-PD-1 & anti-CTLA-4 combination therapy) at baseline and after the first cycle of therapy. A major publication using this dataset is accessible here: (reference)

*experimental design

Single-cell RNA sequencing was performed using 10x Genomics with feature barcoding technology to multiplex cell samples from different patients undergoing mono or dual therapy so that they can be loaded on one well to reduce costs and minimize technical variability. Hashtag oligomers (oligos) were obtained as purified and already oligo-conjugated in TotalSeq-C format from BioLegend. Cells were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*extract protocol

PBMCs were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions.

*library construction protocol

Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*library strategy

scRNA-seq and scTCR-seq

*data processing step

Pre-processing of sequencing results to generate count matrices (gene expression and HTO barcode counts) was performed using the 10x genomics Cell Ranger pipeline.

Further processing was done with Seurat (cell and gene filtering, hashtag identification, clustering, differential gene expression analysis based on gene expression).

*genome build/assembly

Alignment was performed using prebuilt Cell Ranger human reference GRCh38.

*processed data files format and content

RNA counts and HTO counts are in sparse matrix format and TCR clonotypes are in csv format.

Datasets were merged and analyzed by Seurat and the analyzed objects are in rds format.

file name	file checksum
PD1CD8_160421_filtered_feature_bc_matrix.zip	da2e006d2b39485fd8cf8701742c6d77
PD1CD8_190421_filtered_feature_bc_matrix.zip	e125fc5031899bba71e1171888d78205
PD1CD8_160421_filtered_contig_annotations.csv	927241805d507204fbe9ef7045d0ccf4
PD1CD8_190421_filtered_contig_annotations.csv	8ca544d27f06e66592b567d3ab86551e

*processed data file	antibodies/tags
PD1CD8_160421_filtered_feature_bc_matrix.zip	none
PD1CD8_160421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M1_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M1_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - C1_base_combined_therapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - C1_post_combined_therapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C2_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C2_post_combined_therapy
PD1CD8_160421_filtered_contig_annotations.csv	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M2_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M2_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - M3_base_monotherapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - M3_post_monotherapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C3_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C3_post_combined_therapy
PD1CD8_190421_filtered_contig_annotations.csv	none

Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2

Scripts for Analysis

Explore at:

txtAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.6783569.v2

Dataset updated

Jul 18, 2018

Dataset provided by

Figsharehttp://figshare.com/

Authors

Sneddon Lab UCSF

License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.

Clear search

Close search

Google apps

Main menu

Scripts for Analysis

Processed CODEX Data (Seurat Objects)

scRNA-seq_huang2019

scRNAseq_Dataset Merge AMI d5 (CD45+Fibroblast) + AAA Kinetik +...

utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...

scRNAseq Seurat and SCE objects

Data from: Continuous expression of TOX safeguards exhausted CD8 T cell...

Script information

Signac Object Generation

Islet DEGAS v1

Data for Altered Glia-Neuron Communication in Alzheimer's Disease Affects...

V1 Datasets Seurat Objects

Data from: Spatial Transcriptomics in Breast Cancer Reveals Tumour...

Data from: Human tau mutations in cerebral organoids induce a progressive...

Data for Evaluation of altered cell-cell communication between glia and...

Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute...

Single cell T cell atlas

Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...

Analysis Products: Transcription factor stoichiometry, motif affinity and...

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

Scripts for Analysis