MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Seurat objects of overall merges of unsorted endothelial and perivascular cells
-> part of the manuscript: Single-cell atlas of the human brain vasculature across development, adulthood and disease
https://www.nature.com/articles/s41586-024-07493-y
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-i) Overall merge of all unsorted endothelial and perivascular cells_seurat object.rds:
-> this seurat object is the overall merge of unsorted endothelial and perivascular cells isolated from fetal brain, adult/control brains (temporal lobes), brain tumors (lower-grade glioma, high-grade glioma (glioblastoma), brain metastasis, meningiomas) and brain vascular malformations (brain arteriovenous malformations).
-ii) Overall merge of pathological unsorted endothelial and perivascular cells_seurat object.rds:
-> this seurat object is the overall merge of unsorted endothelial and perivascular cells isolated from brain tumors (lower-grade glioma, high-grade glioma (glioblastoma), brain metastasis, meningiomas) and brain vascular malformations (brain arteriovenous malformations).
Integration Skript:
library(Seurat)
library(tidyverse)
library(Matrix)
#cite <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Merge AAA mit Cite AAA/Cite_seq_v0.41.rds")
#CD45 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/CD45.rds")
AAA <- readRDS("C:/Users/alex/sciebo/AAA_Zhao_v4.rds")
cite <- readRDS("C:/Users/alex/sciebo/CITE_Seq_v0.5.rds")
all4 <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/Schrader_All4_Rohanalyse/all4_220228.rds")
#fuse lists
c <- list(cite, all4, AAA)
names(c) <- c("cite", "all4", "AAA")
pancreas.list <- c[c("cite", "all4", "AAA")]
for (i in 1:length(pancreas.list)) {
pancreas.list[[i]] <- SCTransform(pancreas.list[[i]], verbose = FALSE)
}
pancreas.features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
#options(future.globals.maxSize= 6091289600)
#pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = pancreas.features,
#verbose = FALSE) #future.globals.maxsize was to low. changed it to options(future.globals.maxSize= 1091289600)
#identify anchors
#alternative from tutorial (https://satijalab.org/seurat/articles/integration_introduction.html)
#memory.limit(9999999999)
features <- SelectIntegrationFeatures(object.list = pancreas.list, nfeatures = 3000)
pancreas.list <- PrepSCTIntegration(object.list = pancreas.list, anchor.features = features)
pancreas.anchors <- FindIntegrationAnchors(object.list = pancreas.list, normalization.method = "SCT", anchor.features = pancreas.features, verbose = FALSE)
pancreas.integrated <- IntegrateData(anchorset = pancreas.anchors, normalization.method = "SCT",
verbose = FALSE)
setwd("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper")
saveRDS(pancreas.integrated, file = "integrated_AAA_Cite_AMI.rds")
saveRDS(cd45, file = "integrated_AAA_Cite_CD45.rds")
seurat <- pancreas.integrated
#seurat <- readRDS("C:/Users/alex/sciebo/ALL_NGS/scRNAseq/scRNAseq/Schrader/Fertige_Analysen/TS_d5_paper/integrated_d5_cite.rds")
DefaultAssay(object = seurat) <- "integrated"
seurat <- FindVariableFeatures(seurat, selection.method = "vst", nfeatures = 3000)
seurat <- ScaleData(seurat, verbose = FALSE)
seurat <- RunPCA(seurat, npcs = 30, verbose = FALSE)
seurat <- FindNeighbors(seurat, dims = 1:30)
seurat <- FindClusters(seurat, resolution = 0.5)
seurat <- RunUMAP(seurat, reduction = "pca", dims = 1:30)
DimPlot(seurat, reduction = "umap", split.by = "treatment") + NoLegend()
DimPlot(seurat, label = T, repel = T) + NoLegend()
DefaultAssay(object = seurat) <- "ADT"
adt_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.3)
write.csv(adt_marker_integrated, file = "adt_marker_all4_integrated.csv")
DefaultAssay(object = seurat) <- "RNA"
RNA_marker_integrated <- FindAllMarkers(seurat, logfc.threshold = 0.5)
write.csv(RNA_marker_integrated, file = "RNA_marker_all4_integrated.csv")
DimPlot(seurat, label = T, repel = T, split.by = "tissue") + NoLegend()
FeaturePlot(seurat, features = "Cd40", order = T, label = T)
FeaturePlot(seurat, features = "Ms.CD40", order = T, label = T)
#####
#leanup:
> seurat@meta.data[["sen_score1"]] <- NULL
> seurat@meta.data[["sen_score2"]] <- NULL
> seurat@meta.data[["sen_score3"]] <- NULL
> seurat@meta.data[["sen_score4"]] <- NULL
> seurat@meta.data[["sen_score5"]] <- NULL
> seurat@meta.data[["sen_score6"]] <- NULL
> seurat@meta.data[["sen_score7"]] <- NULL
> seurat@meta.data[["pANN_0.25_0.1_1211"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_1211"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_466"]] <- NULL
> seurat@assays[["prediction.score.celltype"]] <- NULL
> seurat@meta.data[["predicted.celltype"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_184"]] <- NULL
> seurat@meta.data[["DF.classifications_0.25_0.1_953"]] <- NULL
> seurat@meta.data[["integrated_snn_res.3"]] <- NULL
> seurat@meta.data[["RNA_snn_res.3"]] <- NULL
> seurat@meta.data[["SingleR"]] <- NULL
> seurat@meta.data[["SingleR_fine"]] <- NULL
> seurat@meta.data[["ImmGen"]] <- NULL
> seurat@meta.data[["ImmGen_fine"]] <- NULL
> seurat@meta.data[["percent.mt"]] <- NULL
> seurat@meta.data[["nCount_integrated"]] <- NULL
> seurat@meta.data[["nFeature_integrated"]] <- NULL
> seurat@meta.data[["S.Score"]] <- NULL
> seurat@meta.data[["G2M.Score"]] <- NULL
> seurat@meta.data[["Phase"]] <- NULL
> seurat@meta.data[["sen_score8"]] <- NULL
> seurat@meta.data[["sen_score9"]] <- NULL
> seurat@meta.data[["sen_score10"]] <- NULL
> seurat@meta.data[["sen_score11"]] <- NULL
> seurat@meta.data[["sen_score12"]] <- NULL
> seurat@meta.data[["sen_score13"]] <- NULL
> seurat@meta.data[["sen_score14"]] <- NULL
> seurat@meta.data[["sen_score15"]] <- NULL
> seurat@meta.data[["sen_score16"]] <- NULL
> seurat@meta.data[["sen_score17"]] <- NULL
> seurat@meta.data[["sen_score18"]] <- NULL
> seurat@meta.data[["sen_score19"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_184"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_953"]] <- NULL
seurat@meta.data[["pANN_0.25_0.1_466"]] <- NULL
CD8 T cell exhaustion is a major barrier limiting anti-tumor therapy. Though checkpoint blockade temporarily improves exhausted CD8 T cell (Tex) function, the underlying epigenetic landscape of Tex remains largely unchanged, preventing their durable “reinvigoration.†Whereas the transcription factor (TF) TOX has been identified as a critical initiator of Tex epigenetic programming, it remains unclear whether TOX plays an ongoing role in preserving Tex biology after cells commit to exhaustion. Here, we decoupled the role of TOX in the initiation versus maintenance of CD8 T cell exhaustion by temporally deleting TOX in established Tex. Induced TOX ablation in committed Tex resulted in apoptotic-driven loss of Tex, reduced expression of inhibitory receptors including PD-1, and a pronounced decrease in terminally differentiated subsets of Tex cells. Simultaneous gene expression and epigenetic profiling revealed a critical role for TOX in ensuring ongoing chromatin accessibility and transcri..., Cells from inducible-Cre (Rosa26CreERT2/+Toxfl/fl P14) mice where TOX was temporally deleted from mature populations of LCMV-specific T exhausted cells after establishment of chronic LCMV infection 5 days post infection were subjected to scRNA and scATACseq coassay,naive cells and WT cells were used as controls. Analysis pipeline developed by Josephine Giles and vignettes published by Satija and Stuart labs.Transcript count and peak accessibility matrices deposited in GSE255042,GSE255043. Seurat/Signac was used to process the scRNA and scATACseq coassay data The processed Seurat/Signac object above was subsequently used for downstream RNA and ATAC analyses as described below: DEGs between TOX WT and iKO cells within each subset were identified using FindMarkers (Seurat, Signac), with a log2-fold-change threshold of 0, using the SCT assay. DACRs were identified using FindMarkers using the "LR" test, with a log2-fold-change threshold of 0.1, a min.pct of 0.05, and included the number of c..., , # Continuous expression of TOX safeguards exhausted CD8 T cell epigenetic fate
https://doi.org/10.5061/dryad.8kprr4xx9
Seurat/Signac pipeline for multiomic scRNA-seq and scATAC-seq dataset, generated following inducible TOX deletion in LCMV-Cl13
Author
Yinghui Jane Huang
Purpose: Generate and process Seurat/Signac object for downstream analyses Written: Nov 2021 through Oct 2022 Adapted from: Analysis pipeline developed by Josephine Giles and vignettes published by Satija and Stuart labs Input dataset: Transcript count and peak accessibility matrices deposited in GSE255042,GSE255043
1) Create individual signac objects for each sample from the raw 10x cellranger output.
2) Merge individual objects to create one seurat object.
3) Add metadata to merged seurat object.
Following are the steps in the attached html file for analysis of the paired data (ATAC+RNA)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Introduction
The original intent of assembling a data set of publicly-available tumor-infiltrating T cells (TILs) with paired TCR sequencing was to expand and improve the scRepertoire R package. However, after some discussion, we decided to release the data set for everyone, a complete summary of the sequencing runs and the sample information can be found in the meta data of the Seurat object. This repository contains the code for the initial processing and annotating of the data set (we are calling this version 0.0.1). This involves several steps 1) loading the respective GE data, 2) harmonizing the data by sample and cohort information, 3) iterating through automatic annotation, 4) unifying annotation via manual inspection and enrichment analysis, and 5) adding the TCR information.
Methods
Single-Cell Data Processing
The filtered gene matrices output from Cell Ranger align function from individual sequencing runs (10x Genomics, Pleasanton, CA) loaded into the R global environment. For each sequencing run cell barcodes were appended to contain a unique prefix to prevent issues with duplicate barcodes. The results were then ported into individual Seurat objects (citation), where the cells with > 10% mitochondrial genes and/or 2.5x natural log distribution of counts were excluded for quality control purposes. At the individual sequencing run level, doublets were estimated using the scDblFinder (v1.4.0) R package. All the sequencing runs across experiments were merged into a single Seurat Object using the merge() function. All the data was then normalized using the default settings and 2,000 variable genes were identified using the "vst" method. Next the data was scaled with the default settings and principal components were calculated for 40 components. Data was integrated using the harmony (v1.0.0) R package (citation) using both cohort and sample information to correct for batch effect with up to 20 iterations. The UMAP was created using the runUMAP() function in Seurat, using 20 dimensions of the harmony calculations.
Annotation of Cells
Automatic annotation was performed using the singler (v1.4.1) R package (citation) with the HPCA (citation) and DICE (citation) data sets as references and the fine label discriminators. Individual sequencing runs were subsetted to run through the singleR algorithm in order to reduce memory demands. The output of all the singleR analyses were collated and appended to the meta data of the seurat object. Likewise, the ProjecTILs (v0.4.1) R Package (citation) was used for automatic annotation as a partially orthogonal approach. Consensus annotation was derived from all 3 databases (HPCA, DICE, ProjecTILs) using a majority approach. No annotation designation was assigned to cells that returned NA for both singleR and ProjecTILs. Mixed annotations were designated with SingleR identified non-Tcells and ProjecTILs identified T cells. Cell type designations with less than 100 cells in the entire cohort were reduced to "other". Automated annotations were checked manually using canonical marker genes and gene enrichment analysis performed using UCell (v1.0.0) R package (citation).
Addition of TCR data
The filtered contig annotation T cell receptor (TCR) data for available sequencing runs were loaded into the R global environment. Individual contigs were combined using the combineTCR() function of scRepertoire (v1.3.2) R Package (citation). Clonotypes were assigned to barcodes and were multiple duplicate chains for individual cells were filtered to select for the top expressing contig by read count. The clonotype data was then added to the Seurat Object with proportion across individual patients being used to calculate frequency.
Citations
As of right now, there is no citation associated with the assembled data set. However if using the data, please find the corresponding manuscript for each data set in the meta.data of the single-cell object. In addition, if using the processed data, feel free to modify the language in the methods section (above) and please cite the appropriate manuscripts of the software or references that were used.
Itemized List of the Software Used
Itemized List of Reference Data Used
Future Directions
There are areas in which we are actively hoping to develop to further facilitate the usefulness of the data set - if you have other suggestions, please reach out using the contact information below.
Contact
Questions, comments, suggestions, please feel free to contact Nick Borcherding via this repository, email, or using twitter.
This dataset comprises results from one single-cell spatial experiment conducted on mouse brains. This experiment was performed using the Bruker Nanostring CosMx technology on 10 µm coronal brain sections from 8-month-old female WT, WT ACAN cKO, 5xFAD, and 5x ACAN cKO mice. The dataset is provided as two separate RDS files split by flowcell which include raw and corrected counts for the RNA data, along with comprehensive metadata. Metadata includes mouse genotype, sample ID, cell type annotations, and X-Y coordinates of each cell., Sample preparation: Isopentane fresh-frozen brain hemispheres were embedded in optimal cutting temperature (OCT) compound (Tissue-Tek, Sakura Fintek, Torrance, CA), and 10 µm thick coronal sections were prepared using a cryostat (CM1950, LeicaBiosystems, Deer Park, IL). Six hemibrains were mounted onto each VWR Superfrost Plus microscope slide (Avantor, 48311-703) and kept at -80°C until fixation. For transcriptomic analysis, n=3 mice per genotype were used for both the 5xFAD and 5xFAD ACAN cKO groups, while n=4 for WT and n=2 for WT ACAN cKO. Tissues were processed according to the Nanostring CosMx fresh-frozen slide preparation manual for RNA assay (NanoString University). Data processing: Spatial transcriptomics datasets were filtered using the AtoMx RNA Quality Control module to flag outlier negative probes (control probes targeting non-existent sequences to quantify non-specific hybridization), lowly-expressing cells, FOVs, and target genes. Datasets were then normal..., # Single-cell spatial transcriptomics of ACAN cKO in WT and 5xFAD mice
Dataset DOI: 10.5061/dryad.z612jm6pw
Due to the large file size, the R object has been split by flowcell into two separate files (5xACANcKO_RNA_slide1.rds and 5xACANcKO_RNA_slide2.rds). The two files should be loaded into the R workspace and combined using the merge() function. For merging and downstream analysis, we recommend using a high performance computing system and at least 64GB of RAM for optimal performance. Data were analyzed using the R package Seurat. Sample metadata are stored in seurat@meta.data
.
Single-cell spatial transcriptomics dataset
Rownames of metadata (accessed using rownames(seurat@meta.data
) contain unique identifiers for each single cell, formatted as c_[slide][fov][cell]
. Additional metadata columns are described below:
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Development of the dorsal aorta is a key step in the establishment of the adult blood-forming system since hematopoietic stem and progenitor cells (HSPCs) arise from ventral aortic endothelium in all vertebrate animals studied. Work in zebrafish has demonstrated that arterial and venous endothelial precursors arise from distinct subsets of lateral plate mesoderm. Here, we profile the transcriptome of the earliest detectable endothelial cells (ECs) during zebrafish embryogenesis to demonstrate that tissue-specific EC programs initiate much earlier than previously appreciated, by the end of gastrulation. Classic studies in the chick embryo showed that paraxial mesoderm generates a subset of somite-derived endothelial cells (SDECs) that incorporate into the dorsal aorta to replace HSPCs as they exit the aorta and enter circulation. We describe a conserved program in the zebrafish, where a rare population of endothelial precursors delaminates from the dermomyotome to incorporate exclusively into the developing dorsal aorta. Although SDECs lack hematopoietic potential, they act as a local niche to support the emergence of HSPCs from neighboring hemogenic endothelium. Thus, at least three subsets of ECs contribute to the developing dorsal aorta: vascular ECs, hemogenic ECs, and SDECs. Taken together, our findings indicate that the distinct spatial origins of endothelial precursors dictate different cellular potentials within the developing dorsal aorta. Methods Single-cell RNA sample preparation After FACS, total cell concentration and viability were ascertained using a TC20 Automated Cell Counter (Bio-Rad). Samples were then resuspended in 1XPBS with 10% BSA at a concentration between 800-3000 per ml. Samples were loaded on the 10X Chromium system and processed as per manufacturer’s instructions (10X Genomics). Single cell libraries were prepared as per the manufacturer’s instructions using the Single Cell 3’ Reagent Kit v2 (10X Genomics). Single cell RNA-seq libraries and barcode amplicons were sequenced on an Illumina HiSeq platform. Single-cell RNA sequencing analysis The Chromium 3’ sequencing libraries were generated using Chromium Single Cell 3’ Chip kit v3 and sequenced with (actually, I don’t know:( what instrument was used?). The Ilumina FASTQ files were used to generate filtered matrices using CellRanger (10X Genomics) with default parameters and imported into R for exploration and statistical analysis using a Seurat package (La Manno et al., 2018). Counts were normalized according to total expression, multiplied by a scale factor (10,000), and log-transformed. For cell cluster identification and visualization, gene expression values were also scaled according to highly variable genes after controlling for unwanted variation generated by sample identity. Cell clusters were identified based on UMAP of the first 14 principal components of PCA using Seurat’s method, Find Clusters, with an original Louvain algorithm and resolution parameter value 0.5. To find cluster marker genes, Seurat’s method, FindAllMarkers. Only genes exhibiting significant (adjusted p-value < 0.05) a minimal average absolute log2-fold change of 0.2 between each of the clusters and the rest of the dataset were considered as differentially expressed. To merge individual datasets and to remove batch effects, Seurat v3 Integration and Label Transfer standard workflow (Stuart et al., 2019)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The power of single-cell RNA sequencing (scRNA-seq) in detecting cell heterogeneity or developmental process is becoming more and more evident every day. The granularity of this knowledge is further propelled when combining two batches of scRNA-seq into a single large dataset. This strategy is however hampered by technical differences between these batches. Typically, these batch effects are resolved by matching similar cells across the different batches. Current approaches, however, do not take into account that we can constrain this matching further as cells can also be matched on their cell type identity. We use an auto-encoder to embed two batches in the same space such that cells are matched. To accomplish this, we use a loss function that preserves: (1) cell-cell distances within each of the two batches, as well as (2) cell-cell distances between two batches when the cells are of the same cell-type. The cell-type guidance is unsupervised, i.e., a cell-type is defined as a cluster in the original batch. We evaluated the performance of our cluster-guided batch alignment (CBA) using pancreas and mouse cell atlas datasets, against six state-of-the-art single cell alignment methods: Seurat v3, BBKNN, Scanorama, Harmony, LIGER, and BERMUDA. Compared to other approaches, CBA preserves the cluster separation in the original datasets while still being able to align the two datasets. We confirm that this separation is biologically meaningful by identifying relevant differential expression of genes for these preserved clusters.
Not seeing a result you expected?
Learn how you can add new datasets to our index.
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.