51 datasets found

Z
Processed, annotated, seurat object
data.niaid.nih.gov
zenodo.org
Updated Nov 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Cenk Celik (2023). Processed, annotated, seurat object [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7608211
Explore at:
Dataset updated
Nov 16, 2023
Dataset provided by
Guillaume Thibault
Cenk Celik
Description
The dataset contains an integrated, annotated Seurat v4 object. One can load the dataset into the R environment using the code below:

seurat_obj <- readRDS('PATH/TO/DOWNLOAD/seurat.rds')

The object has three assays: (I) RNA, (II) SCT and (III) integrated.
o
Individual-donor scRNA-Seq datasets, as Seurat 4.0.5 objects
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Mar 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexandros Sountoulidis; Christos Samakovlis (2022). Individual-donor scRNA-Seq datasets, as Seurat 4.0.5 objects [Dataset]. http://doi.org/10.5281/zenodo.6386451
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6386451
Dataset updated
Mar 26, 2022
Authors
Alexandros Sountoulidis; Christos Samakovlis
Description
The provided datasets correspond to the analyses of individual donor single-cell RNA Sequencing (scRNA-Seq) datasets, before their integration. The datasets have been saved as Seurat v4.0.5 objects. For clustering, we used default settings in Seurat 4.0.5 (resolution 0.8) and increased resolution, if necessary, to separate epithelium in proximal and distal. The *_clusters.pdf files show the suggested clusters in the individual datasets and the _indiv_anno1.pdf files show the cell annotations according to the 84 cell states, described in the study with title "Developmental origins of cell heterogeneity in the human lung" (1st preprint version doi: https://doi.org/10.1101/2022.01.11.475631). The "_cluster_annotations.csv" files provide information about the suggested annotations of the clusters. The "*_object_raw_and_log_counts.RData" objects contain the metadata and the UMI-counts [raw and log2(counts+1)] for each donor scRNA-Seq dataset.
pbmc single cell RNA-seq matrix
zenodo.org
csv
Updated May 4, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager (2021). pbmc single cell RNA-seq matrix [Dataset]. http://doi.org/10.5281/zenodo.4730807
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4730807
Dataset updated
May 4, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Samuel Buchet; Samuel Buchet; Francesco Carbone; Morgan Magnin; Morgan Magnin; Mickaël Ménager; Olivier Roux; Olivier Roux; Francesco Carbone; Mickaël Ménager
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single cell RNA-sequencing dataset of peripheral blood mononuclear cells (pbmc: T, B, NK and monocytes) extracted from two healthy donors.

Cells labeled as C26 come from a 30 years old female and cells labeled as C27 come from a 53 years old male. Cells have been isolated from blood using ficoll. Samples were sequenced using standard 3' v3 chemistry protocols by 10x genomics. Cellranger v4.0.0 was used for the processing, and reads were aligned to the ensembl GRCg38 human genome (GRCg38_r98-ensembl_Sept2019). QC metrics were calculated on the count matrix generated by cellranger (filtered_feature_bc_matrix). Cells with less than 3 genes per cells, less than 500 reads per cell and more than 20% of mithocondrial genes were discarded.

The processing steps was performed with the R package Seurat (https://satijalab.org/seurat/), including sample integration, data normalisation and scaling, dimensional reduction, and clustering. SCTransform method was adopted for the normalisation and scaling steps. The clustered cells were manually annotated using known cell type markers.

Files content:

- raw_dataset.csv: raw gene counts

- normalized_dataset.csv: normalized gene counts (single cell matrix)

- cell_types.csv: cell types identified from annotated cell clusters

- cell_types_macro.csv: cell macro types

- UMAP_coordinates.csv: 2d cell coordinates computed with UMAP algorithm in Seurat
f
Skin sc-RNASeq from seven body sites (face, scalp, axilla, palmoplantar,...
plus.figshare.com
bin
Updated Mar 11, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lam C Tsoi; Rachael Bogle; Johann Gudjonsson; Meri Oliva; Bridget Riley-Gillis (2025). Skin sc-RNASeq from seven body sites (face, scalp, axilla, palmoplantar, arm, leg, and back) [Dataset]. http://doi.org/10.25452/figshare.plus.25696620.v2
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.25452/figshare.plus.25696620.v2
Dataset updated
Mar 11, 2025
Dataset provided by
Figshare+
Authors
Lam C Tsoi; Rachael Bogle; Johann Gudjonsson; Meri Oliva; Bridget Riley-Gillis
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
This sc-RNAseq dataset is composed of disease-unaffected epidermal samples from 96 skin biopsies: 18 from published datasets - GSE173706, GSE249279 – and 78 newly generated ones. Biopsy sample and protocol details, and curated cell-type signature genes, are available in the scRNASeq_source_info_FigShare spreadsheet of this dataset. Processed Seurat object are provided herein. Raw data are available in SRA (id PRJNA1054546). Biopsies originated from seven body sites (face, scalp, axilla, palmoplantar, arm, leg, and back). The skin biopsies were separated into epidermis and dermis before dissociated and enriched for various cell fractions (keratinocytes, fibroblasts, and endothelial cells) and immune cells (myeloid and lymphoid cells) to up sample rare cell types. In total, across body sites, 274,834 cells were profiled, including 96,194 keratinocytes. Seurat v3.0. was utilized to normalize, scale, and reduce the dimensionality of the data. Low quality cells containing less than 200 genes per cell as well as greater than 5,000 genes per cell were filtered out. Cells containing more mitochondrial genes than the permitted quantile of 0.05 were removed. Ambient RNA was removed using R package SoupX v1.6.2. Doublets were removed using scDblFinder v1.12.0. Principal components (PC) were obtained from the topmost 2,000 variable genes, and the Uniform Manifold Approximation and Projection (UMAP) dimensional reduction technique was applied to the 30 topmost variable PC-reduced dataset. Batch effect correction was performed utilizing harmony v1.0, using donor as batch. After batch correction, cells were clustered using shared nearest neighbor modularity optimization-based clustering. Cluster marker genes were identified with FindAllMarkers; cluster corresponding cell type was identified by comparing marker genes to curated cell-type signature genes. Differential expression by keratinocyte subtype was performed with Seurat (v4.3.0) FindMarkers function by comparing keratinocyte subtype to non-keratinocyte clusters. The log fold-change of the average expression between a keratinocyte subtype cluster compared to the rest of clusters is utilized as keratinocyte-subtype gene expression statistic.
m
Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4
data.mendeley.com
Updated Jul 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard Guyer (2024). Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4 [Dataset]. http://doi.org/10.17632/cp4d7t74vb.1
Explore at:
Unique identifier
https://doi.org/10.17632/cp4d7t74vb.1
Dataset updated
Jul 25, 2024
Authors
Richard Guyer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RDS files containing processed Seurat objects for multiome analysis of neuroblastoma cell lines. File names reflect the cell line.
H
Dan R Laks Code of Seurat analysis 4 Primary GBM from Yuan, Sims, 2018
dataverse.harvard.edu
Updated Nov 21, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dan Laks (2021). Dan R Laks Code of Seurat analysis 4 Primary GBM from Yuan, Sims, 2018 [Dataset]. http://doi.org/10.7910/DVN/SYP8LH
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.7910/DVN/SYP8LH
Dataset updated
Nov 21, 2021
Dataset provided by
Harvard Dataverse
Authors
Dan Laks
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
RSTUDIO and Seurat package analysis of 4 primary GBM
n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
Z
A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a...
data.niaid.nih.gov
zenodo.org
Updated Sep 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tales Pascini (2023). A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8328678
Explore at:
Dataset updated
Sep 20, 2023
Dataset provided by
Miren Urrutia Iturritza
Franziska Hildebrandt
Mengxiao He
Elisa Semle
Joel Vega-Rodriguez
Tales Pascini
Johan Ankarklev
Emma R. Andersson
Sami Saarenpää
Noémi Van Hul
Charlotte L. Scott
Joakim Lundeberg
Bavo Vanneste
Christian Zwicker
Description
Dataset created in the study "A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration"

Structure

ST_berghei_liver

contains data generated during stpipeline analysis and imaging on 2k arrays Spatial Transcriptomics platform as well as data necessary for and from hepaquery analysis. These samples include 38 sections in total of which 8 are from mice (n=4) infected with sporozoites for 12h, 5 sections from control mice (n=3) at 12h, 7 sections from mice (n=4) infected with sporozoites for 24h and 4 sections from control mice (n=3) for 24 as well as 8 samples of mice (n=2) infected with sporozoites for 38h and control mice (n =2) for 38h.

count contains gene expression matrix output from stpipeline in .tsv format

spotfiles contains coordinate files for count matrices

images contains scaled H&E, Fluorescence (FL) and annotated H&E images (from FL annotations) scaled to 10% of the original image size.

masks contains image masks for hepaquery analysis

distances contains distance measurements from original section sorted by timepoint as well as combined across timepoints

cluster contains clustering information across spatial positions used in spatial enrichment analysis

STUtiility_mus_pb_ST.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in ST_berghei_liver

visium_berghei_liver

contains data generated with the spaceranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include 8 sections in total, of which 1 was infected with sporozoites for 12h, 1 control section at 12h, 1 section infected with sporozoites for 24h and 1 control section at 24 as well as 2 sporozoite infected sections, and 2 control sections at 38h.

V10S29-135_A1 contains spaceranger output for section 1 for infected and control sections at 38h post-infection

V10S29-135_B1 contains spaceranger output for section 1 for infected and control sections at 12h post-infection

V10S29-135_C1 contains spaceranger output for section 1 for infected and control sections at 24h post-infection

V10S29-135_D1 contains spaceranger output for section 2 for infected and control sections at 38h post-infection

se_visium.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in visium_berghei_liver

snSeq_berghei_liver

contains data generated with the cellranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include single nuclei of 2 infected and control mice after 12h, 2 infected and control mice after 24h, 2 infected and control mice after 38h, and 2 uninfected mice prior to a challenge.

cellranger_cnt_out contains feature count matrix information from cell ranger output

final_merged_curated_annotations_270623.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in snSeq_berghei_liver.tar.gz

raw images.zip contains raw images for supplementary figures 20-22

adjusted images.zip contains brightness and contrast adjusted images for supplementary figures 20-22
utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...
zenodo.org
zip
Updated Apr 6, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Borcherding; Nicholas Borcherding (2022). utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments with TCR [Dataset]. http://doi.org/10.5281/zenodo.4995299
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4995299
Dataset updated
Apr 6, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicholas Borcherding; Nicholas Borcherding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Introduction

The original intent of assembling a data set of publicly-available tumor-infiltrating T cells (TILs) with paired TCR sequencing was to expand and improve the scRepertoire R package. However, after some discussion, we decided to release the data set for everyone, a complete summary of the sequencing runs and the sample information can be found in the meta data of the Seurat object. This repository contains the code for the initial processing and annotating of the data set (we are calling this version 0.0.1). This involves several steps 1) loading the respective GE data, 2) harmonizing the data by sample and cohort information, 3) iterating through automatic annotation, 4) unifying annotation via manual inspection and enrichment analysis, and 5) adding the TCR information.

Methods

Single-Cell Data Processing

The filtered gene matrices output from Cell Ranger align function from individual sequencing runs (10x Genomics, Pleasanton, CA) loaded into the R global environment. For each sequencing run cell barcodes were appended to contain a unique prefix to prevent issues with duplicate barcodes. The results were then ported into individual Seurat objects (citation), where the cells with > 10% mitochondrial genes and/or 2.5x natural log distribution of counts were excluded for quality control purposes. At the individual sequencing run level, doublets were estimated using the scDblFinder (v1.4.0) R package. All the sequencing runs across experiments were merged into a single Seurat Object using the merge() function. All the data was then normalized using the default settings and 2,000 variable genes were identified using the "vst" method. Next the data was scaled with the default settings and principal components were calculated for 40 components. Data was integrated using the harmony (v1.0.0) R package (citation) using both cohort and sample information to correct for batch effect with up to 20 iterations. The UMAP was created using the runUMAP() function in Seurat, using 20 dimensions of the harmony calculations.

Annotation of Cells

Automatic annotation was performed using the singler (v1.4.1) R package (citation) with the HPCA (citation) and DICE (citation) data sets as references and the fine label discriminators. Individual sequencing runs were subsetted to run through the singleR algorithm in order to reduce memory demands. The output of all the singleR analyses were collated and appended to the meta data of the seurat object. Likewise, the ProjecTILs (v0.4.1) R Package (citation) was used for automatic annotation as a partially orthogonal approach. Consensus annotation was derived from all 3 databases (HPCA, DICE, ProjecTILs) using a majority approach. No annotation designation was assigned to cells that returned NA for both singleR and ProjecTILs. Mixed annotations were designated with SingleR identified non-Tcells and ProjecTILs identified T cells. Cell type designations with less than 100 cells in the entire cohort were reduced to "other". Automated annotations were checked manually using canonical marker genes and gene enrichment analysis performed using UCell (v1.0.0) R package (citation).

Addition of TCR data

The filtered contig annotation T cell receptor (TCR) data for available sequencing runs were loaded into the R global environment. Individual contigs were combined using the combineTCR() function of scRepertoire (v1.3.2) R Package (citation). Clonotypes were assigned to barcodes and were multiple duplicate chains for individual cells were filtered to select for the top expressing contig by read count. The clonotype data was then added to the Seurat Object with proportion across individual patients being used to calculate frequency.

Citations

As of right now, there is no citation associated with the assembled data set. However if using the data, please find the corresponding manuscript for each data set in the meta.data of the single-cell object. In addition, if using the processed data, feel free to modify the language in the methods section (above) and please cite the appropriate manuscripts of the software or references that were used.

Itemized List of the Software Used

Seurat v4.0.3 - citation

harmony v1.0 - citation

singler v1.4.1 - citation

ProjecTILs v0.4.1 - citation

UCell v1.0.0 - citation

scRepertoire v1.3.2 - citation

Itemized List of Reference Data Used

Human Primary Cell Atlas (HPCA) - citation

Database Immune Cell Expression (DICE) - citation

Immune-related Gene Sets - citation

Future Directions

Data Hosting for Interactive Analysis

Easy Submission Portal for Researchers to Add Data

Using the Data to Build a Reference Atlas

There are areas in which we are actively hoping to develop to further facilitate the usefulness of the data set - if you have other suggestions, please reach out using the contact information below.

Contact

Questions, comments, suggestions, please feel free to contact Nick Borcherding via this repository, email, or using twitter.
Dataset summary providing data modality, sequencing platform, and number of...
plos.figshare.com
xls
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sumeer Ahmad Khan; Robert Lehmann; Xabier Martinez-de-Morentin; Alberto Maillo; Vincenzo Lagani; Narsis A. Kiani; David Gomez-Cabrero; Jesper Tegner (2023). Dataset summary providing data modality, sequencing platform, and number of cells employed for integration after pre-processing. [Dataset]. http://doi.org/10.1371/journal.pone.0281315.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0281315.t001
Dataset updated
Jun 10, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Sumeer Ahmad Khan; Robert Lehmann; Xabier Martinez-de-Morentin; Alberto Maillo; Vincenzo Lagani; Narsis A. Kiani; David Gomez-Cabrero; Jesper Tegner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset summary providing data modality, sequencing platform, and number of cells employed for integration after pre-processing.
l
cellCounts
opal.latrobe.edu.au
researchdata.edu.au
bin
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.26181/21588276.v3
Dataset updated
Dec 19, 2022
Dataset provided by
La Trobe
Authors
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.

R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html

Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...
data.niaid.nih.gov
datadryad.org
zip
Updated Oct 17, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove (2024). Pre-ciliated tubal epithelial cells are prone to initiation of high-grade serous ovarian carcinoma [Dataset]. http://doi.org/10.5061/dryad.4mw6m90hm
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.4mw6m90hm
Dataset updated
Oct 17, 2024
Dataset provided by
Cornell University
Authors
Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
The distal region of the uterine (Fallopian) tube is commonly associated with high-grade serous carcinoma (HGSC), the predominant and most aggressive form of ovarian or extra-uterine cancer. Specific cell states and lineage dynamics of the adult tubal epithelium (TE) remain insufficiently understood, hindering efforts to determine the cell of origin for HGSC. Here, we report a comprehensive census of cell types and states of the mouse uterine tube. We show that distal TE cells expressing the stem/progenitor cell marker Slc1a3 can differentiate into both secretory (Ovgp1+) and ciliated (Fam183b+) cells. Inactivation of Trp53 and Rb1, whose pathways are commonly altered in HGSC, leads to elimination of targeted Slc1a3+ cells by apoptosis, thereby preventing their malignant transformation. In contrast, pre-ciliated cells (Krt5+, Prom1+, Trp73+) remain cancer-prone and give rise to serous tubal intraepithelial carcinomas and overt HGSC. These findings identify transitional pre-ciliated cells as a previously unrecognized cancer-prone cell state and point to pre-ciliation mechanisms as novel diagnostic and therapeutic targets. Methods

Single-cell RNA-sequencing library preparation For TE single cell expression and transcriptome analysis we isolated TE from C57BL6 adult estrous female mice. In 3 independent experiments a total of 62 uterine tubes were collected. Each uterine tube was placed in sterile PBS containing 100 IU ml-1 of penicillin and 100 µg ml-1 streptomycin (Corning, 30-002-Cl), and separated in distal and proximal regions. Tissues from the same region were combined in a 40 µl drop of the same PBS solution, cut open lengthwise, and minced into 1.5-2.5 mm pieces with 25G needles. Minced tissues were transferred with help of a sterile wide bore 200 µl pipette tip into a 1.8 ml cryo vial containing 1.2 ml A-mTE-D1 (300 IU ml-1 collagenase IV mixed with 100 IU ml-1 hyaluronidase; Stem Cell Technologies, 07912, in DMEM Ham’s F12, Hyclone, SH30023.FS). Tissues were incubated with loose cap for 1 h at 37°C in a 5% CO2 incubator. During the incubation tubes were taken out 4 times and tissues suspended with a wide bore 200 µl pipette tip. At the end of incubation, the tissue-cell suspension from each tube was transferred into 1 ml TrypLE (Invitrogen, 12604013) pre-warmed to 37°C, suspended 70 times with a 1000 µl pipette tip, 5 ml A-SM [DMEM Ham’s F12 containing 2% fetal bovine serum (FBS)] were added to the mix, and TE cells were pelleted by centrifugation 300x g for 10 minutes at 25°C. Pellets were then suspended with 1 ml pre-warmed to 37°C A-mTE-D2 (7 mg ml-1 Dispase II, Worthington NPRO2, and 10 µg ml-1 Deoxyribonuclease I, Stem Cell Technologies, 07900), and mixed 70 times with a 1000 µl pipette tip. 5 ml A-mTE-D2 was added and samples were passed through a 40 µm cell strainer, and pelleted by centrifugation at 300x g for 7 minutes at +4°C. Pellets were suspended in 100 µl microbeads per 107 total cells or fewer, and dead cells were removed with the Dead Cell Removal Kit (Miltenyi Biotec, 130-090-101) according to the manufacturer’s protocol. Pelleted live cell fractions were collected in 1.5 ml low binding centrifuge tubes, kept on ice, and suspended in ice cold 50 µl A-Ri-Buffer (5% FBS, 1% GlutaMAX-I, Invitrogen, 35050-079, 9 µM Y-27632, Millipore, 688000, and 100 IU ml-1 penicillin 100 μg ml-1 streptomycin in DMEM Ham’s F12). Cell aliquots were stained with trypan blue for live and dead cell calculation. Live cell preparations with a target cell recovery of 5,000-6,000 were loaded on Chromium controller (10X Genomics, Single Cell 3’ v2 chemistry) to perform single cell partitioning and barcoding using the microfluidic platform device. After preparation of barcoded, next-generation sequencing cDNA libraries samples were sequenced on Illumina NextSeq500 System.

Download and alignment of single-cell RNA sequencing data For sequence alignment, a custom reference for mm39 was built using the cellranger (v6.1.2, 10x Genomics) mkref function. The mm39.fa soft-masked assembly sequence and the mm39.ncbiRefSeq.gtf (release 109) genome annotation last updated 2020-10-27 were used to form the custom reference. The raw sequencing reads were aligned to the custom reference and quantified using the cellranger count function.

Preprocessing and batch correction All preprocessing and data analysis was conducted in R (v.4.1.1 (2021-08-10)). The cellranger count outs were first modified with the autoEstCont and adjustCounts functions from SoupX (v.1.6.1) to output a corrected matrix with the ambient RNA signal (soup) removed (https://github.com/constantAmateur/SoupX). To preprocess the corrected matrices, the Seurat (v.4.1.1) NormalizeData, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors, and RunUMAP functions were used to create a Seurat object for each sample (https://github.com/satijalab/seurat). The number of principal components used to construct a shared nearest-neighbor graph were chosen to account for 95% of the total variance. To detect possible doublets, we used the package DoubletFinder (v.2.0.3) with inputs specific to each Seurat object. DoubletFinder creates artificial doublets and calculates the proportion of artificial k nearest neighbors (pANN) for each cell from a merged dataset of the artificial and actual data. To maximize DoubletFinder’s predictive power, mean-variance normalized bimodality coefficient (BCMVN) was used to determine the optimal pK value for each dataset. To establish a threshold for pANN values to distinguish between singlets and doublets, the estimated multiplet rates for each sample were calculated by interpolating between the target cell recovery values according to the 10x Chromium user manual. Homotypic doublets were identified using unannotated Seurat clusters in each dataset with the modelHomotypic function. After doublets were identified, all distal and proximal samples were merged separately. Cells with greater than 30% mitochondrial genes, cells with fewer than 750 nCount RNA, and cells with fewer than 200 nFeature RNA were removed from the merged datasets. To correct for any batch defects between sample runs, we used the harmony (v.0.1.0) integration method (github.com/immunogenomics/harmony).

Clustering parameters and annotations After merging the datasets and batch-correction, the dimensions reflecting 95% of the total variance were input into Seurat’s FindNeighbors function with a k.param of 70. Louvain clustering was then conducted using Seurat’s FindClusters with a resolution of 0.7. The resulting 19 clusters were annotated based on the expression of canonical genes and the results of differential gene expression (Wilcoxon Rank Sum test) analysis. One cluster expressing lymphatic and epithelial markers was omitted from later analysis as it only contained 2 cells suspected to be doublets. To better understand the epithelial populations, we reclustered 6 epithelial populations and reapplied harmony batch correction. The clustering parameters from FindNeighbors was a k.param of 50, and a resolution of 0.7 was used for FindClusters. The resulting 9 clusters within the epithelial subset were further annotated using differential expression analysis and canonical markers.

Pseudotime analysis Potential of heat diffusion for affinity-based transition embedding (PHATE) is dimensional reduction method to more accurately visualize continual progressions found in biological data 35. A modified version of Seurat (v4.1.1) was developed to include the ‘RunPHATE’ function for converting a Seurat Object to a PHATE embedding. This was built on the phateR package (v.1.0.7) (https://github.com/scottgigante/seurat/tree/patch/add-PHATE-again). In addition to PHATE, pseudotime values were calculated with Monocle3 (v.1.2.7), which computes trajectories with an origin set by the user 36,55–57. The origin was set to be a progenitor cell state confirmed with lineage tracing experiments. 35. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). doi:10.1038/s41587-019-0336-3 36. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). doi:10.1038/s41586-019-0969-x 55. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 32, 381–386 (2014). doi:10.1038/nbt.2859 56. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nature Methods 14, 309–315 (2017). doi:10.1038/nmeth.4150 57. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14, 979–982 (2017). doi:10.1038/nmeth.4402
Processed Seurat Objects for Localized Marker Detector (Cluster-Independent...
figshare.com
application/gzip
Updated Jun 10, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ruiqi Li; Peggy Myung (2025). Processed Seurat Objects for Localized Marker Detector (Cluster-Independent Multiscale Marker Identification inSingle-cell RNA-seq Data using Localized Marker Detector) [Dataset]. http://doi.org/10.6084/m9.figshare.26507098.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26507098.v2
Dataset updated
Jun 10, 2025
Dataset provided by
Figsharehttp://figshare.com/
Authors
Ruiqi Li; Peggy Myung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These are processed Seurat objects for the biological datasets in Localized Marker Detector (https://github.com/KlugerLab/LocalizedMarkerDetector):Tabular Muris bone marrow dataset (FACS-based and Droplet-based)We used publicly available scRNA-seq mouse bone marrow datasets (FACS and Droplet-based) from the Tabular Muris Consortium, which were already pre-processed and annotated according to their workflow. In addition, we applied ALRA imputation to generate a denoised assay alra and added several cell annotations: (1) Cell cycle annotation using CellCycleScoring with the updated 2019 cell cycle gene set; (2) Module Activity Scores for the gene modules listed in our paper.Mouse embryo skin datasetWe separated dermal cell populations from newly collected mouse embryo skin samples (aligned to the mouse genome mm10 using CellRanger (v.6.1.2)). Cells from the wildtype and SmoM2YFP mutant (SmoM2) for two consecutive days (embryonic day 13.5 and 14.5) were pooled for analysis. To avoid batch effects from pooling or integrating, we analyzed each condition separately: E13.5 SmoM2, E13.5 WT, E14.5 SmoM2, and E14.5 WT. For each condition, we performed standard normalization, selected the top 2,000 highly variable genes, and scaled the data using the Seurat v4 R package. We then applied PCA, retaining the number of PCs determined by the elbow plot: E13.5 SmoM2 (14 PCs), E13.5 WT (12 PCs), E14.5 SmoM2 (12 PCs), and E14.5 WT (11 PCs).
Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune...
zenodo.org
bin
Updated Jun 17, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William Tomaszewski; William Tomaszewski (2022). Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune infiltrate of CT2a preclinical murine glioma [Dataset]. http://doi.org/10.5281/zenodo.6654420
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.6654420
Dataset updated
Jun 17, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
William Tomaszewski; William Tomaszewski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains the processed Seurat objects generated from the raw data deposited at the Gene Expression Omnibus (GEO) under GSE197879.

Details about the experiment and sequencing are available under GSE197879.

Information on how the Seurat objects were created can be found in this GitHub repository https://github.com/wht10/CT2A_scRNAseq_CaMKK2KOvWT .

Notable metadata within each Seurat object:

1. Processed_CD45_Live_Fig2b.rds

Genotype - whether the cell is from a WT or CaMKK2 KO mouse

HTO_maxID - The biological replicate that the cell came from (4 biological replicates per genotype)

MouseID - A concatenation between the genotype and HTO_maxID, providing a unique identifier for each biological replicate

Cell.Type - The cell type annotations for each cell. Can be assigned to "Idents()" to change the name of the cell identities.

Geno.Ident - A concatenation between Genotype and Cell.Type. By re-assigning this to "Idents()" "FindMarkers()" can be used to investigate differentially expressed genes within a cell-type between genotypes.

2. Reclustered_TILs_Fig3a.rds

Genotype - whether the cell is from a WT or CaMKK2 KO mouse

HTO_maxID - The biological replicate that the cell came from (4 biological replicates per genotype)

MouseID - A concatenation between the genotype and HTO_maxID, providing a unique identifier for each biological replicate

Celltype - The cell type annotations for each cell. Can be assigned to "Idents()" to change the name of the cell identities.

Geno_Ident - A concatenation between Genotype and cell-type. By re-assigning this to "Idents()" "FindMarkers()" can be used to investigate differentially expressed genes within a cell-type between genotypes.
Dataset and Code for "A single-cell map of hypertension"
zenodo.org
Updated Nov 2, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Qiongzi Qiu; Qiongzi Qiu (2024). Dataset and Code for "A single-cell map of hypertension" [Dataset]. http://doi.org/10.5281/zenodo.14027488
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14027488
Dataset updated
Nov 2, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Qiongzi Qiu; Qiongzi Qiu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset and accompanying code support the manuscript "A single-cell map of hypertension." The files include all necessary intermediate files and code required to reproduce the primary figures in the manuscript. For single-cell data, users can access the data through our web portal at https://viz.datascience.arizona.edu/content/2d089f1d-2904-4888-b86f-70ca2fe7a297/.

Package versions:

CellRanger: v7.0.0
CellBender: v0.2.0
DoubletFinder: v2.0.3
Seurat: v4.3.0.1
Harmony: v1.2.0
Clustree: v0.5.1
SingleR: v2.4.1
miloR: v1.6.0
CellChat: v1.6.1
GSVA: v1.46.0
ArchR: v1.0.2
Data and program codes for Maeda et al. 2022 PCP
figshare.com
zip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Taro Maeda (2023). Data and program codes for Maeda et al. 2022 PCP [Dataset]. http://doi.org/10.6084/m9.figshare.20375205.v4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20375205.v4
Dataset updated
May 31, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Taro Maeda
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data set and scripts for the arabidopsis leaf single cell RNA-seq analysis in "Single-cell RNA sequencing of Arabidopsis leaf tissues identifies multiple specialized cell types" Planta and cell physiology https://doi.org/10.1093/pcp/pcac167

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

zenodo.org

bin, csv, zip

Updated Oct 24, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang (2024). Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to anti-PD-1 and anti-PD-1/CTLA-4 immunotherapy in melanoma [Dataset]. http://doi.org/10.5281/zenodo.13971562

Explore at:

bin, csv, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13971562

Dataset updated

Oct 24, 2024

Dataset provided by

Zenodo

Authors

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset details the scRNASeq and TCR-Seq analysis of sorted PD-1+ CD8+ T cells from patients with melanoma treated with checkpoint therapy (anti-PD-1 monotherapy and anti-PD-1 & anti-CTLA-4 combination therapy) at baseline and after the first cycle of therapy. A major publication using this dataset is accessible here: (reference)

*experimental design

Single-cell RNA sequencing was performed using 10x Genomics with feature barcoding technology to multiplex cell samples from different patients undergoing mono or dual therapy so that they can be loaded on one well to reduce costs and minimize technical variability. Hashtag oligomers (oligos) were obtained as purified and already oligo-conjugated in TotalSeq-C format from BioLegend. Cells were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*extract protocol

PBMCs were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions.

*library construction protocol

Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*library strategy

scRNA-seq and scTCR-seq

*data processing step

Pre-processing of sequencing results to generate count matrices (gene expression and HTO barcode counts) was performed using the 10x genomics Cell Ranger pipeline.

Further processing was done with Seurat (cell and gene filtering, hashtag identification, clustering, differential gene expression analysis based on gene expression).

*genome build/assembly

Alignment was performed using prebuilt Cell Ranger human reference GRCh38.

*processed data files format and content

RNA counts and HTO counts are in sparse matrix format and TCR clonotypes are in csv format.

Datasets were merged and analyzed by Seurat and the analyzed objects are in rds format.

file name	file checksum
PD1CD8_160421_filtered_feature_bc_matrix.zip	da2e006d2b39485fd8cf8701742c6d77
PD1CD8_190421_filtered_feature_bc_matrix.zip	e125fc5031899bba71e1171888d78205
PD1CD8_160421_filtered_contig_annotations.csv	927241805d507204fbe9ef7045d0ccf4
PD1CD8_190421_filtered_contig_annotations.csv	8ca544d27f06e66592b567d3ab86551e

*processed data file	antibodies/tags
PD1CD8_160421_filtered_feature_bc_matrix.zip	none
PD1CD8_160421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M1_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M1_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - C1_base_combined_therapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - C1_post_combined_therapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C2_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C2_post_combined_therapy
PD1CD8_160421_filtered_contig_annotations.csv	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M2_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M2_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - M3_base_monotherapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - M3_post_monotherapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C3_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C3_post_combined_therapy
PD1CD8_190421_filtered_contig_annotations.csv	none

Z
Processed snRNA-seq data from "Divergent single cell transcriptome and...
data.niaid.nih.gov
Updated Jul 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexey Kozlenkov (2023). Processed snRNA-seq data from "Divergent single cell transcriptome and epigenome alterations in ALS and FTD patients with C9orf72 mutation" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8190316
Explore at:
Dataset updated
Jul 29, 2023
Dataset provided by
Mahammad Gardashli
Junhao Li
Stella Dracheva
Dennis W. Dickson
Eran A Mukamel
Veronique V. Belzil
Jo-Fan Chien
Manoj K Jaiswal
Ping Zhou
Erica Engelberg-Cook
Luc J. Pregent
Alexey Kozlenkov
Jinyoung Jung
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Processed snRNA-seq data from "Divergent single cell transcriptome and epigenome alterations in ALS and FTD patients with C9orf72 mutation". All nuclei passed QC and were corrected for background noise using cellBender. Files are in R objects saved in RDS (R Data Serialization) format. This repo contains one Seurat v4 object and one gene-by-cell raw RNA count matrix in sparse matrix format (dgCMatrix).
f
S3 Fig -
plos.figshare.com
zip
Updated Jun 10, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sumeer Ahmad Khan; Robert Lehmann; Xabier Martinez-de-Morentin; Alberto Maillo; Vincenzo Lagani; Narsis A. Kiani; David Gomez-Cabrero; Jesper Tegner (2023). S3 Fig - [Dataset]. http://doi.org/10.1371/journal.pone.0281315.s003
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0281315.s003
Dataset updated
Jun 10, 2023
Dataset provided by
PLOS ONE
Authors
Sumeer Ahmad Khan; Robert Lehmann; Xabier Martinez-de-Morentin; Alberto Maillo; Vincenzo Lagani; Narsis A. Kiani; David Gomez-Cabrero; Jesper Tegner
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Integration results with across platforms data from CelSeq2, SmartSeq and its quantitative comparison, a) scAEGAN results shows its outperformance as compared to AE-Concatenated, integrating data CelSeq2, SmartSeq platforms, b) The results from the AE-Concatenated shows its bad performance while integrating the datasets from CelSeq2, SmartSeq platforms, c) scAEGAN results shows its outperformance as compared to AE-Concatenated, Seurat and cGAN for integrating data across different platforms. (ZIP)
f
Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute...
figshare.com
bin
Updated Jun 16, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Emin Onur Karakaslar (2025). Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute Myeloid Leukemia samples [Dataset]. http://doi.org/10.6084/m9.figshare.26189771.v1
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.26189771.v1
Dataset updated
Jun 16, 2025
Dataset provided by
figshare
Authors
Emin Onur Karakaslar
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
TLDRSeurat object of the 16 NPM1-mutated AML samples (n = 83,162 cells).AML samplesAll sixteen peripheral blood and bone marrow samples were obtained from patients with AML at diagnosis (n=15) or relapse after chemotherapy (n=1) with written informed consent according to the Declaration of Helsinki. Mononuclear cells were isolated by Ficoll-Isopaque density gradient centrifugation and cryopreserved in the Leiden University Medical Center (LUMC) Biobank for Hematological Diseases after approval by the LUMC Institutional Review Board (protocol no. B18.047).Upstream processing pipelineCellRanger v7.0.0 was run on all samples with the human reference genome hg38. For all QC Seurat v4 was used15. Our QC pipeline had three steps per sample: 1) soft filtering, 2) low quality cluster removal, and 3) doublet detection. In soft filtering, Seurat objects were created with cells expressing at least 200 genes and with the genes expressed at least in 3 cells. Then, standard Seurat command list with default parameters was run to detect low quality clusters. Clusters with >15% mitochondrial and 15% mitochondrial mRNA. We used standard Seurat commands to scale and normalize the data on integrated features. First 30 principal components were used to create UMAP plots. We used clustree to determine optimal cluster number, based on FindClusters with resolutions sweeping from 0 to 1.2. We chose res=0.5, as clusters became stable. Next, we merged two clusters (CC5 and CC12) into one GMP-like cluster as one of these clusters (CC12) had high expression of HSP-genes yet still retained its cell-type specific properties.Note: The file was processed with Seurat v4 but the object is updated for v5. Uploaded as .qs file format for faster reading. To read the file: qs:qread("path/to/data.qs")This data is available for research use only; and cannot be used for commercial purposes.For further queries please refer to our paper:

Facebook

Twitter

Click to copy link

Link copied

Cite

Cenk Celik (2023). Processed, annotated, seurat object [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7608211

Processed, annotated, seurat object

Explore at:

Dataset updated

Nov 16, 2023

Dataset provided by

Guillaume Thibault
Cenk Celik

Description

The dataset contains an integrated, annotated Seurat v4 object. One can load the dataset into the R environment using the code below:

seurat_obj <- readRDS('PATH/TO/DOWNLOAD/seurat.rds')

The object has three assays: (I) RNA, (II) SCT and (III) integrated.

Clear search

Close search

Google apps

Main menu

Processed, annotated, seurat object

Individual-donor scRNA-Seq datasets, as Seurat 4.0.5 objects

pbmc single cell RNA-seq matrix

Skin sc-RNASeq from seven body sites (face, scalp, axilla, palmoplantar,...

Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4

Dan R Laks Code of Seurat analysis 4 Primary GBM from Yuan, Sims, 2018

Data from: Large-scale integration of single-cell transcriptomic data...

A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a...

utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...

Dataset summary providing data modality, sequencing platform, and number of...

cellCounts

Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...

Processed Seurat Objects for Localized Marker Detector (Cluster-Independent...

Processed Seurat Object of scRNAseq data from wildtype and CaMKK2 KO immune...

Dataset and Code for "A single-cell map of hypertension"

Data and program codes for Maeda et al. 2022 PCP

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

Processed snRNA-seq data from "Divergent single cell transcriptome and...

S3 Fig -

Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute...

Processed, annotated, seurat object