100+ datasets found

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17058077.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yunshun Chen; Gordon Smyth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.
n
Transcription start site analysis for heterogenous CD4+ T cells using 5′...
data.niaid.nih.gov
datadryad.org
zip
Updated Apr 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Akiko Oguchi; Yasuhiro Murakawa (2024). Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq [Dataset]. http://doi.org/10.5061/dryad.gtht76hv9
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.gtht76hv9
Dataset updated
Apr 22, 2024
Dataset provided by
RIKEN Center for Integrative Medical Sciences
Authors
Akiko Oguchi; Yasuhiro Murakawa
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
These datasets are generated by ReapTEC (read-level pre-filtering and transcribed enhancer call) using 5' single-cell RNA-seq data on human heterogenous CD4+ T cells. By taking advantage of a unique “cap signature” derived from the 5′-end of a transcript, ReapTEC simultaneously profiles gene expression and enhancer activity at nucleotide resolution using 5′-end single-cell RNA-sequencing (5′ scRNA-seq). The detail of ReapTEC pipeline is described in https://github.com/MurakawaLab/ReapTEC.
Scripts for Analysis
figshare.com
txt
Updated Jul 18, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sneddon Lab UCSF (2018). Scripts for Analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6783569.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6783569.v2
Dataset updated
Jul 18, 2018
Dataset provided by
Figsharehttp://figshare.com/
Authors
Sneddon Lab UCSF
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Scripts used for analysis of V1 and V2 Datasets.seurat_v1.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, PCA analysis, clustering, tSNE visualization. Used for v1 datasets. merge_seurat.R - merge two or more seurat objects into one seurat object. Perform linear regression to remove batch effects from separate objects. Used for v1 datasets. subcluster_seurat_v1.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA. Used for v1 datasets.seurat_v2.R - initialize seurat object from 10X Genomics cellranger outputs. Includes filtering, normalization, regression, variable gene identification, and PCA analysis. Used for v2 datasets. clustering_markers_v2.R - clustering and tSNE visualization for v2 datasets. subcluster_seurat_v2.R - subcluster clusters of interest from Seurat object. Determine variable genes, perform regression and PCA analysis. Used for v2 datasets.seurat_object_analysis_v1_and_v2.R - downstream analysis and plotting functions for seurat object created by seurat_v1.R or seurat_v2.R. merge_clusters.R - merge clusters that do not meet gene threshold. Used for both v1 and v2 datasets. prepare_for_monocle_v1.R - subcluster cells of interest and perform linear regression, but not scaling in order to input normalized, regressed values into monocle with monocle_seurat_input_v1.R monocle_seurat_input_v1.R - monocle script using seurat batch corrected values as input for v1 merged timecourse datasets. monocle_lineage_trace.R - monocle script using nUMI as input for v2 lineage traced dataset. monocle_object_analysis.R - downstream analysis for monocle object - BEAM and plotting. CCA_merging_v2.R - script for merging v2 endocrine datasets with canonical correlation analysis and determining the number of CCs to include in downstream analysis. CCA_alignment_v2.R - script for downstream alignment, clustering, tSNE visualization, and differential gene expression analysis.
d
Data from: Large-scale integration of single-cell transcriptomic data...
dataone.org
data.niaid.nih.gov
+1more
Updated May 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2025). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
May 2, 2025
Dataset provided by
Dryad Digital Repository
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
Time period covered
Oct 22, 2021
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, in...
f
Processed naive T cell single-cell RNA-seq, Seurat object
figshare.com
application/gzip
Updated Jan 5, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Daniel Bunis (2021). Processed naive T cell single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11886891.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11886891.v2
Dataset updated
Jan 5, 2021
Dataset provided by
figshare
Authors
Daniel Bunis
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Processed naive CD4 and CD8 T cell single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis
Data used in SeuratIntegrate paper
zenodo.org
application/gzip, bin +2
Updated May 23, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Florian Specque; Florian Specque; Macha Nikolski; Macha Nikolski; Domitille Chalopin; Domitille Chalopin (2025). Data used in SeuratIntegrate paper [Dataset]. http://doi.org/10.5281/zenodo.15496601
Explore at:
bin, pdf, txt, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15496601
Dataset updated
May 23, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Florian Specque; Florian Specque; Macha Nikolski; Macha Nikolski; Domitille Chalopin; Domitille Chalopin
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository gathers the data and code used to generate hepatocellular carcinoma analyses in the paper presenting SeuratIntegrate. It contains the scripts to reproduce the figures presented in the article. Some figures are also available as pdf files.

To be able to fully reproduce the results from the paper, one shoud:

download all the files

install R 4.3.3, with correspondig base R packages (stats, graphics, grDevices, utils, datasets, methods and base)

install R packages listed in the file sessionInfo.txt

install the provided version of SeuratIntegrate. In an R session, run:

remotes::install_local("path/to/SeuratIntegrate_0.4.1.tar.gz")

install (mini)conda if necessary (we used miniconda version 23.11.0)

install the conda environments (if it fails with the *package-list.yml files, use the *package-list-from-history.yml files instead):

conda env create --file SeuratIntegrate_bbknn_package-list.yml conda env create --file SeuratIntegrate_scanorama_package-list.yml conda env create --file SeuratIntegrate_scvi-tools_package-list.yml conda env create --file SeuratIntegrate_trvae_package-list.yml

open an R session to make the conda environments usable by SeuratIntegrate:

library(SeuratIntegrate) UpdateEnvCache("bbknn", conda.env = "SeuratIntegrate_bbknn", conda.env.is.path = FALSE) UpdateEnvCache("scanorama", conda.env = "SeuratIntegrate_scanorama", conda.env.is.path = FALSE) UpdateEnvCache("scvi", conda.env = "SeuratIntegrate_scvi-tools", conda.env.is.path = FALSE) UpdateEnvCache("trvae", conda.env = "SeuratIntegrate_trvae", conda.env.is.path = FALSE)

Once done, running the code in integrate.R should produce reproducible results. Note that lines 3 to 6 from integrate.R should be adapted to the user's setup.
integrate.R is subdivided into six main parts:

Preparation: lines 1-56

Preprocessing: lines 58-74

Integration: lines 76-121

Processing of integration outputs: lines 126-267

Scoring of integration outputs: lines 269-353

Plotting: lines 380-507

Intermediate SeuratObjects have been saved between steps 3 and 4 and 5 and 6 (liver10k_integrated_object.RDS and liver10k_integrated_scored_object.RDS respectively). It is possible to start with these intermediate SeuratObjects to avoid the preceding steps, given that the Preparation step is always run before.
Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
zenodo.org
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Dataset provided by
Stoop, Allart
Hsu, Jonathan
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
o
WORKSHOP: Single cell RNAseq analysis in R
explore.openaire.eu
Updated Sep 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter (2023). WORKSHOP: Single cell RNAseq analysis in R [Dataset]. http://doi.org/10.5281/zenodo.10042918
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.10042918
Dataset updated
Sep 26, 2023
Authors
Sarah Williams; Adele Barugahare; Paul Harrison; Laura Perlaza Jimenez; Nicholas Matigan; Valentine Murigneux; Magdalena Antczak; Uwe Winter
Description
This record includes training materials associated with the Australian BioCommons workshop 'Single cell RNAseq analysis in R'. This workshop took place over two, 3.5 hour sessions on 26 and 27 October 2023. Event description Analysis and interpretation of single cell RNAseq (scRNAseq) data requires dedicated workflows. In this hands-on workshop we will show you how to perform single cell analysis using Seurat - an R package for QC, analysis, and exploration of single-cell RNAseq data. We will discuss the 'why' behind each step and cover reading in the count data, quality control, filtering, normalisation, clustering, UMAP layout and identification of cluster markers. We will also explore various ways of visualising single cell expression data. This workshop is presented by the Australian BioCommons, Queensland Cyber Infrastructure Foundation (QCIF) and the Monash Genomics and Bioinformatics Platform with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative. Lead trainers: Sarah Williams, Adele Barugahare, Paul Harrison, Laura Perlaza Jimenez Facilitators: Nick Matigan, Valentine Murigneux, Magdalena (Magda) Antczak Infrastructure provision: Uwe Winter Coordinator: Melissa Burke Training materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. scRNAseq_Schedule (PDF): A breakdown of the topics and timings for the workshop Materials shared elsewhere: This workshop follows the tutorial 'scRNAseq Analysis in R with Seurat' https://swbioinf.github.io/scRNAseqInR_Doco/index.html Slides used to introduce key topics are available via GitHub https://github.com/swbioinf/scRNAseqInR_Doco/tree/main/slides This material is based on the introductory Guided Clustering Tutorial tutorial from Seurat. It is also drawing from a similar workshop held by Monash Bioinformatics Platform Single-Cell-Workshop, with material here.
l
cellCounts
opal.latrobe.edu.au
researchdata.edu.au
bin
Updated Dec 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.v3
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.26181/21588276.v3
Dataset updated
Dec 19, 2022
Dataset provided by
La Trobe
Authors
Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This page includes the data and code necessary to reproduce the results of the following paper: Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review. A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.

R (v4.0.0 or newer) https://www.r-project.org/ Rsubread (v2.12.2 or newer) http://bioconductor.org/packages/3.16/bioc/html/Rsubread.html CellRanger (v6.0.1) https://support.10xgenomics.com/single-cell-gene-expression/software/overview/welcome STARsolo (v2.7.10a) https://github.com/alexdobin/STAR sra-tools (v2.10.0 or newer) https://github.com/ncbi/sra-tools Seurat (v3.0.0 or newer) https://satijalab.org/seurat/ edgeR (v3.30.0 or newer) https://bioconductor.org/packages/edgeR/ limma (v3.44.0 or newer) https://bioconductor.org/packages/limma/ mltools (v0.3.5 or newer) https://cran.r-project.org/web/packages/mltools/index.html

Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected): https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.
Data from: Harnessing single cell RNA sequencing to identify dendritic cell...
zenodo.org
csv
Updated Dec 31, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh; Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh (2022). Harnessing single cell RNA sequencing to identify dendritic cell types, characterize their biological states and infer their activation trajectory [Dataset]. http://doi.org/10.5281/zenodo.5511975
Explore at:
csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5511975
Dataset updated
Dec 31, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh; Ammar Sabir Cheema; Kaibo Duan; Marc Dalod; Thien-Phong Vu Manh
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Summary: Dendritic cells (DCs) orchestrate innate and adaptive immunity, by translating the sensing of distinct danger signals into the induction of different effector lymphocyte responses, to induce different defense mechanisms suited to face distinct types of threats. Hence, DCs are very plastic, which results from two key characteristics. First, DCs encompass distinct cell types specialized in different functions. Second, each DC type can undergo different activation states, fine-tuning its functions depending on its tissue microenvironment and the pathophysiological context, by adapting the output signals it delivers to the input signals it receives. Hence, to better understand DC biology and harness it in the clinic, we must determine which combinations of DC types and activation states mediate which functions, and how.
To decipher the nature, functions and regulation of DC types and their physiological activation states, one of the methods that can be harnessed most successfully is ex vivo single cell RNA sequencing (scRNAseq). However, for new users of this approach, determining which analytics strategy and computational tools to choose can be quite challenging, considering the rapid evolution and broad burgeoning of the field. In addition, awareness must be raised on the need for specific, robust and tractable strategies to annotate cells for cell type identity and activation states. It is also important to emphasize the necessity of examining whether similar cell activation trajectories are inferred by using different, complementary methods. In this chapter, we take these issues into account for providing a pipeline for scRNAseq analysis and illustrating it with a tutorial reanalyzing a public dataset of mononuclear phagocytes isolated from the lungs of naïve or tumor-bearing mice. We describe this pipeline step-by-step, including data quality controls, dimensionality reduction, cell clustering, cell cluster annotation, inference of the cell activation trajectories and investigation of the underpinning molecular regulation. It is accompanied with a more complete tutorial on Github. We anticipate that this method will be helpful for both wet lab and bioinformatics researchers interested in harnessing scRNAseq data for deciphering the biology of DCs or other cell types, and that it will contribute to establishing high standards in the field.

Data:

1. negative_cDC1_relative_signatures.csv : Negative signatures for performing Connectivity Map (cMAP) Analysis

2. positive_cDC1_relative_signatures.csv : Positive signatures for performing Connectivity Map (cMAP) Analysis

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

zenodo.org

bin, csv, zip

Updated Oct 24, 2024

Facebook

Twitter

Click to copy link

Link copied

Cite

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang (2024). Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to anti-PD-1 and anti-PD-1/CTLA-4 immunotherapy in melanoma [Dataset]. http://doi.org/10.5281/zenodo.13971562

Explore at:

bin, csv, zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.13971562

Dataset updated

Oct 24, 2024

Dataset provided by

Zenodo

Authors

Bertram Bengsch; Bertram Bengsch; Sagar; Sagar; Zhen Zhang; Zhen Zhang

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset details the scRNASeq and TCR-Seq analysis of sorted PD-1+ CD8+ T cells from patients with melanoma treated with checkpoint therapy (anti-PD-1 monotherapy and anti-PD-1 & anti-CTLA-4 combination therapy) at baseline and after the first cycle of therapy. A major publication using this dataset is accessible here: (reference)

*experimental design

Single-cell RNA sequencing was performed using 10x Genomics with feature barcoding technology to multiplex cell samples from different patients undergoing mono or dual therapy so that they can be loaded on one well to reduce costs and minimize technical variability. Hashtag oligomers (oligos) were obtained as purified and already oligo-conjugated in TotalSeq-C format from BioLegend. Cells were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*extract protocol

PBMCs were thawed, counted and 20 million cells per patient and time point were used for staining. Cells were stained with barcoded antibodies together with a staining solution containing antibodies against CD3, CD4, CD8, PD-1/IgG4 and fixable viability dye (eBioscience) prior to FACS sorting. Barcoded antibody concentrations used were 0.5 µg per million cells, as recommended by the manufacturer (BioLegend) for flow cytometry applications. After staining, cells were washed twice in PBS containing 2% BSA and 0.01% Tween 20, followed by centrifugation (300 xg 5 min at 4 °C) and supernatant exchange. After the final wash, cells were resuspended in PBS and filtered through 40 µm cell strainers and proceeded for sorting. Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions.

*library construction protocol

Sorted cells were counted and approximately 75,000 cells were processed through 10x Genomics single-cell V(D)J workflow according to the manufacturer’s instructions. Gene expression, hashing and TCR libraries were pooled to desired quantities to obtain the sequencing depths of 15,000 reads per cell for gene expression libraries and 5,000 reads per cell for hashing and TCR libraries. Libraries were sequenced on a NovaSeq 6000 flow cell in a 2X100 paired-end format.

*library strategy

scRNA-seq and scTCR-seq

*data processing step

Pre-processing of sequencing results to generate count matrices (gene expression and HTO barcode counts) was performed using the 10x genomics Cell Ranger pipeline.

Further processing was done with Seurat (cell and gene filtering, hashtag identification, clustering, differential gene expression analysis based on gene expression).

*genome build/assembly

Alignment was performed using prebuilt Cell Ranger human reference GRCh38.

*processed data files format and content

RNA counts and HTO counts are in sparse matrix format and TCR clonotypes are in csv format.

Datasets were merged and analyzed by Seurat and the analyzed objects are in rds format.

file name	file checksum
PD1CD8_160421_filtered_feature_bc_matrix.zip	da2e006d2b39485fd8cf8701742c6d77
PD1CD8_190421_filtered_feature_bc_matrix.zip	e125fc5031899bba71e1171888d78205
PD1CD8_160421_filtered_contig_annotations.csv	927241805d507204fbe9ef7045d0ccf4
PD1CD8_190421_filtered_contig_annotations.csv	8ca544d27f06e66592b567d3ab86551e

*processed data file	antibodies/tags
PD1CD8_160421_filtered_feature_bc_matrix.zip	none
PD1CD8_160421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M1_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M1_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - C1_base_combined_therapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - C1_post_combined_therapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C2_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C2_post_combined_therapy
PD1CD8_160421_filtered_contig_annotations.csv	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	none
PD1CD8_190421_filtered_feature_bc_matrix.zip	TotalSeq™-C0251 anti-human Hashtag 1 Antibody - (HASH_1) - M2_base_monotherapy TotalSeq™-C0252 anti-human Hashtag 2 Antibody - (HASH_2) - M2_post_monotherapy TotalSeq™-C0253 anti-human Hashtag 3 Antibody - (HASH_3) - M3_base_monotherapy TotalSeq™-C0254 anti-human Hashtag 4 Antibody - (HASH_4) - M3_post_monotherapy TotalSeq™-C0255 anti-human Hashtag 5 Antibody - (HASH_5) - C3_base_combined_therapy TotalSeq™-C0256 anti-human Hashtag 6 Antibody - (HASH_6) - C3_post_combined_therapy
PD1CD8_190421_filtered_contig_annotations.csv	none

Integrated data with normalized counts
figshare.com
hdf
Updated Dec 15, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kai Huang (2021). Integrated data with normalized counts [Dataset]. http://doi.org/10.6084/m9.figshare.17203928.v1
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17203928.v1
Dataset updated
Dec 15, 2021
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Kai Huang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains the integrated and identified seurat object used in the paper's analyses. It contains both raw and normalized counts as well as cell level metadata and facilitates direct analysis.Raw data from the original studies can be found at GSE145926 and https:/www.covid19cellatlas.org/#wilk20
o
Single-cell Atlas Reveals Diagnostic Features Predicting Progressive Drug...
explore.openaire.eu
data.niaid.nih.gov
+1more
Updated Aug 6, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Vaidehi Krishnan; Florian Schmidt; Zahid Nawaz; Prasanna Nori Venkatesh; Lee Kian Leong; Chan Zhu En; Alice Man Sze Cheung; Sudipto Bari; Meera Makheja; Ahmad Lajam; Pavanish Kumar; John Ouyang; Owen Rackham; William Ying Khee Hwang; Salvatore Albani; Charles Chuah; Shyam Prabhakar; Sin Tiong Ong (2021). Single-cell Atlas Reveals Diagnostic Features Predicting Progressive Drug Resistance in Chronic Myeloid Leukemia [Dataset]. http://doi.org/10.5281/zenodo.7337398
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7337398
Dataset updated
Aug 6, 2021
Authors
Vaidehi Krishnan; Florian Schmidt; Zahid Nawaz; Prasanna Nori Venkatesh; Lee Kian Leong; Chan Zhu En; Alice Man Sze Cheung; Sudipto Bari; Meera Makheja; Ahmad Lajam; Pavanish Kumar; John Ouyang; Owen Rackham; William Ying Khee Hwang; Salvatore Albani; Charles Chuah; Shyam Prabhakar; Sin Tiong Ong
Description
This archive contains data of scRNAseq and CyTOF in form of Seurat objects, txt and csv files as well as R scripts for data analysis and Figure generation. A summary of the content is provided in the following. R scripts Script to run Machine learning models predicting group specific marker genes: CML_Find_Markers_Zenodo.R Script to reproduce the majority of Main and Supplementary Figures shown in the manuscript: CML_Paper_Figures_Zenodo.R Script to run inferCNV analysis: inferCNV_Zenodo.R Script to plot NATMI analysis results:NATMI_CvsA_FC0.32_Updown_Column_plot_Zenodo.R Script to conduct sub-clustering and filtering of NK cells NK_Marker_Detection_Zenodo.R Helper scripts for plotting and DEG calculation:ComputePairWiseDE_v2.R, Seurat_DE_Heatmap_RCA_Style.R RDS files General scRNA-seq Seurat objects: scRNA-seq seurat object after QC, and cell type annotation used for most analysis in the manuscript: DUKE_DataSet_Doublets_Removed_Relabeled.RDS scRNA-seq including findings e.g. from NK analysis used in the shiny app: DUKE_final_for_Shiny_App.rds Neighborhood enrichment score computed for group A across all HSPCs: Enrichment_score_global_groupA.RDS UMAP coordinates used in the article: Layout_2D_nNeighbours_25_Metric_cosine_TCU_removed.RDS SCENIC files: Regulon set used in SCENIC: 2.6_regulons_asGeneSet.Rds AUC values computed for regulons: 3.4_regulonAUC.Rds MetaData used in SCENIC cellInfo.Rds Group specific regulons for LCS: groupSpecificRegulonsBCRAblP.RDS Patient specific regulons for LSC: patientSpecificRegulonsBCRAblP.RDS Patient specificity score for LSC: PatientSpecificRegulonSpecificityScoreBCRAblP.RDS Regulon specificty score for LSC: RegulonSpecificityScoreBCRAblP.RDS BCR-ABL1 inference: HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label.RDS UMAP for HSC with inferred BCR-ABL1 label: HSCs_CML_with_BCR-Abl_label_UMAP.RDS HSPCs with BCR-ABL1 module scores: HSPC_metacluster_74K_with_modscore_27thmay.RDS NK sub-clustering and filtering: NK object with module scores: NK_8617cells_with_modscore_1stjune.RDS Feature genes for NK cells computed with DubStepR: NK_Cells_DubStepR NK cells Seurat object excluding contaminating T and B cells: NK_cells_T_B_17_removed.RDS NK Seurat object including neighbourhood enrichment score calculations: NK_seurat_object_with_enrichment_labels_V2.RDS txt and csv files: Proportions per cluster calculated from CyTOF: CyTOF_Proportions.txt Correlation between scRNAseq and CyTOF cell type abundance: scRNAseq_Cor_Cytof.txt Correlation between manual gating and FlowSOM clustering: Manual_vs_FlowSOM.txt GSEA results: HSPC, HSC and LSC results: FINAL_GSEA_DATA_For_GGPLOT.txt NK: NK_For_Plotting.txt TFRC and HLA expression: TFRC_and_HLA_Values.txt NATMI result files: UP-regulated_mean.csv DOWN-regulated_mean.csv Gene position file used in inferCNV: inferCNV_gene_positions_hg38.txt Module scores for NK subclusters per cell: NK_Supplementary_Module_Scores.csv Compressed folders: All CyTOF raw data files: CyTOF_Data_raw.zip Results of the patient-based classifier: PatientwiseClassifier.zip Results of the single-cell based classifier: SingleCellClassifierResults.zip For general new data analysis approaches, we recommend the readers to use the Seruat object stored in DUKE_final_for_Shiny_App.rds or to use the shiny app(http://scdbm.ddnetbio.com/) and perform further analysis from there. RAW data is available at EGA upon request using Study ID: EGAS00001005509 Revision The for_CML_manuscript_revision.tar.gz folder contains scripts and data for the paper revision including 1) Detection of the BCR-ABL fusion with long read sequencing; 2) Identification of BCR-ABL junction reads with scRNAseq; 3) Detection of expressed mutations using scRNAseq.
m
Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4
data.mendeley.com
Updated Jul 25, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Richard Guyer (2024). Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4 [Dataset]. http://doi.org/10.17632/cp4d7t74vb.1
Explore at:
Unique identifier
https://doi.org/10.17632/cp4d7t74vb.1
Dataset updated
Jul 25, 2024
Authors
Richard Guyer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
RDS files containing processed Seurat objects for multiome analysis of neuroblastoma cell lines. File names reflect the cell line.
f
ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1
figshare.com
application/gzip
Updated Jun 29, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Massimo Andreatta; Santiago Carmona (2023). ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1 [Dataset]. http://doi.org/10.6084/m9.figshare.12478571.v2
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12478571.v2
Dataset updated
Jun 29, 2023
Dataset provided by
figshare
Authors
Massimo Andreatta; Santiago Carmona
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.To construct the reference TIL atlas, we obtained single-cell gene expression matrices from the following GEO entries: GSE124691, GSE116390, GSE121478, GSE86028; and entry E-MTAB-7919 from Array-Express. Data from GSE124691 contained samples from tumor and from tumor-draining lymph nodes, and were therefore treated as two separate datasets. For the TIL projection examples (OVA Tet+, miR-155 KO and Regnase-KO), we obtained the gene expression counts from entries GSE122713, GSE121478 and GSE137015, respectively.Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g. Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non T cell genes (e.g. Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat 3. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal and non-coding genes, as well as genes expressed in less than 0.1% or more than 90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We determined pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat3, providing the anchor set determined by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets.Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.6, reduction=”umap”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).
o
Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat...
ordo.open.ac.uk
zenodo.org
bin
Updated Apr 28, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marisa Loach (2025). Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat version [Dataset]. http://doi.org/10.5281/zenodo.14713816
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14713816
Dataset updated
Apr 28, 2025
Dataset provided by
The Open University
Authors
Marisa Loach
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This data is used for the Seurat version of the batch correction and integration tutorial on the Galaxy Training Network. The input data was provided by Seurat in the 'Integrative Analysis in Seurat v5' tutorial. The input dataset provided here has been filtered to include only cells for which nFeature_RNA > 1000. The other datasets were produced on Galaxy. The original dataset was published as: Ding, J., Adiconis, X., Simmons, S.K. et al. Systematic comparison of single-cell and single-nucleus RNA-sequencing methods. Nat Biotechnol 38, 737–746 (2020). https://doi.org/10.1038/s41587-020-0465-8.
s
Single-cell RNA-seq in monozygotic twins discordant for multiple sclerosis
seek.synergy-munich.de
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Single-cell RNA-seq in monozygotic twins discordant for multiple sclerosis [Dataset]. https://seek.synergy-munich.de/data_files/18
Explore at:
Dataset updated
Oct 11, 2024
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
scRNA-seq. Library preparation: Smart-seq2; alignement: to UCSC hg38 using HISAT2; gene counts: featureCounts (Liao et al.); single-cell analysis: Seurat // PRJNA513835 - SRP180896
Data from: A single-cell atlas characterizes dysregulation of the bone...
zenodo.org
Updated Jan 14, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
William Pilcher; William Pilcher (2025). A single-cell atlas characterizes dysregulation of the bone marrow immune microenvironment associated with outcomes in multiple myeloma [Dataset]. http://doi.org/10.5281/zenodo.14624955
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.14624955
Dataset updated
Jan 14, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
William Pilcher; William Pilcher
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
May 8, 2024
Description
This repository contains R Seurat objects associated with our study titled "A single-cell atlas characterizes dysregulation of the bone marrow immune microenvironment associated with outcomes in multiple myeloma".

Single cell data contained within this object comes from MMRF Immune Atlas Consortium work.

The .rds files contains a Seurat object saved with version 4.3. This can be loaded in R with the readRDS command.

Two .RDS files are included in this version of the release.

Discovery object: MMRF_ImmuneAtlas_Full_With_Corrected_Censored_Metadata.rds contains all aliquots belonging to the 'discovery' cohort as used in the initial paper. This represents the dataset used for initial clustering, cell annotation, and analysis.

Discovery + Validation object: COMBINED_VALIDATION_MMRF_ImmuneAtlas_Full_Censored_Metadata.rds contains both aliquots belonging to the initial 'discovery' cohort, and aliquots belonging to the 'validation' cohort. The group each cell is derived from is listed under the 'cohort' variable. Labels related to cell annotation, including doublet status, are derived from a label transfer process as described in the paper. Labels for the original 'discovery' cohort are unchanged. UMAPs have been reconstructed with both the discovery and validation cohorts integrated.

--

The discovery object contains two assays:

"RNA" - The raw count matrix

"RNA_Batch_Corrected" - Counts adjusted for the combination of 'Study_Site' and 'Batch'.

Analysis should prefer the original RNA assay, unless using pipelines which does not support adjusting for technical covariates.

Currently, the validation object only includes the uncorrected RNA assay.

--

The object contains two umaps in the reduction slot:

umap - will render the UMAP for the full object with all cells.

umap.sub -contains the UMAP embeddings for individual 'compartments', as indicated by 'subcluster_V03072023_compartment'

--

Each sample has three different identifiers:

public_id

Indicates a specific patient (n=263).

MMRF_####

This is a standard identifier which is used across all MMRF CoMMpass datasets

public_ids can map to multiple d_visit_specimen_ids and aliquot_ids

As of now, all public_ids have a single sample collected at Baseline.

This can be accessed by filtering for 'collection_event' %in% c("Baseline", "Screening") or VJ_INTERVAL == 'Baseline'

d_visit_specimen_id

Indicates a specific visit by a patient (n=358)

MMRF_####_Y

Y is a number indicate that this is the 'Y' sample obtained from said patient. This does not correspond to a specific timepoint.

This is a standard identifier, which is used across all MMRF CoMMpass datasets

The purpose of the visit is indicated in 'collection_event' (Baseline, Relapse, Remmission, etc.). The approximate interval the visit corresponds to is in "VJ_INTERVAL"

d_visit_specimen_id uniquely maps to one public_id

d_visit_specimen_id can map to multiple aliquot_ids

aliquot_id

Refers to the specific bone marrow aliquot sample processed (n=361)

MMRFA-######

This is a unique identifier for each processed scRNA-seq sample.

As of now, this uniquely maps to a combination of d_visit_specimen_id, Study_Site, and Batch

As of now, is an identifier specific to the MMRF ImmuneAtlas

Each cell has the following annotation information:

subcluster_V03072023

These refer to an individual cluster derived from 'Seurat'.

Format is 'Compartment'.'Compartment-cluster'.'Compartment-subcluster'

'NkT.2.2', indicates this cell is in the 'Natural Killer + T Cell compartment', was originally part of 'Cluster 2', and then was further separated into a refined subcluster 2.2'

If a parent cluster did not need to be further seprated, the 'Compartment-subcluster' part is omitted (e.g., 'NkT.6')

As of now, this uniquely maps to a specific cellID_short annotation.

Clustering was done on a per compartment basis

For most immune cell types, clustering was based on embeddings corrected for 'siteXbatch'. For Plasma, clustering was performed on embeddings corrected on a per-sample basis.

In the combined validation object, DISCOVERY.subcluster_V03072023 will contain values only for the discovery cohort, and have NA values for validation samples.

subcluster_V03072023_compartment

These refer to one of five major compartments as identified roughly on the original UMAP. Clustering was performed on a per-compartment basis following a first pass rough annotation.

The possible compartments are

NkT (T cell + Natural Killer Cells)

Myeloid (Monocytes, Macrophages, Dendritic cells, Neutrophil/Granulocyte populations)

BEry (B Cell, Erythroblasts, bone marrow progenitor populations, pDCs)

Ery (Erythrocyte population)

Plasma (Plasma cell populations)

Each compartment has it's own UMAP generated, which can be accessed in the 'umap.sub' reduction

One cluster was isolated from all other populations, and was not assigned to a compartment. This cluster is labeled as 'Full.23'.

In the combined validation object, DISCOVERY.subcluster_V03072023_compartment will contain values only for the discovery cohort, and have NA values for validation samples.

cellID_short

This is the individual annotation for each cluster.

Please see the 'Cell Population Annotation Dictionary' for further details.

If different seurat clusters were assigned similar annotations, the celltype annotation will be appended with a distinct cluster gene, or with '_b', '_c'

lineage_group

This is an annotation driven grouping of clusters into major immune populations, as shown in Figure 2.

This includes "CD8", "CD4", "M" (Myeloid), "B" (B cell), "E" (Erythroid), "P" (Plasma), "Other" (HSC, Fibro, pDC_a), "LQ" (Doublet)

isDoublet

This is a binary 'True' or 'False' derived from manual review of clusters following doublet analysis, as described in the paper.

True indicates the cluster was determined to be a doublet population.

This is derived from 'doublet_pred', in which 'dblet_cluster' and 'poss_dblet_cluster' were flagged as doublet populations for subsequent analysis.

In the validation object, the doublet status of new samples were inferred by if label transfer from the discovery cohort mapped the cell from the new sample as one of the previously identified doublet populations. The raw doublet scores from doublet finder, pegasus, or scrublet, are not included in this release.

--

Each sample has the following information indicating shipment batches, for batch correction

Study_Site

The center which processed a specific aliquot_id

EMORY, MSSM, WashU, MAYO

Batch

The shipment batch the sample was associated with

Valued 1 to 3 for EMORY, MSSM, MAYO, and 1 to 4 for WashU

siteXbatch

A combination of the above to variables, to be used for batch correction

(Combined Validation Object only): cohort

Indicates if the sample was involved in the 'discovery' cohort, or 'validation' cohort. Samples in the 'validation' cohort will have labels inferred from label mapping

--

Each public_id has limited demographic information based on publicly available information in the MMRF CoMMpass study.

d_pt_sex

Patient sex (not self-identified). Male or Female

d_pt_race_1

Patient self-identified race

d_pt_ethnicity

Patient self-identified ethnicity

d_dx_amm_age

Patient age at diagnosis.

Not reported for patients above 90 at diagnosis

d_dx_amm_bmi

Patient BMI at diagnosis

d_pt_height_cm

Patient height at diagnosis, in centimeters.

d_dx_amm_weight_kg

Patient weight at diagnosis, in kilograms

d_specimen_visit_id contains two data points providing limited information about the visit

collection_event

Description of why the sample was collected

e.g., 'Baseline' and 'Screening' indicates the sample was obtained prior to therapy

'Relapse/Progression' indicates the sample was collected due to disease progression based on clinical assessment

'Remission/Response' indicates the sample was collected due to patient entering remission based on clinical assessment

Samples may be collected for reasons independent of the above, such as 'Pre' or 'Post' ASCT, or for other unspecified reasons

VJ_INTERVAL

Indicates the rough interval following start of therapy the sample is assigned to

"Baseline", "Month 3", "Year 2", etc.

All the single-cell raw data, along with outcome and cytogenetic information, is available at MMRF’s VLAB shared resource. Requests to access these data will be reviewed by data access committee at MMRF and any data shared will be released under a data transfer agreement that will protect the identities of patients involved in the study. Other information from the CoMMpass trial can also generally be
o
Repository for the single cell RNA sequencing data analysis for the human...
explore.openaire.eu
Updated Aug 26, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8286134
Dataset updated
Aug 26, 2023
Authors
Jonathan; Andrew; Pierre; Allart; Adrian
Description
This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.
f
Droplet-based, high-throughput single cell transcriptional analysis of adult...
figshare.com
Updated Mar 6, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Sarthak Sinha; Jo Anne Stratton (2019). Droplet-based, high-throughput single cell transcriptional analysis of adult mouse tissue using 10X Genomics' Chromium Single Cell 3' (v2) system: From tissue preparation to bioinformatic analysis [Dataset]. http://doi.org/10.6084/m9.figshare.6626927.v1
Explore at:
Unique identifier
https://doi.org/10.6084/m9.figshare.6626927.v1
Dataset updated
Mar 6, 2019
Dataset provided by
figshare
Authors
Sarthak Sinha; Jo Anne Stratton
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The attached R Scripts supplement our protocol paper currently under editorial review at the Journal of Visualized Experiments.Scope of the article:This protocol describes the general processes and quality control checks necessary for preparing healthy adult single cells in preparation for droplet-based, high-throughput single cell RNA-Seq analysis using the 10X Genomics' Chromium System. We also describe sequencing parameters, alignment and downstream single-cell bioinformatic analysis.

Facebook

Twitter

Click to copy link

Link copied

Cite

Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1

Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.17058077.v1

Dataset updated

May 31, 2023

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Yunshun Chen; Gordon Smyth

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.

Clear search

Close search

Google apps

Main menu

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...

Transcription start site analysis for heterogenous CD4+ T cells using 5′...

Scripts for Analysis

Data from: Large-scale integration of single-cell transcriptomic data...

Processed naive T cell single-cell RNA-seq, Seurat object

Data used in SeuratIntegrate paper

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions

WORKSHOP: Single cell RNAseq analysis in R

cellCounts

Data from: Harnessing single cell RNA sequencing to identify dendritic cell...

Single-cell RNA-Seq and TCR-Seq analysis of PD-1+ CD8+ T-cells responding to...

Integrated data with normalized counts

Single-cell Atlas Reveals Diagnostic Features Predicting Progressive Drug...

Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4

ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1

Test Data for Galaxy tutorial "Batch Correction and Integration" - Seurat...

Single-cell RNA-seq in monozygotic twins discordant for multiple sclerosis

Data from: A single-cell atlas characterizes dysregulation of the bone...

Repository for the single cell RNA sequencing data analysis for the human...

Droplet-based, high-throughput single cell transcriptional analysis of adult...

Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues