60 datasets found
  1. n

    Data from: Large-scale integration of single-cell transcriptomic data...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +2more
    zip
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    Cornell University
    Authors
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

    Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

    Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

    Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

    Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

    Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

    Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

    Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

    Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

    Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

  2. r

    cellCounts

    • researchdata.edu.au
    • opal.latrobe.edu.au
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Shi Wei; Mielke Lisa; Pal Bhupinder; Raghu Dinesh; Liao Yang; Yang Liao; Wei Shi; Lisa Mielke; Dinesh Raghu; Bhupinder Pal (2022). cellCounts [Dataset]. http://doi.org/10.26181/21588276.V3
    Explore at:
    Dataset updated
    Dec 19, 2022
    Dataset provided by
    La Trobe University
    Authors
    Shi Wei; Mielke Lisa; Pal Bhupinder; Raghu Dinesh; Liao Yang; Yang Liao; Wei Shi; Lisa Mielke; Dinesh Raghu; Bhupinder Pal
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This page includes the data and code necessary to reproduce the results of the following paper:


    Yang Liao, Dinesh Raghu, Bhupinder Pal, Lisa Mielke and Wei Shi. cellCounts: fast and accurate quantification of 10x Chromium single-cell RNA sequencing data. Under review.


    A Linux computer running an operating system of CentOS 7 (or later) or Ubuntu 20.04 (or later) is recommended for running this analysis. The computer should have >2 TB of disk space and >64 GB of RAM. The following software packages need to be installed before running the analysis. Software executables generated after installation should be included in the $PATH environment variable.



    Reference packages generated by 10x Genomics are also required for this analysis and they can be downloaded from the following link (2020-A version for individual human and mouse reference packages should be selected):


    https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest


    After all these are done, you can simply run the shell script ‘test-all-new.bash’ to perform all the analyses carried out in the paper. This script will automatically download the mixture scRNA-seq data from the SRA database, and it will output a text file called ‘test-all.log’ that contains all the screen outputs and speed/accuracy results of CellRanger, STARsolo and cellCounts.

  3. scRNA-seq Human Pluripotent Stem Cells Messmer2019

    • kaggle.com
    zip
    Updated May 1, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq Human Pluripotent Stem Cells Messmer2019 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-human-pluripotent-stem-cells-messmer2019
    Explore at:
    zip(57267380 bytes)Available download formats
    Dataset updated
    May 1, 2022
    Authors
    Alexander Chervov
    Description

    Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data: https://pubmed.ncbi.nlm.nih.gov/30673604/ Cell Rep. 2019 Jan 22;26(4):815-824.e4. doi: 10.1016/j.celrep.2018.12.099. Transcriptional Heterogeneity in Naive and Primed Human Pluripotent Stem Cells at Single-Cell Resolution Tobias Messmer 1, Ferdinand von Meyenn 2, Aurora Savino 3, Fátima Santos 3, Hisham Mohammed 3, Aaron Tin Long Lun 4, John C Marioni 5, Wolf Reik 6

    Data in two variants: 1) scRNA-seq count matrix, downloaded from database of R-package "scRNAseq", see script: https://www.kaggle.com/alexandervc/rpackage-scrnaseq-downloads-datasets 2) Directly uploaded from E-MTAB-6819 https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6819/

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  4. Z

    Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

    • data.niaid.nih.gov
    Updated Nov 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hsu, Jonathan; Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
    Explore at:
    Dataset updated
    Nov 20, 2023
    Authors
    Hsu, Jonathan; Stoop, Allart
    Description

    Table of Contents

    Main Description File Descriptions Linked Files Installation and Instructions

    1. Main Description

    This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

    Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

    File Descriptions

    The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

    Linked Files

    This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

    Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

    Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

    Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

    Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

    Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

    Installation and Instructions

    The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

    Ensure you have R version 4.1.2 or higher for compatibility.

    Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

    1. Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).
    2. Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.
    3. Set your working directory to where the following files are located:

    marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

    You can use the following code to set the working directory in R:

    setwd(directory)

    1. Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.
    2. Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.
    3. Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.
    4. Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
  5. o

    Repository for the single cell RNA sequencing data analysis for the human...

    • explore.openaire.eu
    Updated Aug 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jonathan; Andrew; Pierre; Allart; Adrian (2023). Repository for the single cell RNA sequencing data analysis for the human manuscript. [Dataset]. http://doi.org/10.5281/zenodo.8286134
    Explore at:
    Dataset updated
    Aug 26, 2023
    Authors
    Jonathan; Andrew; Pierre; Allart; Adrian
    Description

    This is the GitHub repository for the single cell RNA sequencing data analysis for the human manuscript. The following essential libraries are required for script execution: Seurat scReportoire ggplot2 dplyr ggridges ggrepel ComplexHeatmap Linked File: -------------------------------------- This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. Provided below are descriptions of the linked datasets: 1. Gene Expression Omnibus (GEO) ID: GSE229626 - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the matrix.mtx, barcodes.tsv, and genes.tsv files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token"(https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). 2. Sequence read archive (SRA) repository - Title: Gene expression profile at single cell level of human T cells stimulated via antibodies against the T Cell Receptor (TCR) - Description: This submission contains the "raw sequencing" or .fastq.gz files, which are tab delimited text files. - Submission type: Private. In order to gain access to the repository, you must use a "reviewer token" (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html). Please note that since the GSE submission is private, the raw data deposited at SRA may not be accessible until the embargo on GSE229626 has been lifted. Installation and Instructions -------------------------------------- The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation: > Ensure you have R version 4.1.2 or higher for compatibility. > Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code. The following code can be used to set working directory in R: > setwd(directory) Steps: 1. Download the "Human_code_April2023.R" and "Install_Packages.R" R scripts, and the processed data from GSE229626. 2. Open "R-Studios"(https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R. 3. Set your working directory to where the following files are located: - Human_code_April2023.R - Install_Packages.R 4. Open the file titled Install_Packages.R and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies. 5. Open the Human_code_April2023.R R script and execute commands as necessary.

  6. scRNA-seq Kolodziejczyk et al. (2015)

    • kaggle.com
    zip
    Updated Apr 30, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq Kolodziejczyk et al. (2015) [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-kolodziejczyk-et-al-2015
    Explore at:
    zip(13439744 bytes)Available download formats
    Dataset updated
    Apr 30, 2022
    Authors
    Alexander Chervov
    Description

    Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data: Data from the paper: Kolodziejczyk, A. A., J. K. Kim, J. C. Tsang, T. Ilicic, J. Henriksson, K. N. Natarajan, A. C. Tuck, et al. 2015. “Single cell RNA-Sequencing of pluripotent states unlocks modular transcriptional variation.” Cell Stem Cell 17 (4): 471–85. https://pubmed.ncbi.nlm.nih.gov/26431182/

    scRNA-seq count matrix, downloaded from database of R-package "scRNAseq", see script: https://www.kaggle.com/alexandervc/rpackage-scrnaseq-downloads-datasets

    Related datasets:

    Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

    Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

    (Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

    Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article Published: 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

    Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

    Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell

  7. E

    Breast Cancer Single-Cell RNA-Seq Dataset

    • ega-archive.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Breast Cancer Single-Cell RNA-Seq Dataset [Dataset]. https://ega-archive.org/datasets/EGAD00001007495
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001001974https://ega-archive.org/dacs/EGAC00001001974

    Description

    Single-cell RNA-Sequencing of 26 primary breast cancers from Wu et al. (2021) study. Data was generated using the Chromium controller (10X Genomics) and sequenced on the NextSeq 500 platform.

  8. f

    scPerturb Single-Cell Perturbation Data: RNA and protein h5ad files

    • plus.figshare.com
    hdf
    Updated Sep 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stefan Peidli; Tessa D. Green; Ciyue Shen; Torsten Gross; Joseph Min; Samuele Garda; Bo Yuan; Linus J. Schumacher; Jake P. Taylor-King; Debora S. Marks; Augustin Luna; Nils Blüthgen; Chris Sander (2023). scPerturb Single-Cell Perturbation Data: RNA and protein h5ad files [Dataset]. http://doi.org/10.25452/figshare.plus.24160713.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Sep 29, 2023
    Dataset provided by
    Figshare+
    Authors
    Stefan Peidli; Tessa D. Green; Ciyue Shen; Torsten Gross; Joseph Min; Samuele Garda; Bo Yuan; Linus J. Schumacher; Jake P. Taylor-King; Debora S. Marks; Augustin Luna; Nils Blüthgen; Chris Sander
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This repository contains the single-cell RNA and protein datasets as h5ad files scRNA-seq and protein datasets within the scperturb database as h5ad files (saved with scanpy v1.9.1.) from the scperturb database.In order to facilitate development and benchmarking of computational methods in systems biology, we collected a set of 44 publicly available single-cell perturbation-response datasets with molecular readouts, including transcriptomics, proteomics and epigenomics. We applied uniform quality control pipelines and harmonize feature annotations. The resulting information resource enables efficient development and testing of computational analysis methods, and facilitates direct comparison and integration across datasets. In addition, we describe E-statistics for perturbation effect quantification and significance testing, and demonstrate E-distance as a general distance measure for single-cell data as both a python (scperturb on PyPI) and R (scperturbR on CRAN) package.See the associated publication for info on how the data was handled. We also have an interactive table (Data Explorer) on our website with metadata per dataset.

  9. Protocol data (R version)

    • figshare.com
    application/gzip
    Updated Oct 16, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jesse Gillis (2020). Protocol data (R version) [Dataset]. http://doi.org/10.6084/m9.figshare.13020569.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Oct 16, 2020
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Jesse Gillis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We published 3 protocols illustrating how MetaNeighbor can be used to quantify cell type replicability across single cell transcriptomic datasets.The data files included here are needed to run the R version of the protocols available on Github (https://github.com/gillislab/MetaNeighbor-Protocol) in RMarkdown (.Rmd) and Jupyter (.ipynb) notebook format. To run the protocols, download the protocols on Github, download the data on Figshare, place the data and protocol files in the same directory, then run the notebooks in Rstudio or Jupyter.The scripts used to generate the data are included in the Github directory. Briefly: - full_biccn_hvg.rds contains a single cell transcriptomic dataset published by the Brain Initiative Cell Census Network (in SingleCellExperiment format). It combines data from 7 datasets obtained in the mouse primary motor cortex (https://www.biorxiv.org/content/10.1101/2020.02.29.970558v2). Note that this dataset only contains highly variable genes. - biccn_hvgs.txt: highly variable genes from the BICCN dataset described above (computed with the MetaNeighbor library). - biccn_gaba.rds: same dataset as full_biccn_hvg.rds, but restricted to GABAergic neurons. The dataset contains all genes common to the 7 BICCN datasets (not just highly variable genes). - go_mouse.rds: gene ontology annotations, stored as a list of gene symbols (one element per gene set).- functional_aurocs.txt: results of the MetaNeighbor functional analysis in protocol 3.

  10. m

    Queryable single-cell RNA-seq (10x Genomics) datasets of Human and Mouse...

    • data.mendeley.com
    Updated Nov 7, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Brian Hermann (2018). Queryable single-cell RNA-seq (10x Genomics) datasets of Human and Mouse spermatogenic cells [Dataset]. http://doi.org/10.17632/kxd5f8vpt4.1
    Explore at:
    Dataset updated
    Nov 7, 2018
    Authors
    Brian Hermann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    To reveal distinct transcriptomes associated with various spermatogenic cells in both mouse and human testes, including spermatogonial stem cells (SSCs) and all of their subsequent progeny, we used the 10x Genomics Chromium (a commercialized Drop-Seq variant) to perform single-cell RNA-seq on various cell populations. Raw data and analyzed data (gene expression matrices) are deposited into the NIH GEO database. Here we include queryable, annotated and interactive files that can be used to compare single-cell transcriptomes.

    Spermatogonia from immature (P6) and adult Id4-Egfp transgenic mice were used. The GFP-bright and dim phenotypes exhibit distinct fates when assayed by transplantation, with ID4-EGFPbright cells highly enriched for SSCs, and ID4-EGFPdim cells enriched for progenitors. Corresponding human spermatogonia were enriched from human testicular tissue by multi-parameter FACS. For both human and mouse, StaPut gravity sedimentation enriched for meiotic spermatocytes and post-meiotic spermatids and we profiled unselected steady-state spermatogenic cells.

    The data from these experiments are stored in Loupe Cell Browser files (.cloupe) which are generated during analysis of 10x Genomics Single-cell data and can be opened and queried with the Loupe Cell Browser (10X Genomics). This software can be downloaded for free from https://support.10xgenomics.com/single-cell-gene-expression/software/downloads/latest. It is important to note that the companion manuscript for these data used additional analyses that are not represented in these files.

    The following datasets are available:

    1. Unselected or sorted P6 ID4-EGFP+ spermatogonia (sorted separately as EGFP-bright or EGFP-dim) were used for this study. Data are from 13094 cells and can be found in the following file: P6 Mouse Spermatogonia.cloupe (aggregate of three datasets, P6 ID4-EGFP bright/dim/unselected)

    2. Unselected or sorted Adult ID4-EGFP+ spermatogonia (sorted separately as EGFP-bright or EGFP-dim), three replicate preparations of steady-state unselected spermatogenic cells, and StaPut-enriched adult spermatocytes and spermatids were used for this study. Data are from 17491 cells and can be found in the following files: Adult Mouse Sorted Spermatogonia.cloupe (Aggregated Ad Spg- ID4-EGFP bright/dim/CD9bright) Mouse Unselected Spermatogenic cells.cloupe (3 replicates of steady-state spermatogenic cells) Mouse StaPut Spermatocytes.cloupe Mouse StaPut Spermatids.cloupe

    3. Sorted adult Human spermatogonia, three replicates of steady-state unselected spermatogenic cells, and StaPut-enriched adult spermatocytes and spermatids were used. Data are from 32727 cells and can be found in the following files: Human Sorted Spermatogonia.cloupe (3 replicates) Human Unselected Spermatogenic Cells.cloupe (3 replicates of steady-state spermatogenic cells) Human StaPut Spermatocytes.cloupe (2 replicates) Human StaPut Spermatids.cloupe (2 replicates)

  11. h

    gene-expression-single-cell-mouse

    • huggingface.co
    Updated Jun 17, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    2025 Longevity x AI Hackathon (2025). gene-expression-single-cell-mouse [Dataset]. https://huggingface.co/datasets/longevity-db/gene-expression-single-cell-mouse
    Explore at:
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    2025 Longevity x AI Hackathon
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    A single-cell transcriptomic atlas characterizes ageing tissues in the mouse

    Code to download and process this dataset is available in: https://github.com/seanome/2025-longevity-x-ai-hackathon Dataset structure is originally from AnnData. Descriptions of each data file is below.

      Data Files
    

    This dataset contains multiple parquet files, one for each sheet in the original Excel file: gene-expression-single-cell-mouse_*.parquet - Data files containing gene expression and… See the full description on the dataset page: https://huggingface.co/datasets/longevity-db/gene-expression-single-cell-mouse.

  12. Data from "Single-cell integration and multi-modal profiling reveals...

    • zenodo.org
    bin, xz
    Updated Nov 13, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Valentin Marteau; Valentin Marteau; Niloofar Nemati; Niloofar Nemati; Kristina Handler; Kristina Handler; Deeksha Raju; Alexander Kirchmair; Alexander Kirchmair; Dietmar Rieder; Dietmar Rieder; Erika Kvalem Soto; Erika Kvalem Soto; Georgios Fotakis; Georgios Fotakis; Glenn De Lange; Glenn De Lange; Sandro Carollo; Nina Boeck; Nina Boeck; Alessia Rossi; Sophia Daum; Alexandra Scheiber; Alexandra Scheiber; Arno Amann; Andreas Seeber; Andreas Seeber; Elisabeth Gasser; Elisabeth Gasser; Steffen Ormanns; Steffen Ormanns; Michael Günther; Agnieszka Martowicz; Agnieszka Martowicz; Zuzana Loncova; Zuzana Loncova; Giorgia Lamberti; Giorgia Lamberti; Anne Krogsdam; Anne Krogsdam; Michela Carlet; Lena Horvath; Lena Horvath; Marie Theres Eling; Hassan Fazilaty; Hassan Fazilaty; Tomas Valenta; Tomas Valenta; Gregor Sturm; Gregor Sturm; Sieghart Sopper; Sieghart Sopper; Andreas Pircher; Andreas Pircher; Patrizia Stoitzner; Patrizia Stoitzner; Peter J. Wild; Peter J. Wild; Patrick Welker; Pascal J. May; Paul Ziegler; Paul Ziegler; Markus Tschurtschenthaler; Markus Tschurtschenthaler; Daniel Neureiter; Daniel Neureiter; Florian Huemer; Florian Huemer; Richard Greil; Richard Greil; Lukas Weiss; Lukas Weiss; Marieke Ijsselsteijn; Marieke Ijsselsteijn; Noel F.C.C. de Miranda; Noel F.C.C. de Miranda; Dominik Wolf; Dominik Wolf; Isabelle C. Arnold; Isabelle C. Arnold; Stefan Salcher; Stefan Salcher; Zlatko Trajanoski; Zlatko Trajanoski; Deeksha Raju; Sandro Carollo; Alessia Rossi; Sophia Daum; Arno Amann; Michael Günther; Michela Carlet; Marie Theres Eling; Patrick Welker; Pascal J. May (2025). Data from "Single-cell integration and multi-modal profiling reveals phenotypes and spatial organization of neutrophils in colorectal cancer" [Dataset]. http://doi.org/10.5281/zenodo.16631519
    Explore at:
    xz, binAvailable download formats
    Dataset updated
    Nov 13, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Valentin Marteau; Valentin Marteau; Niloofar Nemati; Niloofar Nemati; Kristina Handler; Kristina Handler; Deeksha Raju; Alexander Kirchmair; Alexander Kirchmair; Dietmar Rieder; Dietmar Rieder; Erika Kvalem Soto; Erika Kvalem Soto; Georgios Fotakis; Georgios Fotakis; Glenn De Lange; Glenn De Lange; Sandro Carollo; Nina Boeck; Nina Boeck; Alessia Rossi; Sophia Daum; Alexandra Scheiber; Alexandra Scheiber; Arno Amann; Andreas Seeber; Andreas Seeber; Elisabeth Gasser; Elisabeth Gasser; Steffen Ormanns; Steffen Ormanns; Michael Günther; Agnieszka Martowicz; Agnieszka Martowicz; Zuzana Loncova; Zuzana Loncova; Giorgia Lamberti; Giorgia Lamberti; Anne Krogsdam; Anne Krogsdam; Michela Carlet; Lena Horvath; Lena Horvath; Marie Theres Eling; Hassan Fazilaty; Hassan Fazilaty; Tomas Valenta; Tomas Valenta; Gregor Sturm; Gregor Sturm; Sieghart Sopper; Sieghart Sopper; Andreas Pircher; Andreas Pircher; Patrizia Stoitzner; Patrizia Stoitzner; Peter J. Wild; Peter J. Wild; Patrick Welker; Pascal J. May; Paul Ziegler; Paul Ziegler; Markus Tschurtschenthaler; Markus Tschurtschenthaler; Daniel Neureiter; Daniel Neureiter; Florian Huemer; Florian Huemer; Richard Greil; Richard Greil; Lukas Weiss; Lukas Weiss; Marieke Ijsselsteijn; Marieke Ijsselsteijn; Noel F.C.C. de Miranda; Noel F.C.C. de Miranda; Dominik Wolf; Dominik Wolf; Isabelle C. Arnold; Isabelle C. Arnold; Stefan Salcher; Stefan Salcher; Zlatko Trajanoski; Zlatko Trajanoski; Deeksha Raju; Sandro Carollo; Alessia Rossi; Sophia Daum; Arno Amann; Michael Günther; Michela Carlet; Marie Theres Eling; Patrick Welker; Pascal J. May
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This archive provides all datasets needed to reproduce the single‐cell data integration detailed in the paper

    Single-cell integration and multi-modal profiling reveals phenotypes and spatial organization of neutrophils in colorectal cancer

    DOI: 10.1101/2024.08.26.609563


    The archive comprises the following files:

    • MUI_Innsbruck-adata.h5ad: In-house scRNA-seq dataset from CRC cohort I (n = 12) comprising matched peripheral blood, adjacent normal, and tumor samples generated using the BD Rhapsody platform.
    • input_datasets.tar.xz: Preprocessed input datasets in .h5ad format required to build the CRC scRNA-seq atlas.
    • crc_atlas_scanvi_model.tar.xz: Pretrained scArches reference model and matching .h5ad file (highly variable genes only), enabling projection of external data onto the CRC atlas.
    • downstream_analyses_de_analysis.tar.xz: DESeq2-based differential expression analyses on pseudobulked data by cell type for various matched comparisons within the CRC atlas. Includes RDS files, result TSV tables, and short summaries for each comparison.

    The CRC atlas is publicly available for download and interactive exploration through a cell-x-gene instance with standardized metadata, which allows custom analyses of the atlas. For more information, check out the

  13. scRNA-seq "Tabula sapiens" - human, 500 000+ cells

    • kaggle.com
    zip
    Updated Feb 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq "Tabula sapiens" - human, 500 000+ cells [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-tabula-sapiens-human-500-000-cells
    Explore at:
    zip(14395870367 bytes)Available download formats
    Dataset updated
    Feb 5, 2022
    Authors
    Alexander Chervov
    Description

    Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Remark 2: Second part of the data see in https://www.kaggle.com/alexandervc/scrnaseq-tabula-sapiens-human-part-2

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data: "Tabula Sapiens" project: https://tabula-sapiens-portal.ds.czbiohub.org/ Data section for download: https://figshare.com/articles/dataset/Tabula_Sapiens_release_1_0/14267219 Paper: https://www.science.org/doi/10.1126/science.abl4896 https://www.biorxiv.org/content/10.1101/2021.07.19.452956v2

    Tabula Sapiens is a benchmark, first-draft human cell atlas of nearly 500,000 cells from 24 organs of 15 normal human subjects. This work is the product of the Tabula Sapiens Consortium. Special thanks to the Chan Zuckerberg Initiative for funding this project and to the CZI Science Technology team for creating cellxgene, the tool that makes the visualization of this research possible.

    See also tutorials:

    Course at Sanger's institute https://scrnaseq-course.cog.sanger.ac.uk/website/tabula-muris.html

    Course at CZ-hub: https://chanzuckerberg.github.io/scRNA-python-workshop/intro/about

    On kaggle - copies of the notebooks and data from the course above https://www.kaggle.com/aayush9753/singlecell-rnaseq-data-from-mouse-brain

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

  14. Spotiphy enables single-cell spatial whole transcriptomics across the entire...

    • zenodo.org
    bin, csv, jpeg
    Updated Dec 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiyuan Yang; Jiyuan Yang; Ziqian Zheng; Ziqian Zheng; Jiyang Yu; Jiyang Yu (2024). Spotiphy enables single-cell spatial whole transcriptomics across the entire section [Dataset]. http://doi.org/10.5281/zenodo.10520022
    Explore at:
    bin, jpeg, csvAvailable download formats
    Dataset updated
    Dec 29, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jiyuan Yang; Jiyuan Yang; Ziqian Zheng; Ziqian Zheng; Jiyang Yu; Jiyang Yu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Spatial transcriptomics (ST) has advanced our understanding of tissue regionalization by enabling the visualization of gene expression within whole tissue sections, but the approach remains dogged by the challenge of achieving single-cell resolution without sacrificing whole genome coverage. Here we present Spotiphy (Spot imager with pseudo single-cell resolution histology), a novel computational toolkit that transforms sequencing-based ST data into single-cell-resolved whole-transcriptome images. In evaluations with Alzheimer’s disease (AD) and normal mouse brains, Spotiphy delivers the most precise cellular compositions. For the first time, Spotiphy reveals novel astrocyte regional specification in mouse brains. It distinguishes sub-populations of DAM (Disease-Associated Microglia) located in different AD mouse brain regions. Spotiphy also identifies multiple spatial domains as well as changes in the patterns of tumor-tumor microenvironment interactions using human breast ST data. Spotiphy enables visualization of cell localization and gene expression in tissue sections, offering key insights into the function of complex biological systems.

  15. Comparison of ScRDAVis and other popular single cell data analysis tools.

    • figshare.com
    xls
    Updated Nov 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sankarasubramanian Jagadesan; Chittibabu Guda (2025). Comparison of ScRDAVis and other popular single cell data analysis tools. [Dataset]. http://doi.org/10.1371/journal.pcbi.1013721.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Nov 18, 2025
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Sankarasubramanian Jagadesan; Chittibabu Guda
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Comparison of ScRDAVis and other popular single cell data analysis tools.

  16. u

    Data from: A single-cell immune atlas of primary and secondary lymphoid...

    • agdatacommons.nal.usda.gov
    hdf
    Updated Sep 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jayne Wiarda; Muskan Kapoor; Sathesh K. Sivasankaran; Kristen A. Byrne; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: A single-cell immune atlas of primary and secondary lymphoid organs in pigs [Dataset]. http://doi.org/10.15482/USDA.ADC/29492726.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Sep 8, 2025
    Dataset provided by
    Ag Data Commons
    Authors
    Jayne Wiarda; Muskan Kapoor; Sathesh K. Sivasankaran; Kristen A. Byrne; Crystal L. Loving; Christopher K. Tuggle
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    Data objects used for analysis in "A single-cell immune atlas of primary and secondary lymphoid organs in pigs" by Wiarda et al. Data objects include .cloupe files for interactive query in Loupe Cell Browser (10X Genomics), .rds files to download for use with a Shiny app for data query, and .h5seurat files for computational query of data. Briefly, cells were isolated from bone marrow, thymus, lymph node, and spleen of two pigs and processed for single-cell RNA sequencing. Single-cell RNA sequencing data was analyzed to identify cell types in each tissue and perform comparisons across tissues and across datasets.

  17. PIAS: an interactive visualization platform for integrative analysis of...

    • figshare.com
    zip
    Updated May 31, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhu Sheng; Yibo Zhuang; Lishan Ye; Feng Zeng; Xiaohui Wu; Guoli Ji (2021). PIAS: an interactive visualization platform for integrative analysis of multi-source single-cell RNA-seq datasets [Dataset]. http://doi.org/10.6084/m9.figshare.13205924.v4
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2021
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Zhu Sheng; Yibo Zhuang; Lishan Ye; Feng Zeng; Xiaohui Wu; Guoli Ji
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    Download and unzip the RData data and place it under the path www/task/public of PIAS.PIAS is a web-based interactive platform for integrative analysis of multi-source single-cell RNA-seq datasets. Different from many other single-cell RNA-seq analysis platforms or pipelines that mainly focus on preprocessing or analysis of one single-cell RNA-seq dataset, PIAS has the unique feature of integrating multi-source datasets and incorporates various metrics for comprehensively evaluating the result of data integration. Moreover, PIAS provides rich functions for data preprocessing, comprehensive analyses and visualization, including gene name transfer, quality control, normalization, highly variable genes identification, batch-effect removal, dimen-sionality reduction, clustering, differentially expressed, cluster annotation, enrichment analysis, and sin-gle-cell trajectories construction. Users can freely choose to perform desired functions, visualize results, and transfer data through interactive operations with PIAS.

  18. Datasets accompanying scANANSE

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, bin
    Updated Mar 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    J.A. Arts; J.A. Arts; J.G.A. Smits; J.G.A. Smits (2023). Datasets accompanying scANANSE [Dataset]. http://doi.org/10.5281/zenodo.7446267
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Mar 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    J.A. Arts; J.A. Arts; J.G.A. Smits; J.G.A. Smits
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
  19. allen_brain.h5ad

    • figshare.com
    hdf
    Updated Jun 6, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Dimitrov (2023). allen_brain.h5ad [Dataset]. http://doi.org/10.6084/m9.figshare.20338089.v4
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Jun 6, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Daniel Dimitrov
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Anndata format of the adult mouse brain atlas generated by reasearchers the Allen institute (Tasic et al.), together with inferred cell type colocalization information, as described in Dimitrov et al, 2022.

    Tasic, B. et al. Adult mouse cortical cell taxonomy revealed by single cell transcriptomics. Nat. Neurosci. 19, 335–346 (2016).

    Dimitrov, D., Türei, D., Garrido-Rodriguez, M., Burmedi, P.L., Nagai, J.S., Boys, C., Ramirez Flores, R.O., Kim, H., Szalai, B., Costa, I.G. and Valdeolivas, A., 2022. Comparison of methods and resources for cell-cell communication inference from single-cell RNA-Seq data. Nature Communications, 13(1), pp.1-13.

  20. Bulk-RNA-sequencing-and-single-nuclei-transcriptomics-and-epigenomics-of-brain-tissue-from-mice-flown-on-the-RR-10-mission...

    • osdr.nasa.gov
    • data.nasa.gov
    Updated Jul 21, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lauren Sanders; Eduardo Almeida; Sylvain Costes; Samrawit Gebre; Yi-Chun Chen; Valery Boyko; San-Huei Lai Polo; Kristen Peach; Amanda Saravia-Butler; Jonathan Oribello (2025). Bulk-RNA-sequencing-and-single-nuclei-transcriptomics-and-epigenomics-of-brain-tissue-from-mice-flown-on-the-RR-10-mission [Dataset]. https://osdr.nasa.gov/bio/repo/data/studies/OSD-612
    Explore at:
    Dataset updated
    Jul 21, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    Authors
    Lauren Sanders; Eduardo Almeida; Sylvain Costes; Samrawit Gebre; Yi-Chun Chen; Valery Boyko; San-Huei Lai Polo; Kristen Peach; Amanda Saravia-Butler; Jonathan Oribello
    License

    Attribution 1.0 (CC BY 1.0)https://creativecommons.org/licenses/by/1.0/
    License information was derived automatically

    Description

    The objective of the Rodent Research-10 mission (RR-10) was to investigate how spaceflight affects the cellular and molecular mechanisms of normal bone tissue regeneration in space. To this end, ten (10) 14-15 weeks-old female B6129SF2/J Wild Type (WT), and ten (10) 14-15 weeks-old female B6;129S2-Cdkn1atm1Tyj/J (p21-null) mice received a pre-flight subcutaneous injection of the bone marker (Alizarin Red), and were then delivered to the ISS aboard SpaceX-21. At 7 days before euthanasia, all 20 mice received an intraperitoneal (IP) injection with a bone formation marker (Calcein). At 48 +/- 2 hours before euthanasia, all 20 mice received an IP injection with a second dose of Calcein as well as a cell proliferation marker (BrdU). Then, following 28-29 days in microgravity, the Flight mice were euthanized. Following removal of hindlimbs, carcasses were wrapped in aluminum foil, preserved in the CryoChiller, and stored at -80 C or colder until return to Earth. In addition to the Flight group, three ground control groups were also part of the study: Basal (representing the pre-launch state), Vivarium (standard vivarium housing for the same duration of time as flight), and Ground (flight habitat in the International Space Station Environment Simulator, ISSES). Twenty mice (10 of each strain) were included in each of these control groups (except Vivarium which included 12 of each strain). These were treated, euthanized and processed on the same schedule and in the same manner as the flight samples. This study includes bulk RNA sequencing data from left cerebral hemispheres from 4 WT flight animals and 5 WT ground control animals, and single nuclei transcriptomics and epigenomics data from left cerebral hemispheres from 5 WT flight animals, and 5 WT ground control animals.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34

Data from: Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

Search
Clear search
Close search
Google apps
Main menu