3 datasets found

n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
dataone.org
+1more
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a...
zenodo.org
data.niaid.nih.gov
Updated Sep 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev (2023). A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration [Dataset]. http://doi.org/10.5281/zenodo.8328679
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.8328679
Dataset updated
Sep 20, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev
Description
Dataset created in the study "A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration"

Structure

ST_berghei_liver

contains data generated during stpipeline analysis and imaging on 2k arrays Spatial Transcriptomics platform as well as data necessary for and from hepaquery analysis. These samples include 38 sections in total of which 8 are from mice (n=4) infected with sporozoites for 12h, 5 sections from control mice (n=3) at 12h, 7 sections from mice (n=4) infected with sporozoites for 24h and 4 sections from control mice (n=3) for 24 as well as 8 samples of mice (n=2) infected with sporozoites for 38h and control mice (n =2) for 38h.

count contains gene expression matrix output from stpipeline in .tsv format

spotfiles contains coordinate files for count matrices

images contains scaled H&E, Fluorescence (FL) and annotated H&E images (from FL annotations) scaled to 10% of the original image size.

masks contains image masks for hepaquery analysis

distances contains distance measurements from original section sorted by timepoint as well as combined across timepoints

cluster contains clustering information across spatial positions used in spatial enrichment analysis

STUtiility_mus_pb_ST.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in ST_berghei_liver

visium_berghei_liver

contains data generated with the spaceranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include 8 sections in total, of which 1 was infected with sporozoites for 12h, 1 control section at 12h, 1 section infected with sporozoites for 24h and 1 control section at 24 as well as 2 sporozoite infected sections, and 2 control sections at 38h.

V10S29-135_A1 contains spaceranger output for section 1 for infected and control sections at 38h post-infection

V10S29-135_B1 contains spaceranger output for section 1 for infected and control sections at 12h post-infection

V10S29-135_C1 contains spaceranger output for section 1 for infected and control sections at 24h post-infection

V10S29-135_D1 contains spaceranger output for section 2 for infected and control sections at 38h post-infection

se_visium.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in visium_berghei_liver

snSeq_berghei_liver

contains data generated with the cellranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include single nuclei of 2 infected and control mice after 12h, 2 infected and control mice after 24h, 2 infected and control mice after 38h, and 2 uninfected mice prior to a challenge.

cellranger_cnt_out contains feature count matrix information from cell ranger output

final_merged_curated_annotations_270623.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in snSeq_berghei_liver.tar.gz

raw images.zip contains raw images for supplementary figures 20-22

adjusted images.zip contains brightness and contrast adjusted images for supplementary figures 20-22
Data from: Host-Pathogen Interactions in the Plasmodium-Infected Mouse Liver...
zenodo.org
application/gzip, bin +1
Updated Jan 5, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev (2024). Host-Pathogen Interactions in the Plasmodium-Infected Mouse Liver at Spatial and Single-Cell Resolution [Dataset]. http://doi.org/10.5281/zenodo.8386528
Explore at:
bin, zip, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8386528
Dataset updated
Jan 5, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev
Description
Dataset created in the study "A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration"
Structure
ST_berghei_liver
contains data generated during stpipeline analysis and imaging on 2k arrays Spatial Transcriptomics platform as well as data necessary for and from hepaquery analysis. These samples include 38 sections in total of which 8 are from mice (n=4) infected with sporozoites for 12h, 5 sections from control mice (n=3) at 12h, 7 sections from mice (n=4) infected with sporozoites for 24h and 4 sections from control mice (n=3) for 24 as well as 8 samples of mice (n=2) infected with sporozoites for 38h and control mice (n =2) for 38h.
count contains gene expression matrix output from stpipeline in .tsv format
spotfiles contains coordinate files for count matrices
images contains scaled H&E, Fluorescence (FL) and annotated H&E images (from FL annotations) scaled to 10% of the original image size.
masks contains image masks for hepaquery analysis
distances contains distance measurements from original section sorted by timepoint as well as combined across timepoints
cluster contains clustering information across spatial positions used in spatial enrichment analysis
STUtiility_mus_pb_ST.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in ST_berghei_liver
h5ad
contains anndata files of ST data (normalized read counts), spot information, distance measurements, images and masks generated using the hepaquery package.
visium_berghei_liver
contains data generated with the spaceranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include 8 sections in total, of which 1 was infected with sporozoites for 12h, 1 control section at 12h, 1 section infected with sporozoites for 24h and 1 control section at 24 as well as 2 sporozoite infected sections, and 2 control sections at 38h.
V10S29-135_A1 contains spaceranger output for section 1 for infected and control sections at 38h post-infection
V10S29-135_B1 contains spaceranger output for section 1 for infected and control sections at 12h post-infection
V10S29-135_C1 contains spaceranger output for section 1 for infected and control sections at 24h post-infection
V10S29-135_D1 contains spaceranger output for section 2 for infected and control sections at 38h post-infection
se_visium.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in visium_berghei_liver
snSeq_berghei_liver
contains data generated with the cellranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include single nuclei of 2 infected and control mice after 12h, 2 infected and control mice after 24h, 2 infected and control mice after 38h, and 2 uninfected mice prior to a challenge.
cellranger_cnt_out contains feature count matrix information from cell ranger output
final_merged_curated_annotations_270623.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in snSeq_berghei_liver.tar.gz
raw images.zip contains raw images for supplementary figures 20-22
adjusted images.zip contains brightness and contrast adjusted images for supplementary figures 20-22
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34

Data from: Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.t4b8gtj34

Dataset updated

Dec 14, 2021

Dataset provided by

Cornell University

Authors

David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

Clear search

Close search

Google apps

Main menu

Data from: Large-scale integration of single-cell transcriptomic data...

A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a...

Data from: Host-Pathogen Interactions in the Plasmodium-Infected Mouse Liver...

Data from: Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration