5 datasets found

Visium DLPFC preprocessed .rds
figshare.com
application/gzip
Updated Jun 11, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Marco Varrone (2023). Visium DLPFC preprocessed .rds [Dataset]. http://doi.org/10.6084/m9.figshare.22004750.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.22004750.v1
Dataset updated
Jun 11, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Marco Varrone
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Preprocessed .rds data from https://github.com/LieberInstitute/spatialLIBD used for the benchmarking of the spatial clustering methods in CellCharter.
n
Data from: Large-scale integration of single-cell transcriptomic data...
data.niaid.nih.gov
zip
Updated Dec 14, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.t4b8gtj34
Dataset updated
Dec 14, 2021
Dataset provided by
Cornell University
Authors
David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
Single-cell and spatial transcriptomics of stricturing Crohn's disease
zenodo.org
application/gzip, bin
Updated Sep 1, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier; Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier (2025). Single-cell and spatial transcriptomics of stricturing Crohn's disease [Dataset]. http://doi.org/10.5281/zenodo.14509802
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14509802
Dataset updated
Sep 1, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier; Lingjia Kong; Sathish Subramanian; Asa Segerstolpe; Vy Tran; Angela Shih; Grace Carter; Hiroko Kunitake; Shaina Tardus; Jasmine Li; Marco Kaper; Christy Cauley; Shivam Gandhi; Eric Chen; Caroline Porter; Toni Delorey; Liliana Bordeianou; Rocco Ricciardi; Ashwin Ananthakrishnan; Helena Lau; Richard Hodin; Jacques Deguine; Chris Smillie; Ramnik Xavier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the spatial transcriptomics data + code. This code was generated by members of the Smillie Lab @ MGH and Harvard Medical School.

github.tar.gz: spatial analysis code and data

anndata.h5ad: anndata object (scanpy)

V*tar.gz: raw spatial transcriptomics files

The github.tar.gz folder contains everything you need to reproduce the spatial transcriptomics figures. It is structured as follows:

1.BayesPrism: code for running BayesPrism on spatial data

2.SparCC: code for running SparCC on spatial data

3.Lasso: code for running lasso regression on spatial data

4.Analysis: code for reproducing all figures in the paper

4.Analysis/1.analysis.r: script to reproduce all figures in the paper ***

code: code library containing all necessary functions

load_data.r: code to load the single-cell and spatial datasets

sco.rds: single-cell analysis object (10X Chromium) formatted as an R list

vis.rds: spatial analysis object (10X Visium) formatted as an R list

All scripts are numbered. You need to run everything in order. For convenience, we include the output files for 1.BayesPrism, 2.SparCC, and 3.Lasso, allowing you to skip straight to the analysis code in 4.Analysis.

To reproduce all figures in the paper, you need to do the following:

Edit your PROJECT_FOLDER in the header of load_data.r

Install the packages listed at the top of load_data.r

Go to the 4.Analysis directory, start an interactive R session, and type:
> source('1.analysis.r')

This will load the beginning of the 1.analysis.r script (until the stop() statement on line 68). You can run the code in two different ways:

You can step through the code line by line in your interactive R session (starting at line 68)

Alternatively, remove the stop() statement from the script, then run the code start to finish

If you encounter any errors, try to debug them using a combination of Google+ChatGPT. If you still have trouble, please contact the Smillie Lab.

Note: the single-cell and spatial code are also available on GitHub. However, the spatial analysis requires large files that cannot be hosted on GitHub. Therefore, it is better to download the code + files from Zenodo. The GitHub link is provided below:

https://github.com/LJ-Kong/fibrosis_scRNA_stRNA
Data from: Charting the Heterogeneity of Colorectal Cancer Consensus...
zenodo.org
zip
Updated Mar 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Valdeolivas; Alberto Valdeolivas (2023). Charting the Heterogeneity of Colorectal Cancer Consensus Molecular Subtypes using Spatial Transcriptomics: datasets [Dataset]. http://doi.org/10.5281/zenodo.7551713
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7551713
Dataset updated
Mar 22, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alberto Valdeolivas; Alberto Valdeolivas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
You can find here the datasets used in the publication:

"Charting the Heterogeneity of Colorectal Cancer Consensus Molecular Subtypes using Spatial Transcriptomics"

This contents the raw Spatial Transcriptomics data, spot categorization made by pathologist and intermediary files required to run the analysis described in our manuscript and available in Github:

https://github.com/alberto-valdeolivas/ST_CRC_CMS
Data from: Profiling the Heterogeneity of Colorectal Cancer Consensus...
zenodo.org
zip
Updated Dec 20, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alberto Valdeolivas; Alberto Valdeolivas (2024). Profiling the Heterogeneity of Colorectal Cancer Consensus Molecular Subtypes using Spatial Transcriptomics: datasets [Dataset]. http://doi.org/10.5281/zenodo.7760264
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7760264
Dataset updated
Dec 20, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Alberto Valdeolivas; Alberto Valdeolivas
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
You can find here the datasets used in the publication:

Valdeolivas, A., Amberg, B., Giroud, N. et al. Profiling the heterogeneity of colorectal cancer consensus molecular subtypes using spatial transcriptomics. npj Precis. Onc. 8, 10 (2024). https://doi.org/10.1038/s41698-023-00488-4

This contents the raw Spatial Transcriptomics data, spot categorization made by pathologist, the results of the deconvolution and intermediary files required to run the analysis described in our manuscript and available in Github:

https://github.com/alberto-valdeolivas/ST_CRC_CMS

In particular, you will find here several zip compressed files with the following content:

- Intermediary_FileObjects.zip: The intermediary files generated in the scripts hosted in the github repo and required to run some later scripts.

- IntermediaryFiles_ST_CRC_LiverMetastasis.zip: The intermediary files generated in the scripts hosted in the github repo and required to run some of the scripts dealing with the external CRC ST dataset used in our manuscript.

- Pathology_SpotAnnotations.zip: The categories assigned by the pathologists to all the spots across our set ST samples to a different anatomical category (tumor, stroma, non-neoplastic mucosa...)

-SN048_A121573_Rep1.zip, SN048_A121573_Rep2.zip, SN048_A416371_Rep1.zip, SN048_A416371_Rep2.zip, SN123_A551763_Rep1.zip, SN123_A595688_Rep1.zip, SN123_A798015_Rep1.zip, SN123_A938797_Rep1_X.zip, SN124_A551763_Rep2.zip, SN124_A595688_Rep2.zip, SN124_A798015_Rep2.zip, SN124_A938797_Rep2.zip, SN84_A120838_Rep1.zip, SN84_A120838_Rep2.zip: The output of Space Ranger, including processed count data matrices and histological images, for the ST data generated in this study

- DeconvolutionResults_ST_CRC_BelgianCohort.zip, DeconvolutionResults_ST_CRC_KoreanCohort.zip, DeconvolutionResults_ST_CRC_LiverMetastasis.zip: These files contain the main results obtained when using the Cell2Location deconvolution approach in our samples (with two different references: Korean and Belgian cohorts) and in the external set of CRC ST samples (only Korean cohort)

- We have also uploaded the whole slide images (WSI). These are the files with an ndpi extension:

Visium Frozen_SN V10B01-048_new CRC_2021_02_16.ndp ... (samples A121573_Rep1, A121573_Rep2, A416371_Rep1 and A416371_Rep2), Visium Frozen_SN V19S23-084.ndpi (samples A120838_Rep1 and A120838_Rep2), Visium Frozen_SN V19S23-123.ndpi (samples A551763_Rep1, A595688_Rep1, A798015_Rep1, A938797_Rep1) and Visium Frozen_SN V19S23-124.ndpi (samples A551763_Rep2, A595688_Rep2, A798015_Rep2 and A938797_Rep2)

- We have now included the fastq and Bam files for the different samples, excluding replicate 1 of the A938797 sample whose fastq files are missing:

IMPORTANT: Fastq files are in version 1, while bam files are in version 2 of the dashboards reported below:

Sample S1_Cec (A551763)

Sample S2_Col_R (A595688)

Sample S3_Col_R (A416371)

Sample S4_Col_Sig (A120838)

Sample S5_Rec (A121573)

Sample S6_Rec (A938797)

Sample S7_Rec/Sig (A798015)
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Marco Varrone (2023). Visium DLPFC preprocessed .rds [Dataset]. http://doi.org/10.6084/m9.figshare.22004750.v1

Visium DLPFC preprocessed .rds

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.6084/m9.figshare.22004750.v1

Dataset updated

Jun 11, 2023

Dataset provided by

figshare
Figsharehttp://figshare.com/

Authors

Marco Varrone

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Preprocessed .rds data from https://github.com/LieberInstitute/spatialLIBD used for the benchmarking of the spatial clustering methods in CellCharter.

Clear search

Close search

Google apps

Main menu

Visium DLPFC preprocessed .rds

Data from: Large-scale integration of single-cell transcriptomic data...

Single-cell and spatial transcriptomics of stricturing Crohn's disease

Data from: Charting the Heterogeneity of Colorectal Cancer Consensus...

Data from: Profiling the Heterogeneity of Colorectal Cancer Consensus...

Visium DLPFC preprocessed .rdsSee More Versions

Visium DLPFC preprocessed .rds