100+ datasets found
  1. Z

    Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sun, Eric (2024). Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE spatial gene expression prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8259941
    Explore at:
    Dataset updated
    Jan 8, 2024
    Dataset authored and provided by
    Sun, Eric
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.

    The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:

    • Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix

    • scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix

    • Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data

    • Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data

    These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.

    Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.

    The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9

    "svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.

  2. Z

    Data from: Spatial Transcriptomics in Breast Cancer Reveals Tumour...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Al-Shahrour, Fátima (2024). Spatial Transcriptomics in Breast Cancer Reveals Tumour Microenvironment-Driven Drug Responses and Clonal Therapeutic Heterogeneity [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10638905
    Explore at:
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Al-Shahrour, Fátima
    García-Martín, Santiago
    Jiménez-Santos, María José
    Rubio-Fernández, Marcos
    Gómez-López, Gonzalo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We acquired 10x Visium spatial transcriptomics (ST) data from 9 patients with invasive adenocarcinomas [1–5] to explore the role of the tumour microenvironment (TME) on intratumor heterogeneity (ITH) and drug response in breast cancer. By leveraging a new version of Beyondcell 6, a tool for identifying tumour cell subpopulations with distinct drug response patterns, we predicted sensitivity to over 1,200 drugs while accounting for the spatial context and interaction between the tumour and TME compartments. Moreover, we also used Beyondcell to compute spot-wise functional enrichment scores and identify niche-specific biological functions.

    Here, you can find:

    In signatures folder:

    SSc breast: Collection of gene signatures used to predict sensitivity to > 1,200 drugs derived from breast cancer cell lines.

    Functional signatures: Collection of gene signatures used to compute enrichment in different biological pathways.

    In visium folder:

    Visium objects: Processed ST Seurat objects with deconvoluted spots, SCTransform-normalised counts, and clonal composition predicted with SCEVAN [7]. These objects, together with the signatures, were used to compute the Beyondcell objects.

    In single-cell folder:

    Single-cell objects: Raw and filtered merged single-cell RNA-seq (scRNA-seq) Seurat objects with unnormalised counts used as a reference for spot deconvolution.

    In beyondcell folder:

    Beyondcell sensitivity objects with prediction scores for all drug response signatures in SSc breast.

    Beyondcell functional objects with enrichment scores for all functional signatures.

  3. s

    Spatial Multimodal Analysis (SMA) - Spatial Transcriptomics

    • figshare.scilifelab.se
    • researchdata.se
    json
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marco Vicari; Reza Mirzazadeh; Anna Nilsson; Patrik Bjärterot; Ludvig Larsson; Hower Lee; Mats Nilsson; Julia Foyer; Markus Ekvall; Paulo Czarnewski; Xiaoqun Zhang; Per Svenningsson; Per Andrén; Lukas Käll; Joakim Lundeberg (2025). Spatial Multimodal Analysis (SMA) - Spatial Transcriptomics [Dataset]. http://doi.org/10.17044/scilifelab.22778920.v1
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    KTH Royal Institute of Technology, Science for Life Laboratory
    Authors
    Marco Vicari; Reza Mirzazadeh; Anna Nilsson; Patrik Bjärterot; Ludvig Larsson; Hower Lee; Mats Nilsson; Julia Foyer; Markus Ekvall; Paulo Czarnewski; Xiaoqun Zhang; Per Svenningsson; Per Andrén; Lukas Käll; Joakim Lundeberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains Spatial Transcriptomics (ST) data matching with Matrix Assisted Laser Desorption/Ionization - Mass Spetrometry Imaging (MALDI-MSI). This data is complementary to data contained in the same project. FIles with the same identifiers in the two datasets originated from the very same tissue section and can be combined in a multimodal ST-MSI object. For more information about the dataset please see our manuscript posted on BioRxiv (doi: https://doi.org/10.1101/2023.01.26.525195). This dataset includes ST data from 19 tissue sections, including human post-mortem and mouse samples. The spatial transcriptomics data was generated using the Visium protocol (10x Genomics). The murine tissue sections come from three different mice unilaterally injected with 6-OHDA. 6-OHDA is a neurotoxin that when injected in the brain can selectively destroy dopaminergic neurons. We used this mouse model to show the applicability of the technology that we developed, named Spatial Multimodal Analysis (SMA). Using our technology on these mouse brain tissue sections we were able to detect both dopamine with MALDI-MSI and the corresponding gene expression with ST. This dataset includes also one human post-mortem striatum sample that was placed on one Visium slide across the four capture areas. This sample was analyzed with a different ST protocol named RRST (Mirzazadeh, R., Andrusivova, Z., Larsson, L. et al. Spatially resolved transcriptomic profiling of degraded and challenging fresh frozen samples. Nat Commun 14, 509 (2023). https://doi.org/10.1038/s41467-023-36071-5), where probes capturing the whole transcriptome are first hybridized in the tissue section and then spatially detected. Each tissue section contained in the dataset has been given a unique identifier that is composed of the Visium array ID and capture area ID of the Visium slide that the tissue section was placed on. This unique identifier is included in the file names of all the files relative to the same tissue section, including the MALDI-MSI files published in the other dataset included in this project. In this dataset you will find the following files for each tissue section: - raw files: these are the read one fastq files (containing the pattern *R1*fastq.gz in the file name), read two fastq files (containing the pattern *R1*fastq.gz in the file name) and the raw microscope images (containing the pattern Spot.jpg in the file name). These are the only files needed to run the Space Ranger pipeline, which is freely available for any user (please see the 10x Genomics website for information on how to install and run Space Ranger); - processed data files: we provide processed data files of two types: a) Space Ranger outputs that were used to produce the figures in our publication; b) manual annotation tables in csv format produced using Loupe Browser 6 (csv tables with file names ending _RegionLoupe.csv, _filter.csv, _dopamine.csv, _lesion.csv, _region.csv patterns); c) json files that we used as input for Space Ranger in the cases where the automatic tissue detection included in the pipeline failed to recognize the tissue or the fiducials. Using these processed files the user can reproduce the figures of our publication without having to restart from the raw data files. The MALDI-MSI analyses preceding ST was performed with different matrices in different tissue section. We used 1) 9-aminoacridine (9-AA) for detection of metabolites in negative ionization mode, 2) 2,5-dihydroxybenzoic acid (DHB) for detection of metabolites in positive ionization mode, 3) 4-(anthracen-9-yl)-2-fluoro-1-ethylpyridin-1-ium iodide (FMP-10), which charge-tags molecules with phenolic hydroxyls and/or primary amines, including neurotransmitters. The information about which matrix was sprayed on the tissue sections and other information about the samples is included in the metadata table. We also used three types of control samples: - standard Visium: samples processed with standard Visium (i.e. no matrix spraying, no MALDI-MSI, protocol as recommended by 10x Gemomics with no exeptions) - internal controls (iCTRL): samples not sprayed with any matrix, neither processed with MALDI-MSI, but located on the same Visium slide were other samples were processed with MALDI-MSI - FMP-10-iCTRL: sample sprayed with FMP-10, and then processed as an iCTRL. This and other information is provided in the metadata table.

  4. E

    Spatial transcriptomic data of breast cancer

    • ega-archive.org
    Updated Mar 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Spatial transcriptomic data of breast cancer [Dataset]. https://ega-archive.org/datasets/EGAD50000000322
    Explore at:
    Dataset updated
    Mar 7, 2024
    License

    https://ega-archive.org/dacs/EGAC00001000581https://ega-archive.org/dacs/EGAC00001000581

    Description

    Fastq files from spatial transcriptomic of breast cancer coming from 8 Breast cancer sections. Sample preparation: frozen BC samples were chosen based on tissue structure and RNA quality (RIN > 8). The “Visium Spatial Tissue Optimization Slide and Reagent Kit” (10X Genomics; #PN-1000193) was then used to optimize permeabilization conditions for BC tissues. Briefly, sections were fixed, stained and then permeabilized at different time points to capture mRNA, and the reverse transcription was performed to generate fluorescently labeled cDNA. The permeabilization time that resulted in the highest fluorescence signal with the lowest background diffusion was chosen. The best permeabilization time for BC tissue was 18 min. Cryostat sections of 10 μm of thickness were cut and placed on Visium Spatial Gene Expression slides (10X Genomics, PN-1000184). The slide was incubated for 1 min at 37°C, then fixed with methanol for 30 min at -20°C followed by Hematoxylin and Eosin (H&E) staining and images were taken under a high-resolution microscope. After imaging, the coverslip was detached by holding the slide in water and the slide was mounted in a plastic slide cassette. The spatial gene expression process, including tissue permeabilization, second strand synthesis and cDNA amplification, was performed according to the manufacturer’s instructions (10X Genomics; #CG000239). cDNA quality was next assessed using Agilent High sensitivity DNA Kit (Agilent, #5067-4626). The spatial gene libraries were constructed using Visium Spatial Library Construction Kit (10X Genomics, PN-1000184).

  5. Datasets for single-cell-resolution spatial mapping by SC2Spa

    • figshare.com
    hdf
    Updated Mar 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Linbu Liao (2025). Datasets for single-cell-resolution spatial mapping by SC2Spa [Dataset]. http://doi.org/10.6084/m9.figshare.21829905.v14
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Mar 18, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Linbu Liao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TitleDatasets for high resolution spatial mapping for SC2SpaAuthor(s)Linbu LiaoCategoriesBioinformatic methods developmentItem typedatasetKeyword(s)Spatial transcriptomics, Spatial inferenceFile descriptionAdataMH1.h5ad is the processed mouse hippocampus spatial transcriptomics data file of puck_200115_08 from Slide-seqV2 paper[1].AMB_HC.h5ad is a processed mouse hippocampus scRNA-seq data file[2]. The datasets are saved in Anndata format.HC1_transfer_to_AMB.csv includes the predicted location of the scRNA-seq data (AMB_HC.h5ad). The columns "ClosestSC" and "Dis2ClosestBead" are used in the cell communication analysis tutorial.ssHippo_RCTD.csv is the annotation for the AdataMH1.h5ad file by RCTD[3].WDs_T2.csv includes the Wasserstain distance of genes between the scRNA-seq dataset[2] and the mouse hippocampus Slide-seqV2[1] dataset.SI_T2_WD.h5 is the traned model for mapping mouse hippocampus cells to space. The model is trained using genes selected according to Wasserstain distance.SI_T2.h5 is the spatial inference model trained on all shared genes between two datasets.T2_stat.csv is a summary of SI_T2.h5. It contains genes' contribution to location prediction and Pearson's correlation between prediction and true gene expression.AdataMH2.h5ad is the processed mouse hippocampus spatial transcriptomics data file of puck_191204_01 from the Slide-seqV2 paper[1].slideSeq_Puck190926_03_RCTD.csv is the annotation file for AdataEmbryo1.h5ad. AdataEmbryo1.h5ad is preprocessed file of puck_190926_03 (a mouse emrbyo Slide-seqV2[1] dataset).C2L.zip is the cell2location[4] data used in the analysis of SC2Spa manuscript.Cell2Location_ST.h5ad is the processed Visium adult mouse brain spatial transcriptomics data ST8059048 from Cell2Location paper[4].Cell2Location_snRNAseq.h5ad is the processed snRNA-seq adult mouse brain data of 5705STDY8058280, 5705STDY8058281, 5705STDY8058282, 5705STDY8058283, 5705STDY8058284, 5705STDY8058285 from Cell2Location paper[4]. 80, 81, 82 are from male 1 (female). 83, 84, 85 are from mouse 2 (male)ModelCell2Location_snRNAseq2ST_SC2Spa_WD.h5 is is the pretrained SC2Spa model for mapping snRNAseq to ST (Genes with a Wasserstein Distance greater than 0.1 was used for training)WDs_snRNAseq2Visium.csv has the Wasserstain distance information of genes between the snRNA-seq data[4] and the Visium mouse brain data[4].The datasets were used for the spatial mapping of SC2Spa.CV_code.zip contains the benchmarking code along with scripts for cross-validation and cross-dataset validation.RepositoriesThe github website of SC2Spa: https://github.com/linbuliao/SC2SpaThe github repository for SC2Spa analysis: https://github.com/linbuliao/SC2Spa_NotebooksDocumentationThe Read the Docs website of SC2Spa: https://sc2spa.readthedocs.io/en/latest/References[1] Stickels RR, Murray E, Kumar P, Li J, Marshall JL, Di Bella DJ, Arlotta P, Macosko EZ, Chen F: Highly sensitive spatial transcriptomics at near-cellular resolution with Slide-seqV2. Nature Biotechnology 2020.[2] Saunders A, Macosko EZ, Wysoker A, Goldman M, Krienen FM, de Rivera H, Bien E, Baum M, Bortolin L, Wang SY, et al: Molecular Diversity and Specializations among the Cells of the Adult Mouse Brain. Cell 2018, 174:1015-+.[3] Cable DM, Murray E, Zou LS, Goeva A, Macosko EZ, Chen F, Irizarry RA: Robust decomposition of cell type mixtures in spatial transcriptomics. Nat Biotechnol 2021.[4] Kleshchevnikov V, Shmatko A, Dann E, Aivazidis A, King HW, Li T, Elmentaite R, Lomakin A, Kedlian V, Gayoso A, et al: Cell2location maps fine-grained cell types in spatial transcriptomics. Nat Biotechnol 2022, 40:661-671.

  6. r

    Data from: Computational pathology annotation enhances the resolution and...

    • researchdata.se
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tianyi Li; Qiao Yang; Balazs Acs; Emmanouil G. Sifakis; Hosein Toosi; Camilla Engblom; Kim Thrane; Qirong Lin; Jeff E. Mold; Wenwen Sun; Ceren Boyaci; Sanna Steen; Jonas Frisén; Jens Lagergren; Joakim Lundeberg; Xinsong Chen; Johan Hartman (2025). Computational pathology annotation enhances the resolution and interpretation of breast cancer spatial transcriptomics data [Dataset]. http://doi.org/10.48723/f4v5-m008
    Explore at:
    (1312), (1213)Available download formats
    Dataset updated
    Sep 1, 2025
    Dataset provided by
    Karolinska Institutet
    Authors
    Tianyi Li; Qiao Yang; Balazs Acs; Emmanouil G. Sifakis; Hosein Toosi; Camilla Engblom; Kim Thrane; Qirong Lin; Jeff E. Mold; Wenwen Sun; Ceren Boyaci; Sanna Steen; Jonas Frisén; Jens Lagergren; Joakim Lundeberg; Xinsong Chen; Johan Hartman
    Area covered
    Sweden
    Description

    The samples in the dataset are connected to a study focusing on studying breast cancer intratumoral heterogeneity using spatial transcriptomic data and computational pathology. The dataset contains 14 samples from 3 patients (one triple negative breast cancer and two HER2-positive breast cancer). Multiple regions of the tumor were collected for analysis. Each sample is one tumor region from one of the patients.

    Libraries for spatial transcriptomics were prepared using Visium spatial gene expression kits (10x genomics). Sequencing was performed using the Illumina NovaSeq 6000 platform at the National Genomics Infrastructure, SciLifeLab in Solna, Sweden.

    The dataset contains 28 fastq files, compressed with GNUzip (gzip), from paired-end RNA sequencing (10X Visium spatial transcriptomics). The meta data is described in SND_metadata.xlsx file. The md5sum.txt file is provided for validation of data integrity. The total size of the dataset is approximately 300 GB.

  7. E

    CCA Visium spatial transcriptomics data (4 CCA)

    • ega-archive.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    CCA Visium spatial transcriptomics data (4 CCA) [Dataset]. https://ega-archive.org/datasets/EGAD00001011997
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001003452https://ega-archive.org/dacs/EGAC00001003452

    Description

    Visium spatial transcriptomics (10X Genomics) performed on 4 CCA samples. Each sample has two paired-end sequencing runs: the first (I1 & I2) are a pair reading indexes; the second (R1 & R2) are a pair reading inserts, with R1 additionally reading 10X barcodes. For histology images, please contact authors.

  8. A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a...

    • zenodo.org
    • data.niaid.nih.gov
    Updated Sep 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev (2023). A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration [Dataset]. http://doi.org/10.5281/zenodo.8328679
    Explore at:
    Dataset updated
    Sep 20, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Franziska Hildebrandt; Franziska Hildebrandt; Miren Urrutia Iturritza; Miren Urrutia Iturritza; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev; Christian Zwicker; Bavo Vanneste; Noémi Van Hul; Elisa Semle; Tales Pascini; Sami Saarenpää; Mengxiao He; Emma R. Andersson; Charlotte L. Scott; Joel Vega-Rodriguez; Joakim Lundeberg; Johan Ankarklev
    Description

    Dataset created in the study "A Spatial Transcriptomics Atlas of the Malaria-infected Liver Indicates a Crucial Role for Lipid Metabolism and Hotspots of Inflammatory Cell Infiltration"

    Structure

    ST_berghei_liver

    contains data generated during stpipeline analysis and imaging on 2k arrays Spatial Transcriptomics platform as well as data necessary for and from hepaquery analysis. These samples include 38 sections in total of which 8 are from mice (n=4) infected with sporozoites for 12h, 5 sections from control mice (n=3) at 12h, 7 sections from mice (n=4) infected with sporozoites for 24h and 4 sections from control mice (n=3) for 24 as well as 8 samples of mice (n=2) infected with sporozoites for 38h and control mice (n =2) for 38h.

    • count contains gene expression matrix output from stpipeline in .tsv format
    • spotfiles contains coordinate files for count matrices
    • images contains scaled H&E, Fluorescence (FL) and annotated H&E images (from FL annotations) scaled to 10% of the original image size.
    • masks contains image masks for hepaquery analysis
    • distances contains distance measurements from original section sorted by timepoint as well as combined across timepoints
    • cluster contains clustering information across spatial positions used in spatial enrichment analysis

    STUtiility_mus_pb_ST.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in ST_berghei_liver

    visium_berghei_liver

    contains data generated with the spaceranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include 8 sections in total, of which 1 was infected with sporozoites for 12h, 1 control section at 12h, 1 section infected with sporozoites for 24h and 1 control section at 24 as well as 2 sporozoite infected sections, and 2 control sections at 38h.

    • V10S29-135_A1 contains spaceranger output for section 1 for infected and control sections at 38h post-infection
    • V10S29-135_B1 contains spaceranger output for section 1 for infected and control sections at 12h post-infection

    • V10S29-135_C1 contains spaceranger output for section 1 for infected and control sections at 24h post-infection

    • V10S29-135_D1 contains spaceranger output for section 2 for infected and control sections at 38h post-infection

    se_visium.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in visium_berghei_liver

    snSeq_berghei_liver

    contains data generated with the cellranger pipeline and imaging using the Visium spatial transcriptomics platform. These samples include single nuclei of 2 infected and control mice after 12h, 2 infected and control mice after 24h, 2 infected and control mice after 38h, and 2 uninfected mice prior to a challenge.

    • cellranger_cnt_out contains feature count matrix information from cell ranger output

    final_merged_curated_annotations_270623.RDS describes seurat object generated using the STUtility package using ST data of the 38 liver sections of which the data is stored in snSeq_berghei_liver.tar.gz

    raw images.zip contains raw images for supplementary figures 20-22

    adjusted images.zip contains brightness and contrast adjusted images for supplementary figures 20-22

  9. S

    Spatial transcriptomics data of mouse thymus

    • scidb.cn
    Updated Dec 13, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jingwei Ma; Liang Tang; Jingxuan Xiao; Bo Huang; Junwei Liu (2024). Spatial transcriptomics data of mouse thymus [Dataset]. http://doi.org/10.57760/sciencedb.18291
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Dec 13, 2024
    Dataset provided by
    Science Data Bank
    Authors
    Jingwei Ma; Liang Tang; Jingxuan Xiao; Bo Huang; Junwei Liu
    License

    Attribution-NonCommercial-ShareAlike 4.0 (CC BY-NC-SA 4.0)https://creativecommons.org/licenses/by-nc-sa/4.0/
    License information was derived automatically

    Description

    The thymus is a critical organ for T cell development and immune system function, but it undergoes significant structural and functional changes during aging, a process known as thymic involution. To investigate the spatial and cellular changes associated with thymic aging, we performed single-cell spatial transcriptomics on thymic tissues from young (4-week-old) and aged (52-week-old) C57BL/6J mice, with three biological replicates per age group. This dataset provides a detailed spatial and cellular map of the thymus at single-cell resolution, capturing changes in cell types, abundances, and spatial organization during aging. The results offer valuable insights into the cellular and spatial heterogeneity of the thymus and provide a resource for understanding immune system aging and potential therapeutic strategies.

  10. Z

    Supporting data for SpatialOne: End-to-End Analysis of Spatial...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 1, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pla Planas, Albert (2024). Supporting data for SpatialOne: End-to-End Analysis of Spatial Transcriptomics at Scale [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10837967
    Explore at:
    Dataset updated
    Jul 1, 2024
    Dataset authored and provided by
    Pla Planas, Albert
    Description

    Supplementary data supporting the SpatialOne: End-to-End Analysis of Spatial Transcriptomics at Scale publication

    To showcase the capabilities of SpatialOne, two human lung cancer formalin-fixed, paraffin-embedded (FFPE) samples are analyzed. These samples are prepared following the CG000495 protocol (Figure 1b), sequenced with the 10x Visium CytAssist, and processed using the 10x SpaceRanger version 2. We also present analysis of two adult mouse samples sequenced using 10x Visium samples (one fresh frozen brain tissue section processed using SpaceRanger v2 and one FFPE kidney sample processed using the SpaceRanger v1), and 75 internal samples.

    For the human lung cancer samples, single-cell data from the the Lung Cancer Atlas (Salcher et al., 2022) is used as reference. This dataset is filtered to include only Chromium-generated data. For the mice samples, the GSE107585 single-cell dataset serves as reference. In the human lung cancer datasets, a pathologist annotated regions of interest corresponding to tumors, blood vessels, and alveolar regions.

    Changelog:

    Added a README file describing the zip content.

  11. Z

    Data from: Library size confounds biology in spatial transcriptomics data

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jan 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Feher, Kristen (2024). Library size confounds biology in spatial transcriptomics data [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7959786
    Explore at:
    Dataset updated
    Jan 31, 2024
    Dataset provided by
    Marceaux, Claire
    Kharbanda, Malvika
    Putri, Givanna
    Feher, Kristen
    Chen, Jinjin
    Hickey, Theresa E
    Tan, Chin Wee
    Tilley, Wayne D
    Bhuva, Dharmesh D
    Davis, Melissa J
    Marie-Liesse Asselin-Labat
    Phipson, Belinda
    Jin, Xinyi
    Liu, Ning
    Salim, Agus
    Pickering, Marie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset contains annotated sub-cellular localised spatial measurements from the Visium, Xenium and CosMx platforms. Specifically, it includes datasets analysed in the publication Bhuva et. al, 2023 titled "Library size confounds biology in spatial transcriptomics data". Raw transcript detections are presented. Data is best accessed through the accompanying SubcellularSpatialData R/Bioconductor package. Region files used to annotate individual transcript detections are presented in the form of GeoJSON files.

  12. NanoString GeoMx DSP dataset from 12 primary/recurrent IDH-mutant...

    • zenodo.org
    txt, zip
    Updated Oct 10, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Levi Van Hijfte; Levi Van Hijfte; Marjolein Geurts; Wies Vallentgoed; Paul Eilers; Peter Sillevis Smitt; Reno Debets; Pim French; Marjolein Geurts; Wies Vallentgoed; Paul Eilers; Peter Sillevis Smitt; Reno Debets; Pim French (2024). NanoString GeoMx DSP dataset from 12 primary/recurrent IDH-mutant astrocytoma samples [Dataset]. http://doi.org/10.5281/zenodo.13911761
    Explore at:
    txt, zipAvailable download formats
    Dataset updated
    Oct 10, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Levi Van Hijfte; Levi Van Hijfte; Marjolein Geurts; Wies Vallentgoed; Paul Eilers; Peter Sillevis Smitt; Reno Debets; Pim French; Marjolein Geurts; Wies Vallentgoed; Paul Eilers; Peter Sillevis Smitt; Reno Debets; Pim French
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    NanoString GeoMx Digital Spatial Profiler data from 12 paired tumor resections of 6 IDH-mutant astrocytoma patients. All samples had an IDH-R132H mutation. all first resections were WHO 2016 grade II or III and second resections were WHO 2016 grade IV. The NanoString Cancer Transcriptome Atlas panel was used to measure RNA expression levels of ~1800 genes in 72 regions of interest. ROIs folder contains all images of the regions of interest used in this study, separated by tumor pairs. For information on methods see associated publication.

  13. K

    Replication Data for: Spatial transcriptomics analysis in "Single-cell...

    • rdr.kuleuven.be
    csv, txt
    Updated Dec 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sam Kint; Sam Kint (2022). Replication Data for: Spatial transcriptomics analysis in "Single-cell profiling reveals mechanisms of response to anti-PD-L1 versus anti-PD-L1 combined with anti-CTLA4 in head and neck squamous cell carcinoma" [Dataset]. http://doi.org/10.48804/992X8C
    Explore at:
    txt(619), txt(592), txt(1090), csv(12412)Available download formats
    Dataset updated
    Dec 22, 2022
    Dataset provided by
    KU Leuven RDR
    Authors
    Sam Kint; Sam Kint
    License

    https://www.kuleuven.be/rdm/en/rdr/custom-kuleuvenhttps://www.kuleuven.be/rdm/en/rdr/custom-kuleuven

    Description

    This folder contains the fastq-files that are generated during the Grand Challenge project using 10X Genomics Visium on head&neck squamous cell carcinoma samples. It contains 4 fastq-files (R1 and R2 for each of the two sequencing lanes) per patient (for each patient, 2 samples (biopsy and resection) were collected, and the two samples of 1 patient (HNI40020) was analyzed twice).

  14. Additional file 5 of Seamless integration of image and molecular analysis...

    • springernature.figshare.com
    txt
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Joseph Bergenstråhle; Ludvig Larsson; Joakim Lundeberg (2023). Additional file 5 of Seamless integration of image and molecular analysis for spatial transcriptomics workflows [Dataset]. http://doi.org/10.6084/m9.figshare.12667458.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Joseph Bergenstråhle; Ludvig Larsson; Joakim Lundeberg
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 5: RMarkdown on how to use the RegionNeighbours function.

  15. A single-cell and spatially resolved atlas of human breast cancers | spatial...

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, pdf
    Updated Jul 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sunny Z Wu; Sunny Z Wu; Alexander Swarbrick; Alexander Swarbrick (2024). A single-cell and spatially resolved atlas of human breast cancers | spatial transcriptomics data [Dataset]. http://doi.org/10.5281/zenodo.4739739
    Explore at:
    application/gzip, pdfAvailable download formats
    Dataset updated
    Jul 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sunny Z Wu; Sunny Z Wu; Alexander Swarbrick; Alexander Swarbrick
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains spatial transcriptomics data related to the Wu et al. 2021 study "A single-cell and spatially resolved atlas of human breast cancers". Processed count matrices, brightfield HE-images (plain and annotated) and meta-data (containing clinical information and spot pathological details) for 6 primary breast cancers profiled using the Visium assay (10X Genomics). If you use this dataset in your research, please consider citing the above study.

    The content of the files are:
    raw_count_matrices.tar.gz - spaceranger processed raw count matrices.

    spatial.tar.gz - spaceranger processed spatial files (images, scalefactors, aligned fiducials, position lists)

    filtered_count_matrices.tar.gz - filtered count matrices.

    metadata.tar.gz - metadata for tissues and spots of filtered count matrices, including clinical subtype and pathological annotation of each spot.

    images.pdf - pdf detailing the H&E and annotation images.

  16. m

    Data from: Sequencing-free whole genome spatial transcriptomics at molecular...

    • data.mendeley.com
    Updated Jul 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yubao Cheng (2025). Sequencing-free whole genome spatial transcriptomics at molecular resolution in intact tissue [Dataset]. http://doi.org/10.17632/8kbv637pxh.1
    Explore at:
    Dataset updated
    Jul 15, 2025
    Authors
    Yubao Cheng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Recent breakthroughs in spatial transcriptomics technologies have enhanced our understanding of diverse cellular identities, spatial organizations, and functions. Yet existing spatial transcriptomics tools are still limited in either transcriptomic coverage or spatial resolution, hindering unbiased, hypothesis-free transcriptomic analyses at high spatial resolution. Here we develop Reverse-padlock Amplicon Encoding FISH (RAEFISH), an image-based spatial transcriptomics method with whole-genome coverage and single-molecule resolution in intact tissues. We demonstrate spatial profiling of 23,000 human or 22,000 mouse transcripts in single cells and tissue sections. Our analyses reveal transcript-specific subcellular localization, cell-type-specific and cell-type-invariant zonation-dependent transcriptomes, and gene programs underlying preferential cell-cell interactions. Finally, we further develop our technology for direct spatial readout of gRNAs in an image-based high-content CRISPR screen. Overall, these developments provide the research community with a broadly applicable technology that enables high-coverage, high-resolution spatial profiling of both long and short, native and engineered RNA species in many biomedical contexts.

  17. d

    MERFISH and snRNAseq analysis of healthy and disease human liver

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Feb 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeffrey Moffitt; Brianna Watson; Biplab Paul; Alan Mullen (2024). MERFISH and snRNAseq analysis of healthy and disease human liver [Dataset]. http://doi.org/10.5061/dryad.37pvmcvsg
    Explore at:
    Dataset updated
    Feb 9, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Jeffrey Moffitt; Brianna Watson; Biplab Paul; Alan Mullen
    Time period covered
    Jan 1, 2024
    Description

    Single-cell RNA sequencing (scRNA-seq) has advanced our understanding of cell types and their heterogeneity within the human liver, but the spatial organization at single-cell resolution has not yet been described. Here we apply multiplexed error robust fluorescent in situ hybridization (MERFISH) to map the zonal distribution of hepatocytes, resolve subsets of macrophage and mesenchymal populations, and investigate the relationship between hepatocyte ploidy and gene expression within the healthy human liver. We next integrated spatial information from MERFISH with the more complete transcriptome produced by single- nucleus RNA sequencing (snRNA-seq), revealing zonally enriched receptor-ligand interactions. Finally, analysis of fibrotic liver samples identified two hepatocyte populations that are not restricted to zonal distribution and expand with injury. Together these spatial maps of the healthy and fibrotic liver provide a deeper understanding of the cellular and spatial remodeling t..., Two measurement modalities were used to generate these data, including multiplexed error robust fluorescence in situ hybridization (MERFISH) and single-nucleus RNA sequencing (snRNAseq)., , # MERFISH and snRNAseq data from Watson, Paul et al

    This README file contains information on the data deposited for the manuscript "Spatial transcriptomics of healthy and fibrotic human liver at single-cell resolution" by Watson, Paul and colleagues.

    Anndata Structures

    Multiple anndata structures are provide as h5ad files for different datasets. These anndata structures were generated with the scanpy pipeline (v1.8.1) and can be loaded in python with the associated tools. These include: (1) adata_healthy_merfish.h5ad (2) adata_healthy_diseased_merfish.h5ad (3) adata_healthy_merfish_nucseq.h5ad (4) adata_healthy_nucseq.h5ad

    Each anndata frame contains distinctive values for the respective data set as follows:

    (1) adata_healthy_merfish.h5ad This structure contains data from healthy patient samples which were imaged with MERFISH. Raw data is stored in the adata.raw.X while adata.X is normalized by the total counts per cell, scaled to a uniform value, and then converted to logarithm...

  18. Z

    Data from: In silico spatial transcriptomic editing at single-cell...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 11, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Viktor Koelzer (2024). In silico spatial transcriptomic editing at single-cell resolution [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8186464
    Explore at:
    Dataset updated
    Jul 11, 2024
    Dataset provided by
    Jiqing Wu
    Viktor Koelzer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data for training the GAN (Inversion) model and reproduce the results reported in the paper

  19. d

    Spatial transcriptomic analysis of human dorsoal root ganglia neurons

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huasheng Yu (2025). Spatial transcriptomic analysis of human dorsoal root ganglia neurons [Dataset]. http://doi.org/10.5061/dryad.gf1vhhmxq
    Explore at:
    Dataset updated
    Jul 24, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Huasheng Yu
    Description

    Xenium platform was used for the spatial transcriptomic analysis of human DRG neurons, 100 marker genes were selected as the customized probe panel and hybridized to fresh frozen hDRG sections. Manual segmentation of each neuron soma was performed, based on expressions of pan-neuronal marker gene PGP9.5, satellite glia cell marker FAB7B, and the corresponding H.E. staining. In total, 1340 neurons were identified (excluding 75 region-of-interest with poor or unclear neuronal soma morphology in H & E staining) and clustered into 16 groups. The 16 clusters were assigned as different cell types based on marker genes expression., In the study presented here, four dorsoal root ganglia tissues from two healthy donors were used for Xenium spatial transcriptomics analysis, A hundred gene panel (including 87 neuronal genes from our single-soma sequencing dataset and 13 non-neuronal cell marker genes) were selected to perform spatial transcriptomics. The spatial distribution of these genes in neurons and non-neuronal cells was successfully profiled and quantified., , # Spatial transcriptomic analysis of human dorsoal root ganglia neurons

    This dataset is associated with Yu & Nagi 2024 (https://doi.org/10.1038/s41593-024-01794-1). It contains human dorsal root ganglia (DRG) 10x Xenium spatial transcriptomics raw data. In total, four DRG tissue sections from two healthy donors were used for Xenium spatial transcriptomics analysis, A hundred gene panel (including 87 neuronal genes from our single-soma sequencing dataset and 13 non-neuronal cell marker genes) were selected to perform spatial transcriptomics. The spatial distribution of these genes in neurons and non-neuronal cells was successfully profiled and quantified.

    Description of the data and file structure

    Overview: The .rar file contains all of the 10x Xenium spatial transcriptomics raw data for data analysis generating plots in the associated manuscript. Each .rar file contains the following contents.

    • Xenium experiment file: The `experiment...,
  20. n

    Data from: Large-scale integration of single-cell transcriptomic data...

    • data.niaid.nih.gov
    • dataone.org
    • +1more
    zip
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    Cornell University
    Authors
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

    Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

    Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

    Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

    Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

    Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

    Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

    Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

    Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

    Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sun, Eric (2024). Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE spatial gene expression prediction [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_8259941

Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE spatial gene expression prediction

Explore at:
Dataset updated
Jan 8, 2024
Dataset authored and provided by
Sun, Eric
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.

The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:

  • Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix

  • scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix

  • Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data

  • Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data

These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.

Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.

The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9

"svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.

Search
Clear search
Close search
Google apps
Main menu