Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.
The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:
Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix
scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix
Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data
Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data
These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.
Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.
The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9
"svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains Spatial Transcriptomics (ST) data matching with Matrix Assisted Laser Desorption/Ionization - Mass Spetrometry Imaging (MALDI-MSI). This data is complementary to data contained in the same project. FIles with the same identifiers in the two datasets originated from the very same tissue section and can be combined in a multimodal ST-MSI object. For more information about the dataset please see our manuscript posted on BioRxiv (doi: https://doi.org/10.1101/2023.01.26.525195). This dataset includes ST data from 19 tissue sections, including human post-mortem and mouse samples. The spatial transcriptomics data was generated using the Visium protocol (10x Genomics). The murine tissue sections come from three different mice unilaterally injected with 6-OHDA. 6-OHDA is a neurotoxin that when injected in the brain can selectively destroy dopaminergic neurons. We used this mouse model to show the applicability of the technology that we developed, named Spatial Multimodal Analysis (SMA). Using our technology on these mouse brain tissue sections we were able to detect both dopamine with MALDI-MSI and the corresponding gene expression with ST. This dataset includes also one human post-mortem striatum sample that was placed on one Visium slide across the four capture areas. This sample was analyzed with a different ST protocol named RRST (Mirzazadeh, R., Andrusivova, Z., Larsson, L. et al. Spatially resolved transcriptomic profiling of degraded and challenging fresh frozen samples. Nat Commun 14, 509 (2023). https://doi.org/10.1038/s41467-023-36071-5), where probes capturing the whole transcriptome are first hybridized in the tissue section and then spatially detected. Each tissue section contained in the dataset has been given a unique identifier that is composed of the Visium array ID and capture area ID of the Visium slide that the tissue section was placed on. This unique identifier is included in the file names of all the files relative to the same tissue section, including the MALDI-MSI files published in the other dataset included in this project. In this dataset you will find the following files for each tissue section: - raw files: these are the read one fastq files (containing the pattern *R1*fastq.gz in the file name), read two fastq files (containing the pattern *R1*fastq.gz in the file name) and the raw microscope images (containing the pattern Spot.jpg in the file name). These are the only files needed to run the Space Ranger pipeline, which is freely available for any user (please see the 10x Genomics website for information on how to install and run Space Ranger); - processed data files: we provide processed data files of two types: a) Space Ranger outputs that were used to produce the figures in our publication; b) manual annotation tables in csv format produced using Loupe Browser 6 (csv tables with file names ending _RegionLoupe.csv, _filter.csv, _dopamine.csv, _lesion.csv, _region.csv patterns); c) json files that we used as input for Space Ranger in the cases where the automatic tissue detection included in the pipeline failed to recognize the tissue or the fiducials. Using these processed files the user can reproduce the figures of our publication without having to restart from the raw data files. The MALDI-MSI analyses preceding ST was performed with different matrices in different tissue section. We used 1) 9-aminoacridine (9-AA) for detection of metabolites in negative ionization mode, 2) 2,5-dihydroxybenzoic acid (DHB) for detection of metabolites in positive ionization mode, 3) 4-(anthracen-9-yl)-2-fluoro-1-ethylpyridin-1-ium iodide (FMP-10), which charge-tags molecules with phenolic hydroxyls and/or primary amines, including neurotransmitters. The information about which matrix was sprayed on the tissue sections and other information about the samples is included in the metadata table. We also used three types of control samples: - standard Visium: samples processed with standard Visium (i.e. no matrix spraying, no MALDI-MSI, protocol as recommended by 10x Gemomics with no exeptions) - internal controls (iCTRL): samples not sprayed with any matrix, neither processed with MALDI-MSI, but located on the same Visium slide were other samples were processed with MALDI-MSI - FMP-10-iCTRL: sample sprayed with FMP-10, and then processed as an iCTRL. This and other information is provided in the metadata table.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
The Variable Neighborhood Search (VNS) method is well known metaheuristic method, which starts from one point from the search space, explores its neighborhoods and repeats the whole process until better solution is found or some stopping criteria is reached. Leveraging the well-established foundation of VNS, first we present a comprehensive solution for the cell clustering problem in the form of the Integer Linear Programming (ILP) minimization problem, which is based on the p-median classification. The proposed algorithm exhibits the ability to organize cells into clusters, utilizing information from both gene expression matrices and spatial coordinates.
Facebook
TwitterThe samples in the dataset are connected to a study focusing on studying breast cancer intratumoral heterogeneity using spatial transcriptomic data and computational pathology. The dataset contains 14 samples from 3 patients (one triple negative breast cancer and two HER2-positive breast cancer). Multiple regions of the tumor were collected for analysis. Each sample is one tumor region from one of the patients.
Libraries for spatial transcriptomics were prepared using Visium spatial gene expression kits (10x genomics). Sequencing was performed using the Illumina NovaSeq 6000 platform at the National Genomics Infrastructure, SciLifeLab in Solna, Sweden.
The dataset contains 28 fastq files, compressed with GNUzip (gzip), from paired-end RNA sequencing (10X Visium spatial transcriptomics). The meta data is described in SND_metadata.xlsx file. The md5sum.txt file is provided for validation of data integrity. The total size of the dataset is approximately 300 GB.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recent breakthroughs in spatial transcriptomics technologies have enhanced our understanding of diverse cellular identities, spatial organizations, and functions. Yet existing spatial transcriptomics tools are still limited in either transcriptomic coverage or spatial resolution, hindering unbiased, hypothesis-free transcriptomic analyses at high spatial resolution. Here we develop Reverse-padlock Amplicon Encoding FISH (RAEFISH), an image-based spatial transcriptomics method with whole-genome coverage and single-molecule resolution in intact tissues. We demonstrate spatial profiling of 23,000 human or 22,000 mouse transcripts in single cells and tissue sections. Our analyses reveal transcript-specific subcellular localization, cell-type-specific and cell-type-invariant zonation-dependent transcriptomes, and gene programs underlying preferential cell-cell interactions. Finally, we further develop our technology for direct spatial readout of gRNAs in an image-based high-content CRISPR screen. Overall, these developments provide the research community with a broadly applicable technology that enables high-coverage, high-resolution spatial profiling of both long and short, native and engineered RNA species in many biomedical contexts.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
All the analysis code and processed data required to produce figures in the paper "Inference of cell-type composition and single-cell spatial maps from spatial transcriptomics data with SWOT"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
U-2 OS MERFISH data set prepared by the Han lab at UIUC based off of procedures developed in Moffitt et al. Proc. Natl. Acad. Sci. USA 113 (39), 11046–11051. Data is comprised of ~2 million spots from 130 genes with x,y,z location, cell assignment, and correction status.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains annotated sub-cellular localised spatial measurements from the Visium, Xenium and CosMx platforms. Specifically, it includes datasets analysed in the publication Bhuva et. al, 2023 titled "Library size confounds biology in spatial transcriptomics data". Raw transcript detections are presented. Data is best accessed through the accompanying SubcellularSpatialData R/Bioconductor package. Region files used to annotate individual transcript detections are presented in the form of GeoJSON files.
Facebook
Twitterhttps://ega-archive.org/dacs/EGAC50000000277https://ega-archive.org/dacs/EGAC50000000277
Spatial transcriptomics data (ST) from 32 human prostate tissue samples originating from 8 prostate cancer patients (5 patients with post-surgery relapse). The ST data was acquired using the Visium Spatial Gene Expression kit which resulted in over 20 000 spatially defined spots for the 32 tissue samples. The raw transcriptomics data is RNA-seq. The individual samples have information on patient origin and sample type (cancer, cancer-adjacent field-effect normal or normal sample far from cancer). Each ST spot has metadata such as sample origin, histology class (stroma, normal epithelium, cancer of various grading etc), number of cells and estimated cell type fractions. Patient metadata include information of age at surgery, time (months) until reported relapse, total follow-up time, pre-surgery PSA, post-surgery T-stage and metastasis status.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The data for training the GAN (Inversion) model and reproduce the results reported in the paper
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Clinical interventions and inflammatory signaling shape the transcriptional and cellular architecture of the early postnatal lung
Spatial Transcriptomics was performed using the 10X Xenium Platform with a 480 custom-designed probe set on 1 tissue section from 5 distinct early postnatal lung specimens. CSV files contain cell type identities as determined by label transfer.
.zip files should be unzipped to the same directory and can be viewed with Xenium Explorer.
.csv files contain cell type annotations as determined by label transfer to hand annotated single nuclei RNA-sequencing data from early postnatal lung. They can be added as a custom cell group in Xenium Explorer.
Code used in analysis of this data is available at: http://github.com/jason-spence-lab/Frum-et-al.-2025a.git
METHODS
Tissue Preparation for Xenium Spatial Transcriptomics Analysis
Xenium slides were removed from -20°C storage and allowed to come to room temperature for 30 minutes and then were placed on a 42ºC slide warmed and coated with DNAse/RNAse free water (Corning, Cat# 46000CM). Small sections from multiple specimens were carefully placed within the sample placement area. Most of the water was removed when sections had completely flattened. Slides dried on the slide warmer for three hours before transport to the Advanced Genomics Core. Xenium slides were processed by the Advanced Genomics Core using the Xenium In SituGene Expression with Cell Segmentation workflow (10X, #CG000749).
Xenium Data Analysis
Preprocessing/QC Filtering
Centroids and Segmentation coordinates and Gene Expression counts were determined by Xenium Onboard Analysis v4.0 and imported into R using Seurat::ReadXenium(). Gene Expression counts were converted to a Seurat object using Seurat::CreateSeuratObject(). Coordinates for centroids and segmentations were first converted into a field of view using Seurat::CreateFOV() and then appended to the Seurat object. Segmentations with less than 25 gene expression counts were excluded from the analysis.
Label Transfer
To align low-complexity 480 probe Xenium data with higher complexity snRNA-seq data the reference data was transformed using Seurat::SCTransform() with 3000 variable features. Each specimen was processed individually, also undergoing SCTransformation using 250 variable features. Any Xenium probes expressed in over 95% of cells were excluded from analysis. Anchors between each specimen and the snRNA-seq reference were calculated using FindTransferAnchors() using the SCT assay of both datasets, 20 dimensions, k.filter = 200, and considering only the variable features from the Xenium specimen. Cell type annotations from the snRNA-seq data were then transferred to the Xenium specimen using TransferData(), with anchors weighted by the PCs of the Xenium specimen.
Facebook
TwitterSupplementary data supporting the SpatialOne: End-to-End Analysis of Spatial Transcriptomics at Scale publication
To showcase the capabilities of SpatialOne, two human lung cancer formalin-fixed, paraffin-embedded (FFPE) samples are analyzed. These samples are prepared following the CG000495 protocol (Figure 1b), sequenced with the 10x Visium CytAssist, and processed using the 10x SpaceRanger version 2. We also present analysis of two adult mouse samples sequenced using 10x Visium samples (one fresh frozen brain tissue section processed using SpaceRanger v2 and one FFPE kidney sample processed using the SpaceRanger v1), and 75 internal samples.
For the human lung cancer samples, single-cell data from the the Lung Cancer Atlas (Salcher et al., 2022) is used as reference. This dataset is filtered to include only Chromium-generated data. For the mice samples, the GSE107585 single-cell dataset serves as reference. In the human lung cancer datasets, a pathologist annotated regions of interest corresponding to tumors, blood vessels, and alveolar regions.
Changelog:
Added a README file describing the zip content.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains spatial transcriptomics data related to the Wu et al. 2021 study "A single-cell and spatially resolved atlas of human breast cancers". Processed count matrices, brightfield HE-images (plain and annotated) and meta-data (containing clinical information and spot pathological details) for 6 primary breast cancers profiled using the Visium assay (10X Genomics). If you use this dataset in your research, please consider citing the above study.
The content of the files are: raw_count_matrices.tar.gz - spaceranger processed raw count matrices.
spatial.tar.gz - spaceranger processed spatial files (images, scalefactors, aligned fiducials, position lists)
filtered_count_matrices.tar.gz - filtered count matrices.
metadata.tar.gz - metadata for tissues and spots of filtered count matrices, including clinical subtype and pathological annotation of each spot.
images.pdf - pdf detailing the H&E and annotation images.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The morbidity of Hepatocellular carcinoma (HCC) is highest in individuals with chronic liver diseases (CLD). However, the effects of cell composition on the progression of CLDs to HCC remain elusive. To gain a better understanding of the spatial distribution of cells and their interactions, we created spatial transcriptome data from two HCC and their normal adjacent FFPE tissues using the 10x visium platform. We processed the data using cellRanger and mapped it to the Human Hg38 reference genome with GRCh38.p3 annotation. All data generated by cellRanger is provided here
Facebook
Twitterhttps://www.kuleuven.be/rdm/en/rdr/custom-kuleuvenhttps://www.kuleuven.be/rdm/en/rdr/custom-kuleuven
This folder contains the fastq-files that are generated during the Grand Challenge project using 10X Genomics Visium on head&neck squamous cell carcinoma samples. It contains 4 fastq-files (R1 and R2 for each of the two sequencing lanes) per patient (for each patient, 2 samples (biopsy and resection) were collected, and the two samples of 1 patient (HNI40020) was analyzed twice).
Facebook
Twitterhttps://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy
According to our latest research, the global spatial transcriptomics software market size in 2024 stands at USD 375 million, reflecting a robust expansion driven by the increasing adoption of spatial omics technologies in biomedical research. The market is anticipated to grow at a CAGR of 15.2% from 2025 to 2033, reaching a forecasted value of USD 1.23 billion by 2033. This remarkable growth trajectory is primarily attributed to the rising demand for high-throughput spatial gene expression analysis, advancements in imaging technologies, and the integration of artificial intelligence with bioinformatics platforms across research and clinical settings.
One of the primary growth factors propelling the spatial transcriptomics software market is the surging need for spatially resolved transcriptomic data in understanding complex biological processes, particularly in oncology and neuroscience. Researchers are increasingly recognizing the limitations of bulk RNA sequencing, which fails to capture the spatial context of gene expression within tissues. The ability of spatial transcriptomics software to map gene activity at a cellular level within intact tissue sections is revolutionizing research in tumor microenvironments, neurodegenerative diseases, and developmental biology. As a result, both academic and commercial entities are investing heavily in spatial transcriptomics platforms and software, further fueling market expansion.
Another significant driver is the rapid technological evolution in imaging and sequencing techniques, which has led to the generation of massive spatial omics datasets. This surge in data volume necessitates advanced computational tools for efficient analysis, visualization, and interpretation. Spatial transcriptomics software solutions are being enhanced with machine learning algorithms, scalable cloud-based architectures, and user-friendly interfaces to accommodate the growing complexity and size of datasets. These innovations are enabling researchers to extract actionable insights from spatial transcriptomics experiments, driving adoption across pharmaceutical, biotechnology, and diagnostic sectors.
Furthermore, the increasing collaboration between software developers, instrument manufacturers, and research institutions is accelerating the development of integrated spatial omics solutions. Strategic partnerships are resulting in the creation of comprehensive platforms that combine hardware, reagents, and software, streamlining the workflow from sample preparation to data analysis. This integrated approach not only improves efficiency and reproducibility but also lowers the barrier to entry for new users. The proliferation of open-source spatial transcriptomics software and the establishment of data-sharing consortia are also fostering innovation and standardization across the industry, contributing to sustained market growth.
From a regional perspective, North America currently dominates the spatial transcriptomics software market, owing to its strong presence of leading research institutions, well-established biotechnology and pharmaceutical industries, and high adoption of advanced omics technologies. Europe follows closely, supported by robust funding for life sciences research and a growing focus on precision medicine. The Asia Pacific region is rapidly emerging as a key growth area, driven by expanding investments in genomics infrastructure and increasing awareness of spatial omics applications. Meanwhile, Latin America and the Middle East & Africa are witnessing gradual adoption, propelled by improvements in healthcare infrastructure and rising research activities. The global landscape is poised for dynamic growth, with regional markets contributing uniquely to the evolution of spatial transcriptomics software.
The spatial transcriptomics software market is segmented by product type into standalone software and integrated software suites. Standalone software solutions are designed to perform specific analytical tasks such as image processing, spatial mapping, or gene expression quantification. These tools are favored by advanced users and specialized research groups who require customized workflows and the flexibility to integrate with other bioinformatics platforms. Standalone products often feature modular architectures, allowing users to select and deploy functionalities that align precisely with their experimental requirements. This segment is witnessing steady deman
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Raw sequencing data (fastq and BAM files) of stage III and IV HNSCC samples.
Facebook
TwitterSpatial transcriptomics seeks to integrate single-cell transcriptomic data within the 3-dimensional space of multicellular biology. Current methods use glass substrates pre-seeded with matrices of barcodes or fluorescence hybridization of a limited number of probes. We developed an alternative approach, called ‘ZipSeq’, that uses patterned illumination and photocaged oligonucleotides to serially print barcodes (Zipcodes) onto live cells within intact tissues, in real-time and with on-the-fly selection of patterns. Using ZipSeq, we mapped gene expression in three settings: in-vitro wound healing, live lymph node sections and in a live tumor microenvironment (TME). In all cases, we discovered new gene expression patterns associated with histological structures. In the TME, this demonstrated a trajectory of myeloid and T cell differentiation, from periphery inward. A combinatorial variation of ZipSeq efficiently scales in number of regions defined, providing a pathway for complete mapping ..., Raw stitched images supporting main and supplementary figures from manuscript. Samples were prepared as detailed in the methods section., Please refer to the metadata file.,
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Multiscale data integration of spatial transcriptomics and scRNA-seq data.
Facebook
Twitterhttps://ega-archive.org/dacs/EGAC50000000173https://ega-archive.org/dacs/EGAC50000000173
This data set contains raw FASTQ files and processed files of spatial transcriptomics of 6 EMD samples collected from MM patients. The processed files contain the h5 expression matrices, the Image of the Visium slide, and a TSV of spatial coordinates. Patients included in this data set are PT01A, PT01B, PT02, PT03, PT07, PT08, PT09, PT10, and PT11.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.
The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:
Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix
scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix
Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data
Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data
These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.
Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.
The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9
"svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.