9 datasets found
  1. o

    Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" -...

    • ordo.open.ac.uk
    bin
    Updated Nov 14, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Marisa Loach (2024). Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow [Dataset]. http://doi.org/10.5281/zenodo.14013637
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 14, 2024
    Dataset provided by
    The Open University
    Authors
    Marisa Loach
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow

  2. Data from: Spatial Transcriptomics in Breast Cancer Reveals Tumour...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Nov 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jiménez-Santos, María José; García-Martín, Santiago; Rubio-Fernández, Marcos; Gómez-López, Gonzalo; Al-Shahrour, Fátima (2024). Spatial Transcriptomics in Breast Cancer Reveals Tumour Microenvironment-Driven Drug Responses and Clonal Therapeutic Heterogeneity [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10638905
    Explore at:
    Dataset updated
    Nov 29, 2024
    Dataset provided by
    Spanish National Cancer Research Centre
    Authors
    Jiménez-Santos, María José; García-Martín, Santiago; Rubio-Fernández, Marcos; Gómez-López, Gonzalo; Al-Shahrour, Fátima
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We acquired 10x Visium spatial transcriptomics (ST) data from 9 patients with invasive adenocarcinomas [1–5] to explore the role of the tumour microenvironment (TME) on intratumor heterogeneity (ITH) and drug response in breast cancer. By leveraging a new version of Beyondcell 6, a tool for identifying tumour cell subpopulations with distinct drug response patterns, we predicted sensitivity to over 1,200 drugs while accounting for the spatial context and interaction between the tumour and TME compartments. Moreover, we also used Beyondcell to compute spot-wise functional enrichment scores and identify niche-specific biological functions.

    Here, you can find:

    In signatures folder:

    SSc breast: Collection of gene signatures used to predict sensitivity to > 1,200 drugs derived from breast cancer cell lines.

    Functional signatures: Collection of gene signatures used to compute enrichment in different biological pathways.

    In visium folder:

    Visium objects: Processed ST Seurat objects with deconvoluted spots, SCTransform-normalised counts, and clonal composition predicted with SCEVAN [7]. These objects, together with the signatures, were used to compute the Beyondcell objects.

    In single-cell folder:

    Single-cell objects: Raw and filtered merged single-cell RNA-seq (scRNA-seq) Seurat objects with unnormalised counts used as a reference for spot deconvolution.

    In beyondcell folder:

    Beyondcell sensitivity objects with prediction scores for all drug response signatures in SSc breast.

    Beyondcell functional objects with enrichment scores for all functional signatures.

  3. endocrinogenesis_day15_sctransform.h5ad

    • figshare.com
    hdf
    Updated Jun 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Schaar (2023). endocrinogenesis_day15_sctransform.h5ad [Dataset]. http://doi.org/10.6084/m9.figshare.16656982.v4
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Jun 9, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Anna Schaar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description adjusted from the sctransform documentation (https://satijalab.org/seurat/articles/sctransform_vignette.html):"The results of sctransfrom are stored in layers with the “SCT” prefix. SCT_normalized contains the residuals (normalized values), and is used directly as input to PCA. To assist with visualization and interpretation. we also convert Pearson residuals back to ‘corrected’ UMI counts. You can interpret these as the UMI counts we would expect to observe if all cells were sequenced to the same depth. The ‘corrected’ UMI counts are stored in SCT_corrected_UMI. We store log-normalized versions of these corrected counts in SCT_lognorm_corrected_UMI, which are very helpful for visualization.You can use the corrected log-normalized counts for differential expression and integration. However, in principle, it would be most optimal to perform these calculations directly on the residuals (stored in the SCT_normalized slot) themselves."

  4. Single-cell spatial transcriptomics and proteomics of APOE Christchurch in...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Jan 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kristine Tran; Nellie Kwang; Kim Green (2025). Single-cell spatial transcriptomics and proteomics of APOE Christchurch in 5xFAD and PS19 mice [Dataset]. http://doi.org/10.5061/dryad.m63xsj4ck
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 24, 2025
    Dataset provided by
    AMP Network
    Authors
    Kristine Tran; Nellie Kwang; Kim Green
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    This collection of datasets comprises results from four single-cell spatial experiments conducted on mouse brains: two spatial transcriptomics experiments and two spatial proteomics experiments. These experiments were performed using the Bruker Nanostring CosMx technology on 10µm coronal brain sections from the following mouse models: (1) 14-month-old male 5xFAD;ApoeCh mice and genotype controls, and (2) 9-month-old PS19;ApoeCh mice and genotype controls. Each dataset is provided as an RDS file which includes raw and corrected counts for the RNA data and mean fluorescent intensity for the protein data, along with comprehensive metadata. Metadata includes mouse genotype, sample ID, cell type annotations, sex (for PS19;ApoeCh dataset), and X-Y coordinates of each cell. Results from differential gene expression analysis for each cell type between genotypes using MAST are also included as .csv files. Methods Sample preparation: Isopentane fresh-frozen brain hemispheres were embedded in optimal cutting temperature (OCT) compound (Tissue-Tek, Sakura Fintek, Torrance, CA), and 10µm thick coronal sections were prepared using a cryostat (CM1950, LeicaBiosystems, Deer Park, IL). Six hemibrains were mounted onto each VWR Superfrost Plus microscope slide (Avantor, 48311-703) and kept at -80°C until fixation. For both 5xFAD (14 months old, males) and PS19 (9 months old, females and 1 male ApoeCh) models, n=3 mice per genotype except for n=2 for PS19;ApoeCh (wild-type, ApoeCh HO, 5xFAD HEMI or PS19 HEMI, and 5xFAD HEMI; ApoeCh HO or PS19 HEMI;ApoeCh HO) were used for transcriptomics and proteomics. The same mice were used for both transcriptomics and proteomics. Tissues were processed according to the Nanostring CosMx fresh-frozen slide preparation manual for RNA and protein assays (NanoString University). Data processing: Spatial transcriptomics datasets were filtered using the AtoMx RNA Quality Control module to flag outlier negative probes (control probes targeting non-existent sequences to quantify non-specific hybridization), lowly-expressing cells, FOVs, and target genes. Datasets were then normalized and scaled using Seurat 5.0.1 SCTransform to account for differences in library size across cell types [31]. Principal component analysis (PCA) and uniform manifold approximation and projection (UMAP) analysis were performed to reduce dimensionality and visualize clusters in space. Unsupervised clustering at 1.0 resolution yielded 33 clusters for the 5xFAD dataset and 40 clusters for the PS19 dataset. Clusters were manually annotated based on gene expression and spatial location. Spatial proteomics data were filtered using the AtoMx Protein Quality Control module to flag unreliable cells based on segmented cell area, negative probe expression, and overly high/low protein expression. Mean fluorescence intensity data were hyperbolic arcsine transformed with the AtoMx Protein Normalization module. Cell types were automatically annotated based on marker gene expression using the CELESTA algorithm.

  5. m

    Investigating Highly Variable Genes in Single-cell RNA-seq Data across...

    • data.mendeley.com
    Updated May 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jantarika Kumar Arora (2023). Investigating Highly Variable Genes in Single-cell RNA-seq Data across Multiple Cell Types and Conditions [Dataset]. http://doi.org/10.17632/6ry3x7r8hf.3
    Explore at:
    Dataset updated
    May 16, 2023
    Authors
    Jantarika Kumar Arora
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The peripheral blood immune cell (PBMC) samples were collected from patients infected with dengue virus (DENV) at four time points: two and one day(s) before defervescence (febrile phase), at defervescence (critical phase), and two-week convalescence. The raw and filtered matrix files were generated using CellRanger version 3.0.2 (10x Genomics, USA) with the reference human genome GRCh38 1.2.0. Potential contamination of ambient RNAs was corrected using SoupX. Low quality cells, including cells expressing mitochondrial genes higher than 10% and doublets/multiplets, were excluded using Seurat and doubletFinder, respectively. The individual samples were then integrated using the SCTransform method with 3,000 gene features. Principal component analysis (PCA) and clustering were performed with the Louvain algorithm applying multi-level refinement algorithm. The gene expression level of each cell was normalized using the LogNormalize method in Seurat. Cell types were annotated using the canonical marker genes described in the original paper, see related link below.

  6. snRNA-seq, Primary-Recurrent GBM (Mikolajewicz Cohort)

    • figshare.com
    bin
    Updated Jun 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Mikolajewicz (2024). snRNA-seq, Primary-Recurrent GBM (Mikolajewicz Cohort) [Dataset]. http://doi.org/10.6084/m9.figshare.25917628.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 4, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Nicholas Mikolajewicz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary.10 primary GBM and 8 recurrent GBM samples (14/18 matched) profiled using single nucleus RNA- sequencing (sci-RNA-seq3 protocol).Data Format.Data is provided as preprocessed dataset, stored in Seurat Object.Sample processing, sci-RNA-seq3 library generation, and sequencingSnap-frozen patient pGBM and rGBM tissues were chopped with a razor blade or scissors before nucleus isolation. Nuclei extraction and fixation were performed as previously described (Cao 2019), except for the use of a modified CST lysis buffer50 plus 1% of SUPERase-In RNase Inhibitor (Invitrogen, #AM2696). Lysis time and washing steps were further optimized based on human GBM tissue. Nuclei quality was checked with DAPI and Wheat Germ Agglutinin (WGA) staining. Sci-RNA-seq3 libraries were generated as previously described49 using three-level combinatorial indexing. The final libraries were sequenced on Illumina NovaSeq as follows: read 1: 34bp, read 2: >=69bp, index 1: 10bp, index 2: 10bp.Demultiplexing and read alignments.Raw sequencing reads were first demultiplexed based on i5/i7 PCR barcodes. FASTQ files were then processed using the sci-RNA-Seq3 pipeline. After barcodes and unique molecular identifiers (UMIs) were extracted from the read1 of FASTQ files, read alignment was performed using STAR short-read aligner (v2.5.2b) with the human genome (hg19) and Gencode v24 gene annotations. After removing duplicate reads based on UMI, barcode, chromosome and alignment position, reads were summarized into a count matrix of M genes × N nuclei.Filtering, normalization, integration, and dimensional reduction.Raw count matrices were loaded into a Seurat object (version 4.0.1) and filtered to retain cells with (i) 200 – 9000 recovered genes per cell, (ii) less than 60% mitochondrial content, and (iii) unmatched rate within 3 median absolute deviations of the median. To normalize count matrix, we adopted the modeling framework previously described and implemented in sctransform (R Package, version 0.3.2). In brief, count data were modelled by regularized negative binomial regression, using sequencing depth as a model covariate to regress out the influence of technical effects, and Pearson residuals were used as the normalized and variance stabilized biological signal for downstream analysis. Data from each patient were integrated with the reciprocal PCA method (Seurat) using the top 2000 variable features. PCA was performed on the integrated dataset, and the top N components that accounted for 90% of the observed variance were used for UMAP embedding, RunUMAP(max_components = 2, n_neighbours = 50, min_dist = 01, metric = cosine).Contact.Contact Dr. Nicholas Mikolajewicz regarding any questions about the data or analysis (n.mikolajewicz@utoronto.ca)

  7. Single cell RNA-sequencing counts of mucosal biopsies from ulcerative...

    • zenodo.org
    Updated Jun 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    I.C.N. Fung; I.C.N. Fung (2025). Single cell RNA-sequencing counts of mucosal biopsies from ulcerative colitis patients [Dataset]. http://doi.org/10.5281/zenodo.14236828
    Explore at:
    Dataset updated
    Jun 11, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    I.C.N. Fung; I.C.N. Fung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Mucosal biopsy samples were obtained from patients during routine ileocolonoscopy at the Department of Gastroenterology and Hepatology of the Amsterdam UMC between December 2020 and April 2021. Patients were aged ≥16 years with an established diagnosis of ulcerative colitis (UC) determined through endoscopy and histopathology, where disease activity was scored by a trained gastroenterologist using the Mayo Endoscopic Score (MES) and the Ulcerative Colitis Endoscopic Index of Severity (UCEIS). In total, we acquired samples from 10 UC patients: 5 inflamed UC (MES≥1), 5 non-inflamed UC (MES=0). Exclusion criteria were ongoing malignancy, a history of colonic dysplasia, or colonic surgery. Control samples were obtained from resection specimens acquired from 4 patients with no established diagnosis of UC (CD, suspicion of rectum carcinoma, trans-anal total mesorectal excision or hemicolectomy), which were obtained from the biobank of the Amsterdam UMC.

    Raw reads were aligned to GRCh38 using Cellranger (v7.0.0) (10X Genomics) generating unique molecular identifiers (UMIs) were obtained. Samples were imported separately, processed, and analysed in the R programming environment (v4.2.1) using Seurat (v4.3.0) 43,44. UMI counts were normalized by SCTransform (v) using default parameters.

  8. R-script for single-cell RNA-seq data analysis

    • figshare.com
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alaullah Sheikh (2025). R-script for single-cell RNA-seq data analysis [Dataset]. http://doi.org/10.6084/m9.figshare.30307015.v1
    Explore at:
    Dataset updated
    Oct 24, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Alaullah Sheikh
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    R script for single-cell RNA-seq data analysis. The code includes steps for quality control, normalization using SCTransform, dimensional reduction (PCA and UMAP), clustering, differential gene expression analysis, and visualization of marker genes. Integration workflows were performed to combine control and LT-treated organoid datasets, followed by annotation of epithelial subtypes based on established marker genes. Additional scripts generate figures such as UMAP projections, heatmaps, dot plots, and violin plots.

  9. Single-cell VIPER activity matrix (internal signature)

    • figshare.com
    csv
    Updated Aug 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ester Calvo Fernandez (2025). Single-cell VIPER activity matrix (internal signature) [Dataset]. http://doi.org/10.6084/m9.figshare.30002785.v1
    Explore at:
    csvAvailable download formats
    Dataset updated
    Aug 28, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Ester Calvo Fernandez
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single-cell protein activity was computed on the SCTransform-scaled and Anchor-Integrated gene expression signatures across metaCells by the metaVIPER function in the VIPER package (Bioconductor). Briefly, metaVIPER was developed as an adaptation of VIPER to single-cell data. Protein activity is inferred for a given gene expression signature using multiple networks which are integrated on a protein-by-protein basis using the square of the NES generated by each individual network. Since a non-relevant network would generate a protein activity score close to zero under the null model, networks that generate more extreme NES can be interpreted to more accurately match the given biological context and are thus weighted more heavily for each protein. VIPER-inferred protein activity was computed on the gene expression signatures of all the single cells using the gene expression cluster-based single-cell ARACNe networks, and on the gene expression signatures of the tumor compartment single cells using the six patient-specific tumor single-cell ARACNe networks. The VIPER matrix includes all significant Master Regulators (MR).

  10. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Marisa Loach (2024). Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow [Dataset]. http://doi.org/10.5281/zenodo.14013637

Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow

Explore at:
binAvailable download formats
Dataset updated
Nov 14, 2024
Dataset provided by
The Open University
Authors
Marisa Loach
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow

Search
Clear search
Close search
Google apps
Main menu