100+ datasets found
  1. Z

    Processed, annotated, seurat object

    • data.niaid.nih.gov
    Updated Nov 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cenk Celik; Guillaume Thibault (2023). Processed, annotated, seurat object [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7608211
    Explore at:
    Dataset updated
    Nov 16, 2023
    Dataset provided by
    Nanyang Technological University
    Authors
    Cenk Celik; Guillaume Thibault
    Description

    The dataset contains an integrated, annotated Seurat v4 object. One can load the dataset into the R environment using the code below:

    seurat_obj <- readRDS('PATH/TO/DOWNLOAD/seurat.rds')

    The object has three assays: (I) RNA, (II) SCT and (III) integrated.

  2. Processed naive T cell single-cell RNA-seq, Seurat object

    • figshare.com
    application/gzip
    Updated Jan 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Bunis (2021). Processed naive T cell single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11886891.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 5, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Daniel Bunis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Processed naive CD4 and CD8 T cell single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis

  3. Processed HSPCs single-cell RNA-seq, Seurat object

    • figshare.com
    application/gzip
    Updated Jan 5, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Daniel Bunis (2021). Processed HSPCs single-cell RNA-seq, Seurat object [Dataset]. http://doi.org/10.6084/m9.figshare.11894691.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 5, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Daniel Bunis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Processed hematopoietic stem and progenitor cell (HSPC) single-cell RNAseq data from human samples. The file contains a Seurat object stored as an .rds file which can be read into R with the readRDS() function. It was generated using the raw data of similar name in this project, as well as the code stored here: https://github.com/dtm2451/ProgressiveHematopoiesis

  4. u

    Dawnn benchmarking dataset: Simulated linear trajectories processing and...

    • rdr.ucl.ac.uk
    application/gzip
    Updated May 4, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    George Hall; Sergi Castellano Hereza (2023). Dawnn benchmarking dataset: Simulated linear trajectories processing and label simulation [Dataset]. http://doi.org/10.5522/04/22616611.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    May 4, 2023
    Dataset provided by
    University College London
    Authors
    George Hall; Sergi Castellano Hereza
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This project is a collection of files to allow users to reproduce the model development and benchmarking in "Dawnn: single-cell differential abundance with neural networks" (Hall and Castellano, under review). Dawnn is a tool for detecting differential abundance in single-cell RNAseq datasets. It is available as an R package here. Please contact us if you are unable to reproduce any of the analysis in our paper. The files in this collection correspond to the benchmarking dataset based on simulated linear trajectories.

    FILES: Data processing code

    adapted_traj_sim_milo_paper.R Lightly adapted code from Dann et al. to simulate single-cell RNAseq datasets that form linear trajectories . generate_test_data_linear_traj_sim_milo_paper.R R code to assign simulated labels to datatsets generated from adapted_traj_sim_milo_paper.R. Seurat objects saved as cells_sim_linear_traj_gex_seed_*.rds. Simulated labels saved as benchmark_dataset_sim_linear_traj.csv.

    Resulting datasets

    cells_sim_linear_traj_gex_seed_*.rds Seurat objects generated by generate_test_data_linear_traj_sim_milo_paper.R. benchmark_dataset_sim_linear_traj.csv Cell labels generated by generate_test_data_linear_traj_sim_milo_paper.R.

  5. Human fetal retina FL scRNA-seq processed Seurat object

    • zenodo.org
    bin
    Updated Apr 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dominic WH Shayler; Kevin Stachelek; David Cobrinik; Dominic WH Shayler; Kevin Stachelek; David Cobrinik (2025). Human fetal retina FL scRNA-seq processed Seurat object [Dataset]. http://doi.org/10.5281/zenodo.15231490
    Explore at:
    binAvailable download formats
    Dataset updated
    Apr 16, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Dominic WH Shayler; Kevin Stachelek; David Cobrinik; Dominic WH Shayler; Kevin Stachelek; David Cobrinik
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 16, 2025
    Description

    This processed Seurat object represents full-length single-cell RNA-sequencing data derived from human fetal retina. This dataset is associated with the eLife publication titled "Identification and characterization of early human photoreceptor states and cell-state-specific retinoblastoma-related features" (https://doi.org/10.7554/eLife.101918.1)

  6. Processed seurat object for CJRB-101 study

    • figshare.com
    Updated Dec 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Hyunkyung Park (2025). Processed seurat object for CJRB-101 study [Dataset]. http://doi.org/10.6084/m9.figshare.30969202.v1
    Explore at:
    Dataset updated
    Dec 30, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Hyunkyung Park
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This RDS file contains the processed Seurat object used for macrophage repolarization T cell analyses in the CJRB-101 study.

  7. Single cell T cell atlas

    • zenodo.org
    bin, csv
    Updated Jul 27, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kerry A Mullan; Kerry A Mullan (2024). Single cell T cell atlas [Dataset]. http://doi.org/10.5281/zenodo.12569981
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Jul 27, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kerry A Mullan; Kerry A Mullan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description
    The attached datasets comprised of the merging of 12 high quality single cell T cell based dataset that had both the TCR-seq and GEx. The object contains ~500K paired TCR-seq with GEx in the Seurat Object (supercluster_added_ID-240531.rds). We also included the original identifiers in the Sup_Update_labels.csv a. See our https://stegor.readthedocs.io/en/latest/ for how we processed the 12 datasets and decided on the current 47 T cell annotation models using scGate.

    This is the accompanying data set for the paper entitled ‘T cell receptor-centric approach to streamline multimodal single-cell data analysis.’, which is currently available as a preprint (https://www.biorxiv.org/content/10.1101/2023.09.27.559702v2). Details on the origin of the datasets, and processing steps can be found there.

    The purpose of this atlas both the full dataset and down sampling version is to aid in improving the interpretability of other T cell based datasets. This can be done by adding in the down sampled object that contains up to 500 cells per annotation model or all 12 dataset to your new sample. This dataset aims to improve the capacity to identify TCR-specific signature by ensuring a well covered background, which will improve the robustness of the FindMarker Function in Seurat package.

  8. n

    Data from: Large-scale integration of single-cell transcriptomic data...

    • data.niaid.nih.gov
    zip
    Updated Dec 14, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove (2021). Large-scale integration of single-cell transcriptomic data captures transitional progenitor states in mouse skeletal muscle regeneration [Dataset]. http://doi.org/10.5061/dryad.t4b8gtj34
    Explore at:
    zipAvailable download formats
    Dataset updated
    Dec 14, 2021
    Dataset provided by
    Cornell University
    Authors
    David McKellar; Iwijn De Vlaminck; Benjamin Cosgrove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.

    Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.

    Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).

    Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.

    Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).

    Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).

    Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.

    Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.

    Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).

    Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using

  9. Seurat Objects for CD27 Agonism Enhances Long-Lived CD4 T Cell Vaccine...

    • zenodo.org
    bin
    Updated Nov 12, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zachary Hartman; Zachary Hartman (2025). Seurat Objects for CD27 Agonism Enhances Long-Lived CD4 T Cell Vaccine Responses Critical for Anti-Tumor Immunity [Dataset]. http://doi.org/10.5281/zenodo.17592233
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 12, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Zachary Hartman; Zachary Hartman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 12, 2025
    Description

    Raw and processed Seurat Data objects for scRNA-seq analysis presented in CD27 Agonism Enhances Long-Lived CD4 T Cell Vaccine Responses Critical for Anti-Tumor Immunity

    Objects

    combined_seurat_benchmark_with_mito.rds - Raw Seurat object

    aCD27_final_pure_seurat.rds - Processed Seurat object

  10. d

    Transcription start site analysis for heterogenous CD4+ T cells using 5′...

    • search.dataone.org
    Updated Jul 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akiko Oguchi; Yasuhiro Murakawa (2025). Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq [Dataset]. http://doi.org/10.5061/dryad.gtht76hv9
    Explore at:
    Dataset updated
    Jul 30, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Akiko Oguchi; Yasuhiro Murakawa
    Description

    These datasets are generated by ReapTEC (read-level pre-filtering and transcribed enhancer call) using 5' single-cell RNA-seq data on human heterogenous CD4+ T cells. By taking advantage of a unique “cap signature†derived from the 5′-end of a transcript, ReapTEC simultaneously profiles gene expression and enhancer activity at nucleotide resolution using 5′-end single-cell RNA-sequencing (5′ scRNA-seq). The detail of ReapTEC pipeline is described in https://github.com/MurakawaLab/ReapTEC., , , README: Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq

    https://doi.org/10.5061/dryad.gtht76hv9

    Description of the data and file structure

    Data_summary.xlsx.zip: Summary of single-cell experiments in this study.

    5scCTSSbed_All.zip: There are 102 files containing count data for analyzing transcription start site (TSS) signals. Details are as follows.

    Our original raw sequencing data and processed data of 5′ scRNA-seq have been deposited to National Bioscience Database Center (NBDC) Human Database (accession code: hum0350). Raw sequencing data originated from human subjects have been deposited to Japanese Genotype-phenotype Archive (JGA, accession code: JGAS000689). We retrieved 5′ scRNA-seq data for human memory CD4+ T cells stimulated with viral antigens from the Gene Expression Omnibus database (accession number GSE152522). In total, 102 5′ scRNA-seq datasets were processed by ReapTEC pipeline (https://github.com/MurakawaLab/ReapTEC)....

  11. m

    Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4

    • data.mendeley.com
    Updated Jul 25, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Guyer (2024). Seurat objects for multiome analysis of neuroblastoma cell lines - 4/4 [Dataset]. http://doi.org/10.17632/cp4d7t74vb.1
    Explore at:
    Dataset updated
    Jul 25, 2024
    Authors
    Richard Guyer
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    RDS files containing processed Seurat objects for multiome analysis of neuroblastoma cell lines. File names reflect the cell line.

  12. f

    HuPSA and MoPSA raw data in Seurat V5 format

    • datasetcatalog.nlm.nih.gov
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cheng, Siyuan (2024). HuPSA and MoPSA raw data in Seurat V5 format [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001378286
    Explore at:
    Dataset updated
    Dec 9, 2024
    Authors
    Cheng, Siyuan
    Description

    These are the raw data for HuPSA and MoPSA scRNAseq datasets. Both RDS files can be loaded into R and processed through the Seurat package.https://doi.org/10.1038/s41698-024-00667-x

  13. H

    scRNA-seq_huang2019

    • dataverse.harvard.edu
    Updated Aug 21, 2019
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kee Wui Huang (2019). scRNA-seq_huang2019 [Dataset]. http://doi.org/10.7910/DVN/QB5CC8
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 21, 2019
    Dataset provided by
    Harvard Dataverse
    Authors
    Kee Wui Huang
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Serialized R data files (.rds) associated with the inDrop single-cell RNA-seq analysis in Huang et al., 2019. Each file has a single Seurat object containing a subset of clusters from the full processed dataset, which were separated into different objects due to file size limitations. Raw data (UMIFM counts) are included in the corresponding slot in each Seurat object. Seurat objects can be re-merged into a single object containing the full dataset using the MergeSeurat function.

  14. Annotated Seurat object of RFC1-KO zebrafish brains at 2dpf and 4dpf

    • zenodo.org
    bin
    Updated Sep 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastien Audet; Sebastien Audet; Fanny Nobilleau; Fanny Nobilleau; Martine Tetreault; Martine Tetreault; ERIC SAMARUT; ERIC SAMARUT (2025). Annotated Seurat object of RFC1-KO zebrafish brains at 2dpf and 4dpf [Dataset]. http://doi.org/10.5281/zenodo.15499729
    Explore at:
    binAvailable download formats
    Dataset updated
    Sep 16, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sebastien Audet; Sebastien Audet; Fanny Nobilleau; Fanny Nobilleau; Martine Tetreault; Martine Tetreault; ERIC SAMARUT; ERIC SAMARUT
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Annotated Seurat objects (RDS) of single-cell data from RFC1-KO zebrafish brains at 2 days and 4 days post-fertilization. These data are published in complement to the publication "RFC1 regulates the expansion of neural progenitors in the developing zebrafish cerebellum" (Nobileau et al.), and the raw data available in PRJNA1126282. Non-exhaustive R processing code (in Markdown) used to generate data is also made available. This enables the regeneration of most of the presented figure, as well as additional analysis. More information on the generation of these data is available in the associated publication.

  15. Processed CITE-seq Seurat object for MPET

    • figshare.com
    application/gzip
    Updated Oct 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rekha Mudappathi (2025). Processed CITE-seq Seurat object for MPET [Dataset]. http://doi.org/10.6084/m9.figshare.30434290.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Oct 24, 2025
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Rekha Mudappathi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes normalized RNA and ADT expression matrices, metadata (sample, group, donor information), and precomputed features used in the MPET (Modeling Protein Expression and Transport) framework.

  16. Bassez et al. (2021) Breast Cancer processed dataset

    • figshare.com
    application/gzip
    Updated Dec 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Josep Garnica (2023). Bassez et al. (2021) Breast Cancer processed dataset [Dataset]. http://doi.org/10.6084/m9.figshare.24867018.v3
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Dec 20, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Josep Garnica
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Processed Seurat objects in .rds format with single-cell dataset obtained from Bassez et al. (2021) Nature Medicine (https://www.nature.com/articles/s41591-021-01323-8).BassezA_2021_33958794_downsampled: Seurat object including all samples (42) with random downsampling to max 1000 cells per sample, without any further filtering.BassezA_2021_33958794_3patients: Seurat object with 3 patient samples from the original data set: "BIOKEY_11", "BIOKEY_30", and "BIOKEY_4" from "Pre" condition, without downsampling or further per sample filtering.

  17. snRNA-seq, Primary-Recurrent GBM (Mikolajewicz Cohort)

    • figshare.com
    bin
    Updated Jun 4, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Mikolajewicz (2024). snRNA-seq, Primary-Recurrent GBM (Mikolajewicz Cohort) [Dataset]. http://doi.org/10.6084/m9.figshare.25917628.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 4, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Nicholas Mikolajewicz
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Summary.10 primary GBM and 8 recurrent GBM samples (14/18 matched) profiled using single nucleus RNA- sequencing (sci-RNA-seq3 protocol).Data Format.Data is provided as preprocessed dataset, stored in Seurat Object.Sample processing, sci-RNA-seq3 library generation, and sequencingSnap-frozen patient pGBM and rGBM tissues were chopped with a razor blade or scissors before nucleus isolation. Nuclei extraction and fixation were performed as previously described (Cao 2019), except for the use of a modified CST lysis buffer50 plus 1% of SUPERase-In RNase Inhibitor (Invitrogen, #AM2696). Lysis time and washing steps were further optimized based on human GBM tissue. Nuclei quality was checked with DAPI and Wheat Germ Agglutinin (WGA) staining. Sci-RNA-seq3 libraries were generated as previously described49 using three-level combinatorial indexing. The final libraries were sequenced on Illumina NovaSeq as follows: read 1: 34bp, read 2: >=69bp, index 1: 10bp, index 2: 10bp.Demultiplexing and read alignments.Raw sequencing reads were first demultiplexed based on i5/i7 PCR barcodes. FASTQ files were then processed using the sci-RNA-Seq3 pipeline. After barcodes and unique molecular identifiers (UMIs) were extracted from the read1 of FASTQ files, read alignment was performed using STAR short-read aligner (v2.5.2b) with the human genome (hg19) and Gencode v24 gene annotations. After removing duplicate reads based on UMI, barcode, chromosome and alignment position, reads were summarized into a count matrix of M genes × N nuclei.Filtering, normalization, integration, and dimensional reduction.Raw count matrices were loaded into a Seurat object (version 4.0.1) and filtered to retain cells with (i) 200 – 9000 recovered genes per cell, (ii) less than 60% mitochondrial content, and (iii) unmatched rate within 3 median absolute deviations of the median. To normalize count matrix, we adopted the modeling framework previously described and implemented in sctransform (R Package, version 0.3.2). In brief, count data were modelled by regularized negative binomial regression, using sequencing depth as a model covariate to regress out the influence of technical effects, and Pearson residuals were used as the normalized and variance stabilized biological signal for downstream analysis. Data from each patient were integrated with the reciprocal PCA method (Seurat) using the top 2000 variable features. PCA was performed on the integrated dataset, and the top N components that accounted for 90% of the observed variance were used for UMAP embedding, RunUMAP(max_components = 2, n_neighbours = 50, min_dist = 01, metric = cosine).Contact.Contact Dr. Nicholas Mikolajewicz regarding any questions about the data or analysis (n.mikolajewicz@utoronto.ca)

  18. CPA-Perturb-seq: Multiplexed single-cell characterization of alternative...

    • zenodo.org
    application/gzip, bin
    Updated Feb 13, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Madeline H Kowalski; Madeline H Kowalski; Hans-Hermann Wessels; Hans-Hermann Wessels; Johannes Linder; Johannes Linder; Saket Choudhary; Saket Choudhary; Austin Hartman; Austin Hartman; Yuhan Hao; Yuhan Hao; Isabella Mascio; Isabella Mascio; Carol Dalgarno; Carol Dalgarno; Anshul Kundaje; Anshul Kundaje; Rahul Satija; Rahul Satija (2023). CPA-Perturb-seq: Multiplexed single-cell characterization of alternative polyadenylation regulators (Perturb-seq data) [Dataset]. http://doi.org/10.5281/zenodo.7619593
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Feb 13, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Madeline H Kowalski; Madeline H Kowalski; Hans-Hermann Wessels; Hans-Hermann Wessels; Johannes Linder; Johannes Linder; Saket Choudhary; Saket Choudhary; Austin Hartman; Austin Hartman; Yuhan Hao; Yuhan Hao; Isabella Mascio; Isabella Mascio; Carol Dalgarno; Carol Dalgarno; Anshul Kundaje; Anshul Kundaje; Rahul Satija; Rahul Satija
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This site provides access to datasets from the CPA-Perturb-seq manuscript Kowalski*, Wessels*, Linder* et al., including processed Perturb-seq datasets from HEK293FT and K562. We release these data as Seurat objects, where each object contains single-cell quantifications of gene expression (RNA assay), and in addition, quantifications of polyA site usage (polyA site assay). To explore these data, please install the PASTA (PolyA Site analysis using relative Transcript Abundance) package, which provides infrastructure and analytical tools to explore alternative polyadenylation at single-cell resolution. For each dataset, we also include a fragment file which enables visualization of read coverage plots across groups of cells.

    The files include:

    1. CPA_HEK293FT.Rds: Seurat object containing the HEK293 CPA-Perturb-seq dataset

    2. CPA_HEK293FT_fragments.tsv.gz : Fragment file for the HEK293 dataset

    3. CPA_HEK293FT_fragments.tsv.gz.tbi : Fragment file index for the HEK293 dataset

    4. CPA_K562.Rds : Seurat object containing the K562 CPA-Perturb-seq dataset

    5. CPA_K562_fragments.tsv.gz : Fragment file for the K562 dataset

    6. CPA_K562_fragments.tsv.gz.tbi : Fragment file index for the K562 dataset

    R code below:

    library(PASTA)
    
    hek <- readRDS("CPA_HEK293FT.Rds")
    
    # remove fragment file information
    Fragments(hek) <- NULL
    # Update the path of the fragment file 
    Fragments(hek) <- CreateFragmentObject(path = "download/CPA_HEK293FT_fragments.tsv.gz", cells = Cells(hek))
    
    # visualize polyA site usage
    PolyACoveragePlot(hek, region ="7-26212195-26213351")

  19. Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute...

    • figshare.com
    bin
    Updated Jun 16, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Emin Onur Karakaslar (2025). Processed data of single cell RNA-sequencing of 16 NPM1-mutated Acute Myeloid Leukemia samples [Dataset]. http://doi.org/10.6084/m9.figshare.26189771.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Jun 16, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Emin Onur Karakaslar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    TLDRSeurat object of the 16 NPM1-mutated AML samples (n = 83,162 cells).AML samplesAll sixteen peripheral blood and bone marrow samples were obtained from patients with AML at diagnosis (n=15) or relapse after chemotherapy (n=1) with written informed consent according to the Declaration of Helsinki. Mononuclear cells were isolated by Ficoll-Isopaque density gradient centrifugation and cryopreserved in the Leiden University Medical Center (LUMC) Biobank for Hematological Diseases after approval by the LUMC Institutional Review Board (protocol no. B18.047).Upstream processing pipelineCellRanger v7.0.0 was run on all samples with the human reference genome hg38. For all QC Seurat v4 was used15. Our QC pipeline had three steps per sample: 1) soft filtering, 2) low quality cluster removal, and 3) doublet detection. In soft filtering, Seurat objects were created with cells expressing at least 200 genes and with the genes expressed at least in 3 cells. Then, standard Seurat command list with default parameters was run to detect low quality clusters. Clusters with >15% mitochondrial and 15% mitochondrial mRNA. We used standard Seurat commands to scale and normalize the data on integrated features. First 30 principal components were used to create UMAP plots. We used clustree to determine optimal cluster number, based on FindClusters with resolutions sweeping from 0 to 1.2. We chose res=0.5, as clusters became stable. Next, we merged two clusters (CC5 and CC12) into one GMP-like cluster as one of these clusters (CC12) had high expression of HSP-genes yet still retained its cell-type specific properties.Note: The file was processed with Seurat v4 but the object is updated for v5. Uploaded as .qs file format for faster reading. To read the file: qs:qread("path/to/data.qs")This data is available for research use only; and cannot be used for commercial purposes.For further queries please refer to our paper:

  20. Spatial Transcriptomics (10X Xenium) Data From Early Postnatal Lung...

    • zenodo.org
    csv, zip
    Updated Oct 18, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tristan FRUM; Tristan FRUM; Jason Spence; Jason Spence (2025). Spatial Transcriptomics (10X Xenium) Data From Early Postnatal Lung Specimens [Dataset]. http://doi.org/10.5281/zenodo.17155546
    Explore at:
    csv, zipAvailable download formats
    Dataset updated
    Oct 18, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tristan FRUM; Tristan FRUM; Jason Spence; Jason Spence
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Clinical interventions and inflammatory signaling shape the transcriptional and cellular architecture of the early postnatal lung

    Spatial Transcriptomics was performed using the 10X Xenium Platform with a 480 custom-designed probe set on 1 tissue section from 5 distinct early postnatal lung specimens. CSV files contain cell type identities as determined by label transfer.

    .zip files should be unzipped to the same directory and can be viewed with Xenium Explorer.

    .csv files contain cell type annotations as determined by label transfer to hand annotated single nuclei RNA-sequencing data from early postnatal lung. They can be added as a custom cell group in Xenium Explorer.

    Code used in analysis of this data is available at: http://github.com/jason-spence-lab/Frum-et-al.-2025a.git

    METHODS
    Tissue Preparation for Xenium Spatial Transcriptomics Analysis

    Xenium slides were removed from -20°C storage and allowed to come to room temperature for 30 minutes and then were placed on a 42ºC slide warmed and coated with DNAse/RNAse free water (Corning, Cat# 46000CM). Small sections from multiple specimens were carefully placed within the sample placement area. Most of the water was removed when sections had completely flattened. Slides dried on the slide warmer for three hours before transport to the Advanced Genomics Core. Xenium slides were processed by the Advanced Genomics Core using the Xenium In SituGene Expression with Cell Segmentation workflow (10X, #CG000749).

    Xenium Data Analysis
    Preprocessing/QC Filtering
    Centroids and Segmentation coordinates and Gene Expression counts were determined by Xenium Onboard Analysis v4.0 and imported into R using Seurat::ReadXenium(). Gene Expression counts were converted to a Seurat object using Seurat::CreateSeuratObject(). Coordinates for centroids and segmentations were first converted into a field of view using Seurat::CreateFOV() and then appended to the Seurat object. Segmentations with less than 25 gene expression counts were excluded from the analysis.

    Label Transfer
    To align low-complexity 480 probe Xenium data with higher complexity snRNA-seq data the reference data was transformed using Seurat::SCTransform() with 3000 variable features. Each specimen was processed individually, also undergoing SCTransformation using 250 variable features. Any Xenium probes expressed in over 95% of cells were excluded from analysis. Anchors between each specimen and the snRNA-seq reference were calculated using FindTransferAnchors() using the SCT assay of both datasets, 20 dimensions, k.filter = 200, and considering only the variable features from the Xenium specimen. Cell type annotations from the snRNA-seq data were then transferred to the Xenium specimen using TransferData(), with anchors weighted by the PCs of the Xenium specimen.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Cenk Celik; Guillaume Thibault (2023). Processed, annotated, seurat object [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7608211

Processed, annotated, seurat object

Explore at:
Dataset updated
Nov 16, 2023
Dataset provided by
Nanyang Technological University
Authors
Cenk Celik; Guillaume Thibault
Description

The dataset contains an integrated, annotated Seurat v4 object. One can load the dataset into the R environment using the code below:

seurat_obj <- readRDS('PATH/TO/DOWNLOAD/seurat.rds')

The object has three assays: (I) RNA, (II) SCT and (III) integrated.

Search
Clear search
Close search
Google apps
Main menu