100+ datasets found

u
Data from: Reference transcriptomics of porcine peripheral immune cells...
agdatacommons.nal.usda.gov
datasets.ai
+2more
zip
Updated Nov 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1522411
Dataset updated
Nov 21, 2025
Dataset provided by
Ag Data Commons
Authors
Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

*The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Data from: scRNA-seq Datasets
figshare.com
txt
Updated Apr 9, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Zhengtao Xiao (2019). scRNA-seq Datasets [Dataset]. http://doi.org/10.6084/m9.figshare.7174922.v2
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.7174922.v2
Dataset updated
Apr 9, 2019
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Zhengtao Xiao
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
"*.csv" files contain the single cell gene expression values (log2(tpm+1)) for all genes in each cell from melanoma and squamous cell carcinoma of head and neck (HNSCC) tumors. The cell type and origin of tumor for each cell is also included in "*.csv" files.The "MalignantCellSubtypes.xlsx" defines the tumor subtype."CCLE_RNAseq_rsem_genes_tpm_20180929.zip" is downloaded from CCLE database.
scRNA-seq + scATAC-seq Challenge at NeurIPS 2021
kaggle.com
zip
Updated Sep 16, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq + scATAC-seq Challenge at NeurIPS 2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021
Explore at:
zip(2917180928 bytes)Available download formats
Dataset updated
Sep 16, 2022
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

It is https://en.wikipedia.org/wiki/ATAC-seq#Single-cell_ATAC-seq single cell ATAC-seq data. And single cell RNA-seq data: https://en.wikipedia.org/wiki/Single-cell_transcriptomics#Single-cell_RNA-seq

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

(For companion dataset on CITE-seq = scRNA-seq + Proteomics, see: https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021)

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
4
Scripts and data for the paper: Consequences and opportunities arising due...
data.4tu.nl
Updated Oct 15, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gerard Bouland; Marcel Reinders; Ahmed Mahfouz (2024). Scripts and data for the paper: Consequences and opportunities arising due to sparser single-cell RNA-seq datasets [Dataset]. http://doi.org/10.4121/424eea7a-cce9-4dbb-b6ef-e5b47e132410.v1
Explore at:
Unique identifier
https://doi.org/10.4121/424eea7a-cce9-4dbb-b6ef-e5b47e132410.v1
Dataset updated
Oct 15, 2024
Dataset provided by
4TU.ResearchData
Authors
Gerard Bouland; Marcel Reinders; Ahmed Mahfouz
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
Scripts and data for the paper: Consequences and opportunities arising due to sparser single-cell RNA-seq datasets

With the number of cells measured in single-cell RNA sequencing (scRNA-seq) datasets increasing exponentially and concurrent increased sparsity due to more zero counts being measured for many genes, we demonstrate here that downstream analyses on binary-based gene expression give similar results as count-based analyses. Moreover, a binary representation scales up to ~ 50-fold more cells that can be analyzed using the same computational resources. We also highlight the possibilities provided by binarized scRNA-seq data. Development of specialized tools for bit-aware implementations of downstream analytical tasks will enable a more fine-grained resolution of biological heterogeneity.
Data from: Single-cell RNA-seq data from Smart-seq2 sequencing of FACS...
figshare.com
zip
Updated May 30, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tabula Muris Consortium; James Webber; Joshua Batson; Angela Pisco (2023). Single-cell RNA-seq data from Smart-seq2 sequencing of FACS sorted cells (v2) [Dataset]. http://doi.org/10.6084/m9.figshare.5829687.v8
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.5829687.v8
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Tabula Muris Consortium; James Webber; Joshua Batson; Angela Pisco
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Gene-count tables for FACS sorted cells sequenced with Smart-Seq2 from 20 organs of 7 mice. Cells are grouped by tissue of origin.Includes data for 53,760 cells, 44,879 of which passed a QC cutoff of at least 500 genes and 50,000 reads.Cell annotations using the Cell Ontology [1] controlled vocabulary are in a separate csv.This differs from v1 by renaming "Brain_Neurons" --> "Brain_Non-microglia" to be consistent with the manuscript.Update 2018-09-20: Updated annotations to latest manuscript versionUpdate 2018-02-16: Separated Diaphragm cells from Muscle cells, and Aorta cells from Heart cells.Update 2018-02-20: Aorta and Heart erroneously contained Diaphragm and Muscle data, and have now been corrected.Update 2018-03-09: Renamed tissues for nomenclature standards: "Colon" --> "Large_Intestine" "Muscle" --> "Limb_Muscle" "Mammary" --> "Mammary_Gland" "Brain_Microglia" --> "Brain_Myeloid" "Brain_Non-microglia" --> "Brain_Non-Myeloid"Update 2018-03-22: Renamed subtissues:- tissue: Heart, subtissue: ? --> tissue: Heart, subtissue: Unknown- tissue: Skin, subtissue: NA --> tissue: Skin, subtissue: TelogenUpdate 2018-03-23: Removed row numbers in first column of metadata_FACS.csvUpdate 2018-03-27: Added tissue tSNEs and cluster ids[1] http://purl.obolibrary.org/obo/cl.owl
Raw and processed (filtered and annotated) scRNAseq data
figshare.com
zip
Updated Jun 12, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac (2023). Raw and processed (filtered and annotated) scRNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.23499192.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.23499192.v1
Dataset updated
Jun 12, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single cell RNA-seq data generated and reported as part of the manuscript entitled "Dissecting the mechanisms underlying the Cytokine Release Syndrome (CRS) mediated by T Cell Bispecific Antibodies" by Leclercq-Cohen et al 2023. Raw and processed (filtered and annotated) data are provided as AnnData objects which can be directly ingested to reproduce the findings of the paper or for ab initio data reuse: 1- raw.zip provides concatenated raw/unfiltered counts for the 20 samples in the standard Market Exchange Format (MEX) format. 2- 230330_sw_besca2_LowFil_raw.h5ad contains filtered cells and raw counts in the HDF5 format. 3- 221124_sw_besca2_LowFil.annotated.h5ad contains filtered cells and log normalized counts, along with cell type annotation in the HDF5 format.

scRNAseq data generation: Whole blood from 4 donors was treated with 0.2 μg/mL CD20-TCB, or incubated in the absence of CD20- TCB. At baseline (before addition of TCB) and assay endpoints (2, 4, 6, and 20 hrs), blood was collected for total leukocyte isolation using EasySepTM red blood cell depletion reagent (Stemcell). Briefly, cells were counted and processed for single cell RNA sequencing using the BD Rhapsody platform. To load several samples on a single BD Rhapsody cartridge, sample cells were labelled with sample tags (BD Human Single-Cell Multiplexing Kit) following the manufacturer’s protocol prior to pooling. Briefly, 1x106 cells from each sample were re-suspended in 180 μL FBS Stain Buffer (BD, PharMingen) and sample tags were added to the respective samples and incubated for 20 min at RT. After incubation, 2 successive washes were performed by addition of 2 mL stain buffer and centrifugation for 5 min at 300 g. Cells were then re- suspended in 620 μL cold BD Sample Buffer, stained with 3.1 μL of both 2 mM Calcein AM (Thermo Fisher Scientific) and 0.3 mM Draq7 (BD Biosciences) and finally counted on the BD Rhapsody scanner. Samples were then diluted and/or pooled equally in 650 μL cold BD Sample Buffer. The BD Rhapsody cartridges were then loaded with up to 40 000 – 50 000 cells. Single cells were isolated using Single-Cell Capture and cDNA Synthesis with the BD Rhapsody Express Single-Cell Analysis System according to the manufacturer’s recommendations (BD Biosciences). cDNA libraries were prepared using the Whole Transcriptome Analysis Amplification Kit following the BD Rhapsody System mRNA Whole Transcriptome Analysis (WTA) and Sample Tag Library Preparation Protocol (BD Biosciences). Indexed WTA and sample tags libraries were quantified and quality controlled on the Qubit Fluorometer using the Qubit dsDNA HS Assay, and on the Agilent 2100 Bioanalyzer system using the Agilent High Sensitivity DNA Kit. Sequencing was performed on a Novaseq 6000 (Illumina) in paired-end mode (64-8- 58) with Novaseq6000 S2 v1 or Novaseq6000 SP v1.5 reagents kits (100 cycles). scRNAseq data analysis: Sequencing data was processed using the BD Rhapsody Analysis pipeline (v 1.0 https://www.bd.com/documents/guides/user-guides/GMX_BD-Rhapsody-genomics- informatics_UG_EN.pdf) on the Seven Bridges Genomics platform. Briefly, read pairs with low sequencing quality were first removed and the cell label and UMI identified for further quality check and filtering. Valid reads were then mapped to the human reference genome (GRCh38-PhiX-gencodev29) using the aligner Bowtie2 v2.2.9, and reads with the same cell label, same UMI sequence and same gene were collapsed into a single raw molecule while undergoing further error correction and quality checks. Cell labels were filtered with a multi-step algorithm to distinguish those associated with putative cells from those associated with noise. After determining the putative cells, each cell was assigned to the sample of origin through the sample tag (only for cartridges with multiplex loading). Finally, the single-cell gene expression matrices were generated and a metrics summary was provided. After pre-processing with BD’s pipeline, the count matrices and metadata of each sample were aggregated into a single adata object and loaded into the besca v2.3 pipeline for the single cell RNA sequencing analysis (43). First, we filtered low quality cells with less than 200 genes, less than 500 counts or more than 30% of mitochondrial reads. This permissive filtering was used in order to preserve the neutrophils. We further excluded potential multiplets (cells with more than 5,000 genes or 20,000 counts), and genes expressed in less than 30 cells. Normalization, log-transformed UMI counts per 10,000 reads [log(CP10K+1)], was applied before downstream analysis. After normalization, technical variance was removed by regressing out the effects of total UMI counts and percentage of mitochondrial reads, and gene expression was scaled. The 2,507 most variable genes (having a minimum mean expression of 0.0125, a maximum mean expression of 3 and a minimum dispersion of 0.5) were used for principal component analysis. Finally, the first 50 PCs were used as input for calculating the 10 nearest neighbours and the neighbourhood graph was then embedded into the two-dimensional space using the UMAP algorithm at a resolution of 2. Cell type annotation was performed using the Sig-annot semi-automated besca module, which is a signature- based hierarchical cell annotation method. The used signatures, configuration and nomenclature files can be found at https://github.com/bedapub/besca/tree/master/besca/datasets. For more details, please refer to the publication.
f
Bulk and single-cell RNA-seq data
datasetcatalog.nlm.nih.gov
figshare.com
Updated Mar 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wang, Haiting; Liu, Zhaoyuan; Li, Ziyi (2023). Bulk and single-cell RNA-seq data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000975499
Explore at:
Dataset updated
Mar 8, 2023
Authors
Wang, Haiting; Liu, Zhaoyuan; Li, Ziyi
Description
The bulk and single-cell RNA-seq data of DC lineage cells sorted from blood, spleen and bone marrow from mice.
s
Single-cell RNA sequencing data on primary samples from: Aberrant expression...
figshare.scilifelab.se
researchdata.se
+1more
hdf
Updated Oct 2, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Carl Sandén; Henrik Lilljebjörn; Thoas Fioretos (2025). Single-cell RNA sequencing data on primary samples from: Aberrant expression of SLAMF6 constitutes a targetable immune escape mechanism in acute myeloid leukemia [Dataset]. http://doi.org/10.17044/scilifelab.28263911.v2
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.28263911.v2
Dataset updated
Oct 2, 2025
Dataset provided by
Lund University
Authors
Carl Sandén; Henrik Lilljebjörn; Thoas Fioretos
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset includes single-cell RNA sequencing (scRNA-seq) data from primary AML (acute myeloid leukemia) samples. Libraries were produced using the 10X Genomics Chromium Single Cell 3ʹ Reagent Kits v3 and sequenced on an Illumina Novaseq 6000 system (Illumina). The dataset is available as raw sequencing reads (fastq; restricted access) or as an annotated matrix of scRNA count data (h5ad). Published in: Sandén et al, Nature Cancer, 2025: https://www.nature.com/articles/s43018-025-01054-6
Z
Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...
data.niaid.nih.gov
Updated Feb 12, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson (2021). Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4278129
Explore at:
Dataset updated
Feb 12, 2021
Dataset provided by
Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON, M5G 2C1, Canada
Department of Molecular Genetics, 2Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, CanadaUniversity of Toronto, Toronto, ON, M5S 1A8, Canada,
Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada; Department of Psychology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6
Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada; Vector Institute for Artificial Intelligence, MaRS Centre, Toronto, ON, M5G 1M1; CIFAR, MaRS Centre, Toronto, ON, M5G 1M1
Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Cell and Systems Biology, University of Toronto, Toronto
Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
Authors
Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data repository for the scMappR manuscript:

Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).

RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.
Z
Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset
data.niaid.nih.gov
Updated Nov 20, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Hsu, Jonathan; Stoop, Allart (2023). Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10011621
Explore at:
Dataset updated
Nov 20, 2023
Authors
Hsu, Jonathan; Stoop, Allart
Description
Table of Contents

Main Description File Descriptions Linked Files Installation and Instructions

1. Main Description

This is the Zenodo repository for the manuscript titled "A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity.". The code included in the file titled marengo_code_for_paper_jan_2023.R was used to generate the figures from the single-cell RNA sequencing data. The following libraries are required for script execution:

Seurat scReportoire ggplot2 stringr dplyr ggridges ggrepel ComplexHeatmap

File Descriptions

The code can be downloaded and opened in RStudios. The "marengo_code_for_paper_jan_2023.R" contains all the code needed to reproduce the figues in the paper The "Marengo_newID_March242023.rds" file is available at the following address: https://zenodo.org/badge/DOI/10.5281/zenodo.7566113.svg (Zenodo DOI: 10.5281/zenodo.7566113). The "all_res_deg_for_heat_updated_march2023.txt" file contains the unfiltered results from DGE anlaysis, also used to create the heatmap with DGE and volcano plots. The "genes_for_heatmap_fig5F.xlsx" contains the genes included in the heatmap in figure 5F.

Linked Files

This repository contains code for the analysis of single cell RNA-seq dataset. The dataset contains raw FASTQ files, as well as, the aligned files that were deposited in GEO. The "Rdata" or "Rds" file was deposited in Zenodo. Provided below are descriptions of the linked datasets:

Gene Expression Omnibus (GEO) ID: GSE223311(https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE223311)

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the "matrix.mtx", "barcodes.tsv", and "genes.tsv" files for each replicate and condition, corresponding to the aligned files for single cell sequencing data. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Sequence read archive (SRA) repository ID: SRX19088718 and SRX19088719

Title: Gene expression profile at single cell level of CD4+ and CD8+ tumor infiltrating lymphocytes (TIL) originating from the EMT6 tumor model from mSTAR1302 treatment. Description: This submission contains the raw sequencing or .fastq.gz files, which are tab delimited text files. Submission type: Private. In order to gain access to the repository, you must use a reviewer token (https://www.ncbi.nlm.nih.gov/geo/info/reviewer.html).

Zenodo DOI: 10.5281/zenodo.7566113(https://zenodo.org/record/7566113#.ZCcmvC2cbrJ)

Title: A TCR β chain-directed antibody-fusion molecule that activates and expands subsets of T cells and promotes antitumor activity. Description: This submission contains the "Rdata" or ".Rds" file, which is an R object file. This is a necessary file to use the code. Submission type: Restricted Acess. In order to gain access to the repository, you must contact the author.

Installation and Instructions

The code included in this submission requires several essential packages, as listed above. Please follow these instructions for installation:

Ensure you have R version 4.1.2 or higher for compatibility.

Although it is not essential, you can use R-Studios (Version 2022.12.0+353 (2022.12.0+353)) for accessing and executing the code.

Download the *"Rdata" or ".Rds" file from Zenodo (https://zenodo.org/record/7566113#.ZCcmvC2cbrJ) (Zenodo DOI: 10.5281/zenodo.7566113).

Open R-Studios (https://www.rstudio.com/tags/rstudio-ide/) or a similar integrated development environment (IDE) for R.

Set your working directory to where the following files are located:

marengo_code_for_paper_jan_2023.R Install_Packages.R Marengo_newID_March242023.rds genes_for_heatmap_fig5F.xlsx all_res_deg_for_heat_updated_march2023.txt

You can use the following code to set the working directory in R:

setwd(directory)

Open the file titled "Install_Packages.R" and execute it in R IDE. This script will attempt to install all the necessary pacakges, and its dependencies in order to set up an environment where the code in "marengo_code_for_paper_jan_2023.R" can be executed.

Once the "Install_Packages.R" script has been successfully executed, re-start R-Studios or your IDE of choice.

Open the file "marengo_code_for_paper_jan_2023.R" file in R-studios or your IDE of choice.

Execute commands in the file titled "marengo_code_for_paper_jan_2023.R" in R-Studios or your IDE of choice to generate the plots.
m
Investigating Highly Variable Genes in Single-cell RNA-seq Data across...
data.mendeley.com
Updated May 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jantarika Kumar Arora (2023). Investigating Highly Variable Genes in Single-cell RNA-seq Data across Multiple Cell Types and Conditions [Dataset]. http://doi.org/10.17632/6ry3x7r8hf.3
Explore at:
Unique identifier
https://doi.org/10.17632/6ry3x7r8hf.3
Dataset updated
May 16, 2023
Authors
Jantarika Kumar Arora
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The peripheral blood immune cell (PBMC) samples were collected from patients infected with dengue virus (DENV) at four time points: two and one day(s) before defervescence (febrile phase), at defervescence (critical phase), and two-week convalescence. The raw and filtered matrix files were generated using CellRanger version 3.0.2 (10x Genomics, USA) with the reference human genome GRCh38 1.2.0. Potential contamination of ambient RNAs was corrected using SoupX. Low quality cells, including cells expressing mitochondrial genes higher than 10% and doublets/multiplets, were excluded using Seurat and doubletFinder, respectively. The individual samples were then integrated using the SCTransform method with 3,000 gene features. Principal component analysis (PCA) and clustering were performed with the Louvain algorithm applying multi-level refinement algorithm. The gene expression level of each cell was normalized using the LogNormalize method in Seurat. Cell types were annotated using the canonical marker genes described in the original paper, see related link below.
scRNA-seq Human Pluripotent Stem Cells Messmer2019
kaggle.com
zip
Updated May 1, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq Human Pluripotent Stem Cells Messmer2019 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-human-pluripotent-stem-cells-messmer2019
Explore at:
zip(57267380 bytes)Available download formats
Dataset updated
May 1, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data: https://pubmed.ncbi.nlm.nih.gov/30673604/ Cell Rep. 2019 Jan 22;26(4):815-824.e4. doi: 10.1016/j.celrep.2018.12.099. Transcriptional Heterogeneity in Naive and Primed Human Pluripotent Stem Cells at Single-Cell Resolution Tobias Messmer 1, Ferdinand von Meyenn 2, Aurora Savino 3, Fátima Santos 3, Hisham Mohammed 3, Aaron Tin Long Lun 4, John C Marioni 5, Wolf Reik 6

Data in two variants: 1) scRNA-seq count matrix, downloaded from database of R-package "scRNAseq", see script: https://www.kaggle.com/alexandervc/rpackage-scrnaseq-downloads-datasets 2) Directly uploaded from E-MTAB-6819 https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6819/

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
u
Data from: Single-cell RNA sequencing data and resources from blood and milk...
agdatacommons.nal.usda.gov
gimi9.com
+1more
hdf
Updated Aug 27, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jayne Wiarda; Kaitlyn Sarlo-Davila; Julian M. Trachsel; Crystal L. Loving; Paola M. Boggiatto; John D. Lippolis; Ellie J. Putz (2025). Single-cell RNA sequencing data and resources from blood and milk immune cells of Holstein cattle with chronic mastitis caused by experimental Staphylococcus aureus infection [Dataset]. http://doi.org/10.15482/USDA.ADC/26870506.v1
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/26870506.v1
Dataset updated
Aug 27, 2025
Dataset provided by
Ag Data Commons
Authors
Jayne Wiarda; Kaitlyn Sarlo-Davila; Julian M. Trachsel; Crystal L. Loving; Paola M. Boggiatto; John D. Lippolis; Ellie J. Putz
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This online resource provides supplementary items used to analyze data in the work, "Single-cell RNA sequencing characterization of Holstein cattle blood and milk immune cells during a chronic Staphylococcus aureus mastitis infection", by Wiarda et al. Cells were collected from milk and blood of three cattle with chronic mastitis infections caused by experimental Staphylococcus aureus challenge. Isolated cells were processed for single-cell RNA sequencing, resulting in a dataset of 35,338 cells distributed across 62 cells clusters. Cell clusters were classified as granulocytes, monocyte/macrophage/conventional dendritic cells, B cells/antibody-secreting cells, T cells/innate lymphoid cells, plasmacytoid dendritic cells, and non-immune cells. A data subset consisting of 30 granulocyte clusters was also created. Data objects of total cell and granulocyte datasets are included here (.h5seurat files), as well as results of pairwise differential gene expression of all cell clusters (resulting in over 4.3 million differentially expressed genes), and a data object containing cell neighborhoods used for differential abundance testing.
scRNA-seq Kolodziejczyk et al. (2015)
kaggle.com
zip
Updated Apr 30, 2022
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2022). scRNA-seq Kolodziejczyk et al. (2015) [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-kolodziejczyk-et-al-2015
Explore at:
zip(13439744 bytes)Available download formats
Dataset updated
Apr 30, 2022
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Data and Context

Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

Particular data: Data from the paper: Kolodziejczyk, A. A., J. K. Kim, J. C. Tsang, T. Ilicic, J. Henriksson, K. N. Natarajan, A. C. Tuck, et al. 2015. “Single cell RNA-Sequencing of pluripotent states unlocks modular transcriptional variation.” Cell Stem Cell 17 (4): 471–85. https://pubmed.ncbi.nlm.nih.gov/26431182/

scRNA-seq count matrix, downloaded from database of R-package "scRNAseq", see script: https://www.kaggle.com/alexandervc/rpackage-scrnaseq-downloads-datasets

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833

Challenges in unsupervised clustering of single-cell RNA-seq data https://www.nature.com/articles/s41576-018-0088-9 Review Article Published: 07 January 2019 Vladimir Yu Kiselev, Tallulah S. Andrews & Martin Hemberg Nature Reviews Genetics volume 20, pages273–282 (2019)

Challenges and emerging directions in single-cell analysis https://link.springer.com/article/10.1186/s13059-017-1218-y Published: 08 May 2017 Guo-Cheng Yuan, Long Cai, Michael Elowitz, Tariq Enver, Guoping Fan, Guoji Guo, Rafael Irizarry, Peter Kharchenko, Junhyong Kim, Stuart Orkin, John Quackenbush, Assieh Saadatpour, Timm Schroeder, Ramesh Shivdasani & Itay Tirosh Genome Biology volume 18, Article number: 84 (2017)

Single-Cell RNA Sequencing in Cancer: Lessons Learned and Emerging Challenges https://www.sciencedirect.com/science/article/pii/S1097276519303569 Molecular Cell Volume 75, Issue 1, 11 July 2019, Pages 7-12 Journal home page for Molecular Cell
CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021
kaggle.com
zip
Updated Jan 22, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2023). CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021 [Dataset]. https://www.kaggle.com/datasets/alexandervc/citeseqscrnaseqproteins-challenge-neurips2021
Explore at:
zip(646191284 bytes)Available download formats
Dataset updated
Jan 22, 2023
Authors
Alexander Chervov
Description
Context

Dataset from NeurIPS2021 challenge similar to Kaggle 2022 competition: https://www.kaggle.com/competitions/open-problems-multimodal "Open Problems - Multimodal Single-Cell Integration Predict how DNA, RNA & protein measurements co-vary in single cells"

CITE-seq - joint single cell RNA sequencing + single cell measurements of CD** proteins. (https://en.wikipedia.org/wiki/CITE-Seq) (For companion dataset on scRNA-seq + scATAC-seq, see: https://www.kaggle.com/datasets/alexandervc/scrnaseq-scatacseq-challenge-at-neurips-2021 )

Single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

See tutorials: https://scanpy.readthedocs.io/en/stable/tutorials.html ("Scanpy" - main Python package to work with scRNA-seq data). Or https://satijalab.org/seurat/ "Seurat" - "R" package

Particular data

https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE194122

Expression profiling by high throughput sequencing Genome binding/occupancy profiling by high throughput sequencing Summary Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors. Half the samples were measured using the 10X Multiome Gene Expression and Chromatin Accessability kit and half were measured using the 10X 3' Single-Cell Gene Expression kit with Feature Barcoding in combination with the BioLegend TotalSeq B Universal Human Panel v1.0. The dataset was generated to support Multimodal Single-Cell Data Integration Challenge at NeurIPS 2021. Samples were prepared using a standard protocol at four sites. The resulting data was then annotated to identify cell types and remove doublets. The dataset was designed with a nested batch layout such that some donor samples were measured at multiple sites with some donors measured at a single site. In the competition, participants were tasked with challenges including modality prediction, matching profiles from different modalities, and learning a joint embedding from multiple modalities.

Overall design Single-cell multiomics data collected from bone marrow mononuclear cells of 12 healthy human donors.

Contributor(s) Burkhardt DB, Lücken MD, Lance C, Cannoodt R, Pisco AO, Krishnaswamy S, Theis FJ, Bloom JM Citation https://datasets-benchmarks-proceedings.neurips.cc/paper/2021/hash/158f3069a435b314a80bdcb024f8e422-Abstract-round2.html

Related datasets:

Other single cell RNA seq datasets can be found on kaggle: Look here: https://www.kaggle.com/alexandervc/datasets Or search kaggle for "scRNA-seq"

Inspiration

Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

Search scholar.google "challenges in single cell rna sequencing" https://scholar.google.fr/scholar?q=challenges+in+single+cell+rna+sequencing&hl=en&as_sdt=0&as_vis=1&oi=scholart gives many interesting and highly cited articles

(Cited 968) Computational and analytical challenges in single-cell transcriptomics Oliver Stegle, Sarah A. Teichmann, John C. Marioni Nat. Rev. Genet., 16 (3) (2015), pp. 133-145 https://www.nature.com/articles/nrg3833
u
Single-cell RNA sequencing data from 20 tumors
portalcientifico.unav.edu
plus.figshare.com
Updated 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Li, Yilong; Nahas, Michelle; Stephens, Dennis; Froburg, Kate; Hintz, Emma; Champagne, Devin; Lochab, Amaneet; Brown, Markus; Braun, Jasper; Antonia Fortuno, Maria; Ocon, Maria-del-Mar; Pasquier, Andrea; Luque Vazquez, Ines; Moudgalya, Hita; Kivlehan, Sophie; Gjeci, Iliana; L. Korle, Stephanie; Campo, Arantza; Rodriguez, Maria; W. Seder, Christopher; Lizotte, Patrick H.; Bueno, Raphael; Borgia, Jeffrey A.; Seijo, Luis Miguel; Montuenga, Luis M.; Yelensky, Roman; Li, Yilong; Nahas, Michelle; Stephens, Dennis; Froburg, Kate; Hintz, Emma; Champagne, Devin; Lochab, Amaneet; Brown, Markus; Braun, Jasper; Antonia Fortuno, Maria; Ocon, Maria-del-Mar; Pasquier, Andrea; Luque Vazquez, Ines; Moudgalya, Hita; Kivlehan, Sophie; Gjeci, Iliana; L. Korle, Stephanie; Campo, Arantza; Rodriguez, Maria; W. Seder, Christopher; Lizotte, Patrick H.; Bueno, Raphael; Borgia, Jeffrey A.; Seijo, Luis Miguel; Montuenga, Luis M.; Yelensky, Roman (2025). Single-cell RNA sequencing data from 20 tumors [Dataset]. https://portalcientifico.unav.edu/documentos/688b602c17bb6239d2d49480
Explore at:
Dataset updated
2025
Authors
Li, Yilong; Nahas, Michelle; Stephens, Dennis; Froburg, Kate; Hintz, Emma; Champagne, Devin; Lochab, Amaneet; Brown, Markus; Braun, Jasper; Antonia Fortuno, Maria; Ocon, Maria-del-Mar; Pasquier, Andrea; Luque Vazquez, Ines; Moudgalya, Hita; Kivlehan, Sophie; Gjeci, Iliana; L. Korle, Stephanie; Campo, Arantza; Rodriguez, Maria; W. Seder, Christopher; Lizotte, Patrick H.; Bueno, Raphael; Borgia, Jeffrey A.; Seijo, Luis Miguel; Montuenga, Luis M.; Yelensky, Roman; Li, Yilong; Nahas, Michelle; Stephens, Dennis; Froburg, Kate; Hintz, Emma; Champagne, Devin; Lochab, Amaneet; Brown, Markus; Braun, Jasper; Antonia Fortuno, Maria; Ocon, Maria-del-Mar; Pasquier, Andrea; Luque Vazquez, Ines; Moudgalya, Hita; Kivlehan, Sophie; Gjeci, Iliana; L. Korle, Stephanie; Campo, Arantza; Rodriguez, Maria; W. Seder, Christopher; Lizotte, Patrick H.; Bueno, Raphael; Borgia, Jeffrey A.; Seijo, Luis Miguel; Montuenga, Luis M.; Yelensky, Roman
Description
Liquid biopsy is a promising non-invasive technology that is capable of diagnosing cancer. However, current ctDNA-based approaches detect only a minority of early-stage disease. We set out to improve the sensitivity of liquid biopsy by harnessing tumor recognition by T cells through the sequencing of the circulating T-cell receptor repertoire. We studied a cohort of 463 patients with lung cancer (86% stage I) and 587 subjects without cancer using gDNA extracted from blood buffy coats. We performed TCR β chain sequencing to yield a median of 113,571 TCR clonotypes per sample and built a TCR sequence similarity graph to cluster clonotypes into TCR repertoire functional units (RFUs). The TCR frequencies of RFUs were tested for association with cancer status and RFUs with a statistically significant association were combined into a cancer score using a support vector machine model. The model was evaluated by 10-fold cross-validation and compared with a ctDNA panel of 237 mutation hotspots in 154 lung cancer driver genes and 17 cancer related protein biomarkers in 85 subjects. We identified 327 cancer- associated TCR RFUs with a false discovery rate (FDR) ≤ 0.1, including 157 enriched in cancer samples and 170 enriched in controls. Levels of 247/327 (76%) RFUs were correlated with the presence of an HLA allele at FDR ≤ 0.1 and tumor-infiltrating lymphocyte TCRs from multiple RFUs bound HLA presented tumor antigen peptides, suggesting antigen recognition as a driver of the cancer-RFU associations found. The RFU cancer score detected nearly 50% of stage I lung cancers at a specificity of 80% and boosted the sensitivity by up to 20 percentage points when added to ctDNA and circulating proteins in a multi- analyte cancer screening test. Overall, we show that circulating TCR repertoire functional unit analysis can complement established analytes to improve liquid biopsy sensitivity for early-stage cancer.

This dataset contains the CellRanger output for 20 cancer patients. Please refer to https://www.10xgenomics.com/support/software/cell-ranger/latest for documentation.
For details on how the data was generated, please see Li Y. et al. 2025: Circulating T-cell Receptor Repertoire for Cancer Early Detection.
s
Single cell sequencing data from: The AML cellular state space unveils NPM1...
figshare.scilifelab.se
researchdata.se
+1more
hdf
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Henrik Lilljebjörn; Thoas Fioretos (2025). Single cell sequencing data from: The AML cellular state space unveils NPM1 immune evasion subtypes with distinct clinical outcomes [Dataset]. http://doi.org/10.17044/scilifelab.23715648.v2
Explore at:
hdfAvailable download formats
Unique identifier
https://doi.org/10.17044/scilifelab.23715648.v2
Dataset updated
Oct 7, 2025
Dataset provided by
Lund University
Authors
Henrik Lilljebjörn; Thoas Fioretos
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset contains 10X single cell 3' RNA sequencing gene expression data from from 38 AML-samples from the subtypes NPM1 (n=12), AML-MR (n=11), TP53 (n=7), CBFB::MYH11 (n=3), RUNX1::RUNX1T1 (n=3), AML without class defining mutations (n=1), and AML meeting the criteria for two subtypes (n=1). In addition, reference samples from normal bone marrow mononuclear cells (n=5) and CD34 sorted cells (n=3) are included. The single cell libraries were constructed from viably frozen cells from bone marrow (n=29+8) or peripheral blood (n=9) using the Chromium Single Cell 3' Library & Gel Bead Kit v3 (10X genomics) and sequenced on a Novaseq 6000 or NextSeq 500.Data is available in h5 format for each sample, with raw count output from Cellranger, or as a processed Seurat object with scaled expression data, dimension reductions, and metadata.
q
Single Cell Insights Into Cancer Transcriptomes: A Five-Part Single-Cell...
qubeshub.org
Updated Nov 16, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Leigh Samsa*; Melissa Eslinger; Adam Kleinschmit; Amanda Solem; Carlos Goller* (2021). Single Cell Insights Into Cancer Transcriptomes: A Five-Part Single-Cell RNAseq Case Study Lesson [Dataset]. http://doi.org/10.24918/cs.2021.26
Explore at:
Unique identifier
https://doi.org/10.24918/cs.2021.26
Dataset updated
Nov 16, 2021
Dataset provided by
QUBES
Authors
Leigh Samsa*; Melissa Eslinger; Adam Kleinschmit; Amanda Solem; Carlos Goller*
Description
There is a growing need for integration of “Big Data” into undergraduate biology curricula. Transcriptomics is one venue to examine biology from an informatics perspective. RNA sequencing has largely replaced the use of microarrays for whole genome gene expression studies. Recently, single cell RNA sequencing (scRNAseq) has unmasked population heterogeneity, offering unprecedented views into the inner workings of individual cells. scRNAseq is transforming our understanding of development, cellular identity, cell function, and disease. As a ‘Big Data,’ scRNAseq can be intimidating for students to conceptualize and analyze, yet it plays an increasingly important role in modern biology. To address these challenges, we created an engaging case study that guides students through an exploration of scRNAseq technologies. Students work in groups to explore external resources, manipulate authentic data and experience how single cell RNA transcriptomics can be used for personalized cancer treatment. This five-part case study is intended for upper-level life science majors and graduate students in genetics, bioinformatics, molecular biology, cell biology, biochemistry, biology, and medical genomics courses. The case modules can be completed sequentially, or individual parts can be separately adapted. The first module can also be used as a stand-alone exercise in an introductory biology course. Students need an intermediate mastery of Microsoft Excel but do not need programming skills. Assessment includes both students’ self-assessment of their learning as answers to previous questions are used to progress through the case study and instructor assessment of final answers. This case provides a practical exercise in the use of high-throughput data analysis to explore the molecular basis of cancer at the level of single cells.
Additional file 2 of scTyper: a comprehensive pipeline for the cell typing...
springernature.figshare.com
xlsx
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2023). Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762703.v1
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.12762703.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 2: Supplementary Table 2–3. This file contains the list of cell markers in each of scTyper.db (Table S2) and CellMarker DB (Table S3) and detailed information such as identifier, study name, species, cell type, gene symbol, and PMID.
Data, R code and output Seurat Objects for single cell RNA-seq analysis of...
figshare.com
application/gzip
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yunshun Chen; Gordon Smyth (2023). Data, R code and output Seurat Objects for single cell RNA-seq analysis of human breast tissues [Dataset]. http://doi.org/10.6084/m9.figshare.17058077.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.17058077.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Yunshun Chen; Gordon Smyth
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This dataset contains all the Seurat objects that were used for generating all the figures in Pal et al. 2021 (https://doi.org/10.15252/embj.2020107333). All the Seurat objects were created under R v3.6.1 using the Seurat package v3.1.1. The detailed information of each object is listed in a table in Chen et al. 2021.

Facebook

Twitter

Click to copy link

Link copied

Cite

Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.15482/USDA.ADC/1522411

Dataset updated

Nov 21, 2025

Dataset provided by

Ag Data Commons

Authors

Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

*The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().

Clear search

Close search

Google apps

Main menu

Data from: Reference transcriptomics of porcine peripheral immune cells...

Data from: scRNA-seq Datasets

scRNA-seq + scATAC-seq Challenge at NeurIPS 2021

Context

Particular data

Related datasets:

Inspiration

Scripts and data for the paper: Consequences and opportunities arising due...

Data from: Single-cell RNA-seq data from Smart-seq2 sequencing of FACS...

Raw and processed (filtered and annotated) scRNAseq data

Bulk and single-cell RNA-seq data

Single-cell RNA sequencing data on primary samples from: Aberrant expression...

Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...

Repository for Single Cell RNA Sequencing Analysis of The EMT6 Dataset

1. Main Description

File Descriptions

Linked Files

Installation and Instructions

Investigating Highly Variable Genes in Single-cell RNA-seq Data across...

scRNA-seq Human Pluripotent Stem Cells Messmer2019

Data and Context

Related datasets:

Inspiration

Data from: Single-cell RNA sequencing data and resources from blood and milk...

scRNA-seq Kolodziejczyk et al. (2015)

Data and Context

Related datasets:

Inspiration

CITE-seq=scRNA-seq+Proteins: Challenge NeurIPS2021

Context

Particular data

Related datasets:

Inspiration

Single-cell RNA sequencing data from 20 tumors

Single cell sequencing data from: The AML cellular state space unveils NPM1...

Single Cell Insights Into Cancer Transcriptomes: A Five-Part Single-Cell...

Additional file 2 of scTyper: a comprehensive pipeline for the cell typing...

Data, R code and output Seurat Objects for single cell RNA-seq analysis of...

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing