100+ datasets found

u
Data from: Reference transcriptomics of porcine peripheral immune cells...
agdatacommons.nal.usda.gov
datasets.ai
+1more
zip
Updated Nov 21, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.15482/USDA.ADC/1522411
Dataset updated
Nov 21, 2025
Dataset provided by
Ag Data Commons
Authors
Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle
License
Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
Description
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

*The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
SCimilarity Tutorial Data
zenodo.org
data.niaid.nih.gov
+1more
bin
Updated Sep 4, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Graham Heimberg; Graham Heimberg; Tony Kuo; Nathaniel Diamant; Nathaniel Diamant; Omar Salem; Omar Salem; Héctor Corrada Bravo; Héctor Corrada Bravo; Jason Vander Heiden; Jason Vander Heiden; Tony Kuo (2024). SCimilarity Tutorial Data [Dataset]. http://doi.org/10.5281/zenodo.13685881
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.13685881
Dataset updated
Sep 4, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Graham Heimberg; Graham Heimberg; Tony Kuo; Nathaniel Diamant; Nathaniel Diamant; Omar Salem; Omar Salem; Héctor Corrada Bravo; Héctor Corrada Bravo; Jason Vander Heiden; Jason Vander Heiden; Tony Kuo
Description
SCimilarity is a unifying representation of single-cell expression profiles that quantifies similarity between expression states and generalizes to represent new studies without additional training. This enables a novel cell search capability, which sifts through millions of profiles to find cells similar to a query cell state and allows researchers to quickly and systematically leverage massive public scRNA-seq atlases to learn about a cell state of interest.

This repository contains public datasets for SCimilarity tutorials, specifically:

A subsample of single-cell data from Adams, et al. Science Advances, 2020 (GSE136831) as an AnnData object in h5ad format.

Terms of GSE136831:

Used with permission. Research developed by TLC4PF and the Yale School of Medicine led by Dr. Naftali Kaminski. © 2023 Pulmonary Fibrosis Cell Atlas website and associated content. All rights reserved. Please see the project website for more information: www.IPFCellAtlas.com

In addition, please cite (https://www.science.org/doi/10.1126/sciadv.aba1983 and for a description of the website creation methodology please cite (https://doi.org/10.1152/ajplung.00451.2020).

Single-Cell RNA Data Portal for Alzheimer's Disease

zenodo.org

zip

Updated Apr 30, 2025

Facebook

Twitter

Click to copy link

Link copied

Cite

Theodoros Siozos; Theodoros Siozos; Christos Petrou; Christos Petrou; ATHANASIOS BALOMENOS; ATHANASIOS BALOMENOS; Yannis Kopsinis; Yannis Kopsinis (2025). Single-Cell RNA Data Portal for Alzheimer's Disease [Dataset]. http://doi.org/10.5281/zenodo.15295744

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.15295744

Dataset updated

Apr 30, 2025

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Theodoros Siozos; Theodoros Siozos; Christos Petrou; Christos Petrou; ATHANASIOS BALOMENOS; ATHANASIOS BALOMENOS; Yannis Kopsinis; Yannis Kopsinis

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Single-Cell RNA Data Portal for Alzheimer's Disease

The single cell Alzheimer's Disease Data Portal is an aggregated data portal created as part of the Enfield EU Funded program for the single-cell Generative Pretrained Transformer (scGPT-AD) model research. The data portal contains data from the ssREAD data portal, along with single-cell AD data from latest studies (dharsini et al, pan et al, rexach et al). The data from the individual studies where accessed through the cellXgene data portal, a vast portal for single cell data. The data have been uploaded in two seperate .zip files (part1, part2).

The single cell data follow the Annotated Data format. The core data for each sample is the gene-expression matrix, which refers to the level of expression of each gene in a single cell. Additionally, the dataset contains the `.obs` attributed which includes core cell metadata for each of the sample (cell type, brain region, braak stage, donor age, disease condition, donor gender, etc.), along with the gene names accessed via `.var` attribute.

The source data have been processed to create a unified data portal ready to be used as training dataset for a Transformer model. The main processing steps were:

convert ssREAD data from `.qsave` format to `.h5ad` format that aligns with the AnnData framework
discard some unprocessable data samples
standardize metadata column names
process categorical data to create a unified namespace (e.g.: merge `microglia` and `microgrial` cell type names into one)
standardize all gene names to be upper-cased
discard dimensionality reduction and clustering attributes, to make a lightweight version of the data portal, since they are not meant to be used in Transformer model training

Aggregated Data Statistics

Total Cells	2.3M
AD Cells	1.2M
Control Cells	1.1M
Unique Genes	91k
Donors	166

Characteristics of Dataset grouped by Data Source

Data Source

Unique Genes

Total Cells

AD Cells

Control Cells

Donors

Cell Type Label

Brain Region

Tissue Type

Braak Stage

Donors Id

Donor Gender

Donor Age

rexach et al

30k

217k

118k

99k

✅

✘

✅

✘

✅

pan et al

61k

43k

11k

32k

✅

dharsini et al

61k

425k

311k

114k

✅

ssREAD

62k

2.42M

1.14M

1.28M

135

✅

✘

✅

Ageing_Exercise_Single_Cell
figshare.com
application/gzip
Updated Jul 3, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Solal Chauquet (2024). Ageing_Exercise_Single_Cell [Dataset]. http://doi.org/10.6084/m9.figshare.21959516.v1
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21959516.v1
Dataset updated
Jul 3, 2024
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Solal Chauquet
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Single cell RNA seq dataset at the rds format. Readable using the R programming language.
r
10X single-cell RNA sequencing of bone marrow cells from MDS-RS patients and...
researchdata.se
Updated Nov 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pedro Luis Moura; Eva Hellström-Lindberg (2023). 10X single-cell RNA sequencing of bone marrow cells from MDS-RS patients and healthy donors [Dataset]. http://doi.org/10.48723/nq2a-1e03
Explore at:
(1107), (5665)Available download formats
Unique identifier
https://doi.org/10.48723/nq2a-1e03
Dataset updated
Nov 6, 2023
Dataset provided by
Karolinska Institutet
Authors
Pedro Luis Moura; Eva Hellström-Lindberg
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset consists of single-cell RNA sequencing data of bone marrow cells (CD34+ stem cells, GPA+ erythroblasts, ring sideroblasts and mononuclear cells) obtained from multiple healthy bone marrow donors and MDS-RS patients. The objective of this data collection was to assess several parameters on how the bone marrow of MDS-RS patients differs from that of healthy donors.

This dataset includes raw sequencing data in .fastq format, processed count matrices and associated pseudonymized metadata.

Processing: All samples were loaded onto Chromium Single Cell Chips (10x Genomics, CA, USA) at a target capture rate of 10,000 cells per sample. Single cell libraries were prepared using Chromium Next GEM Single Cell 3ʹ Kits v3.1 (10x Genomics) as per the manufacturer’s instructions, except 1µl additive ADT primers were added to the initial cDNA PCR amplification buffer and ADT libraries prepared as described in the Total-Seq B protocol (BioLegend) from the initial cDNA SPRI clean up. Libraries were pooled and sequenced on an Illumina NovaSeq 6000 (Illumina). Read pseudoalignment was performed against the GRCh38.p13 human genome assembly through kallisto v0.46.1 and bustools v0.40.0 was used for barcode and UMI counting.

The dataset consists of 2 folders: - Processed_Count_Matrices - Raw_FASTQ

And one xlsx file: - Sample_key.xlsx

The folder Processed_Count_Matrices contains 1 rds file, 1 tsv file, 9 mtx files, and 18 txt files. The folder Raw_FASTQ contains 27 GNU zipped fastq files, and 5 txt files.

The documentation file File_list_10x.txt contains a full list of the files in the dataset.

The total size of the dataset is approximately 21 GB.
Multiple Single Cell RNA Expressions ARCHS4
kaggle.com
zip
Updated Jul 25, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Alexander Chervov (2021). Multiple Single Cell RNA Expressions ARCHS4 [Dataset]. https://www.kaggle.com/alexandervc/multiple-single-cell-rna-expressions-archs4
Explore at:
zip(23319014182 bytes)Available download formats
Dataset updated
Jul 25, 2021
Authors
Alexander Chervov
Description
Remark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

Context

Dataset is downloaded from https://amp.pharm.mssm.edu/archs4/download.html The methods are described in Nature Communications paper: https://www.nature.com/articles/s41467-018-03751-6

The ARCHS4 data provides user-friendly access to multiple gene expression data from the GEO database. (https://www.ncbi.nlm.nih.gov/geo/ ). While in GEO database most of data is stored in raw formats, ARCHS4 provides prepared count matrix expression data. While GEO contains data stored separately for each research paper, ARCHS4 collects all the information in one single matrix. One may consult the main site for further information.

Main data files are in H5 (HD5, Hierarchical Data Format ) file format https://en.wikipedia.org/wiki/Hierarchical_Data_Format It contains expression data, as well as annotation data and futher meta-information. There are several other auxilliary files like TSNE 3d projection (in CSV format) and correlation matrices for genes for human and mouse in feather format.

Content

The main file (for human): human_matrix.h5 - contains data matrix - which is 238522 samples times 35238 genes, as well as, various meta information: gene names, samples information (tissue, etc), references to GEO database id where all the details can be found.

There is also similar data for mouse, csv files with TSNE images, correlation matrices for genes.

Acknowledgements

The ARCHS4 project is by :

'Alexander Lachmann', 'alexander.lachmann@mssm.edu', update: '2020-02-06'
r
Single cell sequencing data from: The AML cellular state space unveils NPM1...
researchdata.se
figshare.scilifelab.se
Updated Oct 7, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Henrik Lilljebjörn; Thoas Fioretos (2025). Single cell sequencing data from: The AML cellular state space unveils NPM1 immune evasion subtypes with distinct clinical outcomes [Dataset]. http://doi.org/10.17044/SCILIFELAB.23715648
Explore at:
Unique identifier
https://doi.org/10.17044/SCILIFELAB.23715648
Dataset updated
Oct 7, 2025
Dataset provided by
Lund University
Authors
Henrik Lilljebjörn; Thoas Fioretos
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset contains 10X single cell 3' RNA sequencing gene expression data from from 38 AML-samples from the subtypes NPM1 (n=12), AML-MR (n=11), TP53 (n=7), CBFB::MYH11 (n=3), RUNX1::RUNX1T1 (n=3), AML without class defining mutations (n=1), and AML meeting the criteria for two subtypes (n=1). In addition, reference samples from normal bone marrow mononuclear cells (n=5) and CD34 sorted cells (n=3) are included. The single cell libraries were constructed from viably frozen cells from bone marrow (n=29+8) or peripheral blood (n=9) using the Chromium Single Cell 3' Library & Gel Bead Kit v3 (10X genomics) and sequenced on a Novaseq 6000 or NextSeq 500.

Data is available in h5 format for each sample, with raw count output from Cellranger, or as a processed Seurat object with scaled expression data, dimension reductions, and metadata.
n
Single-cell analysed data
data.ncl.ac.uk
zip
Updated Jun 24, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ioana Nicorescu (2025). Single-cell analysed data [Dataset]. http://doi.org/10.25405/data.ncl.28359179.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.25405/data.ncl.28359179.v1
Dataset updated
Jun 24, 2025
Dataset provided by
Newcastle University
Authors
Ioana Nicorescu
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This folder contains the following files and datasets:Flow Cytometry DataIndividual FCS files - Raw data files obtained following segmentationAnalysis file (pre-transformation) - Data analysis file before transformation, compatible with FCS ExpressAnalysis file (post-transformation) - Data analysis file after transformation, compatible with FCS ExpressDNS format files - Processed files analyzed following data transformationStatistical Analysis and FiguresManuscript figures - All figures from the manuscript in GraphPad Prism format, accessible with Numbers, including statistical test resultsData Extraction and Spatial AnalysisCluster percentages - Excel file containing individual cluster percentages extracted from the analysis fileSpatial neighborhood data - Excel file with all data used as starting point for spatial neighborhood map generationSpatial interaction maps - ZIP archive containing heatmaps showing spatial interactions between individual clustersPlease see the collection for related records https://doi.org/10.25405/data.ncl.c.7890872
FedscGen: privacy-aware federated batch effect correction of single-cell RNA...
zenodo.org
bin
Updated Jun 30, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mohammad Bakhtiari; Mohammad Bakhtiari (2025). FedscGen: privacy-aware federated batch effect correction of single-cell RNA sequencing data -- Preprocessed datasets [Dataset]. http://doi.org/10.5281/zenodo.11489844
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.11489844
Dataset updated
Jun 30, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Mohammad Bakhtiari; Mohammad Bakhtiari
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Jun 5, 2024
Description
This dataset accompanies the publication "FedscGen: Privacy-Aware Federated Batch Effect Correction of Single-Cell RNA Sequencing Data" and includes eight single-cell RNA sequencing (scRNA-seq) datasets used to benchmark the FedscGen and scGen methods. The datasets are provided in .h5ad format and include comprehensive metadata necessary for replication and further analysis.

Datasets

We analyze various datasets to compare FedscGen against scGen (centralized) in terms of batch correction. For simplicity, we refer to the dataset by abbreviations:

Cell Line (CL):

Derived from the 293t_jurkat experiment with three batches: Zheng et al., 2017.

Human Dendritic Cells (HDC):

scRNA-seq data of human dendritic cells across two batches: Villani et al., 2017.

Human Pancreas (HP):

Consolidated data from five sources with 14,767 cells each: Baron et al., 2016; Muraro et al., 2016; Segerstolpe et al., 2016; Wang et al., 2016; Xin et al., 2016.

Mouse Brain (MB):

Merged datasets with 691,600 and 141,606 cells: Saunders et al., 2018; Rosenberg et al., 2018.

Mouse Cell Atlas (MCA):

Data focusing on 11 cell types from various organs: Han et al., 2018; The Tabula Muris Consortium, 2018.

Mouse Hematopoietic Stem and Progenitor Cells (MHSPC):

Data from SMART-seq2 and MARS-seq protocols: Nestorowa et al., 2016; Paul et al., 2015.

Mouse Retina (MR):

Data from two unassociated laboratories with 26,830 and 44,808 cells: Macosko et al., 2015; Shekhar et al., 2016.

PBMC (human Peripheral Blood Mononuclear Cell):

scRNA-seq data with two batches: Zheng et al., 2017.

Usage Notes: Each dataset is provided in .h5ad format, compatible with common single-cell analysis tools such as Scanpy. Detailed metadata is included within each file.

Keywords: Single-cell RNA sequencing, scRNA-seq, Batch effect correction, Privacy-aware, Federated learning, scGen, FedscGen, Clinical multi-center studies, Genomics, Bioinformatics

Contact: For questions or further information, please contact Mohammad Bakhtiari at mohammad.bakhtiari@uni-hamburg.de.

License: Creative Commons Attribution 4.0 International (CC BY 4.0)
Additional file 3 of Pooling across cells to normalize single-cell RNA...
springernature.figshare.com
txt
Updated Jun 9, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Aaron L. Lun; Karsten Bach; John Marioni (2023). Additional file 3 of Pooling across cells to normalize single-cell RNA sequencing data with many zero counts [Dataset]. http://doi.org/10.6084/m9.figshare.c.3629252_D2.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3629252_D2.v1
Dataset updated
Jun 9, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Aaron L. Lun; Karsten Bach; John Marioni
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Enriched GO terms for library size normalization. This file is in a tab-separated format and contains the top 200 GO terms that were enriched in the set of DE genes unique to library size normalization. The fields are the same as described for Additional file 2. (13 KB PDF)
h
gtex-single-cell-rnaseq
huggingface.co
Updated Nov 22, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Lviv Polytechnic National University – Department of Artificial Intelligence Systems (2025). gtex-single-cell-rnaseq [Dataset]. https://huggingface.co/datasets/ai-department-lpnu/gtex-single-cell-rnaseq
Explore at:
Dataset updated
Nov 22, 2025
Dataset authored and provided by
Lviv Polytechnic National University – Department of Artificial Intelligence Systems
Description
GTEx Single-Cell RNA-seq Dataset

This repository provides tools to create a Hugging Face dataset from GTEx single-nucleus RNA-seq data, transforming the hierarchical H5AD format into a flat, ML-ready structure.

Overview Data Source

The data comes from GTEx's snRNA-seq atlas:

Source: GTEx Portal Publication: Eraslan et al., Science 2022 - "Single-nucleus cross-tissue molecular reference maps toward understanding disease gene function" Content: 209,126… See the full description on the dataset page: https://huggingface.co/datasets/ai-department-lpnu/gtex-single-cell-rnaseq.
utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...
zenodo.org
Updated Jan 9, 2026
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Nicholas Borcherding; Nicholas Borcherding (2026). utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments with TCR [Dataset]. http://doi.org/10.5281/zenodo.17977149
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.17977149
Dataset updated
Jan 9, 2026
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Nicholas Borcherding; Nicholas Borcherding
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Dec 11, 2025
Description
uTILity is a comprehensive, harmonized collection of publicly available single-cell RNA sequencing data from tumor-infiltrating T cells (TILs) with paired T cell receptor (TCR) sequencing. This resource aggregates data from 28 published studies spanning 13 tissue types, 420 unique patients, and over 2.6 million cells, with 1.8 million cells having associated TCR information.

Data Processing

All datasets were uniformly processed using the following pipeline:

Quality Control: Cells with >10% mitochondrial genes and/or 2.5× standard deviation from the mean number of features were excluded. Doublets were identified using scDblFinder.

Annotation: Automated cell type annotation was performed using:

SingleR with Human Primary Cell Atlas (HPCA) and Monaco reference datasets

Azimuth with the PBMC reference (providing L1, L2, and L3 annotations)

TCR Integration: T cell receptor data was processed using scRepertoire, with clonotypes assigned based on CDR3 amino acid sequences and gene usage.

Contents

This archive contains:

Seurat Objects (.rds): Fully processed R objects with gene expression, cell metadata, dimensional reductions, and TCR annotations

AnnData Files (.h5ad): Python-compatible exports for use with scanpy, scvi-tools, and related ecosystems

Processed Data: Intermediate files and per-cohort objects for users who wish to work with individual studies

Cancer Types Represented

Breast, Colorectal, Lung, Melanoma, Renal, Ovarian, HNSCC, Esophageal, Biliary, Endometrial, Merkel Cell, and multi-cancer cohorts.

Tissue Types

Tumor, Normal adjacent tissue, Peripheral blood, Lymph node, Metastatic lesions, and Juxtatumoral tissue.

Usage

This data is intended for researchers studying tumor immunology, T cell biology, and computational methods for single-cell analysis. Users can leverage the harmonized annotations and TCR data for:

Pan-cancer T cell phenotype analysis

TCR repertoire studies across cancer types

Benchmarking integration and annotation methods

Training and validating machine learning models

For analysis code and the processing pipeline, see the associated GitHub repository.

File Formats

.h5ad (Hierarchical Data Format) AnnData objects compatible with the Python single-cell ecosystem.

X: Raw count matrix (sparse CSR)

obs: Cell metadata

var: Gene metadata

obsm: Embeddings (PCA, UMAP, HARMONY, etc.)

Load in Python with:

import scanpy as sc adata = sc.read_h5ad("adata.h5ad")

Load in R with:

library(Seurat) obj <- as.Seurat(readRDS("adata.h5ad"))

Metadata Columns

See metadata_headers.txt in the GitHub repository for complete descriptions: https://github.com/ncborcherding/utility/blob/main/summary/metadata_headers.txt

Key columns:

orig.ident: Sample identifier (tumor type + tissue)

predicted.celltype.l1/l2/l3: Azimuth annotations

Monaco.labels / HPCA.labels: SingleR annotations

CTaa: Clonotype by CDR3 amino acid sequence

clonalFrequency: Clone count within sample

clonalProportion: Clone proportion within sample

SUGGESTED CITATION FORMAT

Borcherding, N. (2025). uTILity: Comprehensive Single-Cell Tumor-Infiltrating Lymphocyte Data with Paired TCR Sequencing (Version 1.0.0) [Dataset]. Zenodo. https://doi.org/10.5281/zenodo.10211240
r
Single cell data
resodate.org
Updated Feb 2, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Stefan Bohn; Lorenz Hexemer; Zixin Huang; Laura Strohmaier; Sonja Lenhardt; Stefan Legewie; Alexander Loewer (2023). Single cell data [Dataset]. https://resodate.org/resources/aHR0cHM6Ly90dWRhdGFsaWIudWxiLnR1LWRhcm1zdGFkdC5kZS9oYW5kbGUvdHVkYXRhbGliLzM3MjUuMg==
Explore at:
Dataset updated
Feb 2, 2023
Dataset provided by
Technische Universität Darmstadt
TUdatalib
Authors
Stefan Bohn; Lorenz Hexemer; Zixin Huang; Laura Strohmaier; Sonja Lenhardt; Stefan Legewie; Alexander Loewer
Description
Time-resolved analysis of nuclear-to-cytoplasmic SMAD2 ratio in individual cells. For some datasets, data regarding motility and cell death is included as well.

Data is provided in CSV format and generally organized in time points (rows) and individual cells (columns). For each experiment, several files are provided:

_data.csv - nuc/cyt SMAD2 ratio _conditions.csv - labeling of experimental conditions _map.csv - vector mapping individual cells to experimental conditions, numbering is according to the order given in the corresponding _conditions.csv file. _timeLine.csv - time points for measurements given in minutes _motility.csv - distance moved per time point given in µm/h _division.csv - number of divisions for each cells _fractiondead.csv - fraction of dead cells per field of view - please note that this data is not resolved at the single cell level!

The MATLAB script "ReproduceFigures.m" allows to reproduce most data panels from the publication and should help to guide you through the data. Effect sizes need to be calculated separately using the function "permTest.m" and the parameters given in the publication.
CellxGene-1K
kaggle.com
zip
Updated Apr 5, 2025
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Darien Schettler (2025). CellxGene-1K [Dataset]. https://www.kaggle.com/datasets/dschettler8845/cellxgene-1k/data
Explore at:
zip(64274758 bytes)Available download formats
Dataset updated
Apr 5, 2025
Authors
Darien Schettler
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
CellXGene-1K

GCS PATH: gs://kds-2dfa91b267e9146f17786893547814ae5688af7ddeab756631a60ffa

DATASET OVERVIEW

A curated dataset of approximately 7,000 healthy human single cells (approx. 1,000 per tissue) sourced from the CellXGene Census, covering seven major tissues: heart, blood, brain, lung, kidney, intestine, and pancreas.

This is 1 of 4 datasets focusing on providing progressively larger, ready-to-use collections of healthy human single-cell RNA sequencing data in the H5AD format.

The goal is to offer standardized benchmarks/datasets derived from CellXGene for exploring fundamental scRNA-seq analysis, understanding multi-tissue cellular composition, developing and testing computational models, and evaluating method scalability across different orders of magnitude.

With its manageable size (approx. 7k total cells), this specific dataset serves as an excellent starting point for exploration, initial model development, or educational purposes.

Additional Information

This dataset provides a focused collection of single-cell transcriptomic profiles representing healthy human tissues, curated from the comprehensive CZ CELLxGENE Discover Census (CellXGene) from the latest (Jan 2025) stable release.

It includes data exclusively from Homo sapiens cells annotated as 'normal' or 'healthy' and in 'cell' suspension. The dataset is specifically balanced to contain approximately 1,000 cells from each of the following seven vital tissues: heart, blood, brain, lung, kidney, intestine, and pancreas.

With a total size of roughly 7,000 cells, this collection offers a manageable yet diverse snapshot of baseline cellular states across different organ systems. It is well-suited for comparative analyses of healthy cell types and gene expression signatures across these tissues, for benchmarking computational analysis tools on a multi-tissue dataset, or for educational exploration of single-cell data principles. This subset provides a representative sample while reducing the computational burden associated with analyzing the full CellXGene Census.
Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE...
zenodo.org
data.niaid.nih.gov
+1more
application/gzip, zip
Updated Jan 8, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric Sun; Eric Sun (2024). Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE spatial gene expression prediction [Dataset]. http://doi.org/10.5281/zenodo.8259942
Explore at:
application/gzip, zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.8259942
Dataset updated
Jan 8, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Eric Sun; Eric Sun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Dataset folders from "TISSUE: uncertainty-calibrated prediction of single-cell spatial transcriptomics improves downstream analyses". If using the processed data or TISSUE algorithm, please cite: https://doi.org/10.1101/2023.04.25.538326.

The directory of datasets are compressed in tar gzip format. The top level contains folders with dataset names and within each of those folders, there are the relevant data files which include:

- Spatial_count.txt --- a tab-delimited file containing spatial transcriptomics counts matrix

- scRNA_count.txt --- a tab-delimited file containing RNAseq counts matrix

- Locations.txt --- a tab-delimited file containing the (x,y) spatial coordinates of cells in the spatial transcriptomics data

- Metadata.txt --- for some datasets, this is a comma-separated file containing the metadata table for the spatial transcriptomics data

These files are formatted and organized to be read into AnnData objects using the native loading functions in the TISSUE package (https://github.com/sunericd/TISSUE). Some folders will also have additional accessory files such as gene lists corresponding to some experiments present in our manuscript and/or adjacency matrix objects.

Also included are the two simulated spatial transcriptomics datasets that we generated using SRTsim.

The SVZ folders contain our processed MERFISH spatial transcriptomics dataset on the adult mouse subventricular zone. Refer to the SVZFullFinal folder for the full dataset with TISSUE-informed cell labels. All other folders are processed data accessed from publicly available sources. The identity of numbered folders can be found in the Data Availability statement of the benchmarking paper from which they were retrieved: https://doi.org/10.1038/s41592-022-01480-9

"svz_merfish_data.zip" includes the raw MERFISH dataset on the adult mouse subventricular zone.
E
Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma...
ega-archive.org
Updated Apr 18, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma cells of untreated celiac disease patients [Dataset]. https://ega-archive.org/datasets/EGAD50000000339
Explore at:
Dataset updated
Apr 18, 2024
License
https://ega-archive.org/dacs/EGAC50000000162https://ega-archive.org/dacs/EGAC50000000162
Description
The dataset contains processed sequencing data from Chromium Single Cell 5’ gene expression, human B cell VDJ and feature barcode (CSP) sequencing from transglutaminase 2-specific and other small intestinal plasma cells isolated from four untreated celiac disease patients. The raw sequencing data has been processed with Cell Ranger v.6.0.2 with the multi and aggr functions using the pre-built Cell Ranger references GRCh38 version 2020-A for gene expression and GRCh38-alts-ensembl-5.0.0 for V(D)J analysis. The dataset consists of a gene expression and antibody capture expression matrix (cell barcodes and feature names in tsv.gz file, expression matrix in mtx.gz file) and VDJ sequences in AIRR format (csv file). A metadata file (csv file) details cells passing our custom quality control based on number of detected genes, UMIs, mitochondrial genes, immunoglobulin genes and a productively rearranged immunoglobulin heavy chain of the IgA isotype.
e
Single cell RNA-sequencing of Spike-ins and mESC using STRT-Seq on C1 System...
ebi.ac.uk
Updated Mar 14, 2017
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guy Emerton; Valentine Svensson (2017). Single cell RNA-sequencing of Spike-ins and mESC using STRT-Seq on C1 System [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-5482/
Explore at:
Dataset updated
Mar 14, 2017
Authors
Guy Emerton; Valentine Svensson
Description
In this study, we assess technical differences between commonly used single-cell RNA-Sequencing (scRNA-Seq) methods. We perform scRNA-seq on a homogenous population of mouse embryonic stem cells along with two kinds of control spike-in molecules to assess sensitivity and accuracy of these specific methods. In this dataset, we perform STRT-seq method on Fluidigm C1 system and generate single-cell libraries using Nextera XT kit. Please note the sample-data relationship format (SDRF) file for this submission contains only a high-level representation of all sample, library and run information, and not per cell. For meta-data at the level of individual cells, please refer to the supplementary file called single_cells_list.txt, which is included as part of this ArrayExpress submission.
r
Smart-seq3 and Smart-seq3xpress single-cell RNA sequencing of bone marrow...
researchdata.se
Updated Nov 6, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pedro Luis Moura; Eva Hellström-Lindberg (2023). Smart-seq3 and Smart-seq3xpress single-cell RNA sequencing of bone marrow cells from MDS-RS patients [Dataset]. http://doi.org/10.48723/0f0c-p816
Explore at:
(825), (1600), (830)Available download formats
Unique identifier
https://doi.org/10.48723/0f0c-p816
Dataset updated
Nov 6, 2023
Dataset provided by
Karolinska Institutet
Authors
Pedro Luis Moura; Eva Hellström-Lindberg
License
Attribution-NonCommercial-NoDerivs 4.0 (CC BY-NC-ND 4.0)https://creativecommons.org/licenses/by-nc-nd/4.0/
License information was derived automatically
Description
This dataset consists of Smart-seq3 single-cell RNA sequencing data of purified RS from the bone marrow and peripheral blood of 2 MDS-RS patients; and Smart-seq3xpress single-cell RNA sequencing data of FACS-sorted hematopoietic stem cells (HSC), multipotent progenitors (MPP), megakaryocyte-erythroid progenitors (MEP) and erythroblasts from 1 MDS-RS patient. The objective of this data collection was to assess several parameters on how the bone marrow of MDS-RS patients differs from that of healthy donors.

This dataset includes raw sequencing data in .fastq format, processed count matrices and associated pseudonymized metadata.

Processing: In brief, cells were sorted into 384-well plates containing 3uL Vapor-Lock (Qiagen) and 0.3uL lysis buffer consisting of 0.125 µM OligodT30VN (5'-Biotin-ACGAGCATCAGCAGCATACGAT30VN-3'; IDT) adjusted to reverse transcription (RT), 0.5mM dNTPs/each adjusted to RT volume, 0.1% Triton X-100, 5% PEG8000 adjusted to RT volume, 0.4u RNase Inhibitor (Takara Bio, 40 U/µL). After cell sorting plates were briefly centrifuged before storage at -80C. Before RT, plates were denatured at 72 degrees for 10 min followed by addition of 0.1 µL of RT mix; 25 mM Tris-HCL pH 8.4 (Fischer Scientific), 30mM NaCl (Ambion), 1 mM GTP (Thermo Fisher Scientific), 2.5 mM MgCl2 (Ambion), 8 mM DTT (Thermo Fisher Scientific), 0.25 U/µl RNase Inhibitor (Takara Bio), 0.75 µM Template Switching Oligo (TSO) (5′-Biotin-AGAGACAGATTGCGCAATGNNNNNNNNWWrGrGrG-3′; IDT) and 2 U/µl of Maxima H Minus reverse transcriptase (Thermo Fisher Scientific). Plates were quickly centrifuged after dispensing to ensure merge of lysis and RT volumes. RT was incubated at 42 °C for 90 minutes, followed by ten cycles of 50 °C for 2 minutes and 42 °C for 2 minutes. After RT, 0.6 µL PCR mix was dispensed to each well containing the following; 1× SeqAmp PCR buffer (Takara Bio), 0.025 U/µl of SeqAmp polymerase (Takara Bio) and 0.5 µM Smartseq3 forward and reverse primer. Plates were quickly spun down before being incubated as follows: 1 minute at 95 °C for initial denaturation, 14 cycles of 10 seconds at 98 °C, 30 seconds at 65 °C and 2–6 minutes at 68 °C. Final elongation was performed for 10 minutes at 72 °C.

The dataset consists of 2 folders: - SS3_FACS_PB-BM_RS - SS3xpress_FACS_HSC_MPP_MEP_EB

The folder SS3_FACS_PB-BM_RS contains 1 rds file, 3 txt files, and 1 compressed folder (tar.gz) with fastq files. The folder SS3xpress_FACS_HSC_MPP_MEP_EB contains 1 rds file, 7 txt files, and 2 GNU zipped fastq files.

The documentation file File_list_SS3_SS3xpress.txt contains a full list of the files in the dataset.
Scanpy Pipeline GSE145926 HDF5 Ingestion Plotly
kaggle.com
zip
Updated Dec 4, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Dr. Nagendra (2025). Scanpy Pipeline GSE145926 HDF5 Ingestion Plotly [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/scanpy-pipeline-gse145926-hdf5-ingestion-plotly
Explore at:
zip(4663836 bytes)Available download formats
Dataset updated
Dec 4, 2025
Authors
Dr. Nagendra
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
This dataset contains single-cell RNA sequencing (scRNA-seq) data processed using the Scanpy pipeline.

It focuses on the GSE145926 dataset from publicly available sources.

The data has been ingested and stored in HDF5 format for easy access and manipulation.

It includes pre-processed expression matrices suitable for downstream analysis.

The dataset enables exploratory analysis using Plotly interactive visualizations.

It allows researchers to examine gene expression patterns at single-cell resolution.

Includes metadata annotations for cell types and experimental conditions.

Facilitates differential expression analysis and cell clustering investigations.

Supports visualization of key immune markers such as CD3E across cell populations.

Designed for bioinformaticians, computational biologists, and immunology researchers.

Provides an end-to-end demonstration of Scanpy workflow in Python.

Enables reproducibility and further expansion for custom analyses.
MOESM11 of Benchmarking principal component analysis for large-scale...
springernature.figshare.com
application/x-gzip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido (2023). MOESM11 of Benchmarking principal component analysis for large-scale single-cell RNA-sequencing [Dataset]. http://doi.org/10.6084/m9.figshare.11662101.v1
Explore at:
application/x-gzipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.11662101.v1
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Koki Tsuyuzaki; Hiroyuki Sato; Kenta Sato; Itoshi Nikaido
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Additional file 11 Pair plots of all the pCA (Brain) implementations.

Facebook

Twitter

Click to copy link

Link copied

Cite

Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.15482/USDA.ADC/1522411

Dataset updated

Nov 21, 2025

Dataset provided by

Ag Data Commons

Authors

Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle

License

Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically

Description

This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

*The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().

Clear search

Close search

Google apps

Main menu

Data from: Reference transcriptomics of porcine peripheral immune cells...

SCimilarity Tutorial Data

Single-Cell RNA Data Portal for Alzheimer's Disease

Single-Cell RNA Data Portal for Alzheimer's Disease

Aggregated Data Statistics

Characteristics of Dataset grouped by Data Source

Ageing_Exercise_Single_Cell

10X single-cell RNA sequencing of bone marrow cells from MDS-RS patients and...

Multiple Single Cell RNA Expressions ARCHS4

Context

Content

Acknowledgements

Single cell sequencing data from: The AML cellular state space unveils NPM1...

Single-cell analysed data

FedscGen: privacy-aware federated batch effect correction of single-cell RNA...

Datasets

Additional file 3 of Pooling across cells to normalize single-cell RNA...

gtex-single-cell-rnaseq

utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...

Data Processing

Contents

Cancer Types Represented

Tissue Types

Usage

File Formats

Metadata Columns

SUGGESTED CITATION FORMAT

Single cell data

CellxGene-1K

CellXGene-1K

DATASET OVERVIEW

Additional Information

Single-cell Spatial Transcriptomics Data with Paired RNAseq for TISSUE...

Processed Chromium Single Cell GEX, CSP and VDJ data from intestinal plasma...

Single cell RNA-sequencing of Spike-ins and mESC using STRT-Seq on C1 System...

Smart-seq3 and Smart-seq3xpress single-cell RNA sequencing of bone marrow...

Scanpy Pipeline GSE145926 HDF5 Ingestion Plotly

MOESM11 of Benchmarking principal component analysis for large-scale...

Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing