40 datasets found
  1. b

    Chan Zuckerberg CELLxGENE Collection

    • bioregistry.io
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Chan Zuckerberg CELLxGENE Collection [Dataset]. https://bioregistry.io/cellxgene.collection
    Explore at:
    Dataset updated
    May 7, 2025
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Assigns identifiers to collections of datasets indexed by CELLxGENE.

    CELLxGENE is an interactive data visualization and exploration tool developed by the Chan Zuckerberg Initiative that enables researchers to analyze and share single-cell genomics datasets. It provides a user-friendly interface for biologists and computational scientists to interrogate gene expression patterns across different cell types.

  2. d

    CZ CELLxGENE Discover

    • dknet.org
    • scicrunch.org
    • +1more
    Updated Jan 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). CZ CELLxGENE Discover [Dataset]. http://identifiers.org/RRID:SCR_024894/resolver?q=&i=rrid
    Explore at:
    Dataset updated
    Jan 17, 2024
    Description

    Portal used to find and download any of data sets published on CELLxGENE. Allows to download and visually explore data to understand functionality of human tissues at cellular level. Optimized for finding, exploring, and reusing single cell data. Collections Page lists collections hosted on CELLxGENE Discover and metadata that define tissue, assay, disease, organism, and cell count for each collection. Once you find published dataset of interest on CELLxGENE Discover, you can click on the explore button below the dataset description to explore the cells of that dataset using the CELLxGENE Explorer.

  3. CellxGene-100K

    • kaggle.com
    zip
    Updated Apr 5, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Darien Schettler (2025). CellxGene-100K [Dataset]. https://www.kaggle.com/datasets/dschettler8845/cellxgene-100k
    Explore at:
    zip(5611138387 bytes)Available download formats
    Dataset updated
    Apr 5, 2025
    Authors
    Darien Schettler
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    CellXGene-100K

    GCS LINK: gs://kds-6860773353013302b6e19605df3e5195ee14d269d4d746edb218f8ff


    DATASET OVERVIEW

    A curated dataset of approximately 700,000 healthy human single cells (approx. 100,000 per tissue) sourced from the CellXGene Census, covering seven major tissues: * heart * blood * brain * lung * kidney * intestine * pancreas.

    This is 1 of 4 datasets focusing on providing progressively larger, ready-to-use collections of healthy human single-cell RNA sequencing data in the H5AD format.

    The goal is to offer standardized benchmarks/datasets derived from CellXGene for exploring fundamental scRNA-seq analysis, understanding multi-tissue cellular composition, developing and testing computational models, and evaluating method scalability across different orders of magnitude.

    This dataset provides a focused collection of single-cell transcriptomic profiles representing healthy human tissues, curated from the comprehensive CZ CELLxGENE Discover Census (CellXGene) from the latest (Jan 2025) stable release. It includes data exclusively from Homo sapiens cells annotated as 'normal' or 'healthy' and in 'cell' suspension.

    With its somewhat manageable size (approx. 700k total cells), this dataset serves as an excellent middle ground for exploration, model development, and scaling to larger use-cases.

  4. b

    Chan Zuckerberg CELLxGENE Dataset

    • bioregistry.io
    Updated May 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Chan Zuckerberg CELLxGENE Dataset [Dataset]. https://bioregistry.io/cellxgene.dataset
    Explore at:
    Dataset updated
    May 7, 2025
    Description

    Assigns identifiers to datasets indexed by CELLxGENE, such those resulting from scRNA-seq experiments

  5. S

    Pretrained checkpoints of models by scCompass and CELLxGENE--scGPT

    • scidb.cn
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengfei Wang (2025). Pretrained checkpoints of models by scCompass and CELLxGENE--scGPT [Dataset]. http://doi.org/10.57760/sciencedb.22054
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Pengfei Wang
    License

    https://mit-license.orghttps://mit-license.org

    Description

    This project utilizes the scCompass and CELLxGENE datasets with data scales of 100K, 200K, 500K, 1M, 2M, and 5M to pre-train model: scGPT.

  6. Z

    Cellxgene VIP snRNA-seq demo dataset for visualization and DE analysis

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    Updated Apr 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    KEJIE LI; Zhengyu Ouyang (2022). Cellxgene VIP snRNA-seq demo dataset for visualization and DE analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6425901
    Explore at:
    Dataset updated
    Apr 9, 2022
    Dataset provided by
    BioinfoRx
    Biogen
    Authors
    KEJIE LI; Zhengyu Ouyang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    H5ad file can be used as demo input for Cellxgene VIP. Dataset was the re-process from Schirmer et al Nature 2019 paper by using the raw fastq files. In order to reproduce the h5ad file, details could be found in https://github.com/interactivereport/cellxgene_VIP/blob/master/notebook/MS_Nature_Rowitch_snRNAseq.ipynb Two rds files are also included here which are the input files for sample differential expression (DE) analysis scripts (glmmTMB and Nebula)

  7. Z

    10X Genomics Human Visium Spatial Transcriptomics Demo Dataset for Cellxgene...

    • data-staging.niaid.nih.gov
    • zenodo.org
    Updated Dec 8, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, Kejie (2021). 10X Genomics Human Visium Spatial Transcriptomics Demo Dataset for Cellxgene VIP [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5524882
    Explore at:
    Dataset updated
    Dec 8, 2021
    Dataset provided by
    Biogen
    Authors
    Li, Kejie
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    4 Visium Spatial Transcriptomics datasets downloaded 10X Genomics data site ,and organized in the way to be used for Cellxgene VIP input.

    10X_demo_data_Breast_Cancer_Block_A_Section_1 10X_demo_data_Breast_Cancer_Block_A_Section_2 10X_demo_data_Human_Heart 10X_demo_data_Human_Lymph_Node

  8. S

    ScCompass and CELLxGENE Training Datasets--scGPT

    • scidb.cn
    Updated Mar 14, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengfei Wang (2025). ScCompass and CELLxGENE Training Datasets--scGPT [Dataset]. http://doi.org/10.57760/sciencedb.22043
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Pengfei Wang
    License

    https://mit-license.orghttps://mit-license.org

    Description

    ScCompass and CELLxGENE Training Datasets: Human and Mouse for scGPT.

  9. scdrs.cellxgene

    • figshare.com
    hdf
    Updated Sep 6, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Martin Jinye Zhang (2021). scdrs.cellxgene [Dataset]. http://doi.org/10.6084/m9.figshare.15065061.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Sep 6, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Martin Jinye Zhang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    h5ad objects for cellxgene visualization of scDRS results: - scdrs_tmsfacs_thin.h5ad: scDRS results for the TMS FACS data of 110,096 cells (gene count matrix removed to save space)- scdrs_demo.h5ad: demo scDRS results for 3 TMS FACS cell types and 3 diseases (gene count matrix removed to save space)

  10. Cell_Gene_Expression_Metadata

    • kaggle.com
    zip
    Updated Sep 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kazi Aishikuzzaman (2025). Cell_Gene_Expression_Metadata [Dataset]. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata
    Explore at:
    zip(845887409 bytes)Available download formats
    Dataset updated
    Sep 24, 2025
    Authors
    Kazi Aishikuzzaman
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview This dataset contains comprehensive metadata from single-cell gene expression studies, providing researchers with structured information about cellular phenotypes, experimental conditions, and sample characteristics. The data is particularly valuable for bioinformatics research, machine learning applications in genomics, and comparative studies across different cell types and conditions.

    Dataset Description: The dataset comprises metadata associated with single-cell RNA sequencing (scRNA-seq) experiments, including: Cell Type Information: Classification of different cell types and subtypes Experimental Metadata: Details about experimental conditions, protocols, and methodologies Sample Characteristics: Information about biological samples, including tissue origin, developmental stages, and treatment conditions Quality Metrics: Data quality indicators and filtering parameters Annotation Details: Standardized cell type annotations and biological classifications

    Data Source and Licensing This dataset is derived from publicly available single-cell gene expression data, potentially sourced from: CELLxGENE Data Portal (https://cellxgene.cziscience.com/) Gene Expression Omnibus (GEO) European Bioinformatics Institute (EBI) Other public genomics repositories

    License: Creative Commons CC BY 4.0 (or specify the actual license) ✅ Commercial use allowed ✅ Modification allowed ✅ Distribution allowed ✅ Private use allowed ❗ Attribution required

    Research Applications Cell Type Discovery: Identify novel cell types and subtypes Comparative Genomics: Study cellular differences across conditions, tissues, or species Disease Research: Investigate cellular changes in disease states Developmental Biology: Analyze cellular differentiation and development patterns

    Machine Learning Applications Classification Tasks: Predict cell types from gene expression data Clustering Analysis: Discover cellular subpopulations and states Dimensionality Reduction: Apply PCA, t-SNE, UMAP for visualization Biomarker Discovery: Identify genes characteristic of specific cell types

    Educational Use : Teaching bioinformatics and computational biology concepts. Demonstrating single-cell analysis workflows. Training in data preprocessing and quality control.

    Data Quality and Preprocessing : Quality Control: Metadata has been curated and standardized Missing Values: [Specify how missing values are handled] Standardization: Cell type annotations follow established ontologies (e.g., Cell Ontology) Validation: Data has been cross-referenced with original publications

    Usage Guidelines : Getting Started- Load the metadata files using pandas or your preferred data analysis tool. Explore the cell type distributions and experimental conditions. Filter data based on quality metrics as needed. Join with corresponding gene expression data for comprehensive analysis.

    Best Practices Always cite original data sources and publications. Consider batch effects when combining data from different experiments. Validate findings with independent datasets when possible. Follow established bioinformatics workflows for single-cell analysis.

    Citation and Acknowledgments : If you use this dataset in your research, please: Cite this dataset:[Kazi Aishikuzzaman]. (2024). Cell Gene Expression Metadata. Kaggle. https://www.kaggle.com/datasets/kaziaishikuzzaman/cell-gene-expression-metadata

    File Structure : dataset- ─ metadata_summary.csv # Main metadata file ─ cell_type_annotations.csv # Detailed cell type information
    ─ experimental_conditions.csv # Experiment-specific metadata ─ quality_metrics.csv # Data quality indicators ─ README.txt # Detailed file descriptions

    Technical Specifications : File Encoding: UTF-8 Separator: Comma-separated values (CSV) Missing Values: Represented as 'NA' or empty cells Data Types: Mixed (categorical, numerical, text)

    Contact and Support : For questions about this dataset: Kaggle Profile: @kaziaishikuzzaman Dataset Issues: Use Kaggle's discussion section Collaboration: Open to research collaborations and improvements

    Version History : v1.0: Initial release with comprehensive metadata collection [Future versions]: Updates and additional annotations as available

    Related Datasets: Consider exploring these complementary datasets- Single-cell gene expression data (companion to this metadata) Cell atlas datasets from major consortiums Disease-specific single-cell studies Multi-omics datasets with matching cell types

    Keywords: single-cell, RNA-seq, genomics, cell types, metadata, bioinformatics, machine learning, computational biology Category: Biology > Genomics

  11. Mouse Brain snRNASeq Demo Dataset for Cellxgene VIP

    • data.niaid.nih.gov
    Updated Jun 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li, KEJIE; Sheehan, Mark; Zhang, Baohong (2022). Mouse Brain snRNASeq Demo Dataset for Cellxgene VIP [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6626455
    Explore at:
    Dataset updated
    Jun 10, 2022
    Dataset provided by
    Biogenhttp://biogen.com/
    Authors
    Li, KEJIE; Sheehan, Mark; Zhang, Baohong
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    snRNASeq data generated at Biogen from 3 control mouse brains. Each brain picked 3 brain regions.

    Animal IDs 1, 4 and 7

    Brain region codes: W: WhiteMatter H: Hippo G: GreyMatter

    10X standard mm10 (3.0.0) reference was used, on cellranger 5.0.0 with --include-introns on.

  12. Single-Cell RNA Data Portal for Alzheimer's Disease

    • zenodo.org
    zip
    Updated Apr 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Theodoros Siozos; Theodoros Siozos; Christos Petrou; Christos Petrou; ATHANASIOS BALOMENOS; ATHANASIOS BALOMENOS; Yannis Kopsinis; Yannis Kopsinis (2025). Single-Cell RNA Data Portal for Alzheimer's Disease [Dataset]. http://doi.org/10.5281/zenodo.15295744
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 30, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Theodoros Siozos; Theodoros Siozos; Christos Petrou; Christos Petrou; ATHANASIOS BALOMENOS; ATHANASIOS BALOMENOS; Yannis Kopsinis; Yannis Kopsinis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single-Cell RNA Data Portal for Alzheimer's Disease

    The single cell Alzheimer's Disease Data Portal is an aggregated data portal created as part of the Enfield EU Funded program for the single-cell Generative Pretrained Transformer (scGPT-AD) model research. The data portal contains data from the ssREAD data portal, along with single-cell AD data from latest studies (dharsini et al, pan et al, rexach et al). The data from the individual studies where accessed through the cellXgene data portal, a vast portal for single cell data. The data have been uploaded in two seperate .zip files (part1, part2).

    The single cell data follow the Annotated Data format. The core data for each sample is the gene-expression matrix, which refers to the level of expression of each gene in a single cell. Additionally, the dataset contains the `.obs` attributed which includes core cell metadata for each of the sample (cell type, brain region, braak stage, donor age, disease condition, donor gender, etc.), along with the gene names accessed via `.var` attribute.

    The source data have been processed to create a unified data portal ready to be used as training dataset for a Transformer model. The main processing steps were:

    • convert ssREAD data from `.qsave` format to `.h5ad` format that aligns with the AnnData framework
    • discard some unprocessable data samples
    • standardize metadata column names
    • process categorical data to create a unified namespace (e.g.: merge `microglia` and `microgrial` cell type names into one)
    • standardize all gene names to be upper-cased
    • discard dimensionality reduction and clustering attributes, to make a lightweight version of the data portal, since they are not meant to be used in Transformer model training

    Aggregated Data Statistics

    Total Cells

    2.3M

    AD Cells

    1.2M

    Control Cells

    1.1M

    Unique Genes

    91k

    Donors

    166

    Characteristics of Dataset grouped by Data Source

    Data Source

    Unique Genes

    Total Cells

    AD Cells

    Control Cells

    Donors

    Cell Type Label

    Brain Region

    Tissue Type

    Braak Stage

    Donors Id

    Donor Gender

    Donor Age

    rexach et al

    30k

    217k

    118k

    99k

    20

    pan et al

    61k

    43k

    11k

    32k

    7

    dharsini et al

    61k

    425k

    311k

    114k

    46

    ssREAD

    62k

    2.42M

    1.14M

    1.28M

    135

  13. h

    tabula-muris-senis-bladder-smartseq2

    • huggingface.co
    Updated Dec 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    2025 Longevity x AI Hackathon (2025). tabula-muris-senis-bladder-smartseq2 [Dataset]. https://huggingface.co/datasets/longevity-db/tabula-muris-senis-bladder-smartseq2
    Explore at:
    Dataset updated
    Dec 26, 2025
    Dataset authored and provided by
    2025 Longevity x AI Hackathon
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Bladder Tissue from Tabula Muris Senis

    Tabula Muris Senis is a mammalian aging single-cell gene expression dataset, downloaded from https://cellxgene.cziscience.com/collections/0b9d8a04-bb9d-44da-aa27-705bb65b54eb. This dataset represents the Bladder tissue, using the SmartSeq2 full-length mRNA library preparation method for single cells. Code to download and process this dataset is available in: https://github.com/seanome/2025-longevity-x-ai-hackathon

    Ageing is characterized by a… See the full description on the dataset page: https://huggingface.co/datasets/longevity-db/tabula-muris-senis-bladder-smartseq2.

  14. S

    Single Cell Analysis Software Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Single Cell Analysis Software Report [Dataset]. https://www.datainsightsmarket.com/reports/single-cell-analysis-software-1963380
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jun 25, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2026 - 2034
    Area covered
    Global
    Variables measured
    Market Size
    Description

    Discover the booming single-cell analysis software market! Our in-depth report reveals key trends, growth drivers, leading companies (Cellenics, BioTuring Browser, 10x Genomics Loupe Browser, etc.), and future projections through 2033. Learn about market segmentation and regional analysis to gain a competitive edge.

  15. Human Retina Cell Atlas reference model

    • zenodo.org
    bin, csv
    Updated Nov 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jin Li; Jin Li; Rui Chen; Rui Chen (2024). Human Retina Cell Atlas reference model [Dataset]. http://doi.org/10.5281/zenodo.14014720
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Nov 8, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jin Li; Jin Li; Rui Chen; Rui Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset hosts files needed to reproduce the Human Retina Cell Atlas (HRCA) reference model using scArches. The HRCA data can be accessed through several interactive browsers, including HCA Data Portal, CELLxGENE, UCSC Cell Browser, and the Broad Single Cell Portal. Please use these browsers for atlas exploration and visualization. For more information on HRCA, please refer to the HRCA paper (Li et al., bioRxiv 2023) and the Github repository at https://github.com/RCHENLAB/HRCA_reproducibility. This dataset has been used in the tutorial for the HRCA reference model at https://github.com/RCHENLAB/HRCA_reproducibility/tree/main/scArches.

    Data description:

    1. HRCA_snRNA_allcells_rawcounts.h5ad

    This file contains the cell-by-gene count matrix for over 3.1 million single nuclei and more than 36,000 gene features of the HRCA. Gene features are represented by gene symbols. Please refer to the interactive browsers for atlas exploration, where gene features are mapped to Ensembl IDs. In the cell metadata, "sampleid" indicates sample batches of cells, and "celltype" specifies 123 retina cell types.

    2. model.pt

    This file is the trained reference model using scArches, incorporating 10,000 highly variable features from the full count matrix. It can be directly used for cell type annotation of new retina samples.

    3. HRCA_snRNA_allcells_rawcounts_latent.h5ad

    This file contains the embeddings of all 3.1 million reference single nuclei generated by the trained reference model using scArches. These embeddings can be used to compare with the embeddings of query data for exploration.

    4. HRCA_reference_model_gene_id_and_symbol.csv

    This file contains the mapping of Ensembl IDs to gene symbols for the 10,000 features used in the reference model. This mapping can be used to convert the gene features in a query .h5ad file from gene IDs to gene symbols, allowing cell type labels to be predicted using the trained reference model, which uses gene symbols as gene features.

    5. query.h5ad

    This file contains a cell-by-gene count matrix for a query dataset, designed to support reproducibility in the HRCA reference model tutorial. The "majorclass" column includes pre-annotated major cell classes. Additional details on the tutorial are available at https://github.com/RCHENLAB/HRCA_reproducibility/tree/main/scArches.

    6. query_latent.h5ad

    This file contains the embeddings of the query data against the trained reference model. These embeddings can be compared with the reference data embeddings for exploration and visualization.

  16. scRNA-seq "Tabula sapiens" - human, Part 2

    • kaggle.com
    zip
    Updated Feb 5, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexander Chervov (2022). scRNA-seq "Tabula sapiens" - human, Part 2 [Dataset]. https://www.kaggle.com/datasets/alexandervc/scrnaseq-tabula-sapiens-human-part-2
    Explore at:
    zip(7504637468 bytes)Available download formats
    Dataset updated
    Feb 5, 2022
    Authors
    Alexander Chervov
    Description

    Remark 1: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev

    Remark 2: The first of the data see in https://www.kaggle.com/alexandervc/scrnaseq-tabula-sapiens-human-500-000-cells

    Data and Context

    Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (or vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics

    Particular data: "Tabula Sapiens" project: https://tabula-sapiens-portal.ds.czbiohub.org/ Data section for download: https://figshare.com/articles/dataset/Tabula_Sapiens_release_1_0/14267219 Paper: https://www.science.org/doi/10.1126/science.abl4896 https://www.biorxiv.org/content/10.1101/2021.07.19.452956v2

    Tabula Sapiens is a benchmark, first-draft human cell atlas of nearly 500,000 cells from 24 organs of 15 normal human subjects. This work is the product of the Tabula Sapiens Consortium. Special thanks to the Chan Zuckerberg Initiative for funding this project and to the CZI Science Technology team for creating cellxgene, the tool that makes the visualization of this research possible.

    See also tutorials:

    Course at Sanger's institute https://scrnaseq-course.cog.sanger.ac.uk/website/tabula-muris.html

    Course at CZ-hub: https://chanzuckerberg.github.io/scRNA-python-workshop/intro/about

    On kaggle - copies of the notebooks and data from the course above https://www.kaggle.com/aayush9753/singlecell-rnaseq-data-from-mouse-brain

    Inspiration

    Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6

    Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x

  17. Z

    Single-cell atlas of human kidneys in health, chronic kidney disease and...

    • datasetcatalog.nlm.nih.gov
    Updated Jan 25, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jafree, Daniyal J; Long, David A; Stewart, Benjamin J; Clatworthy, Menna R (2023). Single-cell atlas of human kidneys in health, chronic kidney disease and transplant rejection [Dataset]. http://doi.org/10.5281/zenodo.7566982
    Explore at:
    Dataset updated
    Jan 25, 2023
    Authors
    Jafree, Daniyal J; Long, David A; Stewart, Benjamin J; Clatworthy, Menna R
    Description

    This single-cell RNA-sequencing (scRNA-seq) dataset comprises two files: an RData file (combined_data.RData), which can be loaded into RStudio to generate a Seurat object, and an h5ad object (annotated_combined_adata_full.h5ad) for downstream analysis in Scanpy or cellxgene. The dataset contains previously published data and five new samples derived from kidney allografts undergoing graft nephrectomies. Overall, 217,411 human kidney cells are included, including 151,038 ‘control’ cells from living donor biopsies or non-tumorous regions of tumour nephrectomies and 66,373 cells from diseased samples, including chronic kidney disease and different aetiologies of transplant rejection. For full information on generation of the dataset, please see the associated preprint, which has been uploaded to bioRxiv and is available at: https://www.biorxiv.org/content/10.1101/2022.10.28.514222v2. The code used for scRNA-seq analysis is available at: https://github.com/daniyal-jafree1995/

  18. Stack-CellxGene45M

    • huggingface.co
    Updated Jan 9, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Arc Institute (2026). Stack-CellxGene45M [Dataset]. https://huggingface.co/datasets/arcinstitute/Stack-CellxGene45M
    Explore at:
    Dataset updated
    Jan 9, 2026
    Dataset authored and provided by
    Arc Institute
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    CellxGene 45M Collection

    A curated subset of CellxGene (~45M cells) used to align the Stack model after pretraining on full human scBaseCount.

      Selection Criteria
    

    ≥ 50,000 cells per dataset ≥ 5 donors per dataset

      Cell Type Annotations
    

    Author-annotated coarse-grained cell type labels were heuristically identified and transferred to adata.obs["author_cell_type"].

  19. Single cell and spatial analysis of immune-hot and immune-cold tumours...

    • zenodo.org
    bin
    Updated Dec 6, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Benjamin Jenkins; Benjamin Jenkins; Gareth Thomas; Gareth Thomas (2024). Single cell and spatial analysis of immune-hot and immune-cold tumours identifies fibroblast subtypes associated with distinct immunological niches and positive immunotherapy response | scRNA-Seq data [Dataset]. http://doi.org/10.5281/zenodo.14284357
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 6, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Benjamin Jenkins; Benjamin Jenkins; Gareth Thomas; Gareth Thomas
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains scRNA-Seq data related to the Jenkins et al. 2024 study "Single cell and spatial analysis of immune-hot and immune-cold tumours identifies fibroblast subtypes associated with distinct immunological niches and positive immunotherapy response".

    HNSCC_fibroblasts_integ_srt.RDS - Seurat object containing fibroblasts from integrated analysis of EPG dataset (https://cellxgene.cziscience.com/collections/3c34e6f1-6827-47dd-8e19-9edcd461893f) with GSE164690 - Relating to Figure 2.

    PCFA_srt_obj.RDS - Seurat object containing Pan-Cancer Fibroblast Atlas (PCFA) - Relating to Figures 5-7.

  20. h

    single-cell-lung-zarr

    • huggingface.co
    Updated Feb 4, 2026
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Fahad alghanim (2026). single-cell-lung-zarr [Dataset]. https://huggingface.co/datasets/KokosDev/single-cell-lung-zarr
    Explore at:
    Dataset updated
    Feb 4, 2026
    Authors
    Fahad alghanim
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Single-cell lung (CellxGene Census) — Zarr

    This dataset was exported from the CellxGene Census as a chunked + compressed Zarr store intended for easy streaming access.

    Source: CellxGene Census API Organism: Homo sapiens Filter: tissue_general == 'lung' and is_primary_data == True Shape: 100,000 cells × 61,497 genes Zarr path: lung.zarr

      Compression
    

    Uncompressed (dense float32): 22.91 GB Compressed Zarr: ~307 MB (322 MB on Hub) Compression ratio: ~76× (Blosc zstd on… See the full description on the dataset page: https://huggingface.co/datasets/KokosDev/single-cell-lung-zarr.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
(2025). Chan Zuckerberg CELLxGENE Collection [Dataset]. https://bioregistry.io/cellxgene.collection

Chan Zuckerberg CELLxGENE Collection

Explore at:
2 scholarly articles cite this dataset (View in Google Scholar)
Dataset updated
May 7, 2025
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Assigns identifiers to collections of datasets indexed by CELLxGENE.

CELLxGENE is an interactive data visualization and exploration tool developed by the Chan Zuckerberg Initiative that enables researchers to analyze and share single-cell genomics datasets. It provides a user-friendly interface for biologists and computational scientists to interrogate gene expression patterns across different cell types.

Search
Clear search
Close search
Google apps
Main menu