6 datasets found
  1. Single Cell RNA Sequencing Analysis PBMC 3k Scanpy

    • kaggle.com
    zip
    Updated Dec 4, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). Single Cell RNA Sequencing Analysis PBMC 3k Scanpy [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/single-cell-rna-sequencing-analysis-pbmc-3k-scanpy
    Explore at:
    zip(8484350 bytes)Available download formats
    Dataset updated
    Dec 4, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    • This dataset provides a comprehensive single-cell RNA sequencing (scRNA-seq) analysis of 3,000 Peripheral Blood Mononuclear Cells (PBMC 3k) using the Scanpy framework. • It includes a fully processed and annotated Jupyter Notebook workflow designed for beginners, intermediate users, and advanced researchers working in single-cell bioinformatics. • The dataset demonstrates key preprocessing steps including quality control, filtering, normalization, log transformation, and detection of highly variable genes. • It covers dimensionality reduction techniques such as PCA, neighborhood graph construction, UMAP, and t-SNE embeddings for intuitive visualization of cell populations. • The workflow includes clustering analysis using Leiden algorithms to identify distinct immune cell types present in PBMC samples. • Detailed marker-gene identification and differential gene expression analysis are performed to classify major immune cell subsets. • The notebook integrates multiple visualization tools including Scanpy plots, violin plots, dot plots, rank-gene visualizations, and interactive embeddings. • It provides step-by-step code explanations to help users understand each stage of scRNA-seq data processing using Scanpy. • The dataset is suitable for researchers studying immunology, transcriptomics, and single-cell data exploration. • This dataset enables reproducible analysis and serves as a reference template for future single-cell workflows using Scanpy. • It is ideal for teaching, training, and hands-on learning in scRNA-seq analysis. • The included notebook demonstrates best practices for analyzing publicly available PBMC 3k data from the 10x Genomics platform. • Users can explore interactive visualizations to better interpret cellular heterogeneity and lineage relationships within PBMCs. • This resource aims to simplify single-cell analysis and make Scanpy workflows more accessible to the bioinformatics community.

  2. Single-cell transcriptome atlas of lamprey exploring Natterin induced white...

    • zenodo.org
    bin, zip
    Updated Dec 9, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kai Han; Kai Han (2024). Single-cell transcriptome atlas of lamprey exploring Natterin induced white adipose tissue browning: Code and processed scRNA-seq data [Dataset]. http://doi.org/10.5281/zenodo.14338297
    Explore at:
    bin, zipAvailable download formats
    Dataset updated
    Dec 9, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Kai Han; Kai Han
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains code and processed scRNA-seq data of lamprey, which constructed a comprehensive cell atlas comprising 604,460 cells/nuclei and 70 cell types from 14 tissues.

    lamprey_atlas.raw.h5ad:

    Python data set (.h5ad) containing raw counts matrix from all tissues and libraries.

    lamprey_atlas.scanpy_merge.h5ad:

    Python data set (.h5ad) containing scanpy processed matrix, used in projection of cells from all tissues into shared UMAP space. Only highly-variable genes calculated by scanpy are included.

    immune.h5ad:

    Python data set (.h5ad) containing scanpy processed matrix, used in re-clustering of immune cells. Only highly-variable genes calculated by scanpy are included.

    pancreas.evo.rds:

    R data set (.rds) containing integrated data of intestine, liver, pancreas from human and mouse, as well as intestine and liver from lamprey.

    lamprey-single-cell-atlas-1.0.0.zip:

    Code used in processing of scRNA-seq data.

  3. Z

    Joint embedding of vertebrate brain single-cell RNA-Seq using sequence or...

    • data-staging.niaid.nih.gov
    Updated Aug 18, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sun, Dennis (2023). Joint embedding of vertebrate brain single-cell RNA-Seq using sequence or structure [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_7838975
    Explore at:
    Dataset updated
    Aug 18, 2023
    Dataset provided by
    Arcadia Science
    Authors
    Sun, Dennis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Embeddings of single-cell RNA-Seq data from three adult vertebrate brain datasets into Orthogroup feature space or Structural cluster feature space. Orthogroups were generated using OrthoFinder v5.5.0; Structural clusters were assigned by using FoldSeek to cluster AlphaFold-v4 structural predictions.

    The three datasets used as the basis for these embeddings were:

    sample "Brain8" from the Jiang et al. 2021 zebrafish cell atlas (files beginning with GSM3768152)

    sample "Brain1" from the Han et al. 2018 mouse cell atlas (files beginning with GSM2906405)

    sample "Xenopus_brain_COL65" from the Liao et al. 2022 Xenopus laevis adult cell atlas (files beginning with GSM6214268)

    For each dataset, we also generated a standardized cell type annotation file based on the author's originally provided cell type annotation data. The first column is the cell barcode for that species and the second column is the original study's cell type annotation for that cell.

    For the Xenopus brain data, we removed around ~18k cells that were not annotated in the original data to simplify data analyses - these are reflected in the files with the "subsampled" suffix. Subsampled versions of the data are also available for the joint embedding space (prefixed with "DrerMmusXlae").

    For the final datasets used in our analyses, we also provide features x cell matrices as .h5ad files for smaller file sizes and faster loading using Scanpy.

    For visualizing our UMAP plots of our top200 embedding space, we provide ".tsv" files with a variety of metrics and the x and y positions of each cell in the UMAP. See "DrerMmusXlae_adultbrain_FoldSeek_plotlydata.tsv" and "DrerMmusXlae_adultbrain_OrthoFinder_plotlydata.tsv"

    These data are part of the Arcadia Science Pub titled "Comparing gene expression across species based on protein structure instead of sequence".

  4. Data from: A Single-Cell Tumor Immune Atlas for Precision Oncology

    • zenodo.org
    bin, csv
    Updated Mar 31, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paula Nieto; Paula Nieto (2022). A Single-Cell Tumor Immune Atlas for Precision Oncology [Dataset]. http://doi.org/10.5281/zenodo.4263972
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Mar 31, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paula Nieto; Paula Nieto
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preprint version of the Single-Cell Tumor Immune Atlas

    This upload contains:

    • TICAtlas.rds: an rds file containing a Seurat object with the whole Atlas (317111 cells, RNA and integrated assays, PCA and UMAP reductions)
    • TICAtlas.h5ad: an h5ad file with the whole Atlas (317111 cells, RNA assay, PCA and UMAP)
    • TICAtlas_RNA.rds: an rds file containing a Seurat object of the whole Atlas but only the RNA assay (317111 cells, UMAP embedding)
    • TICAtlas_downsampled_1000.rds: an rds file containing a downsampled version of the Seurat object of the whole Atlas (24834 cells, RNA and integrated assay, PCA and UMAP reductions)
    • TICAtlas_downsampled_1000.h5ad: an rds file containing a downsampled version of the Seurat object of the whole Atlas (24834 cells, RNA assay, PCA and UMAP reductions)
    • TICAtlas_metadata.csv: a comma-separated text file with the metadata for each of the cells

    For the h5ad files, the .X slot contains the normalized data, while the .X.raw slot contains the raw counts as they were in the original datasets.

    All the files contain the following patient/sample metadata variables:

    • patient: assigned patient identifiers
    • gender: the patient's gender (male/female/unknown)
    • source: dataset of origin
    • subtype: cancer type (abbreviations as indicated in the preprint)
    • cluster_kmeans_k6: patients clusters, NA if filtered out
    • cell_type: annotated cell type for each of the cells

    If you have any issues with the metadata you can use the TICAtlas_metadata.csv file.

    For more information, read our preprint and check our GitHub.

    h5ad files can be read with Python using Scanpy, rds files can be read in R using Seurat. For format conversion between AnnData and Seurat we recommend SeuratDisk. For other single-cell data formats you can use sceasy.

  5. Z

    Data from: Robust clustering and interpretation of scRNA-seq data using...

    • data.niaid.nih.gov
    Updated May 30, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Florian Schmidt; Bobby Ranjan (2021). Robust clustering and interpretation of scRNA-seq data using reference component analysis [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4021966
    Explore at:
    Dataset updated
    May 30, 2021
    Dataset provided by
    Genome Institute of Singapore (GIS), A*Star
    Authors
    Florian Schmidt; Bobby Ranjan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets and Code accompanying the new release of RCA, RCA2. The R-package for RCA2 is available at GitHub: https://github.com/prabhakarlab/RCAv2/

    The datasets included here are:

    Datasets required for a characterization of batch effects:

    merged_rna_seurat.rds

    de_list.rds

    mergedRCAObj.rds

    merged_rna_integrated.rds

    10X_PBMCs.RDS: Processed 10X PBMC data RCA2 object (10X PBMC example data sets )

    NBM_RDS_Files.zip: Several RDS files containing RCA2 object of Normal Bone Marrow (NBM) data, umap coordinates, doublet finder results and metadata information (Normal Bone Marrow use case)

    Dataset used for the Covid19 example:

    blish_covid.seu.rds

    rownames_of_glocal_projection_immune_cells.txt

    Blish_RCA_no_QC_filtering_project_to_multiple_panels.rds

    Data sets used to outline the ability of supervised clustering to detect disease states:

    809653.seurat.rds

    blish_covid.seu.rds

    Performance benchmarking results:

    Memory_consumption.txt

    rca_time_list.rds

    ScanPY input files:

    input_data.zip

    The R script provides R code to regenerate the main paper Figures 2 to 7 modulo some visual modifications performed in Inkscape.

    Provided R scripts are:

    ComputePairWiseDE_v2.R (Required code for pairwise DE computation)

    RCA_Figure_Reproduction.R

    Provided python Code for Scanpy analysis:

    RA_Scanpy.ipynb

    CITESeq_Scanpy.ipynb

  6. Single-cell RNA-seq dataset of innate lymphoid cells

    • figshare.com
    hdf
    Updated Oct 8, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sijie Chen (2024). Single-cell RNA-seq dataset of innate lymphoid cells [Dataset]. http://doi.org/10.6084/m9.figshare.27190692.v1
    Explore at:
    hdfAvailable download formats
    Dataset updated
    Oct 8, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Sijie Chen
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The helper-like ILC contains various functional subsets, such as ILC1, ILC2, ILC3 and LTi cells, mediating the immune responses against viruses, parasites, and extracellular bacteria, respectively. Among them, LTi cells are also crucial for the formation of peripheral lymphoid tissues, such as lymph nodes. Our research, along with others’, indicates a high proportion of LTi cells in the fetal ILC pool, which significantly decreases after birth. Conversely, the proportion of non-LTi ILCs increases postnatally, corresponding to the need for LTi cells to mediate lymphoid tissue formation during fetal stages and other ILC subsets to combat diverse pathogen infections postnatally. However, the regulatory mechanism for this transition remains unclear. In this study, we observed a preference for fetal ILC progenitors to differentiate into LTi cells, while postnatal bone marrow ILC progenitors preferentially differentiate into non-LTi ILCs. Particularly, this differentiation shift occurs within the first week after birth in mice. Further analysis revealed that adult ILC progenitors exhibit stronger activation of the Notch signaling pathway compared to fetal counterparts, accompanied by elevated Gata3 expression and decreased Rorc expression, leading to a transition from fetal LTi cell-dominant states to adult non-LTi ILC-dominant states. This study suggests that the body can regulate ILC development by modulating the activation level of the Notch signaling pathway, thereby acquiring different ILC subsets to accommodate the varying demands within the body at different developmental stages.Data usageimport scanpy as sc# read the data using scanpyadata= sc.read_h5ad('./220516-ABM.velo.h5ad')# draw umap for visualization. `ann0608` is the cell type label.sc.pl.umap(adata,color='ann0608')# get gene expression matrixadata.X

  7. Not seeing a result you expected?
    Learn how you can add new datasets to our index.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Dr. Nagendra (2025). Single Cell RNA Sequencing Analysis PBMC 3k Scanpy [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/single-cell-rna-sequencing-analysis-pbmc-3k-scanpy
Organization logo

Single Cell RNA Sequencing Analysis PBMC 3k Scanpy

Comprehensive Scanpy Workflow for PBMC 3k Single-Cell RNA-seq Analysis

Explore at:
zip(8484350 bytes)Available download formats
Dataset updated
Dec 4, 2025
Authors
Dr. Nagendra
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

• This dataset provides a comprehensive single-cell RNA sequencing (scRNA-seq) analysis of 3,000 Peripheral Blood Mononuclear Cells (PBMC 3k) using the Scanpy framework. • It includes a fully processed and annotated Jupyter Notebook workflow designed for beginners, intermediate users, and advanced researchers working in single-cell bioinformatics. • The dataset demonstrates key preprocessing steps including quality control, filtering, normalization, log transformation, and detection of highly variable genes. • It covers dimensionality reduction techniques such as PCA, neighborhood graph construction, UMAP, and t-SNE embeddings for intuitive visualization of cell populations. • The workflow includes clustering analysis using Leiden algorithms to identify distinct immune cell types present in PBMC samples. • Detailed marker-gene identification and differential gene expression analysis are performed to classify major immune cell subsets. • The notebook integrates multiple visualization tools including Scanpy plots, violin plots, dot plots, rank-gene visualizations, and interactive embeddings. • It provides step-by-step code explanations to help users understand each stage of scRNA-seq data processing using Scanpy. • The dataset is suitable for researchers studying immunology, transcriptomics, and single-cell data exploration. • This dataset enables reproducible analysis and serves as a reference template for future single-cell workflows using Scanpy. • It is ideal for teaching, training, and hands-on learning in scRNA-seq analysis. • The included notebook demonstrates best practices for analyzing publicly available PBMC 3k data from the 10x Genomics platform. • Users can explore interactive visualizations to better interpret cellular heterogeneity and lineage relationships within PBMCs. • This resource aims to simplify single-cell analysis and make Scanpy workflows more accessible to the bioinformatics community.

Search
Clear search
Close search
Google apps
Main menu