91 datasets found
  1. Additional file 2 of scTyper: a comprehensive pipeline for the cell typing...

    • springernature.figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2023). Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762703.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 2: Supplementary Table 2–3. This file contains the list of cell markers in each of scTyper.db (Table S2) and CellMarker DB (Table S3) and detailed information such as identifier, study name, species, cell type, gene symbol, and PMID.

  2. o

    Data from: SCDevDB: A Database for Insights Into Single-Cell Gene Expression...

    • omicsdi.org
    Updated Jan 14, 2021
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2021). SCDevDB: A Database for Insights Into Single-Cell Gene Expression Profiles During Human Developmental Processes. [Dataset]. https://www.omicsdi.org/dataset/biostudies/S-EPMC6775478
    Explore at:
    Dataset updated
    Jan 14, 2021
    Variables measured
    Unknown
    Description

    Single-cell RNA-seq studies profile thousands of cells in developmental processes. Current databases for human single-cell expression atlas only provide search and visualize functions for a selected gene in specific cell types or subpopulations. These databases are limited to technical properties or visualization of single-cell RNA-seq data without considering the biological relations of their collected cell groups. Here, we developed a database to investigate single-cell gene expression profiling during different developmental pathways (SCDevDB). In this database, we collected 10 human single-cell RNA-seq datasets, split these datasets into 176 developmental cell groups, and constructed 24 different developmental pathways. SCDevDB allows users to search the expression profiles of the interested genes across different developmental pathways. It also provides lists of differentially expressed genes during each developmental pathway, T-distributed stochastic neighbor embedding maps showing the relationships between developmental stages based on these differentially expressed genes, Gene Ontology, and Kyoto Encyclopedia of Genes and Genomes analysis results of these differentially expressed genes. This database is freely available at https://scdevdb.deepomics.org.

  3. d

    Single Cell Developmental Database

    • dknet.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Single Cell Developmental Database [Dataset]. http://identifiers.org/RRID:SCR_017546/resolver
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database for insights into single cell gene expression profiles during human developmental processes. Interactive database provides DE gene lists in each developmental pathway, t-SNE map, and GO and KEGG enrichment analysis based on these differential genes.

  4. Data from: Single-cell RNA-seq of the rare virosphere reveals the native...

    • data.niaid.nih.gov
    • search.dataone.org
    • +2more
    zip
    Updated Feb 29, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Amir Fromm; Gur Hevroni; Daniella Schatz; Flora Vincent; Carolina A. Martinez-Gutierrez; Frank O. Aylward; Assaf Vardi (2024). Single-cell RNA-seq of the rare virosphere reveals the native hosts of giant viruses in the marine environment [Dataset]. http://doi.org/10.5061/dryad.s7h44j1c9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Feb 29, 2024
    Dataset provided by
    Weizmann Institute of Science
    Virginia Tech
    Authors
    Amir Fromm; Gur Hevroni; Daniella Schatz; Flora Vincent; Carolina A. Martinez-Gutierrez; Frank O. Aylward; Assaf Vardi
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Giant viruses (phylum Nucleocytoviricota) are globally distributed in aquatic ecosystems. They play significant roles as evolutionary drivers of eukaryotic plankton and regulators of global biogeochemical cycles. However, we lack knowledge about their native hosts, hindering our understanding of their lifecycle and ecological importance. Here, we used single-cell RNAseq and samples from an induced E. huxleyi bloom during a mesocosm experiment to link giant viruses with their protist hosts. We observe active giant virus infections in multiple host lineages, including members of the algal groups Chrysophycae and Prymnesiophycae, as well as heterotrophic flagellates in the class Katablepharidaceae. Katablepharids were infected with a rare Imitevirales-07 giant virus lineage expressing cell fate regulation genes. Analysis of the temporal dynamics of this host-virus interaction indicated a role for the Imitevirales-07 in the collapse of the host Katablepharid population. Our results demonstrate that single-cell RNA-seq can be used to identify previously undescribed host-virus interactions and study their ecological relevance. Methods Mesocosm core setup and sampling procedure Samples were obtained during the AQUACOSM VIMS-Ehux mesocosm experiment in Raunefjorden near Bergen, Norway (60°16′11N; 5°13′07E), in May 2018. Seven bags were filled with 11m3 water from the fjord, containing natural plankton communities. Algal blooms were induced by nutrient addition and monitored for 24 days, as previously described23. 10 samples were collected from four bags, as follows: From bag 3, on days 15 and 20 (named B3T15, B3T20 correspondingly). From bag 4, on days 13, 15,19, and 20 (named B4T13, B4T15, B4T19, and B4T20, correspondingly). From bag 6, on day 17 (named B6T17). From bag 7, on days 16, 17, and 18 (named B7T16, B7T17, and B7T18, correspondingly). Samples were initially filtered as follows: 2 liters of water were filtered with a 20 µm mesh and collected in a glass bottle. The cells were then concentrated through gentle gravity filtration on a 3 µm polycarbonate filter (Whatman), mounted on a reusable bottle top filter holder (Thermo Fischer). The biomass on the filter was regularly resuspended by gentle pipetting. For samples B7T16, B7T18, B4T15, B3T15, B6T17, B7T17, and B4T19, the 2 liters of seawater were concentrated down to 100 ml, distributed in two 50 ml tubes, which corresponds to a 200 times concentration. For B4T13, the concentration factor was 140 times. For B4T20 and B3T20, the concentration factor was 100 times. The different concentration factors are explained by filter clogging and various field constraints, including processing time. For all samples except B3T20, the 50 ml tubes were centrifuged for 4 min at 2500g, after which the supernatant was discarded. Pellets corresponding to the same day and same bag were pooled and resuspended in a final volume of 200 µl of chilled PBS. 1800 µl of pre-chilled high-performance liquid chromatography (HPLC) grade 100% methanol was added drop by drop to the concentrated biomass. For B3T20, the concentrated biomass was centrifuged for 4 min at 2500g, resuspended in 100 µl of chilled PBS, to which 900 µl of chilled HPLC grade 100% methanol was added. Then, samples were incubated for 15 minutes on ice and stored at -80°C until further analysis. Library preparation and RNA-seq sequencing using 10X Genomics For analysis by 10X Genomics, tubes were defrosted and gently mixed, and 1.7 ml of the samples were transferred into an Eppendorf Lowbind tube and centrifuged at 4°C for 3 min at 3000g. The PBS/methanol mix was discarded and replaced by 400 µl of PBS. Cell concentration was measured using an iCyt Eclipse flow cytometer (SONY) based on forward scatter. Cell concentration ranged from 1044 cells ml-1 to 9855 cells ml-1. All concentrations were brought to 1000 cells ml-1 to target 7000 cells recovery, according to the 10X Genomics Cell Suspension Volume Calculator Table provided in the user guide. The cellular suspension was loaded onto Next GEM Chip G targeting 7000 cells and then ran on a Chromium Controller instrument to generate GEM emulsion (10x Genomics). Single-cell 3' RNA-seq libraries were generated according to the manufacturer's protocol (10x Genomics Chromium Single Cell 3' Reagent Kit User Guide v3/v3.1 Chemistry) on different occasions: B4T19 and B7T17 in January 2020 and B3T15, B3T20, B4T13, B4T15, B4T20, B6T17, B7T16, and B7T18 in August 2020 with 12 cycles for cDNA amplification and 15 cycles for library amplification. Library concentrations and quality were measured using the Qubit dsDNA High Sensitivity Assay kit (Life Technologies, Carlsbad, CA). Libraries were pooled according to targeted cell number, aiming for a minimum of 20,000 reads per cell. Pooled libraries were sequenced using the NextSeq® 500 High Output kit (75 cycles). Bioinformatic pipeline A step-by-step description of the bioinformatic pipeline from this step onward, including all in-house scripts used, is detailed in the GitHub repository under github.com/vardilab/host-virus-pairing. Detection of infected cells in the single-cell RNA-seq data using a custom viral genes database To detect viral transcripts, a reference was built from a database of highly conserved genes6 from all NCLDV in the Giant Virus Database9, such as family B DNA polymerase, RNA polymerase subunits, and the major capsid protein. The genes were clustered using CD-HIT v. 4.6.6 at 90% nucleotide identity To remove redundancy43. From this database of 34866 genes, a reference was created using the 10X Genomics Cell Ranger mkref command. The Cell Ranger Software Suite (v. 5.0.0) was used to perform barcode processing (demultiplexing) and single-cell unique molecular identifier (UMI) counting on the raw reads from 47391 cells using the count script (default parameters), with the deduplicated NCLDV database as a reference. For downstream analysis, 972 cells that highly expressed multiple NCLDV genes and were considered "highly infected" were selected. These 'highly infected' cells were selected based on the following criteria: (a) cell expresses in total ≥10 viral UMIs22,24, (b) expression of more than one viral gene (>1), (c) expression of at least one gene with a UMI count greater than one (>1). Cell selection was wrapped using an in-house script (choose_cells.py). Identifying the taxonomy of individual cells by sequence homology to ribosomal RNA Raw reads from each cell were pulled by the cell's unique barcode identifier using seqtk v. 1.2. Reads were then trimmed (command: trim_galore --phred33 -j 8 --length 36 -q 5 --stringency 1 --fastqc -e 0.1), and poly-A was removed (command: trim_galore --polyA -j 1 --length 36), using TrimGalore (v. 0.6.5), a Cutadapt wrapper 44. Trimmed reads from each cell were assembled using rnaSPAdes 3.1545 with kmer 21,33. Raw reads pulling, trimming, and assembly was wrapped using an in-house script (assemble_cells.sh). To identify the taxonomy of the cells, assembled contigs from each cell were matched against 18S rRNA sequences from the Protist Ribosomal Reference (PR2)46 and metaPR247. To remove redundancy, the sequences in each database were clustered using CD-HIT v. 4.6.6 at 99% identity43. Contigs were filtered using SortMeRNA v. 4.3.648 with default parameters against the PR2 database and then aligned to the PR2 and metaPR2 databases using Blastn49, at 99% identity, E-value ≤ 10-10 and alignment length of at least 100 bp. Contigs were ranked by their bitscore, and only the best hit was kept for each contig. Each contig was assigned to one of the following taxonomic groups that were prevalent in the sample: the classes Bacillariophyta, Prymnesiophyceae, Chrysophyceae, MAST-3, and Katablepharidaceae, the divisions Pseudofungi, Lobosa (Amoebozoa), Ciliphora (Ciliates), Dinoflagellata and Cercozoa. Contigs that matched other groups were assigned as "other eukaryotes". Contigs that matched more than one of these taxonomic groups were considered non-specific or chimeric and were therefore ignored. This downstream analysis of Blast result was wrapped using an in-house script (Sankey_wrapper_extended.ipynb). To avoid detection of doublets and predators, Cells that transcribe 18S rRNA transcripts homologous to more than one taxonomic group were conservatively omitted. Of the 972 infected cells detected, 418 (43%) were omitted because we could not assemble specific 18s rRNA contigs from them or because their identity was ambiguous. None of the cells that were assigned "other eukaryotes" had contigs with conflicting annotations (contigs matching different classes). Identifying the infecting virus using a homology search against a custom protein database To identify transcripts derived from giant viruses, reads from the detected 972 infected cells were compared to a custom protein database using a translated alignment approach. To ensure that as many giant viruses as possible were represented, a database was constructed by combining RefSeq v. 20750 with all predicted proteins in the Giant Virus Database9. The proteins were then masked with tantan51 (using the -p option) and generated the database with the lastdb command (using parameters -c, -p). To identify the infecting virus, the raw sequencing reads in each of the 972 single-cell transcriptomes were compared to the constructed database using LASTAL v. 95952 (parameters -m 100, -F 15, -u 2) with best matches retained. The same procedure was done for the assembled transcripts from each cell to identify viral transcripts. The results were analyzed at different taxonomic levels, consistent with the Giant Virus Database (for giant viruses) or NCBI taxonomy33(everything else). 754 Cells whose best matching virus was coccolithovirus were omitted from the downstream analysis since EhV-infected cells were already reported to be abundant in the algal bloom25, and our analysis aims to explore other host-virus

  5. List of tumor microenvironment scRNA-seq datasets included in TMExplorer.

    • plos.figshare.com
    • figshare.com
    xls
    Updated Jun 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Erik Christensen; Alaine Naidas; David Chen; Mia Husic; Parisa Shooshtari (2023). List of tumor microenvironment scRNA-seq datasets included in TMExplorer. [Dataset]. http://doi.org/10.1371/journal.pone.0272302.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 16, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Erik Christensen; Alaine Naidas; David Chen; Mia Husic; Parisa Shooshtari
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    List of tumor microenvironment scRNA-seq datasets included in TMExplorer.

  6. s

    SC2diseases

    • scicrunch.org
    Updated Dec 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2023). SC2diseases [Dataset]. http://identifiers.org/RRID:SCR_019093
    Explore at:
    Dataset updated
    Dec 4, 2023
    Description

    Manually curated database of single cell transcriptome for human diseases. scRNA-seq database derived from numerous human studies. Provides researchers with encyclopedia of biomarkers at level of genes, cells, and diseases.

  7. f

    Data_Sheet_1_SCSA: A Cell Type Annotation Tool for Single-Cell RNA-seq...

    • frontiersin.figshare.com
    xlsx
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yinghao Cao; Xiaoyue Wang; Gongxin Peng (2023). Data_Sheet_1_SCSA: A Cell Type Annotation Tool for Single-Cell RNA-seq Data.xlsx [Dataset]. http://doi.org/10.3389/fgene.2020.00490.s001
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    Frontiers
    Authors
    Yinghao Cao; Xiaoyue Wang; Gongxin Peng
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Currently most methods take manual strategies to annotate cell types after clustering the single-cell RNA sequencing (scRNA-seq) data. Such methods are labor-intensive and heavily rely on user expertise, which may lead to inconsistent results. We present SCSA, an automatic tool to annotate cell types from scRNA-seq data, based on a score annotation model combining differentially expressed genes (DEGs) and confidence levels of cell markers from both known and user-defined information. Evaluation on real scRNA-seq datasets from different sources with other methods shows that SCSA is able to assign the cells into the correct types at a fully automated mode with a desirable precision.

  8. Inferelator 3.0 Yeast Single-Cell Benchmarking Data

    • zenodo.org
    • data.niaid.nih.gov
    application/gzip, tsv
    Updated Aug 27, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Christopher Jackson; Christopher Jackson (2021). Inferelator 3.0 Yeast Single-Cell Benchmarking Data [Dataset]. http://doi.org/10.5281/zenodo.5272314
    Explore at:
    application/gzip, tsvAvailable download formats
    Dataset updated
    Aug 27, 2021
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Christopher Jackson; Christopher Jackson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Yeast single-cell gene expression data, database-derived prior knowledge network, and hand-curated gold standard network. Used to benchmark the Inferelator 3.0, SCENIC, and CellOracle.

    Expression data (GSE144820_GSE125162.tsv.gz) is an integer count matrix [44343 rows x 6763 columns] with an index column (0) assembled from GSE144820 and GSE125162. Included is a paired metadata file (GSE144820_GSE125162_META_DATA.tsv.gz).

    A database-derived prior knowledge network (YEASTRACT_20190713_BOTH.tsv) is a boolean connectivity matrix [6885 rows x 220 columns] with an index column (0) obtained from the YEASTRACT database on 07132019. It consists of edges which have both DNA localization evidence and evidence of changes to gene expression after TF perturbation.

    A curated gold standard network (Tchourine_2018_yeast_gold_standard.tsv) is a signed connectivity matrix [993 rows x 98 columns] with an index column (0). Details of its construction have been published.

  9. f

    Single cell reconstruction data from MouseLight database and related Matlab...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Sep 20, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    chen, susu; Svoboda, Karel; Li, Nuo (2023). Single cell reconstruction data from MouseLight database and related Matlab analysis scripts [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000988154
    Explore at:
    Dataset updated
    Sep 20, 2023
    Authors
    chen, susu; Svoboda, Karel; Li, Nuo
    Description

    This document describes how to identify and extract ALM neurons from the MouseLight database of individual reconstructed neurons (https://ml-neuronbrowser.janelia.org/) to define ALM projection zones (relevant to Chen, Liu et al., Cell, 2023). All scripts are in Matlab R2022b./MouseLight_figshare/MouseLightComplete contains all reconstructed single neurons from the MouseLight data set, in .json and .swc formats.Use ‘ExtractMouseLightNeuronsFromJsonFiles.m’ to extract MouseLight neuron ID, soma coordinates and annotation, and axon coordinates from the above directory.Use ‘ExtractMouseLightALMneurons.m’ to identify and extract ALM neurons from the MouseLight data set. ALM neurons are defined based on functional maps of ALM (photoinhibition) in the CCF coordinate system, contained in ‘ALM_functionalData.nii’ (from Li, Daie, et al Nature, 2016).Use ‘ALMprojDensity.m’ to compute and generate an ALM projection map based on axonal density. The map is saved in ‘ALM_mask_150um3Dgauss_Bilateral.mat’ as smoothed (3D Gaussian, sigma = 150 um) axonal density in a 3D matrix: F_smooth.First axis: dorsal-ventral, second axis: medial-lateral, third axis: anterior-posterior.Use ‘medial_lateral_ALMprojDensities.m’ to compute and generate medial and lateral ALM projection maps separately.Medial ALM soma location < 1.5 mm from the midline; lateral ALM soma locations > 1.5 mm from the midline.The maps are saved in ‘medialALM_mask_150um3Dgauss_Bilateral.mat’ and ‘lateralALM_mask_150um3Dgauss_Bilateral.mat’ as smoothed (3D Gaussian, sigma = 150 um) axonal density in 3D matrices, respectively.Use ‘PlotALMinCCF.m’ to plot voxels of ALM in CCF, defined by the functional maps in ‘ALM_functionalData.nii’.Use ‘PlotMouseLightALMneurons.m’ to plot ALM neurons (all, medial, or lateral) in CCF; figures are saved in .tiff format.Other functions:‘loadTifFast.m’ is called to load CCF .tif file (Annotation_new_10_ds222_16bit.tif).‘plotCCFbrain.m’ is called to plot an isosurface of the CCF brain (Annotation_new_10_ds222_16bit.tif).

  10. utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments...

    • zenodo.org
    zip
    Updated Apr 6, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Borcherding; Nicholas Borcherding (2022). utility: Collection of Tumor-Infiltrating Lymphocyte Single-Cell Experiments with TCR [Dataset]. http://doi.org/10.5281/zenodo.5524577
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 6, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Borcherding; Nicholas Borcherding
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Introduction

    The original intent of assembling a data set of publicly-available tumor-infiltrating T cells (TILs) with paired TCR sequencing was to expand and improve the scRepertoire R package. However, after some discussion, we decided to release the data set for everyone, a complete summary of the sequencing runs and the sample information can be found in the meta data of the Seurat object. This repository contains the code for the initial processing and annotating of the data set (we are calling this version 0.0.1). This involves several steps 1) loading the respective GE data, 2) harmonizing the data by sample and cohort information, 3) iterating through automatic annotation, 4) unifying annotation via manual inspection and enrichment analysis, and 5) adding the TCR information.

    Methods

    Single-Cell Data Processing

    The filtered gene matrices output from Cell Ranger align function from individual sequencing runs (10x Genomics, Pleasanton, CA) loaded into the R global environment. For each sequencing run cell barcodes were appended to contain a unique prefix to prevent issues with duplicate barcodes. The results were then ported into individual Seurat objects (citation), where the cells with > 10% mitochondrial genes and/or 2.5x natural log distribution of counts were excluded for quality control purposes. At the individual sequencing run level, doublets were estimated using the scDblFinder (v1.4.0) R package. All the sequencing runs across experiments were merged into a single Seurat Object using the merge() function. All the data was then normalized using the default settings and 2,000 variable genes were identified using the "vst" method. Next the data was scaled with the default settings and principal components were calculated for 40 components. Data was integrated using the harmony (v1.0.0) R package (citation) using both cohort and sample information to correct for batch effect with up to 20 iterations. The UMAP was created using the runUMAP() function in Seurat, using 20 dimensions of the harmony calculations.

    Annotation of Cells

    Automatic annotation was performed using the singler (v1.4.1) R package (citation) with the HPCA (citation) and DICE (citation) data sets as references and the fine label discriminators. Individual sequencing runs were subsetted to run through the singleR algorithm in order to reduce memory demands. The output of all the singleR analyses were collated and appended to the meta data of the seurat object. Likewise, the ProjecTILs (v0.4.1) R Package (citation) was used for automatic annotation as a partially orthogonal approach. Consensus annotation was derived from all 3 databases (HPCA, DICE, ProjecTILs) using a majority approach. No annotation designation was assigned to cells that returned NA for both singleR and ProjecTILs. Mixed annotations were designated with SingleR identified non-Tcells and ProjecTILs identified T cells. Cell type designations with less than 100 cells in the entire cohort were reduced to "other". Automated annotations were checked manually using canonical marker genes and gene enrichment analysis performed using UCell (v1.0.0) R package (citation).

    Addition of TCR data

    The filtered contig annotation T cell receptor (TCR) data for available sequencing runs were loaded into the R global environment. Individual contigs were combined using the combineTCR() function of scRepertoire (v1.3.2) R Package (citation). Clonotypes were assigned to barcodes and were multiple duplicate chains for individual cells were filtered to select for the top expressing contig by read count. The clonotype data was then added to the Seurat Object with proportion across individual patients being used to calculate frequency.

    Citations

    As of right now, there is no citation associated with the assembled data set. However if using the data, please find the corresponding manuscript for each data set in the meta.data of the single-cell object. In addition, if using the processed data, feel free to modify the language in the methods section (above) and please cite the appropriate manuscripts of the software or references that were used.

    Itemized List of the Software Used

    Itemized List of Reference Data Used

    • Human Primary Cell Atlas (HPCA) - citation
    • Database Immune Cell Expression (DICE) - citation
    • Immune-related Gene Sets - citation

    Future Directions

    • Data Hosting for Interactive Analysis
    • Easy Submission Portal for Researchers to Add Data
    • Using the Data to Build a Reference Atlas

    There are areas in which we are actively hoping to develop to further facilitate the usefulness of the data set - if you have other suggestions, please reach out using the contact information below.

    Contact

    Questions, comments, suggestions, please feel free to contact Nick Borcherding via this repository, email, or using twitter.

  11. Cell Health - Cell Painting Single Cell Profiles

    • nih.figshare.com
    bin
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gregory Way; Maria Kost-Alimova; Tsukasa Shibue; Will Harrington; Stanley Gill; Tim Becker; William C. Hahn; Anne Carpenter; Francisca Vazquez; Shantanu Singh (2023). Cell Health - Cell Painting Single Cell Profiles [Dataset]. http://doi.org/10.35092/yhjc.9995672.v5
    Explore at:
    binAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Gregory Way; Maria Kost-Alimova; Tsukasa Shibue; Will Harrington; Stanley Gill; Tim Becker; William C. Hahn; Anne Carpenter; Francisca Vazquez; Shantanu Singh
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Single Cell Databases of Cell Painting Profiles for the Cell Health Project. These data are used to aggregate profiles in a CRISPR knockout experiment. The data are used to predict cell health assays.DataWe collected Cell Painting measurements on a CRISPR experiment. The experiment targeted 59 genes, which included 119 unique guides (~2 per gene), across 3 cell lines. The cell lines included A549, ES2, and HCC44.About 40% of all CRISPR guides were reproducible. This is ok since we are not actually interested in the CRISPR treatment specifically, but instead, just its corresponding readout in each cell health assay.ApproachWe performed the following approach:Split data into 85% training and 15% test sets.Normalized data by plate (z-score).Selected optimal hyperparamters using 5-fold cross-validationTrained elastic net regression models to predict each of the 70 cell health assay readouts, independently.Trained using shuffled data as well.Report performance on training and test sets.We also trained logistic regression classifiers using the same approach aboveSee https://github.com/broadinstitute/cell-health for more details.

  12. f

    Table1_Mechanism of action of paclitaxel for treating glioblastoma based on...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Nov 21, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jin, Jinghao; Xu, Fanjie; Shen, Chaodong; Rao, Changjun; Lu, Jianglong; Wang, Chengde; Li, Qun; Zhu, Zhangzhang (2022). Table1_Mechanism of action of paclitaxel for treating glioblastoma based on single-cell RNA sequencing data and network pharmacology.XLS [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000259533
    Explore at:
    Dataset updated
    Nov 21, 2022
    Authors
    Jin, Jinghao; Xu, Fanjie; Shen, Chaodong; Rao, Changjun; Lu, Jianglong; Wang, Chengde; Li, Qun; Zhu, Zhangzhang
    Description

    Paclitaxel is an herbal active ingredient used in clinical practice that shows anti-tumor effects. However, its biological activity, mechanism, and cancer cell-killing effects remain unknown. Information on the chemical gene interactions of paclitaxel was obtained from the Comparative Toxicogenomics Database, SwishTargetPrediction, Binding DB, and TargetNet databases. Gene expression data were obtained from the GSE4290 dataset. Differential gene analysis, Kyoto Encyclopedia of Genes and Genomes, and Gene Ontology analyses were performed. Gene set enrichment analysis was performed to evaluate disease pathway activation; weighted gene co-expression network analysis with diff analysis was used to identify disease-associated genes, analyze differential genes, and identify drug targets via protein-protein interactions. The Molecular Complex Detection (MCODE) analysis of critical subgroup networks was conducted to identify essential genes affected by paclitaxel, assess crucial cluster gene expression differences in glioma versus standard samples, and perform receiver operator characteristic mapping. To evaluate the pharmacological targets and signaling pathways of paclitaxel in glioblastoma, the single-cell GSE148196 dataset was acquired from the Gene Expression Omnibus database and preprocessed using Seurat software. Based on the single-cell RNA-sequencing dataset, 24 cell clusters were identified, along with marker genes for the two different cell types in each cluster. Correlation analysis revealed that the mechanism of paclitaxel treatment involves effects on neurons. Paclitaxel may affect glioblastoma by improving glucose metabolism and processes involved in modulating immune function in the body.

  13. Z

    MDMcleaner reference database

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    • +1more
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    John Vollmers (2022). MDMcleaner reference database [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_5698994
    Explore at:
    Dataset updated
    Jan 28, 2022
    Dataset provided by
    Karlsruhe Institute of Technology
    Authors
    John Vollmers
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    MDMcleaner reference database used at time of publication.

    based on:

    GTDB release r95

    RefSeq release 203

    Silva version 138.1

  14. d

    Rodrigo Quian Quiroga EEG ERP and single cell recordings database

    • dknet.org
    • scicrunch.org
    • +2more
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Rodrigo Quian Quiroga EEG ERP and single cell recordings database [Dataset]. http://identifiers.org/RRID:SCR_001580
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    5 EEG, ERP and single cell recordings data sets where each file corresponds to the recording on a different subject in the left occipital electrode (O1), with linked earlobes reference. Each file contains several artifact-free trials, each of them containing 512 data points (256 pre- and 256 post-stimulation) stored with a sampling frequency of 250 Hz. Trials are stored consecutively in a 1 column file. Data was pre-filtered in the range 0.1-70Hz. All trials correspond to target stimulation with an oddball paradigm. STAR R based Data Sets Used * Dataset # 1: Human single-cell recording * Dataset # 2: Simulated extracellular recordings * Dataset # 3: EEG signals from rats * Dataset # 4: Pattern visual evoked potentials. * Dataset # 5: Tonic-clonic (Grand Mal) seizures.

  15. f

    DataSheet_5_A combined analysis of bulk and single-cell sequencing data...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Sep 15, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dong, Shaowei; Lin, Guanchuan; Zhao, Pan; Ma, Xiaoshi; Xu, Jing; Zhang, Hao; Zhang, Siyu; Zou, Chang; Hu, Jiliang (2022). DataSheet_5_A combined analysis of bulk and single-cell sequencing data reveals that depleted extracellular matrix and enhanced immune processes co-contribute to fluorouracil beneficial responses in gastric cancer.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000398302
    Explore at:
    Dataset updated
    Sep 15, 2022
    Authors
    Dong, Shaowei; Lin, Guanchuan; Zhao, Pan; Ma, Xiaoshi; Xu, Jing; Zhang, Hao; Zhang, Siyu; Zou, Chang; Hu, Jiliang
    Description

    Fluorouracil, also known as 5-FU, is one of the most commonly used chemotherapy drugs in the treatment of advanced gastric cancer (GC). Whereas, the presence of innate or acquired resistance largely limits its survival benefit in GC patients. Although accumulated studies have demonstrated the involvement of tumor microenvironments (TMEs) in chemo-resistance induction, so far little is known about the relevance of GC TMEs in 5-FU resistance. To this end, in this study, we investigated the relationship between TME features and 5-FU responses in GC patients using a combined analysis involving both bulk sequencing data from the TCGA database and single-cell RNA sequencing data from the GEO database. We found that depleted extracellular matrix (ECM) components such as capillary/stroma cells and enhanced immune processes such as increased number of M1 polarized macrophages/Memory T cells/Natural Killer T cells/B cells and decreased number of regulatory T cells are two important features relating to 5-FU beneficial responses in GC patients, especially in diffuse-type patients. We further validated these two features in the tumor tissues of 5-FU-benefit GC patients using immunofluorescence staining experiments. Based on this finding, we also established a Pro (63 genes) and Con (199 genes) gene cohort that could predict 5-FU responses in GC with an AUC (area under curve) score of 0.90 in diffuse-type GC patients, and further proved the partial applicability of this gene panel pan-cancer-wide. Moreover, we identified possible communications mediated by heparanase and galectin-1 which could regulate ECM remodeling and tumor immune microenvironment (TIME) reshaping. Altogether, these findings deciphered the relationship between GC TMEs and 5-FU resistance for the first time, as well as provided potential therapeutic targets and predicting rationale to overcome this chemo-resistance, which could shed some light on developing novel precision treatment strategies in clinical practice.

  16. f

    Additional file 4 of scTyper: a comprehensive pipeline for the cell typing...

    • figshare.com
    • datasetcatalog.nlm.nih.gov
    html
    Updated Aug 5, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2020). Additional file 4 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762709.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    Aug 5, 2020
    Dataset provided by
    figshare
    Authors
    Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 4: Supplementary Data. An example report summary document of scTyper.

  17. n

    Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jun 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow (2020). Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection [Dataset]. http://doi.org/10.5061/dryad.vt4b8gtp6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2020
    Dataset provided by
    Dartmouth–Hitchcock Medical Center
    Dartmouth College
    Authors
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.

    Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).

    Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.

    Preprocessing of single cell RNA sequencing (scRNA-seq) data

    Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.

    Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq

    Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.

    Differential gene expression by immune cells

    Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.

    Analysis of arsenic effect on immune cell gene expression by scRNA-seq.

    Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.

  18. n

    Allen Cell Types Database

    • neuinfo.org
    • dknet.org
    • +2more
    Updated Dec 16, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Allen Cell Types Database [Dataset]. http://identifiers.org/RRID:SCR_014806
    Explore at:
    Dataset updated
    Dec 16, 2022
    Description

    Database of neuronal cell types based on multimodal characterization of single cells to enable data-driven approaches to classification. It includes data such as electrophysiology recordings, imaging data, morphological reconstructions, and RNA and DNA sequencing data.

  19. f

    Data Sheet 1_Identifying key genes associated with recurrence in non-small...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jun 11, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cao, Chunxiao; Xie, Yuning; Li, Weiyuan; Shen, Jingxia; Han, Duo (2025). Data Sheet 1_Identifying key genes associated with recurrence in non-small cell lung cancer through TCGA and single-cell analysis.csv [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002045446
    Explore at:
    Dataset updated
    Jun 11, 2025
    Authors
    Cao, Chunxiao; Xie, Yuning; Li, Weiyuan; Shen, Jingxia; Han, Duo
    Description

    ObjectiveThis study aims to mine the TCGA database for differentially expressed genes in recurrent lung cancer tissues, determine the relationship between these recurrent genes and lung cancer at the single-cell level, and identify potential targets for lung cancer treatment.MethodsData for lung adenocarcinoma (LUAD) and lung squamous cell carcinoma (LUSC) were obtained from the TCGA database and grouped based on clinical recurrence information. Single-cell data from GSE131907 were downloaded from the GEO database. R was utilized to screen for differentially expressed genes (DEGs), followed by weighted gene co-expression network analysis (WGCNA) of these DEGs. Additionally, the GSEA database was employed to visualize differential pathways and identify key genes. The relationship between the expression of these key genes and lung cancer recurrence was validated using the GSE131907 single-cell dataset.ResultsA total of 2,239 differentially expressed genes were identified in the LUAD dataset, while 3,404 differentially expressed genes were found in the LUSC dataset. WGCNA revealed that the lapis lazuli module gene set was associated with recurrence. Validation at the single-cell level indicated that the FOXI1, FOXB1, and KCNA7 genes were linked to lung cancer progression.ConclusionThe differentially expressed genes primarily influence NSCLC recurrence through involvement in biological processes related to metabolism and hormone secretion pathways. Notably, the KCNA7 and FOX gene families were identified as critical for NSCLC recurrence. This study highlights specific genes within proliferation and cell cycle pathways as key therapeutic targets for managing NSCLC recurrence.

  20. f

    Table 2_Integrative spatial and single-cell transcriptomics elucidate...

    • datasetcatalog.nlm.nih.gov
    • frontiersin.figshare.com
    Updated Jul 16, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tan, Jinmei; Zhao, Yutong; Liu, Jiawei; Li, Shumin; Zhou, Qi; Zhou, Caihong; Chen, Wenhao; Lei, Kai; Tan, Jiehui; Wu, Jian; Zhang, Yi (2025). Table 2_Integrative spatial and single-cell transcriptomics elucidate programmed cell death-driven tumor microenvironment dynamics in hepatocellular carcinoma.docx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0002062124
    Explore at:
    Dataset updated
    Jul 16, 2025
    Authors
    Tan, Jinmei; Zhao, Yutong; Liu, Jiawei; Li, Shumin; Zhou, Qi; Zhou, Caihong; Chen, Wenhao; Lei, Kai; Tan, Jiehui; Wu, Jian; Zhang, Yi
    Description

    PurposeProgrammed cell death (PCD) mechanisms play crucial roles in cancer progression and treatment response. This study aims to develop a PCD scores prediction model to evaluate the prognosis of hepatocellular carcinoma (HCC) and elucidate the tumor microenvironment differences.MethodsWe analyzed transcriptomic data from 363 HCC patients in the TCGA database and 221 patients in the GEO database to develop a PCD prediction model. Single-cell RNA sequencing (scRNA-seq) and spatial transcriptomics sequencing (ST-seq) data from HCC patients were analyzed to investigate the tumor microenvironment and functional disparities. The oncogenic role of the key gene UBE2E1 in the model was explored in HCC through various in vitro experiments.ResultsSeventeen PCD-related genes were identified as significant prognostic indicators, forming the basis of our PCD prediction model. High-PCD scores correlated with poorer overall survival (OS) and exhibited significant predictive capabilities. scRNA-seq analysis revealed distinct tumor cell characteristics and immune microenvironment differences between high- and low-PCD groups. High-PCD tumors showed increased cell proliferation and malignancy-associated gene expression. T cells in high-PCD patients were more likely to be exhausted, with elevated expression of exhaustion markers. ST-seq data also confirmed these results. Among the genes associated with the PCD prognostic model, UBE2E1 was identified as a key oncogenic marker in HCC.ConclusionsThe PCD prediction model effectively predicts prognosis in HCC patients and reveals critical insights into the tumor microenvironment and immune cell exhaustion. This study underscores the potential of PCD-related biomarkers in guiding personalized treatment strategies for HCC.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo (2023). Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data [Dataset]. http://doi.org/10.6084/m9.figshare.12762703.v1
Organization logoOrganization logo

Additional file 2 of scTyper: a comprehensive pipeline for the cell typing analysis of single-cell RNA-seq data

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
figshare
Authors
Ji-Hye Choi; Hye In Kim; Hyun Goo Woo
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Additional file 2: Supplementary Table 2–3. This file contains the list of cell markers in each of scTyper.db (Table S2) and CellMarker DB (Table S3) and detailed information such as identifier, study name, species, cell type, gene symbol, and PMID.

Search
Clear search
Close search
Google apps
Main menu