11 datasets found

Data from: Single cell multiomic analysis identifies key genes...
data.niaid.nih.gov
search.dataone.org
+1more
zip
Updated Jul 2, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Abhinav Kaushik; Kari Nadeau (2024). Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients [Dataset]. http://doi.org/10.5061/dryad.8931zcrz4
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.8931zcrz4
Dataset updated
Jul 2, 2024
Dataset provided by
National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
Authors
Abhinav Kaushik; Kari Nadeau
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Innate lymphoid cells (ILCs) are enriched at mucosal surfaces where they respond rapidly to environmental stimuli and contribute to both tissue inflammation and healing. To gain insight into the role of ILCs in the pathology and recovery from COVID-19 infection, we employed a multi-omic approach consisting of Abseq and targeted mRNA sequencing to respectively probe the surface marker expression, transcriptional profile and heterogeneity of ILCs in peripheral blood of patients with COVID-19 compared with healthy controls. We found that the frequency of ILC1 and ILC2 cells was significantly increased in COVID-19 patients. Moreover, all ILC subsets displayed a significantly higher frequency of CD69-expressing cells, indicating a heightened state of activation. ILC2s from COVID-19 patients had the highest number of significantly differentially expressed (DE) genes. The most notable genes DE in COVID-19 vs healthy participants included a) genes associated with responses to virus infections and b) genes that support ILC self-proliferation, activation and homeostasis. In addition, differential gene regulatory network analysis revealed ILC-specific regulons and their interactions driving the differential gene expression in each ILC. Overall, this study provides mechanistic insights into the characteristics of ILC subsets activated during COVID-19 infection. Methods Study participants, blood draws and processing Participants were recruited as described previously from adults who had a positive SARS-COV-2 RT-PCR test at Stanford Health Care (NCT04373148). Collection of Covid samples occurred between May to December 2020. The cohort used in this study consisted of asymptomatic (n=2), mild (n=17), and moderate (n=3) COVID-19 infections, some of whom developed long term COVID-19 (n=15). The clinical case severities at the time of diagnosis were defined as asymptomatic, moderate or mild according to the guidelines released by NIH. Long term (LT) COVID was defined as symptoms occurring 30 or more days after infection, consistent with CDC guidelines. Some participants in our study continued to have LT COVID symptoms 90 days after diagnosis (n=12). Exclusion criteria for COVID sample study were NIH severity diagnosis of severe or critical at the time of positive covid test. Samples selected for this study were obtained within 76 days of positive PCR COVID-19 test date. Healthy controls were selected who had sample collection before 2020. Informed consent was obtained from all participants. All protocols were approved by the Stanford Administrative Panel on Human Subjects in Medical Research. Peripheral blood was drawn by venipuncture and using validated and published procedures, peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll-based density gradient centrifugation, frozen in aliquots and stored in liquid nitrogen at -80°C , until thawing. A summary of participant demographics is presented in Supp. Table 1.
ILC Enrichment, single cell captures for Abseq and targeted mRNAseq Participant PBMCs were thawed, and each sample stained with Sample Tag (BD #633781) at room temperature for 20 minutes. Samples were combined in healthy control or COVID-19 tubes. Cells were surface stained with a panel of fluorochrome-conjugated antibodies (Supp. Table 2) in buffer (PBS with 0.25% BSA and 1mM EDTA) for 20 minutes at room temperature prior to immunomagnetic negative selection for ILCs. Following ILC enrichment using the EasySep human Pan-ILC enrichment kit (StemCell Technologies #17975), cells from healthy and COVID-19 recovered participants were counted and normalized before combining. ILCs were sorted using a BD FACS Aria at the Stanford FACS facility prior to incubation with AbSeq oligo-linked mAbs (Supp. Table 3). Sorted cells were processed by the Stanford Human Immune Monitoring Center (HIMC) using the BD Rhapsody platform. Library was prepared using the BD Immune Response Targeting Panel (BD Kit #633750) with addition of custom gene panel reagents (Supp. Table 4) and sequenced on Illumina NovaSeq 6000 at Stanford Genomics Sequencing Center (SGSC). ILCs were identified as Lineageneg (CD3neg, CD14neg, CD34neg, CD19neg), NKG2Aneg, CD45+ and ILCs further defined as CD127+CD161+ and as subsets: ILC1 (CD117negCRTH2neg), ILC2 (CRTH2+) and ILCp (CD117+CRTH2neg) (Supp. Fig. 1). Computational data analysis The above multi-modal setup allowed paired measurements of cellular transcriptome and cell surface protein abundance. The ILC1, ILC2 and ILCp cells were manually gated based on the abundance profile of CD127, CD117, CD161 and CRTH2 (Supp. Fig. 1). Before the integrative analysis, the complete multi-modal single cell dataset containing ILC subsets was converted into single Seurat object. All the subsequent protein-level and gene-level analyses were performed using multimodal data analysis pipeline of Seurat R package version 4.0. The normalized and scaled protein abundance profile was used for estimating the integrated harmony dimensions using runHarmony function in Seurat R package (reduction= ‘apca’ and group.by.vars = ‘batch’) . The batch corrected harmony embeddings were then used for computing the Uniform Manifold Approximation and Projection (UMAP) dimensions to visualize the clusters of ILC subsets. Differential marker analysis of surface proteins, between two groups of cells (COVID-19 and Healthy cohort), from abseq panels was computed with normalized and scaled expression values using FindMarkers function from Seurat R package (test.use=’wilcox’). Similarly, differential gene expression was performed on normalized and scaled gene expression values from between two groups of cells (COVID-19 and Healthy cohort) using the FindMarkers function from Seurat R package (test.use=’MAST’ and latent.vars=’batch’). Genes with log-fold change > 0.5 and adjusted p-value < 0.05 (method: Benjamini-Hochberg) (were considered as significant for further evaluation. The resulting adjusted p-values box-plots were plotted using ggplot2 R package (version 3.4.2) after computing the number of cells expressing a given protein or gene in each sample. Pathway enrichment analysis of DE genes was performed using web-server metascape (version 3.5). The AUCells score and gene regulatory network analysis was performed using pySCENIC pipeline (version 0.12.1). Gene regulatory network was reconstructed using GRNBoost2 algorithm and the list of TFs in humans (genome version: hg38) were obtained from cisTarget database. (https://resources.aertslab.org/cistarget). Cellular enrichment (aka AUCell) analysis that measures the activity of TF or gene signatures across all single cells was performed using aucell function in pySCENIC python library. The ggplot2 R package (version 3.4.2) was used for boxplot visualization. The differential gene co-expression analysis was performed using scSFMnet R package. Circular plots were generated using the R package circlize (version 0.4.15).
o
Data from: MOJITOO: a fast and universal method for integration of...
explore.openaire.eu
Updated Mar 12, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Mingbo Cheng; Zhijian Li; Ivan G. Costa (2022). MOJITOO: a fast and universal method for integration of multimodal single cell data [Dataset]. http://doi.org/10.5281/zenodo.6348128
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.6348128
Dataset updated
Mar 12, 2022
Authors
Mingbo Cheng; Zhijian Li; Ivan G. Costa
Description
MOJITOO benchmarking seurat Robjects.
Single cell T cell atlas
zenodo.org
bin, csv
Updated Jul 27, 2024
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Kerry A Mullan; Kerry A Mullan (2024). Single cell T cell atlas [Dataset]. http://doi.org/10.5281/zenodo.12569981
Explore at:
bin, csvAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.12569981
Dataset updated
Jul 27, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Kerry A Mullan; Kerry A Mullan
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The attached datasets comprised of the merging of 12 high quality single cell T cell based dataset that had both the TCR-seq and GEx. The object contains ~500K paired TCR-seq with GEx in the Seurat Object (supercluster_added_ID-240531.rds). We also included the original identifiers in the Sup_Update_labels.csv a. See our https://stegor.readthedocs.io/en/latest/ for how we processed the 12 datasets and decided on the current 47 T cell annotation models using scGate.

This is the accompanying data set for the paper entitled ‘T cell receptor-centric approach to streamline multimodal single-cell data analysis.’, which is currently available as a preprint (https://www.biorxiv.org/content/10.1101/2023.09.27.559702v2). Details on the origin of the datasets, and processing steps can be found there.

The purpose of this atlas both the full dataset and down sampling version is to aid in improving the interpretability of other T cell based datasets. This can be done by adding in the down sampled object that contains up to 500 cells per annotation model or all 12 dataset to your new sample. This dataset aims to improve the capacity to identify TCR-specific signature by ensuring a well covered background, which will improve the robustness of the FindMarker Function in Seurat package.
Data from: Large-scale single-cell RNA-seq characterizes neural stem cells...
figshare.com
txt
Updated Sep 5, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Muhammad Junaid; Han Kyoung Choe; Eun Jeong Lee; Su Bin Lim (2023). Large-scale single-cell RNA-seq characterizes neural stem cells and progenitor cells in postnatal and young adult mouse hypothalamus [Dataset]. http://doi.org/10.6084/m9.figshare.21981251.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.21981251.v1
Dataset updated
Sep 5, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Muhammad Junaid; Han Kyoung Choe; Eun Jeong Lee; Su Bin Lim
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Background In recent years, studies have demonstrated that neurogenesis can also occur in the adult mammal hypothalamus. Although the hypothalamus is a critical brain region that plays a vital role in regulating homeostatic and survival-related behaviors, there is still limited knowledge about its intrinsic mechanisms of development.

Our Goal Our goal is to identify and extensively characterize the cell-type-specific features during neurogenesis in the hypothalamic region of mice, from postnatal to young adult stages. We processed and analyzed publicly available scRNA-seq transcriptomic data that were obtained from the hypothalamic regions of mice, using uniform and optimized informatics pipeline.

Methods We obtained 10 scRNA-seq datasets from independent studies using from the NCBI GEO, which were further processed using the Seurat package in R (v4.2.1). A standard informatics pipeline was applied to each dataset for pre-processing and cell clustering for cell-level metadata standardization (See R script). Further, Pseudotime trajectory using the Monocle3 Alpha package in R (v. 2.99.1) and slingshot package in R (v. 2.6.0) was performed to see developmental differentiation in the hypothalamus during ault neurogenesis. We also generated connect-seq-derived scRNA-seq dataset Validation.rds (barcodes.tsv.gz, features.tsv.gz, matrix.mtx.gz) of 1,533 cells from whole hypothalamus and the bed nucleus of the stria terminalis (BNST) for validation analysis of our integrated dataset, by performing anchor-based mapping and transferring of labels and merging both integrated and validation dataset(refquery.rds).

Result Our integrated dataset (final_hypo_ann.rds) has revealed 30 distinct cell types that encompass all major cell types found in the hypothalamic regions, including glial-like cells such as ependymal cells, tanycytes, astrocytes, oligodendrocytes, and intermediate progenitor cells (IPCs) and also explored gene expression and cellular differentiation in the hypothalamus across various stages of development.
Z
Datasets accompanying scANANSE
data.niaid.nih.gov
zenodo.org
Updated Mar 13, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Smits, J.G.A. (2023). Datasets accompanying scANANSE [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_7446266
Explore at:
Dataset updated
Mar 13, 2023
Dataset provided by
Arts, J.A.
Smits, J.G.A.
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The preprocessed Seurat object and the two Scanpy objects that can be used to run the scANANSE pipeline with.

Seurat object: preprocessed_PBMC.Rds

Scanpy objects: rna_PBMC.h5ad, atac_PBMC.h5ad

Additional raw data, used to construct the preprocessed objects, supplemented from:

https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5, https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz, https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi, https://atlas.fredhutch.org/data/nygc/multimodal/pbmc_multimodal.h5seurat
Data from: An atlas of transcribed enhancers across helper T cell diversity...
data.niaid.nih.gov
datadryad.org
zip
Updated Apr 22, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto (2024). An atlas of transcribed enhancers across helper T cell diversity for decoding human diseases [Dataset]. http://doi.org/10.5061/dryad.pk0p2ngwx
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5061/dryad.pk0p2ngwx
Dataset updated
Apr 22, 2024
Dataset provided by
RIKEN Center for Integrative Medical Sciences
Authors
Yasuhiro Murakawa; Akiko Oguchi; Shuichiro Komatsu; Akari Suzuki; Chikashi Terao; Kazuhiko Yamamoto
License
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Description
Transcribed enhancer maps can reveal nuclear interactions underpinning each cell type and connect specific cell types to diseases. Using a 5′ single-cell RNA sequencing approach, we defined transcription start sites of enhancer RNAs and other classes of coding and non-coding RNAs in human CD4+ T cells, revealing cellular heterogeneity and differentiation trajectories. Integration of these datasets with single-cell chromatin profiles showed that active enhancers with bidirectional RNA transcription are highly cell type–specific, and disease heritability is strongly enriched in these enhancers. The resulting cell type–resolved multimodal atlas of bidirectionally transcribed enhancers, which we linked with promoters using fine-scale chromatin contact maps, enabled us to systematically interpret genetic variants associated with a range of immune-mediated diseases. Methods All experiments using human samples were approved by the ethical review committee of RIKEN [approval no. H30-9(13)]. Written informed consent was obtained from all donors. CD4+ T cells were isolated by the immunomagnetic negative selection method. Stained CD4+ T cells were sorted using a FACSAria IIu Cell Sorter (BD Biosciences). Human CD4+ T cells and FACS-sorted heterogenous populations were processed with a Chromium Next GEM Single Cell 5′ kit (10x Genomics). Libraries were sequenced on an Illumina NovaSeq 6000 sequencing platform using 2 × 150 bp paired-end sequencing. Multiome assay (10x Genomics) was performed according to the manufacturer’s instructions. Multiome libraries were pooled and sequenced as above with 10 cycles for i7 index and 24 cycles for i5 index. Micro-C libraries were generated using a Dovetail Micro-C Kit (Cantata Bio, Cat#21006) and were sequenced on an Illumina NovaSeq 6000 platform using 2 × 150 bp paired-end sequencing. Chromium scRNA-seq, snRNA-seq, and CITE-seq data were processed using Cell Ranger Software version 5.0.1 (10x Genomics) and R package Seurat version 5 (4.9.9.9067). Multiome data were processed by Cell Ranger ARC version 2.0.0 (10x Genomics), Seurat version 5 (4.9.9.9067), and Signac version 1.10.0. scRNA-seq, snRNA-seq, and Multiome 3′ snRNA-seq data were integrated using canonical correlation analysis. snATAC peaks were identified from fragment files of each cluster using MACS2 version 2.2.6 with default settings as implemented in Signac version 1.10.0. For ReapTEC, paired-end reads were mapped again using STAR (STARsolo) to obtain reads with unencoded G, which was tagged as a soft-clipped G by STARsolo. Reads were deduplicated, and those with the barcodes of each cell type were extracted. A count file was generated for each transcription start site (TSS) using the “bamToBed” function in BEDTools version 2.30.0. TSS peaks were generated by merging TSSs located within 10 bp of each other. To identify btcEnhs, TSS peak pairs were detected using scripts provided at https://github.com/anderssonrobin/enhancers/blob/master/scripts/bidir_enhancers with minor modifications. Micro-C data were processed with the dovetail_tools pipeline (Cantata Bio). Chromatin loop contacts were identified by the HiCCUPS algorithm using the Juicer Tools package version 2.20.0 and the scale-space representation algorithm using the Mustache package. Loops were called at a 1-kb resolution with SCALE-normalized contact matrices for HiCCUPS and with ICE-normalized contact matrices for Mustache, and were filtered for an FDR < 0.05.
Data connected with "Cell signaling pathways discovery from multi-modal...
zenodo.org
zip
Updated Feb 13, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Claire Simpson; Ian Cossentino; Changhan He; Darya Orlova; Claire Simpson; Ian Cossentino; Changhan He; Darya Orlova (2025). Data connected with "Cell signaling pathways discovery from multi-modal data" [Dataset]. http://doi.org/10.5281/zenodo.14775408
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.14775408
Dataset updated
Feb 13, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Claire Simpson; Ian Cossentino; Changhan He; Darya Orlova; Claire Simpson; Ian Cossentino; Changhan He; Darya Orlova
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Input data used to run Incytr as described in "Cell signaling pathways discovery from multi-modal data":

mc38_input_data:

pr_10, pr_14: total proteomics for 10 and 14 day conditions deconvoluted by cell type

ps_10, ps_14: pS phosphoproteomics for 10 and 14 day conditions deconvoluted by cell type

py_10, py_14: pY phosphoproteomics for 10 and 14 day conditions deconvoluted by cell type

kldata: Kinase Library predictions for observed phosphosites

mc38b.rds: Seurat object with scRNA-seq data

input_gene_list: markers and highly expressed genes in transcriptomics data

5xfad_input_data:

processed_pr_5X_v2, processed_pr_WT_v2: total proteomics for 5X and WT conditions deconvoluted by cell type

processed_ps_5X_v2, processed_ps_WT_v2: pS phosphoproteomics for 5X and WT conditions deconvoluted by cell type

processed_py_5X_v2, processed_py_WT_v2: pY phosphoproteomics for 5X and WT conditions deconvoluted by cell type

kldata: Kinase Library predictions for observed phosphosites

transcriptomics is available from Zhou et al. (2020)

covid_input_data:

covid_bl_t_2024.rds: Seurat object with scRNA-seq data (T cells from BL group)

additional_input_genes: additional genes used as input to Incytr besides those indicated through high expression in the transcriptomic data

First cohort data available at Gene Expression Omnibus (GSE186267)
Data from: Phospho-seq: Integrated, multi-modal profiling of intracellular...
zenodo.org
application/gzip, bin
Updated Nov 22, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija; Philipp Wahle; Philipp Wahle; Giovanna Brancati; Giovanna Brancati (2023). Phospho-seq: Integrated, multi-modal profiling of intracellular protein dynamics in single cells [Dataset]. http://doi.org/10.5281/zenodo.10120360
Explore at:
bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10120360
Dataset updated
Nov 22, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
John D Blair; John D Blair; Austin Hartman; Austin Hartman; Fides Zenk; Fides Zenk; Carol Dalgarno; Carol Dalgarno; Barbara Treutlein; Barbara Treutlein; Rahul Satija; Rahul Satija; Philipp Wahle; Philipp Wahle; Giovanna Brancati; Giovanna Brancati
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description

Datasets to go along with the publication listed:
full_object.rds: Brain Organoid Phospho-Seq dataset with ATAC, Protein and imputed RNA data
rna_object.rds: Reference whole cell scRNA-Seq object on Brain organoids
multiome_object.rds: Bridge dataset containing RNA and ATAC modalities for Brain organoids
metacell_allnorm.rds: Metacell object for finding gene-peak-protein linkages in Brain organoid dataset
fullobject_fragments.tsv.gz: fragment file to go with the full object
fullobject_fragments.tsv.gz.tbi:index file for the full object fragment file
multiome_fragments.tsv.gz: fragment file to go with the multiome object
multiome_fragments.tsv.gz.tbi:index file for the multiome object fragment file
K562_Stem.rds : object corresponding to the pilot experiment including K562 cells and iPS cells
K562_stem_fragments.tsv.gz: fragment file to go with the K562_stem object
K562_stem_fragments.tsv.gz.tbi: index file for the K562_stem object fragment file
retina.rds : object corresponding to the retinal organoid phospho-seq experiment
retina_fragments.tsv.gz: fragment file to go with the retina object
retina_fragments.tsv.gz.tbi: index file for the retina object fragment file
retina_multi.rds : object corresponding to the retinal organoid phospho-seq-multiome experiment
retina_multi_fragments.tsv.gz: fragment file to go with the retina_multi object
retina_multi_fragments.tsv.gz.tbi: index file for the retina_multi object fragment file
To use the K562, multiome, retina and retina_multiome datasets provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:
obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))
To use the K562 and multiome datasets provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:
obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))
To use the "fullobject" dataset provided, please use these lines of code to import the object into Signac/Seurat and change the fragment file path to the corresponding downloaded fragment file:
#load the stringr package library(stringr) #load the object obj <- readRDS("obj.rds") # remove fragment file information Fragments(obj) <- NULL #Remove unwanted residual information and rename cells obj@reductions$norm.adt.pca <- NULL obj@reductions$norm.pca <- NULL obj <- RenameCells(obj, new.names = str_remove(Cells(obj), "atac_")) # Update the path of the fragment file Fragments(obj) <- CreateFragmentObject(path = "download/obj_fragments.tsv.gz", cells = Cells(obj))
f
DataSheet1_Benchmarking automated cell type annotation tools for single-cell...
frontiersin.figshare.com
docx
Updated Jun 21, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yuge Wang; Xingzhi Sun; Hongyu Zhao (2023). DataSheet1_Benchmarking automated cell type annotation tools for single-cell ATAC-seq data.docx [Dataset]. http://doi.org/10.3389/fgene.2022.1063233.s001
Explore at:
docxAvailable download formats
Unique identifier
https://doi.org/10.3389/fgene.2022.1063233.s001
Dataset updated
Jun 21, 2023
Dataset provided by
Frontiers
Authors
Yuge Wang; Xingzhi Sun; Hongyu Zhao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
As single-cell chromatin accessibility profiling methods advance, scATAC-seq has become ever more important in the study of candidate regulatory genomic regions and their roles underlying developmental, evolutionary, and disease processes. At the same time, cell type annotation is critical in understanding the cellular composition of complex tissues and identifying potential novel cell types. However, most existing methods that can perform automated cell type annotation are designed to transfer labels from an annotated scRNA-seq data set to another scRNA-seq data set, and it is not clear whether these methods are adaptable to annotate scATAC-seq data. Several methods have been recently proposed for label transfer from scRNA-seq data to scATAC-seq data, but there is a lack of benchmarking study on the performance of these methods. Here, we evaluated the performance of five scATAC-seq annotation methods on both their classification accuracy and scalability using publicly available single-cell datasets from mouse and human tissues including brain, lung, kidney, PBMC, and BMMC. Using the BMMC data as basis, we further investigated the performance of these methods across different data sizes, mislabeling rates, sequencing depths and the number of cell types unique to scATAC-seq. Bridge integration, which is the only method that requires additional multimodal data and does not need gene activity calculation, was overall the best method and robust to changes in data size, mislabeling rate and sequencing depth. Conos was the most time and memory efficient method but performed the worst in terms of prediction accuracy. scJoint tended to assign cells to similar cell types and performed relatively poorly for complex datasets with deep annotations but performed better for datasets only with major label annotations. The performance of scGCN and Seurat v3 was moderate, but scGCN was the most time-consuming method and had the most similar performance to random classifiers for cell types unique to scATAC-seq.
Datasets and supplemental information accompanying scANANSE
zenodo.org
application/gzip, bin +1
Updated Jul 12, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
J.A. Arts; J.A. Arts; J.G.A. Smits; J.G.A. Smits; S. Frölich; S. Frölich; R.R. Snabel; R.R. Snabel; B.M.A. Heuts; B.M.A. Heuts; J.H.A. Martens; J.H.A. Martens; S.J. Van Heeringen; S.J. Van Heeringen; J.H. Zhou; J.H. Zhou (2024). Datasets and supplemental information accompanying scANANSE [Dataset]. http://doi.org/10.5281/zenodo.7575107
Explore at:
pdf, bin, application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7575107
Dataset updated
Jul 12, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
J.A. Arts; J.A. Arts; J.G.A. Smits; J.G.A. Smits; S. Frölich; S. Frölich; R.R. Snabel; R.R. Snabel; B.M.A. Heuts; B.M.A. Heuts; J.H.A. Martens; J.H.A. Martens; S.J. Van Heeringen; S.J. Van Heeringen; J.H. Zhou; J.H. Zhou
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains supplementary files from the paper: "scANANSE: gene regulatory network and motif analysis of single-cell clusters" as well as several accompanying datasets.

The supplementary files:

- Install_Rstudio.pdf

- AnanseScanpy_equivalent.pdf

The pre-processed Seurat object and the two pre-processed Scanpy objects can be used to run the scANANSE pipeline with:

- preprocessed_PBMC.Rds

- rna_PBMC.h5ad

- atac_PBMC.h5ad

Additional raw data, used to construct the pre-processed objects, supplemented from:

https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_filtered_feature_bc_matrix.h5, https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz, https://cf.10xgenomics.com/samples/cell-arc/1.0.0/pbmc_granulocyte_sorted_10k/pbmc_granulocyte_sorted_10k_atac_fragments.tsv.gz.tbi,
https://atlas.fredhutch.org/data/nygc/multimodal/pbmc_multimodal.h5seura
TCRscape: A Single-Cell Multi-Omic TCR Profiling Toolkit
zenodo.org
bin, text/x-python +1
Updated Apr 25, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Roman Perik-Zavodskii; Roman Perik-Zavodskii; Olga Perik-Zavodskaia; Olga Perik-Zavodskaia; Marina Volynets; Marina Volynets; Saleh Alrhmoun; Saleh Alrhmoun (2025). TCRscape: A Single-Cell Multi-Omic TCR Profiling Toolkit [Dataset]. http://doi.org/10.5281/zenodo.15280600
Explore at:
zip, bin, text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15280600
Dataset updated
Apr 25, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Roman Perik-Zavodskii; Roman Perik-Zavodskii; Olga Perik-Zavodskaia; Olga Perik-Zavodskaia; Marina Volynets; Marina Volynets; Saleh Alrhmoun; Saleh Alrhmoun
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Time period covered
Apr 25, 2025
Description
Single-cell multi-omics has transformed T-cell biology by enabling the simultaneous analysis of T-cell receptor (TCR) sequences, transcriptomes, and surface proteins at the resolution of individual cells. These capabilities are critical for identifying antigen-specific T-cells and accelerating the development of TCR-based immunotherapies. Here, we present TCRscape, an open-source Python 3 tool tailored for BD Rhapsody™ data, which performs high-resolution TCR clonotype discovery and quantification. TCRscape integrates full-length TCR sequence data with gene expression profiles and V(D)J gene segment usage to enable multimodal clustering of αβ and γδ T-cell populations. It outputs Seurat-compatible matrices, facilitating downstream visualization and analysis in standard single-cell analysis environments. By bridging clonotype detection with immune cell transcriptome and proteome profiling, TCRscape supports rapid identification of dominant T-cell clones and their functional phenotypes, offering a powerful resource for immune monitoring and TCR-engineered therapeutic development.
Not seeing a result you expected?
Learn how you can add new datasets to our index.

Facebook

Twitter

Click to copy link

Link copied

Cite

Abhinav Kaushik; Kari Nadeau (2024). Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients [Dataset]. http://doi.org/10.5061/dryad.8931zcrz4

Data from: Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients

Explore at:

zipAvailable download formats

Unique identifier

https://doi.org/10.5061/dryad.8931zcrz4

Dataset updated

Jul 2, 2024

Dataset provided by

National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/

Authors

Abhinav Kaushik; Kari Nadeau

License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Innate lymphoid cells (ILCs) are enriched at mucosal surfaces where they respond rapidly to environmental stimuli and contribute to both tissue inflammation and healing. To gain insight into the role of ILCs in the pathology and recovery from COVID-19 infection, we employed a multi-omic approach consisting of Abseq and targeted mRNA sequencing to respectively probe the surface marker expression, transcriptional profile and heterogeneity of ILCs in peripheral blood of patients with COVID-19 compared with healthy controls. We found that the frequency of ILC1 and ILC2 cells was significantly increased in COVID-19 patients. Moreover, all ILC subsets displayed a significantly higher frequency of CD69-expressing cells, indicating a heightened state of activation. ILC2s from COVID-19 patients had the highest number of significantly differentially expressed (DE) genes. The most notable genes DE in COVID-19 vs healthy participants included a) genes associated with responses to virus infections and b) genes that support ILC self-proliferation, activation and homeostasis. In addition, differential gene regulatory network analysis revealed ILC-specific regulons and their interactions driving the differential gene expression in each ILC. Overall, this study provides mechanistic insights into the characteristics of ILC subsets activated during COVID-19 infection. Methods Study participants, blood draws and processing Participants were recruited as described previously from adults who had a positive SARS-COV-2 RT-PCR test at Stanford Health Care (NCT04373148). Collection of Covid samples occurred between May to December 2020. The cohort used in this study consisted of asymptomatic (n=2), mild (n=17), and moderate (n=3) COVID-19 infections, some of whom developed long term COVID-19 (n=15). The clinical case severities at the time of diagnosis were defined as asymptomatic, moderate or mild according to the guidelines released by NIH. Long term (LT) COVID was defined as symptoms occurring 30 or more days after infection, consistent with CDC guidelines. Some participants in our study continued to have LT COVID symptoms 90 days after diagnosis (n=12). Exclusion criteria for COVID sample study were NIH severity diagnosis of severe or critical at the time of positive covid test. Samples selected for this study were obtained within 76 days of positive PCR COVID-19 test date. Healthy controls were selected who had sample collection before 2020. Informed consent was obtained from all participants. All protocols were approved by the Stanford Administrative Panel on Human Subjects in Medical Research. Peripheral blood was drawn by venipuncture and using validated and published procedures, peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll-based density gradient centrifugation, frozen in aliquots and stored in liquid nitrogen at -80°C , until thawing. A summary of participant demographics is presented in Supp. Table 1.
ILC Enrichment, single cell captures for Abseq and targeted mRNAseq Participant PBMCs were thawed, and each sample stained with Sample Tag (BD #633781) at room temperature for 20 minutes. Samples were combined in healthy control or COVID-19 tubes. Cells were surface stained with a panel of fluorochrome-conjugated antibodies (Supp. Table 2) in buffer (PBS with 0.25% BSA and 1mM EDTA) for 20 minutes at room temperature prior to immunomagnetic negative selection for ILCs. Following ILC enrichment using the EasySep human Pan-ILC enrichment kit (StemCell Technologies #17975), cells from healthy and COVID-19 recovered participants were counted and normalized before combining. ILCs were sorted using a BD FACS Aria at the Stanford FACS facility prior to incubation with AbSeq oligo-linked mAbs (Supp. Table 3). Sorted cells were processed by the Stanford Human Immune Monitoring Center (HIMC) using the BD Rhapsody platform. Library was prepared using the BD Immune Response Targeting Panel (BD Kit #633750) with addition of custom gene panel reagents (Supp. Table 4) and sequenced on Illumina NovaSeq 6000 at Stanford Genomics Sequencing Center (SGSC). ILCs were identified as Lineageneg (CD3neg, CD14neg, CD34neg, CD19neg), NKG2Aneg, CD45+ and ILCs further defined as CD127+CD161+ and as subsets: ILC1 (CD117negCRTH2neg), ILC2 (CRTH2+) and ILCp (CD117+CRTH2neg) (Supp. Fig. 1). Computational data analysis The above multi-modal setup allowed paired measurements of cellular transcriptome and cell surface protein abundance. The ILC1, ILC2 and ILCp cells were manually gated based on the abundance profile of CD127, CD117, CD161 and CRTH2 (Supp. Fig. 1). Before the integrative analysis, the complete multi-modal single cell dataset containing ILC subsets was converted into single Seurat object. All the subsequent protein-level and gene-level analyses were performed using multimodal data analysis pipeline of Seurat R package version 4.0. The normalized and scaled protein abundance profile was used for estimating the integrated harmony dimensions using runHarmony function in Seurat R package (reduction= ‘apca’ and group.by.vars = ‘batch’) . The batch corrected harmony embeddings were then used for computing the Uniform Manifold Approximation and Projection (UMAP) dimensions to visualize the clusters of ILC subsets. Differential marker analysis of surface proteins, between two groups of cells (COVID-19 and Healthy cohort), from abseq panels was computed with normalized and scaled expression values using FindMarkers function from Seurat R package (test.use=’wilcox’). Similarly, differential gene expression was performed on normalized and scaled gene expression values from between two groups of cells (COVID-19 and Healthy cohort) using the FindMarkers function from Seurat R package (test.use=’MAST’ and latent.vars=’batch’). Genes with log-fold change > 0.5 and adjusted p-value < 0.05 (method: Benjamini-Hochberg) (were considered as significant for further evaluation. The resulting adjusted p-values box-plots were plotted using ggplot2 R package (version 3.4.2) after computing the number of cells expressing a given protein or gene in each sample. Pathway enrichment analysis of DE genes was performed using web-server metascape (version 3.5). The AUCells score and gene regulatory network analysis was performed using pySCENIC pipeline (version 0.12.1). Gene regulatory network was reconstructed using GRNBoost2 algorithm and the list of TFs in humans (genome version: hg38) were obtained from cisTarget database. (https://resources.aertslab.org/cistarget). Cellular enrichment (aka AUCell) analysis that measures the activity of TF or gene signatures across all single cells was performed using aucell function in pySCENIC python library. The ggplot2 R package (version 3.4.2) was used for boxplot visualization. The differential gene co-expression analysis was performed using scSFMnet R package. Circular plots were generated using the R package circlize (version 0.4.15).

Clear search

Close search

Google apps

Main menu

Data from: Single cell multiomic analysis identifies key genes...

Data from: MOJITOO: a fast and universal method for integration of...

Single cell T cell atlas

Data from: Large-scale single-cell RNA-seq characterizes neural stem cells...

Datasets accompanying scANANSE

Data from: An atlas of transcribed enhancers across helper T cell diversity...

Data connected with "Cell signaling pathways discovery from multi-modal...

Data from: Phospho-seq: Integrated, multi-modal profiling of intracellular...

DataSheet1_Benchmarking automated cell type annotation tools for single-cell...

Datasets and supplemental information accompanying scANANSE

TCRscape: A Single-Cell Multi-Omic TCR Profiling Toolkit

Data from: Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patientsSee More Versions

Data from: Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients