24 datasets found
  1. Data from: Single cell multiomic analysis identifies key genes...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Jul 2, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Abhinav Kaushik; Kari Nadeau (2024). Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients [Dataset]. http://doi.org/10.5061/dryad.8931zcrz4
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 2, 2024
    Dataset provided by
    National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
    Authors
    Abhinav Kaushik; Kari Nadeau
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Innate lymphoid cells (ILCs) are enriched at mucosal surfaces where they respond rapidly to environmental stimuli and contribute to both tissue inflammation and healing. To gain insight into the role of ILCs in the pathology and recovery from COVID-19 infection, we employed a multi-omic approach consisting of Abseq and targeted mRNA sequencing to respectively probe the surface marker expression, transcriptional profile and heterogeneity of ILCs in peripheral blood of patients with COVID-19 compared with healthy controls. We found that the frequency of ILC1 and ILC2 cells was significantly increased in COVID-19 patients. Moreover, all ILC subsets displayed a significantly higher frequency of CD69-expressing cells, indicating a heightened state of activation. ILC2s from COVID-19 patients had the highest number of significantly differentially expressed (DE) genes. The most notable genes DE in COVID-19 vs healthy participants included a) genes associated with responses to virus infections and b) genes that support ILC self-proliferation, activation and homeostasis. In addition, differential gene regulatory network analysis revealed ILC-specific regulons and their interactions driving the differential gene expression in each ILC. Overall, this study provides mechanistic insights into the characteristics of ILC subsets activated during COVID-19 infection. Methods Study participants, blood draws and processing Participants were recruited as described previously from adults who had a positive SARS-COV-2 RT-PCR test at Stanford Health Care (NCT04373148). Collection of Covid samples occurred between May to December 2020. The cohort used in this study consisted of asymptomatic (n=2), mild (n=17), and moderate (n=3) COVID-19 infections, some of whom developed long term COVID-19 (n=15). The clinical case severities at the time of diagnosis were defined as asymptomatic, moderate or mild according to the guidelines released by NIH. Long term (LT) COVID was defined as symptoms occurring 30 or more days after infection, consistent with CDC guidelines. Some participants in our study continued to have LT COVID symptoms 90 days after diagnosis (n=12). Exclusion criteria for COVID sample study were NIH severity diagnosis of severe or critical at the time of positive covid test. Samples selected for this study were obtained within 76 days of positive PCR COVID-19 test date. Healthy controls were selected who had sample collection before 2020. Informed consent was obtained from all participants. All protocols were approved by the Stanford Administrative Panel on Human Subjects in Medical Research. Peripheral blood was drawn by venipuncture and using validated and published procedures, peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll-based density gradient centrifugation, frozen in aliquots and stored in liquid nitrogen at -80°C , until thawing. A summary of participant demographics is presented in Supp. Table 1.
    ILC Enrichment, single cell captures for Abseq and targeted mRNAseq Participant PBMCs were thawed, and each sample stained with Sample Tag (BD #633781) at room temperature for 20 minutes. Samples were combined in healthy control or COVID-19 tubes. Cells were surface stained with a panel of fluorochrome-conjugated antibodies (Supp. Table 2) in buffer (PBS with 0.25% BSA and 1mM EDTA) for 20 minutes at room temperature prior to immunomagnetic negative selection for ILCs. Following ILC enrichment using the EasySep human Pan-ILC enrichment kit (StemCell Technologies #17975), cells from healthy and COVID-19 recovered participants were counted and normalized before combining. ILCs were sorted using a BD FACS Aria at the Stanford FACS facility prior to incubation with AbSeq oligo-linked mAbs (Supp. Table 3). Sorted cells were processed by the Stanford Human Immune Monitoring Center (HIMC) using the BD Rhapsody platform. Library was prepared using the BD Immune Response Targeting Panel (BD Kit #633750) with addition of custom gene panel reagents (Supp. Table 4) and sequenced on Illumina NovaSeq 6000 at Stanford Genomics Sequencing Center (SGSC). ILCs were identified as Lineageneg (CD3neg, CD14neg, CD34neg, CD19neg), NKG2Aneg, CD45+ and ILCs further defined as CD127+CD161+ and as subsets: ILC1 (CD117negCRTH2neg), ILC2 (CRTH2+) and ILCp (CD117+CRTH2neg) (Supp. Fig. 1). Computational data analysis The above multi-modal setup allowed paired measurements of cellular transcriptome and cell surface protein abundance. The ILC1, ILC2 and ILCp cells were manually gated based on the abundance profile of CD127, CD117, CD161 and CRTH2 (Supp. Fig. 1). Before the integrative analysis, the complete multi-modal single cell dataset containing ILC subsets was converted into single Seurat object. All the subsequent protein-level and gene-level analyses were performed using multimodal data analysis pipeline of Seurat R package version 4.0. The normalized and scaled protein abundance profile was used for estimating the integrated harmony dimensions using runHarmony function in Seurat R package (reduction= ‘apca’ and group.by.vars = ‘batch’) . The batch corrected harmony embeddings were then used for computing the Uniform Manifold Approximation and Projection (UMAP) dimensions to visualize the clusters of ILC subsets. Differential marker analysis of surface proteins, between two groups of cells (COVID-19 and Healthy cohort), from abseq panels was computed with normalized and scaled expression values using FindMarkers function from Seurat R package (test.use=’wilcox’). Similarly, differential gene expression was performed on normalized and scaled gene expression values from between two groups of cells (COVID-19 and Healthy cohort) using the FindMarkers function from Seurat R package (test.use=’MAST’ and latent.vars=’batch’). Genes with log-fold change > 0.5 and adjusted p-value < 0.05 (method: Benjamini-Hochberg) (were considered as significant for further evaluation. The resulting adjusted p-values box-plots were plotted using ggplot2 R package (version 3.4.2) after computing the number of cells expressing a given protein or gene in each sample. Pathway enrichment analysis of DE genes was performed using web-server metascape (version 3.5). The AUCells score and gene regulatory network analysis was performed using pySCENIC pipeline (version 0.12.1). Gene regulatory network was reconstructed using GRNBoost2 algorithm and the list of TFs in humans (genome version: hg38) were obtained from cisTarget database. (https://resources.aertslab.org/cistarget). Cellular enrichment (aka AUCell) analysis that measures the activity of TF or gene signatures across all single cells was performed using aucell function in pySCENIC python library. The ggplot2 R package (version 3.4.2) was used for boxplot visualization. The differential gene co-expression analysis was performed using scSFMnet R package. Circular plots were generated using the R package circlize (version 0.4.15).

  2. f

    217 shared genes in DEGs related to human age.

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). 217 shared genes in DEGs related to human age. [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s004
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOS ONE
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ObjectiveTo guide animal experiments, we investigated the similarities and differences between humans and mice in aging and Alzheimer’s disease (AD) at the single-nucleus RNA sequencing (snRNA-seq) or single-cell RNA sequencing (scRNA-seq) level.MethodsMicroglia cells were extracted from dataset GSE198323 of human post-mortem hippocampus. The distributions and proportions of microglia subpopulation cell numbers related to AD or age were compared. This comparison was done between GSE198323 for humans and GSE127892 for mice, respectively. The Seurat R package and harmony R package were used for data analysis and batch effect correction. Differentially expressed genes (DEGs) were identified by FindMarkers function with MAST test. Comparative analyses were conducted on shared genes in DEGs associated with age and AD. The analyses were done between human and mouse using various bioinformatics techniques. The analysis of genes in DEGs related to age was conducted. Similarly, the analysis of genes in DEGs related to AD was performed. Cross-species analyses were conducted using orthologous genes. Comparative analyses of pseudotime between humans and mice were performed using Monocle2.Results(1) Similarities: The proportion of microglial subpopulation Cell_APOE/Apoe shows consistent trends, whether in AD or normal control (NC) groups in both humans and mice. The proportion of Cell_CX3CR1/Cx3cr1, representing homeostatic microglia, remains stable with age in NC groups across species. Tuberculosis and Fc gamma R-mediated phagocytosis pathways are shared in microglia responses to age and AD across species, respectively. (2) Differences: IL1RAPL1 and SPP1 as marker genes are more identifiable in human microglia compared to their mouse counterparts. Most genes of DEGs associated with age or AD exhibit different trends between humans and mice. Pseudotime analyses demonstrate varying cell density trends in microglial subpopulations, depending on age or AD across species.ConclusionsMouse Apoe and Cell_Apoe maybe serve as proxies for studying human AD, while Cx3cr1 and Cell_Cx3cr1 are suitable for human aging studies. However, AD mouse models (App_NL_G_F) have limitations in studying human genes like IL1RAPL1 and SPP1 related to AD. Thus, mouse models cannot fully replace human samples for AD and aging research.

  3. d

    Seurat objects for the manuscript Single-cell consequences of X-linked...

    • search.dataone.org
    Updated Aug 28, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Peter Price; Alison Wright (2025). Seurat objects for the manuscript Single-cell consequences of X-linked meiotic drive in stalk-eyed flies [Dataset]. http://doi.org/10.5061/dryad.q573n5twb
    Explore at:
    Dataset updated
    Aug 28, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Peter Price; Alison Wright
    Description

    This dataset contains R seurat objects used to reproduce the single-cell RNA-seq analyses for the manuscript Single-cell consequences of X-linked meiotic drive in stalk-eyed flies. Testis tissue from eight male Teleopsis dalmanni (drive and standard genotypes) was dissociated and sequenced using the 10X Genomics Chromium platform. Sequencing reads were processed with Cell Ranger v7.2.0, and downstream filtering, doublet removal, integration, and clustering were performed in Seurat v5.1.0. The final dataset (seurat_final.RData) comprises 12,217 high-quality cells expressing 12,454 genes, with cell types identified using orthologous markers from Drosophila melanogaster. Provided files include the filtered integrated Seurat object and a final processed object with reclustered and annotated cell types. These resources enable full reproducibility of the analyses and support future exploration of testis cell populations in stalk-eyed flies. , , # Seurat objects for the manuscript Single-cell consequences of X-linked meiotic drive in stalk-eyed flies

    Dataset DOI: 10.5061/dryad.q573n5twb

    Description of the data and file structure

    Sequencing data from Price et al. (2025; 10.5061/dryad.zkh1893kb) was processed using Cell Ranger v7.2.0. First, a custom reference genome was built with the T. dalmanni reference genome using mkref. Using cellrangers count function, fastq reads were then aligned against the custom index and counted, creating gene-by-cell count matrices. Data filtering and downstream analyses were performed using Seurat v5.1.0 in R v4.3.2. Cells in each sample were removed from the analysis if they expressed fewer than 200 features and more than 20% mitochondrial expression. Count data for each sample was also filtered by only keeping genes with expression (counts > 1) in at least three cells. We used DoubletFinder v2.0.4 in R with default parameters to identify and remove doublets. O...,

  4. Data from: Continuous expression of TOX safeguards exhausted CD8 T cell...

    • data.niaid.nih.gov
    • search.dataone.org
    • +1more
    zip
    Updated Mar 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yinghui Jane Huang; John Wherry; Sasikanth Manne (2025). Continuous expression of TOX safeguards exhausted CD8 T cell epigenetic fate [Dataset]. http://doi.org/10.5061/dryad.8kprr4xx9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 14, 2025
    Dataset provided by
    University of Pennsylvania
    Authors
    Yinghui Jane Huang; John Wherry; Sasikanth Manne
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    CD8 T cell exhaustion is a major barrier limiting anti-tumor therapy. Though checkpoint blockade temporarily improves exhausted CD8 T cell (Tex) function, the underlying epigenetic landscape of Tex remains largely unchanged, preventing their durable “reinvigoration.” Whereas the transcription factor (TF) TOX has been identified as a critical initiator of Tex epigenetic programming, it remains unclear whether TOX plays an ongoing role in preserving Tex biology after cells commit to exhaustion. Here, we decoupled the role of TOX in the initiation versus maintenance of CD8 T cell exhaustion by temporally deleting TOX in established Tex. Induced TOX ablation in committed Tex resulted in apoptotic-driven loss of Tex, reduced expression of inhibitory receptors including PD-1, and a pronounced decrease in terminally differentiated subsets of Tex cells. Simultaneous gene expression and epigenetic profiling revealed a critical role for TOX in ensuring ongoing chromatin accessibility and transcriptional patterns for key Tex gene modules in committed Tex cells. Moreover, when exposed to effector-driving conditions, inducibly TOX-deleted established Tex acquired an altered chromatin landscape with increased accessibility at cytotoxic genes typically accessible in Teff cells, thus undergoing partial reprogramming into a more functional state. Together, these findings suggest that continuous TOX expression in established Tex acts as a durable epigenetic barrier to reinforce the Tex developmental fate by simultaneously maintaining Tex epigenetic commitment while restraining differentiation into Teff. Manipulation of TOX even after Tex establishment could therefore provide a therapeutic opportunity to rewire Tex biology in settings of chronic infection or cancer. The secondary goal of this dissertation was to develop a novel Tex fate-mapping mouse model driven to track the fate of developing Tex and manipulate Tex in a lineage-restricted fashion. Given the selectively high expression of TOX in Tex, compared to other peripheral non-Tex CD8 lineages, we used the Tox locus to drive this model (termed ToxTREx), which was engineered with a T2A-hmKO2-P2A-CreERT2 cassette knocked into the Tox locus after the last exon. We confirmed intact TOX function, hmKO2 reporter detection and TOX-driven Cre recombinase activity in this model and identified further optimization that will be necessary to improve Cre efficiency and specificity of this model for the Tex lineage. Nevertheless, the ToxTREx model could enable insightful studies that address existing and emerging questions in Tex ontogeny, differentiation and function. Methods Cells from inducible-Cre (Rosa26CreERT2/+Toxfl/fl P14) mice where TOX was temporally deleted from mature populations of LCMV-specific T exhausted cells after establishment of chronic LCMV infection 5 days post infection were subjected to scRNA and scATACseq coassay,naive cells and WT cells were used as controls. Analysis pipeline developed by Josephine Giles and vignettes published by Satija and Stuart labs.Transcript count and peak accessibility matrices deposited in GSE255042,GSE255043. Seurat/Signac was used to process the scRNA and scATACseq coassay data The processed Seurat/Signac object above was subsequently used for downstream RNA and ATAC analyses as described below: DEGs between TOX WT and iKO cells within each subset were identified using FindMarkers (Seurat, Signac), with a log2-fold-change threshold of 0, using the SCT assay. DACRs were identified using FindMarkers using the "LR" test, with a log2-fold-change threshold of 0.1, a min.pct of 0.05, and included the number of counts as a latent variable. DEGs and DACRs were filtered with a false discovery rate (FDR) of less than 0.05, using the Benjamini–Hochberg method to adjust P-values. Gene ontology (GO) analysis of DEGs used Metascape (https://metascape.org/) with all expressed genes as the background gene list. AddModuleScore (Seurat, Signac) was used to calculate per-cell gene set enrichment scores and peak set enrichment scores. Gene set enrichment analysis (GSEA) was performed and visualized using clusterProfiler (4.8.1) and enrichplot (1.20.0). Tex-prog and Tex-term transcriptional signatures were generated from the Giles et al dataset (NI 2022), using FindMarkers to identify DEGs between the "Exh-Prog" cluster and the "Exh-Term" and "Exh-TermGzma" clusters, filtered with a FDR of less than 0.05 and log2 fold-change threshold of 0.25. Peak sets of subset transition ACRs were identified using FindMarkers to perform pairwise comparisons between TOX WT Tex-prog and Tex-int, Tex-klr, and Tex-term respectively (using the "LR" test, with a min.pct of 0.05, included the number of counts as a latent variable and cut-off at top 5000 ACRs by log2 fold-change threshold). Peak sets of naïve, Teff, Tmem and Tex-specific ACRs were generated as previously described (Khan et al, Nature, 2019). Bedtools intersect was used to identify overlapping peaks between different datasets. Genome coverage tracks were generated in Signac using CoveragePlot, PeakPlot, and AnnotationPlot. Genomic locations of peaks were determined using ClosestFeature (Signac) and nearestTSS (edgeR), with promoter-TSS defined as -1kB to +100bp. The average local chromatin accessibility at each gene was determined using GeneActivity (Signac). TF motif enrichment was calculated using the JASPAR2020 function getMatrixSet (species 9606) and Signac functions CreateMotifMatrix, CreateMotifObject, and FindMotifs. Open peaks in the clusters of interest were used as background, using AccessiblePeaks and MatchRegionStats (Signac).

  5. n

    Data from: Dermomyotome-derived endothelial cells migrate to the dorsal...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Oct 4, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    David Traver; Pankaj Sahai-Hernandez; Claire Pouget; Shai Eyal; Ondrej Svoboda; Jose Chacon; Lin Grimm; Tor Gjøen (2023). Dermomyotome-derived endothelial cells migrate to the dorsal aorta to support hematopoietic stem cell emergence [Dataset]. http://doi.org/10.6075/J0GB22J0
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 4, 2023
    Dataset provided by
    University of Oslo
    University of California, San Diego
    Authors
    David Traver; Pankaj Sahai-Hernandez; Claire Pouget; Shai Eyal; Ondrej Svoboda; Jose Chacon; Lin Grimm; Tor Gjøen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Development of the dorsal aorta is a key step in the establishment of the adult blood-forming system since hematopoietic stem and progenitor cells (HSPCs) arise from ventral aortic endothelium in all vertebrate animals studied. Work in zebrafish has demonstrated that arterial and venous endothelial precursors arise from distinct subsets of lateral plate mesoderm. Here, we profile the transcriptome of the earliest detectable endothelial cells (ECs) during zebrafish embryogenesis to demonstrate that tissue-specific EC programs initiate much earlier than previously appreciated, by the end of gastrulation. Classic studies in the chick embryo showed that paraxial mesoderm generates a subset of somite-derived endothelial cells (SDECs) that incorporate into the dorsal aorta to replace HSPCs as they exit the aorta and enter circulation. We describe a conserved program in the zebrafish, where a rare population of endothelial precursors delaminates from the dermomyotome to incorporate exclusively into the developing dorsal aorta. Although SDECs lack hematopoietic potential, they act as a local niche to support the emergence of HSPCs from neighboring hemogenic endothelium. Thus, at least three subsets of ECs contribute to the developing dorsal aorta: vascular ECs, hemogenic ECs, and SDECs. Taken together, our findings indicate that the distinct spatial origins of endothelial precursors dictate different cellular potentials within the developing dorsal aorta. Methods Single-cell RNA sample preparation After FACS, total cell concentration and viability were ascertained using a TC20 Automated Cell Counter (Bio-Rad). Samples were then resuspended in 1XPBS with 10% BSA at a concentration between 800-3000 per ml. Samples were loaded on the 10X Chromium system and processed as per manufacturer’s instructions (10X Genomics). Single cell libraries were prepared as per the manufacturer’s instructions using the Single Cell 3’ Reagent Kit v2 (10X Genomics). Single cell RNA-seq libraries and barcode amplicons were sequenced on an Illumina HiSeq platform. Single-cell RNA sequencing analysis The Chromium 3’ sequencing libraries were generated using Chromium Single Cell 3’ Chip kit v3 and sequenced with (actually, I don’t know:( what instrument was used?). The Ilumina FASTQ files were used to generate filtered matrices using CellRanger (10X Genomics) with default parameters and imported into R for exploration and statistical analysis using a Seurat package (La Manno et al., 2018). Counts were normalized according to total expression, multiplied by a scale factor (10,000), and log-transformed. For cell cluster identification and visualization, gene expression values were also scaled according to highly variable genes after controlling for unwanted variation generated by sample identity. Cell clusters were identified based on UMAP of the first 14 principal components of PCA using Seurat’s method, Find Clusters, with an original Louvain algorithm and resolution parameter value 0.5. To find cluster marker genes, Seurat’s method, FindAllMarkers. Only genes exhibiting significant (adjusted p-value < 0.05) a minimal average absolute log2-fold change of 0.2 between each of the clusters and the rest of the dataset were considered as differentially expressed. To merge individual datasets and to remove batch effects, Seurat v3 Integration and Label Transfer standard workflow (Stuart et al., 2019)

  6. Visium Spatial and snRNA data of Brain section from Parkinson Mouse Model...

    • zenodo.org
    bin, csv, zip
    Updated Jun 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jaehyun Lee; Jaehyun Lee (2025). Visium Spatial and snRNA data of Brain section from Parkinson Mouse Model based on inducible expression of human a-syn constructs: 20-months + snRNA 23 months dataset [Dataset]. http://doi.org/10.5281/zenodo.14988055
    Explore at:
    csv, bin, zipAvailable download formats
    Dataset updated
    Jun 5, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jaehyun Lee; Jaehyun Lee
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Using 23-months old mice of a inducible expression of human a-syn constructs based Parkinson mouse model, we produced a single nucleus RNA dataset by cutting 0mm Bregma to -5mm Bregma. The Chromium 3’ Single Cell Library Kit (10x Genomics) was used and Sequencing was performed on a NovaSeq 6000. From the same model we also used 20-months old mice with the Visium Spatial V1 platform (10x Genomics). Sequencing was performed on a NovaSeq 6000. Both were PE150.

    snRNA pipeline: For the alignment of reads, a custom reference was created by adding the sequences of the V1S/SV2 transgene and the Camk2a promoter to the mm10 mouse reference genome. Count matrices generated by cellranger count 7.1 were loaded into an AnnData object and processed using the Python-based framework Scanpy 1.10.2. Integration with R, where needed, was facilitated through the rpy2 package. Raw count matrices were corrected for ambient RNA contamination using the SoupX 1.6.2. To remove potential doublets, scDblFinder 1.18.0 was employed with a fixed seed (123). Nuclei with nUMI and nGenes values exceeding three median absolute deviations (MADs) from the median were excluded. Genes detected in fewer than five nuclei across the dataset were excluded. The resulting dataset was normalized via scanpy.pp.normalize_total and scanpy.pp.log1p. Highly variable genes were identified using the function scanpy.pp.highly_variable_genes with the Seurat v3 flavor, selecting the top 4,000 genes. Dimensionality reduction was performed using principal component analysis (PCA) and batch effects were corrected using the python-implemented version of Harmony via the function scanpy.external.pp.harmony_integrate. Harmony embeddings were then used to construct a k-nearest neighbor (kNN) graph with scanpy.pp.neighbors. Clustering was performed using Leiden clustering with standard parameters via the function scanpy.tl.leiden. Clusters were annotated using literature, the mousebrain.org, and markers identified via the FindConservedMarkers function in Seurat. First, neurons and non-neuronal cells were distinguished using mainly canonical markers, such as but not limited to Rbfox3 (neurons), Mbp (oligodendrocytes), Acsbg1 (astrocytes), Pdgfra (oligodendrocyte precursor cells), Inpp5d (microglia), Colec12 (vascular cells), and Ttr (choroid plexus cells). Neurons were further classified into Vglut1 (Slc17a7), Vglut2 (Slc17a6), GABA (Gad2), cholinergic (Scube1), and dopaminergic (Th) neurons. Vglut1 and GABA neurons were further annotated into subtypes based on subclustering and FindConservedMarkers markers.

    visium spatial pipeline: Sequences were fiducially aligned to spots using Loupe Browser ver. 8. All aligned sequences were mapped using spaceranger count 3.0.1 with a custom refence, which included sequences for the promotor and transgene (Camk2aTTA, V1S/SV2) to the mouse genome mm39. We filtered each sample of the Visium Spatial dataset based on the MAD filtering of number of reads (nUMI), number of genes (nGene), and percentage of mitochondrial genes (percent.mt). A spot was filtered out if it was outside of 3x MAD value in at least two metrics. Filtered samples were merged into one Seurat 5.1.0 object and we obtained normalized counts by the SCTransform function of Seurat. Integration was performed using Harmony 1.2.0 on 50 PCA embeddings and clustering was done using Leiden clustering based on 30 harmony embeddings. Integrated clusters were visualized using the UMAP method. Samples that were not successfully integrated (based on similarity measures of the harmony embeddings) and showed high percentage.mt or low nUMI levels compared to other samples, were removed from subsequent analysis. A final integration and clustering were performed after filtering. Regions were first annotated based on a 0.1 resolution clustering to get high level region annotation (Cortex, Hippocampus, Subcortex). Each high-level region was further annotated based on either more granular resolutions or subclustering. Marker genes from mousebrain.org and literature were used in combination with the Allen mouse brain atlas to obtain anatomically relevant annotations.

  7. n

    Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jun 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow (2020). Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection [Dataset]. http://doi.org/10.5061/dryad.vt4b8gtp6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2020
    Dataset provided by
    Dartmouth College
    Dartmouth–Hitchcock Medical Center
    Authors
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.

    Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).

    Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.

    Preprocessing of single cell RNA sequencing (scRNA-seq) data

    Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.

    Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq

    Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.

    Differential gene expression by immune cells

    Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.

    Analysis of arsenic effect on immune cell gene expression by scRNA-seq.

    Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.

  8. Development of Ferret Reference Resources and Profiling Assays.

    • data.niaid.nih.gov
    • dev.immport.org
    • +1more
    url
    Updated Mar 27, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xinxia Peng (2025). Development of Ferret Reference Resources and Profiling Assays. [Dataset]. http://doi.org/10.21430/M3KVP4YEVN
    Explore at:
    urlAvailable download formats
    Dataset updated
    Mar 27, 2025
    Dataset provided by
    National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
    Authors
    Xinxia Peng
    License

    https://www.immport.org/agreementhttps://www.immport.org/agreement

    Description

    This study designed new ferret-specific immune repertoire profiling assays by targeting positions in constant regions without allelic diversity. Transcriptome sequencing of ferret splenocyte and lymph node samples was perfomed to obtain Ig and T cell receptor transcripts. These improved resources and assays enables further studies to capture ferret immune diversity.

  9. Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap...

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap for mice (Fig 7H). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s014
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap for mice (Fig 7H).

  10. f

    DEGs caused by Six3 and Six6 dual deficiency in combined clusters 3 and 9.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Oct 24, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Liu, Wei; Zheng, Deyou; Shrestha, Rupendra; Zhang, Xusheng; Ferrena, Alexander (2024). DEGs caused by Six3 and Six6 dual deficiency in combined clusters 3 and 9. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001362129
    Explore at:
    Dataset updated
    Oct 24, 2024
    Authors
    Liu, Wei; Zheng, Deyou; Shrestha, Rupendra; Zhang, Xusheng; Ferrena, Alexander
    Description

    Related to Fig 2. These DEGs were identified using the function FindMarkers in Seurat when DKO_CrePos cells and control cells in combined clusters 3 and 9 were compared. (CSV)

  11. Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap...

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap for humans (Fig 7C). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s012
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster1 in the heatmap for humans (Fig 7C).

  12. Multiplexed histology of COVID-19 post-mortem lung samples - Single-cell...

    • zenodo.org
    csv
    Updated Apr 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anna Pascual Reguant; Anna Pascual Reguant; Ronja Mothes; Helena Radbruch; Anja E. Hauser; Ronja Mothes; Helena Radbruch; Anja E. Hauser (2023). Multiplexed histology of COVID-19 post-mortem lung samples - Single-cell Mean Fluorescence Intensities [Dataset]. http://doi.org/10.5281/zenodo.7839928
    Explore at:
    csvAvailable download formats
    Dataset updated
    Apr 19, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anna Pascual Reguant; Anna Pascual Reguant; Ronja Mothes; Helena Radbruch; Anja E. Hauser; Ronja Mothes; Helena Radbruch; Anja E. Hauser
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data table containing single-cell mean fluorescence intensities (MFI) of all markers analyzed by multiplexed histology in all COVID-19 post-mortem lung samples and non-COVID-related pneumonia controls (14 lung samples, stratified based on disease duration into control, acute, chronic and prolonged). It contains information at the single-cell level about approx 50 proteins in around 40.000 lung cells.

    Data shown has been arcsin(h) transformed with a co-factor of 0.2. Additionally, cells expressing less than 0.15 MFI of all markers have been labeled as non-defined and excluded from the data set.

    Seurat package 4.0.0 was used in R to perform mean centering and scaling, followed by PCA, and reduced the dimensions of the data to the top 11 principal components. UMAP was initialized in this PCA space to visualize the data on reduced UMAP dimensions. The cells were clustered on PCA space using the SNN algorithm implemented as FindNeighbors and FindClusters with n.epochs = 500 and default parameters (res = 0.8). We obtained 26 clusters that we merged to get relevant populations for our analysis based on canonical lineage markers. We ended up with 8 cell clusters that were manually annotated based on cell-type-specific markers found to be differentially expressed.

  13. Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap...

    • figshare.com
    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap for mice (Fig 6I). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s011
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap for mice (Fig 6I).

  14. Data from: Pre-ciliated tubal epithelial cells are prone to initiation of...

    • data.niaid.nih.gov
    • datadryad.org
    zip
    Updated Oct 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove (2024). Pre-ciliated tubal epithelial cells are prone to initiation of high-grade serous ovarian carcinoma [Dataset]. http://doi.org/10.5061/dryad.4mw6m90hm
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 17, 2024
    Dataset provided by
    Cornell University
    Authors
    Coulter Ralston; Alexander Nikitin; Benjamin Cosgrove
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The distal region of the uterine (Fallopian) tube is commonly associated with high-grade serous carcinoma (HGSC), the predominant and most aggressive form of ovarian or extra-uterine cancer. Specific cell states and lineage dynamics of the adult tubal epithelium (TE) remain insufficiently understood, hindering efforts to determine the cell of origin for HGSC. Here, we report a comprehensive census of cell types and states of the mouse uterine tube. We show that distal TE cells expressing the stem/progenitor cell marker Slc1a3 can differentiate into both secretory (Ovgp1+) and ciliated (Fam183b+) cells. Inactivation of Trp53 and Rb1, whose pathways are commonly altered in HGSC, leads to elimination of targeted Slc1a3+ cells by apoptosis, thereby preventing their malignant transformation. In contrast, pre-ciliated cells (Krt5+, Prom1+, Trp73+) remain cancer-prone and give rise to serous tubal intraepithelial carcinomas and overt HGSC. These findings identify transitional pre-ciliated cells as a previously unrecognized cancer-prone cell state and point to pre-ciliation mechanisms as novel diagnostic and therapeutic targets. Methods

    Single-cell RNA-sequencing library preparation For TE single cell expression and transcriptome analysis we isolated TE from C57BL6 adult estrous female mice. In 3 independent experiments a total of 62 uterine tubes were collected. Each uterine tube was placed in sterile PBS containing 100 IU ml-1 of penicillin and 100 µg ml-1 streptomycin (Corning, 30-002-Cl), and separated in distal and proximal regions. Tissues from the same region were combined in a 40 µl drop of the same PBS solution, cut open lengthwise, and minced into 1.5-2.5 mm pieces with 25G needles. Minced tissues were transferred with help of a sterile wide bore 200 µl pipette tip into a 1.8 ml cryo vial containing 1.2 ml A-mTE-D1 (300 IU ml-1 collagenase IV mixed with 100 IU ml-1 hyaluronidase; Stem Cell Technologies, 07912, in DMEM Ham’s F12, Hyclone, SH30023.FS). Tissues were incubated with loose cap for 1 h at 37°C in a 5% CO2 incubator. During the incubation tubes were taken out 4 times and tissues suspended with a wide bore 200 µl pipette tip. At the end of incubation, the tissue-cell suspension from each tube was transferred into 1 ml TrypLE (Invitrogen, 12604013) pre-warmed to 37°C, suspended 70 times with a 1000 µl pipette tip, 5 ml A-SM [DMEM Ham’s F12 containing 2% fetal bovine serum (FBS)] were added to the mix, and TE cells were pelleted by centrifugation 300x g for 10 minutes at 25°C. Pellets were then suspended with 1 ml pre-warmed to 37°C A-mTE-D2 (7 mg ml-1 Dispase II, Worthington NPRO2, and 10 µg ml-1 Deoxyribonuclease I, Stem Cell Technologies, 07900), and mixed 70 times with a 1000 µl pipette tip. 5 ml A-mTE-D2 was added and samples were passed through a 40 µm cell strainer, and pelleted by centrifugation at 300x g for 7 minutes at +4°C. Pellets were suspended in 100 µl microbeads per 107 total cells or fewer, and dead cells were removed with the Dead Cell Removal Kit (Miltenyi Biotec, 130-090-101) according to the manufacturer’s protocol. Pelleted live cell fractions were collected in 1.5 ml low binding centrifuge tubes, kept on ice, and suspended in ice cold 50 µl A-Ri-Buffer (5% FBS, 1% GlutaMAX-I, Invitrogen, 35050-079, 9 µM Y-27632, Millipore, 688000, and 100 IU ml-1 penicillin 100 μg ml-1 streptomycin in DMEM Ham’s F12). Cell aliquots were stained with trypan blue for live and dead cell calculation. Live cell preparations with a target cell recovery of 5,000-6,000 were loaded on Chromium controller (10X Genomics, Single Cell 3’ v2 chemistry) to perform single cell partitioning and barcoding using the microfluidic platform device. After preparation of barcoded, next-generation sequencing cDNA libraries samples were sequenced on Illumina NextSeq500 System.

    Download and alignment of single-cell RNA sequencing data For sequence alignment, a custom reference for mm39 was built using the cellranger (v6.1.2, 10x Genomics) mkref function. The mm39.fa soft-masked assembly sequence and the mm39.ncbiRefSeq.gtf (release 109) genome annotation last updated 2020-10-27 were used to form the custom reference. The raw sequencing reads were aligned to the custom reference and quantified using the cellranger count function.

    Preprocessing and batch correction All preprocessing and data analysis was conducted in R (v.4.1.1 (2021-08-10)). The cellranger count outs were first modified with the autoEstCont and adjustCounts functions from SoupX (v.1.6.1) to output a corrected matrix with the ambient RNA signal (soup) removed (https://github.com/constantAmateur/SoupX). To preprocess the corrected matrices, the Seurat (v.4.1.1) NormalizeData, FindVariableFeatures, ScaleData, RunPCA, FindNeighbors, and RunUMAP functions were used to create a Seurat object for each sample (https://github.com/satijalab/seurat). The number of principal components used to construct a shared nearest-neighbor graph were chosen to account for 95% of the total variance. To detect possible doublets, we used the package DoubletFinder (v.2.0.3) with inputs specific to each Seurat object. DoubletFinder creates artificial doublets and calculates the proportion of artificial k nearest neighbors (pANN) for each cell from a merged dataset of the artificial and actual data. To maximize DoubletFinder’s predictive power, mean-variance normalized bimodality coefficient (BCMVN) was used to determine the optimal pK value for each dataset. To establish a threshold for pANN values to distinguish between singlets and doublets, the estimated multiplet rates for each sample were calculated by interpolating between the target cell recovery values according to the 10x Chromium user manual. Homotypic doublets were identified using unannotated Seurat clusters in each dataset with the modelHomotypic function. After doublets were identified, all distal and proximal samples were merged separately. Cells with greater than 30% mitochondrial genes, cells with fewer than 750 nCount RNA, and cells with fewer than 200 nFeature RNA were removed from the merged datasets. To correct for any batch defects between sample runs, we used the harmony (v.0.1.0) integration method (github.com/immunogenomics/harmony).

    Clustering parameters and annotations After merging the datasets and batch-correction, the dimensions reflecting 95% of the total variance were input into Seurat’s FindNeighbors function with a k.param of 70. Louvain clustering was then conducted using Seurat’s FindClusters with a resolution of 0.7. The resulting 19 clusters were annotated based on the expression of canonical genes and the results of differential gene expression (Wilcoxon Rank Sum test) analysis. One cluster expressing lymphatic and epithelial markers was omitted from later analysis as it only contained 2 cells suspected to be doublets. To better understand the epithelial populations, we reclustered 6 epithelial populations and reapplied harmony batch correction. The clustering parameters from FindNeighbors was a k.param of 50, and a resolution of 0.7 was used for FindClusters. The resulting 9 clusters within the epithelial subset were further annotated using differential expression analysis and canonical markers.

    Pseudotime analysis Potential of heat diffusion for affinity-based transition embedding (PHATE) is dimensional reduction method to more accurately visualize continual progressions found in biological data 35. A modified version of Seurat (v4.1.1) was developed to include the ‘RunPHATE’ function for converting a Seurat Object to a PHATE embedding. This was built on the phateR package (v.1.0.7) (https://github.com/scottgigante/seurat/tree/patch/add-PHATE-again). In addition to PHATE, pseudotime values were calculated with Monocle3 (v.1.2.7), which computes trajectories with an origin set by the user 36,55–57. The origin was set to be a progenitor cell state confirmed with lineage tracing experiments. 35. Moon, K. R. et al. Visualizing structure and transitions in high-dimensional biological data. Nat Biotechnol 37, 1482–1492 (2019). doi:10.1038/s41587-019-0336-3 36. Cao, J. et al. The single-cell transcriptional landscape of mammalian organogenesis. Nature 566, 496–502 (2019). doi:10.1038/s41586-019-0969-x 55. Trapnell, C. et al. The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells. Nature Biotechnology 32, 381–386 (2014). doi:10.1038/nbt.2859 56. Qiu, X. et al. Single-cell mRNA quantification and differential analysis with Census. Nature Methods 14, 309–315 (2017). doi:10.1038/nmeth.4150 57. Qiu, X. et al. Reversed graph embedding resolves complex single-cell trajectories. Nat Methods 14, 979–982 (2017). doi:10.1038/nmeth.4402

  15. Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap...

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap for mice (Fig 7H). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s016
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap for mice (Fig 7H).

  16. Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap...

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap for humans (Fig 7C). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s013
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster2 in the heatmap for humans (Fig 7C).

  17. Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap...

    • plos.figshare.com
    xlsx
    Updated Nov 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao (2024). Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap for humans (Fig 6C). [Dataset]. http://doi.org/10.1371/journal.pone.0311374.s010
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 26, 2024
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Rong He; Qiang Zhang; Limei Wang; Yiwen Hu; Yue Qiu; Jia Liu; Dingyun You; Jishuai Cheng; Xue Cao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Pathways from KEGG enrichment analysis with genes of cluster3 in the heatmap for humans (Fig 6C).

  18. Processed Seurat objects for GeneTrajectory inference (Gene Trajectory...

    • figshare.com
    application/gzip
    Updated Feb 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Rihao Qu; Peggy Myung (2024). Processed Seurat objects for GeneTrajectory inference (Gene Trajectory Inference for Single-cell Data by Optimal Transport Metrics) [Dataset]. http://doi.org/10.6084/m9.figshare.25243225.v1
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Feb 19, 2024
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Rihao Qu; Peggy Myung
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    These are processed Seurat objects for the two biological datasets in GeneTrajectory inference (https://github.com/KlugerLab/GeneTrajectory/):Human myeloid dataset analysisMyeloid cells were extracted from a publicly available 10x scRNA-seq dataset (https:// support.10xgenomics.com/single-cell-gene-expression/datasets/3.0.0/pbmc 10k v3). QC was performed using the same workflow in (https://github.com/satijalab/ Integration2019/blob/master/preprocessing scripts/pbmc 10k v3.R). After standard normalization, highly-variable gene selection and scaling using the Seurat R package, we applied PCA and retained the top 30 principal components. Four sub-clusters of myeloid cells were identified based on Louvian clustering with a resolution of 0.3. Wilcoxon rank-sum test was employed to find cluster-specific gene markers for cell type annotation.For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel, each bandwidth is determined by the distance to its k-nearest neighbor, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 5 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 0.5% − 75% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (11,21,8) to extract three gene trajectories. Mouse embryo skin data analysisWe separated out dermal cell populations from the newly collected mouse embryo skin samples. Cells from the wildtype and the Wls mutant were pooled for analyses. After standard normalization, highly-variable gene selection and scaling using Seurat, we applied PCA and retained the top 30 principal components. Three dermal celltypes were stratified based on the expression of canonical dermal markers, including Sox2, Dkk1, and Dkk2. For gene trajectory inference, we first applied Diffusion Map on the cell PC embedding (using a local-adaptive kernel bandwidth, k = 10) to generate a spectral embedding of cells. We constructed a cell-cell kNN (k = 10) graph based on their coordinates of the top 10 non-trivial Diffusion Map eigenvectors. Among the top 2,000 variable genes, genes expressed by 1% − 50% of cells were retained for pairwise gene-gene Wasserstein distance computation. The original cell graph was coarse-grained into a graph of size 1,000. We then built a gene-gene graph where the affinity between genes is transformed from the Wasserstein distance using a Gaussian kernel (local-adaptive, k = 5). Diffusion Map was employed to visualize the embedding of gene graph. For trajectory identification, we used a series of time steps (9,16,5) to sequentially extract three gene trajectories. To compare the differences between the wiltype and the Wls mutant, we stratified Wnt-active UD cells into seven stages according to their expression profiles of the genes binned along the DC gene trajectory.

  19. f

    Skin sc-RNASeq from seven body sites (face, scalp, axilla, palmoplantar,...

    • plus.figshare.com
    bin
    Updated Mar 11, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lam C Tsoi; Rachael Bogle; Johann Gudjonsson; Meri Oliva; Bridget Riley-Gillis (2025). Skin sc-RNASeq from seven body sites (face, scalp, axilla, palmoplantar, arm, leg, and back) [Dataset]. http://doi.org/10.25452/figshare.plus.25696620.v2
    Explore at:
    binAvailable download formats
    Dataset updated
    Mar 11, 2025
    Dataset provided by
    Figshare+
    Authors
    Lam C Tsoi; Rachael Bogle; Johann Gudjonsson; Meri Oliva; Bridget Riley-Gillis
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    This sc-RNAseq dataset is composed of disease-unaffected epidermal samples from 96 skin biopsies: 18 from published datasets - GSE173706, GSE249279 – and 78 newly generated ones. Biopsy sample and protocol details, and curated cell-type signature genes, are available in the scRNASeq_source_info_FigShare spreadsheet of this dataset. Processed Seurat object are provided herein. Raw data are available in SRA (id PRJNA1054546). Biopsies originated from seven body sites (face, scalp, axilla, palmoplantar, arm, leg, and back). The skin biopsies were separated into epidermis and dermis before dissociated and enriched for various cell fractions (keratinocytes, fibroblasts, and endothelial cells) and immune cells (myeloid and lymphoid cells) to up sample rare cell types. In total, across body sites, 274,834 cells were profiled, including 96,194 keratinocytes. Seurat v3.0. was utilized to normalize, scale, and reduce the dimensionality of the data. Low quality cells containing less than 200 genes per cell as well as greater than 5,000 genes per cell were filtered out. Cells containing more mitochondrial genes than the permitted quantile of 0.05 were removed. Ambient RNA was removed using R package SoupX v1.6.2. Doublets were removed using scDblFinder v1.12.0. Principal components (PC) were obtained from the topmost 2,000 variable genes, and the Uniform Manifold Approximation and Projection (UMAP) dimensional reduction technique was applied to the 30 topmost variable PC-reduced dataset. Batch effect correction was performed utilizing harmony v1.0, using donor as batch. After batch correction, cells were clustered using shared nearest neighbor modularity optimization-based clustering. Cluster marker genes were identified with FindAllMarkers; cluster corresponding cell type was identified by comparing marker genes to curated cell-type signature genes. Differential expression by keratinocyte subtype was performed with Seurat (v4.3.0) FindMarkers function by comparing keratinocyte subtype to non-keratinocyte clusters. The log fold-change of the average expression between a keratinocyte subtype cluster compared to the rest of clusters is utilized as keratinocyte-subtype gene expression statistic.

  20. ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1

    • figshare.com
    application/gzip
    Updated Jun 29, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Massimo Andreatta; Santiago Carmona (2023). ProjecTILs murine reference atlas of tumor-infiltrating T cells, version 1 [Dataset]. http://doi.org/10.6084/m9.figshare.12478571.v2
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jun 29, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Massimo Andreatta; Santiago Carmona
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    We have developed ProjecTILs, a computational approach to project new data sets into a reference map of T cells, enabling their direct comparison in a stable, annotated system of coordinates. Because new cells are embedded in the same space of the reference, ProjecTILs enables the classification of query cells into annotated, discrete states, but also over a continuous space of intermediate states. By comparing multiple samples over the same map, and across alternative embeddings, the method allows exploring the effect of cellular perturbations (e.g. as the result of therapy or genetic engineering) and identifying genetic programs significantly altered in the query compared to a control set or to the reference map. We illustrate the projection of several data sets from recent publications over two cross-study murine T cell reference atlases: the first describing tumor-infiltrating T lymphocytes (TILs), the second characterizing acute and chronic viral infection.To construct the reference TIL atlas, we obtained single-cell gene expression matrices from the following GEO entries: GSE124691, GSE116390, GSE121478, GSE86028; and entry E-MTAB-7919 from Array-Express. Data from GSE124691 contained samples from tumor and from tumor-draining lymph nodes, and were therefore treated as two separate datasets. For the TIL projection examples (OVA Tet+, miR-155 KO and Regnase-KO), we obtained the gene expression counts from entries GSE122713, GSE121478 and GSE137015, respectively.Prior to dataset integration, single-cell data from individual studies were filtered using TILPRED-1.0 (https://github.com/carmonalab/TILPRED), which removes cells not enriched in T cell markers (e.g. Cd2, Cd3d, Cd3e, Cd3g, Cd4, Cd8a, Cd8b1) and cells enriched in non T cell genes (e.g. Spi1, Fcer1g, Csf1r, Cd19). Dataset integration was performed using STACAS (https://github.com/carmonalab/STACAS), a batch-correction algorithm based on Seurat 3. For the TIL reference map, we specified 600 variable genes per dataset, excluding cell cycling genes, mitochondrial, ribosomal and non-coding genes, as well as genes expressed in less than 0.1% or more than 90% of the cells of a given dataset. For integration, a total of 800 variable genes were derived as the intersection of the 600 variable genes of individual datasets, prioritizing genes found in multiple datasets and, in case of draws, those derived from the largest datasets. We determined pairwise dataset anchors using STACAS with default parameters, and filtered anchors using an anchor score threshold of 0.8. Integration was performed using the IntegrateData function in Seurat3, providing the anchor set determined by STACAS, and a custom integration tree to initiate alignment from the largest and most heterogeneous datasets.Next, we performed unsupervised clustering of the integrated cell embeddings using the Shared Nearest Neighbor (SNN) clustering method implemented in Seurat 3 with parameters {resolution=0.6, reduction=”umap”, k.param=20}. We then manually annotated individual clusters (merging clusters when necessary) based on several criteria: i) average expression of key marker genes in individual clusters; ii) gradients of gene expression over the UMAP representation of the reference map; iii) gene-set enrichment analysis to determine over- and under- expressed genes per cluster using MAST. In order to have access to predictive methods for UMAP, we recomputed PCA and UMAP embeddings independently of Seurat3 using respectively the prcomp function from basic R package “stats”, and the “umap” R package (https://github.com/tkonopka/umap).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Abhinav Kaushik; Kari Nadeau (2024). Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients [Dataset]. http://doi.org/10.5061/dryad.8931zcrz4
Organization logo

Data from: Single cell multiomic analysis identifies key genes differentially expressed in innate lymphoid cells from COVID-19 patients

Related Article
Explore at:
zipAvailable download formats
Dataset updated
Jul 2, 2024
Dataset provided by
National Institute of Allergy and Infectious Diseaseshttp://www.niaid.nih.gov/
Authors
Abhinav Kaushik; Kari Nadeau
License

https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

Description

Innate lymphoid cells (ILCs) are enriched at mucosal surfaces where they respond rapidly to environmental stimuli and contribute to both tissue inflammation and healing. To gain insight into the role of ILCs in the pathology and recovery from COVID-19 infection, we employed a multi-omic approach consisting of Abseq and targeted mRNA sequencing to respectively probe the surface marker expression, transcriptional profile and heterogeneity of ILCs in peripheral blood of patients with COVID-19 compared with healthy controls. We found that the frequency of ILC1 and ILC2 cells was significantly increased in COVID-19 patients. Moreover, all ILC subsets displayed a significantly higher frequency of CD69-expressing cells, indicating a heightened state of activation. ILC2s from COVID-19 patients had the highest number of significantly differentially expressed (DE) genes. The most notable genes DE in COVID-19 vs healthy participants included a) genes associated with responses to virus infections and b) genes that support ILC self-proliferation, activation and homeostasis. In addition, differential gene regulatory network analysis revealed ILC-specific regulons and their interactions driving the differential gene expression in each ILC. Overall, this study provides mechanistic insights into the characteristics of ILC subsets activated during COVID-19 infection. Methods Study participants, blood draws and processing Participants were recruited as described previously from adults who had a positive SARS-COV-2 RT-PCR test at Stanford Health Care (NCT04373148). Collection of Covid samples occurred between May to December 2020. The cohort used in this study consisted of asymptomatic (n=2), mild (n=17), and moderate (n=3) COVID-19 infections, some of whom developed long term COVID-19 (n=15). The clinical case severities at the time of diagnosis were defined as asymptomatic, moderate or mild according to the guidelines released by NIH. Long term (LT) COVID was defined as symptoms occurring 30 or more days after infection, consistent with CDC guidelines. Some participants in our study continued to have LT COVID symptoms 90 days after diagnosis (n=12). Exclusion criteria for COVID sample study were NIH severity diagnosis of severe or critical at the time of positive covid test. Samples selected for this study were obtained within 76 days of positive PCR COVID-19 test date. Healthy controls were selected who had sample collection before 2020. Informed consent was obtained from all participants. All protocols were approved by the Stanford Administrative Panel on Human Subjects in Medical Research. Peripheral blood was drawn by venipuncture and using validated and published procedures, peripheral blood mononuclear cells (PBMCs) were isolated by Ficoll-based density gradient centrifugation, frozen in aliquots and stored in liquid nitrogen at -80°C , until thawing. A summary of participant demographics is presented in Supp. Table 1.
ILC Enrichment, single cell captures for Abseq and targeted mRNAseq Participant PBMCs were thawed, and each sample stained with Sample Tag (BD #633781) at room temperature for 20 minutes. Samples were combined in healthy control or COVID-19 tubes. Cells were surface stained with a panel of fluorochrome-conjugated antibodies (Supp. Table 2) in buffer (PBS with 0.25% BSA and 1mM EDTA) for 20 minutes at room temperature prior to immunomagnetic negative selection for ILCs. Following ILC enrichment using the EasySep human Pan-ILC enrichment kit (StemCell Technologies #17975), cells from healthy and COVID-19 recovered participants were counted and normalized before combining. ILCs were sorted using a BD FACS Aria at the Stanford FACS facility prior to incubation with AbSeq oligo-linked mAbs (Supp. Table 3). Sorted cells were processed by the Stanford Human Immune Monitoring Center (HIMC) using the BD Rhapsody platform. Library was prepared using the BD Immune Response Targeting Panel (BD Kit #633750) with addition of custom gene panel reagents (Supp. Table 4) and sequenced on Illumina NovaSeq 6000 at Stanford Genomics Sequencing Center (SGSC). ILCs were identified as Lineageneg (CD3neg, CD14neg, CD34neg, CD19neg), NKG2Aneg, CD45+ and ILCs further defined as CD127+CD161+ and as subsets: ILC1 (CD117negCRTH2neg), ILC2 (CRTH2+) and ILCp (CD117+CRTH2neg) (Supp. Fig. 1). Computational data analysis The above multi-modal setup allowed paired measurements of cellular transcriptome and cell surface protein abundance. The ILC1, ILC2 and ILCp cells were manually gated based on the abundance profile of CD127, CD117, CD161 and CRTH2 (Supp. Fig. 1). Before the integrative analysis, the complete multi-modal single cell dataset containing ILC subsets was converted into single Seurat object. All the subsequent protein-level and gene-level analyses were performed using multimodal data analysis pipeline of Seurat R package version 4.0. The normalized and scaled protein abundance profile was used for estimating the integrated harmony dimensions using runHarmony function in Seurat R package (reduction= ‘apca’ and group.by.vars = ‘batch’) . The batch corrected harmony embeddings were then used for computing the Uniform Manifold Approximation and Projection (UMAP) dimensions to visualize the clusters of ILC subsets. Differential marker analysis of surface proteins, between two groups of cells (COVID-19 and Healthy cohort), from abseq panels was computed with normalized and scaled expression values using FindMarkers function from Seurat R package (test.use=’wilcox’). Similarly, differential gene expression was performed on normalized and scaled gene expression values from between two groups of cells (COVID-19 and Healthy cohort) using the FindMarkers function from Seurat R package (test.use=’MAST’ and latent.vars=’batch’). Genes with log-fold change > 0.5 and adjusted p-value < 0.05 (method: Benjamini-Hochberg) (were considered as significant for further evaluation. The resulting adjusted p-values box-plots were plotted using ggplot2 R package (version 3.4.2) after computing the number of cells expressing a given protein or gene in each sample. Pathway enrichment analysis of DE genes was performed using web-server metascape (version 3.5). The AUCells score and gene regulatory network analysis was performed using pySCENIC pipeline (version 0.12.1). Gene regulatory network was reconstructed using GRNBoost2 algorithm and the list of TFs in humans (genome version: hg38) were obtained from cisTarget database. (https://resources.aertslab.org/cistarget). Cellular enrichment (aka AUCell) analysis that measures the activity of TF or gene signatures across all single cells was performed using aucell function in pySCENIC python library. The ggplot2 R package (version 3.4.2) was used for boxplot visualization. The differential gene co-expression analysis was performed using scSFMnet R package. Circular plots were generated using the R package circlize (version 0.4.15).

Search
Clear search
Close search
Google apps
Main menu