Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Although an essential step, the functional annotation of cells often proves particularly challenging in the analysis of single-cell transcriptional data. Several methods have been developed to accomplish this task. However, in most cases, these rely on techniques initially developed for bulk RNA sequencing or simply make use of marker genes identified from cell clustering followed by supervised annotation. To overcome these limitations and automatise the process, we have developed two novel methods, the single-cell gene set enrichment analysis (scGSEA) and the single cell mapper (scMAP). scGSEA combines latent data representations and gene set enrichment scores to detect coordinated gene activity at single-cell resolution. scMAP uses transfer learning techniques to repurpose and contextualise new cells into a reference cell atlas. Using both simulated and real datasets, we show that scGSEA effectively recapitulates recurrent patterns of pathways’ activity shared by cells from different experimental conditions. At the same time, we show that scMAP can reliably map and contextualise new single cell profiles on a breast cancer atlas we recently released. Both tools are provided in an effective and straightforward workflow providing a framework to determine cell function and significantly improve annotation and interpretation of scRNA-seq data.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Skeletal muscle repair is driven by the coordinated self-renewal and fusion of myogenic stem and progenitor cells. Single-cell gene expression analyses of myogenesis have been hampered by the poor sampling of rare and transient cell states that are critical for muscle repair, and do not inform the spatial context that is important for myogenic differentiation. Here, we demonstrate how large-scale integration of single-cell and spatial transcriptomic data can overcome these limitations. We created a single-cell transcriptomic dataset of mouse skeletal muscle by integration, consensus annotation, and analysis of 23 newly collected scRNAseq datasets and 88 publicly available single-cell (scRNAseq) and single-nucleus (snRNAseq) RNA-sequencing datasets. The resulting dataset includes more than 365,000 cells and spans a wide range of ages, injury, and repair conditions. Together, these data enabled identification of the predominant cell types in skeletal muscle, and resolved cell subtypes, including endothelial subtypes distinguished by vessel-type of origin, fibro/adipogenic progenitors defined by functional roles, and many distinct immune populations. The representation of different experimental conditions and the depth of transcriptome coverage enabled robust profiling of sparsely expressed genes. We built a densely sampled transcriptomic model of myogenesis, from stem cell quiescence to myofiber maturation and identified rare, transitional states of progenitor commitment and fusion that are poorly represented in individual datasets. We performed spatial RNA sequencing of mouse muscle at three time points after injury and used the integrated dataset as a reference to achieve a high-resolution, local deconvolution of cell subtypes. We also used the integrated dataset to explore ligand-receptor co-expression patterns and identify dynamic cell-cell interactions in muscle injury response. We provide a public web tool to enable interactive exploration and visualization of the data. Our work supports the utility of large-scale integration of single-cell transcriptomic data as a tool for biological discovery.
Methods Mice. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols, and experiments were performed in compliance with its institutional guidelines. Adult C57BL/6J mice (mus musculus) were obtained from Jackson Laboratories (#000664; Bar Harbor, ME) and were used at 4-7 months of age. Aged C57BL/6J mice were obtained from the National Institute of Aging (NIA) Rodent Aging Colony and were used at 20 months of age. For new scRNAseq experiments, female mice were used in each experiment.
Mouse injuries and single-cell isolation. To induce muscle injury, both tibialis anterior (TA) muscles of old (20 months) C57BL/6J mice were injected with 10 µl of notexin (10 µg/ml; Latoxan; France). At 0, 1, 2, 3.5, 5, or 7 days post-injury (dpi), mice were sacrificed and TA muscles were collected and processed independently to generate single-cell suspensions. Muscles were digested with 8 mg/ml Collagenase D (Roche; Switzerland) and 10 U/ml Dispase II (Roche; Switzerland), followed by manual dissociation to generate cell suspensions. Cell suspensions were sequentially filtered through 100 and 40 μm filters (Corning Cellgro #431752 and #431750) to remove debris. Erythrocytes were removed through incubation in erythrocyte lysis buffer (IBI Scientific #89135-030).
Single-cell RNA-sequencing library preparation. After digestion, single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. Cells were counted manually with a hemocytometer to determine their concentration. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, PN-1000075; Pleasanton, CA) following the manufacturer’s protocol. Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes. After preparation, libraries were sequenced using on a NextSeq 500 (Illumina; San Diego, CA) using 75 cycle high output kits (Index 1 = 8, Read 1 = 26, and Read 2 = 58). Details on estimated sequencing saturation and the number of reads per sample are shown in Sup. Data 1.
Spatial RNA sequencing library preparation. Tibialis anterior muscles of adult (5 mo) C57BL6/J mice were injected with 10µl notexin (10 µg/ml) at 2, 5, and 7 days prior to collection. Upon collection, tibialis anterior muscles were isolated, embedded in OCT, and frozen fresh in liquid nitrogen. Spatially tagged cDNA libraries were built using the Visium Spatial Gene Expression 3’ Library Construction v1 Kit (10x Genomics, PN-1000187; Pleasanton, CA) (Fig. S7). Optimal tissue permeabilization time for 10 µm thick sections was found to be 15 minutes using the 10x Genomics Visium Tissue Optimization Kit (PN-1000193). H&E stained tissue sections were imaged using Zeiss PALM MicroBeam laser capture microdissection system and the images were stitched and processed using Fiji ImageJ software. cDNA libraries were sequenced on an Illumina NextSeq 500 using 150 cycle high output kits (Read 1=28bp, Read 2=120bp, Index 1=10bp, and Index 2=10bp). Frames around the capture area on the Visium slide were aligned manually and spots covering the tissue were selected using Loop Browser v4.0.0 software (10x Genomics). Sequencing data was then aligned to the mouse reference genome (mm10) using the spaceranger v1.0.0 pipeline to generate a feature-by-spot-barcode expression matrix (10x Genomics).
Download and alignment of single-cell RNA sequencing data. For all samples available via SRA, parallel-fastq-dump (github.com/rvalieris/parallel-fastq-dump) was used to download raw .fastq files. Samples which were only available as .bam files were converted to .fastq format using bamtofastq from 10x Genomics (github.com/10XGenomics/bamtofastq). Raw reads were aligned to the mm10 reference using cellranger (v3.1.0).
Preprocessing and batch correction of single-cell RNA sequencing datasets. First, ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX). Samples were then preprocessed using the standard Seurat (v3.2.1) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat). Cells with fewer than 750 features, fewer than 1000 transcripts, or more than 30% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0) was used to identify putative doublets in each dataset, individually. BCmvn optimization was used for PK parameterization. Estimated doublet rates were computed by fitting the total number of cells after quality filtering to a linear regression of the expected doublet rates published in the 10x Chromium handbook. Estimated homotypic doublet rates were also accounted for using the modelHomotypic function. The default PN value (0.25) was used. Putative doublets were then removed from each individual dataset. After preprocessing and quality filtering, we merged the datasets and performed batch-correction with three tools, independently- Harmony (github.com/immunogenomics/harmony) (v1.0), Scanorama (github.com/brianhie/scanorama) (v1.3), and BBKNN (github.com/Teichlab/bbknn) (v1.3.12). We then used Seurat to process the integrated data. After initial integration, we removed the noisy cluster and re-integrated the data using each of the three batch-correction tools.
Cell type annotation. Cell types were determined for each integration method independently. For Harmony and Scanorama, dimensions accounting for 95% of the total variance were used to generate SNN graphs (Seurat::FindNeighbors). Louvain clustering was then performed on the output graphs (including the corrected graph output by BBKNN) using Seurat::FindClusters. A clustering resolution of 1.2 was used for Harmony (25 initial clusters), BBKNN (28 initial clusters), and Scanorama (38 initial clusters). Cell types were determined based on expression of canonical genes (Fig. S3). Clusters which had similar canonical marker gene expression patterns were merged.
Pseudotime workflow. Cells were subset based on the consensus cell types between all three integration methods. Harmony embedding values from the dimensions accounting for 95% of the total variance were used for further dimensional reduction with PHATE, using phateR (v1.0.4) (github.com/KrishnaswamyLab/phateR).
Deconvolution of spatial RNA sequencing spots. Spot deconvolution was performed using the deconvolution module in BayesPrism (previously known as “Tumor microEnvironment Deconvolution”, TED, v1.0; github.com/Danko-Lab/TED). First, myogenic cells were re-labeled, according to binning along the first PHATE dimension, as “Quiescent MuSCs” (bins 4-5), “Activated MuSCs” (bins 6-7), “Committed Myoblasts” (bins 8-10), and “Fusing Myoctes” (bins 11-18). Culture-associated muscle stem cells were ignored and myonuclei labels were retained as “Myonuclei (Type IIb)” and “Myonuclei (Type IIx)”. Next, highly and differentially expressed genes across the 25 groups of cells were identified with differential gene expression analysis using Seurat (FindAllMarkers, using Wilcoxon Rank Sum Test; results in Sup. Data 2). The resulting genes were filtered based on average log2-fold change (avg_logFC > 1) and the percentage of cells within the cluster which express each gene (pct.expressed > 0.5), yielding 1,069 genes. Mitochondrial and ribosomal protein genes were also removed from this list, in line with recommendations in the BayesPrism vignette. For each of the cell types, mean raw counts were calculated across the 1,069 genes to generate a gene expression profile for BayesPrism. Raw counts for each spot were then passed to the run.Ted function, using
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Additional file 1: Supplementary Table S1. Detailed comparison of multiple single-cell RNA-seq data processing workflows.
This record includes training materials associated with the Australian BioCommons workshop 'Single cell RNAseq analysis in R'. This workshop took place over two, 3.5 hour sessions on 26 and 27 October 2023. Event description Analysis and interpretation of single cell RNAseq (scRNAseq) data requires dedicated workflows. In this hands-on workshop we will show you how to perform single cell analysis using Seurat - an R package for QC, analysis, and exploration of single-cell RNAseq data. We will discuss the 'why' behind each step and cover reading in the count data, quality control, filtering, normalisation, clustering, UMAP layout and identification of cluster markers. We will also explore various ways of visualising single cell expression data. This workshop is presented by the Australian BioCommons, Queensland Cyber Infrastructure Foundation (QCIF) and the Monash Genomics and Bioinformatics Platform with the assistance of a network of facilitators from the national Bioinformatics Training Cooperative. Lead trainers: Sarah Williams, Adele Barugahare, Paul Harrison, Laura Perlaza Jimenez Facilitators: Nick Matigan, Valentine Murigneux, Magdalena (Magda) Antczak Infrastructure provision: Uwe Winter Coordinator: Melissa Burke Training materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. scRNAseq_Schedule (PDF): A breakdown of the topics and timings for the workshop Materials shared elsewhere: This workshop follows the tutorial 'scRNAseq Analysis in R with Seurat' https://swbioinf.github.io/scRNAseqInR_Doco/index.html Slides used to introduce key topics are available via GitHub https://github.com/swbioinf/scRNAseqInR_Doco/tree/main/slides This material is based on the introductory Guided Clustering Tutorial tutorial from Seurat. It is also drawing from a similar workshop held by Monash Bioinformatics Platform Single-Cell-Workshop, with material here.
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
The single-cell RNA sequencing (scRNA-seq) kits market is experiencing robust growth, driven by advancements in genomics research, personalized medicine initiatives, and the increasing need for understanding cellular heterogeneity in various biological systems. The market, estimated at $2 billion in 2025, is projected to exhibit a compound annual growth rate (CAGR) of 15% from 2025 to 2033, reaching a substantial market size. This expansion is fueled by several key factors. Firstly, the rising adoption of scRNA-seq in diverse research areas, including oncology, immunology, and neuroscience, is significantly boosting market demand. Secondly, technological innovations leading to improved kit performance, reduced costs, and simplified workflows are making scRNA-seq more accessible to a wider range of researchers and laboratories. The development of user-friendly benchtop kits is further contributing to market expansion, especially among smaller research institutions and biotechnology companies. Finally, the growing availability of bioinformatics tools and data analysis platforms is simplifying the complex data generated by scRNA-seq, facilitating wider adoption and interpretation. The market segmentation reveals a strong preference for benchtop kits due to their ease of use and affordability, particularly among smaller research facilities. The application segment shows a significant contribution from research institutions, closely followed by bioscience companies. While North America and Europe currently dominate the market, Asia-Pacific is poised for significant growth, driven by increasing investment in life sciences research and infrastructure development in emerging economies like China and India. However, factors such as the high cost of scRNA-seq, the need for specialized expertise in data analysis, and regulatory hurdles in some regions may partially restrain market growth. Nevertheless, the continued advancements in technology and the increasing demand for precise cellular analysis are anticipated to outweigh these constraints, ensuring consistent market expansion throughout the forecast period.
Single cell (sc) RNA-sequencing is a powerful technology to discover new cell types and study biological processes in complex biological samples. A current challenge is to predict transcription factor (TF) regulation from scRNA data. Here, we propose a novel approach for predicting gene expression at the single cell level using cis-regulatory motifs as well as epigenetic features. We designed a treeguided multi-task learning framework that considers each cell as a task. Through this framework we were able to explain the single cell gene expression values using either TF binding affinities or TF ChIP-seq data measured at specific genomic regions. TFs identified using these models could be validated by literature. Our proposed method allows us to identify distinct TFs that show cell-type specific regulation. This approach is not limited to TFs, but can use any type of data that can potentially be used in explaining gene expression at the single cell level to study factors that drive differentiation or show abnormal regulation in disease. The implementation of our workflow can be accessed under an MIT license via GitHub.
https://dataverse.nl/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.34894/TYHGEFhttps://dataverse.nl/api/datasets/:persistentId/versions/2.0/customlicense?persistentId=doi:10.34894/TYHGEF
These are the single-cell RNAseq data from the Athero-Express Biobank Study as used after quality control in the paper referenced below; abstract below. Background Genome-wide association studies (GWAS) have discovered hundreds of common genetic variants for atherosclerotic disease and cardiovascular risk factors. The translation of susceptibility loci into biological mechanisms and targets for drug discovery remains challenging. Intersecting genetic and gene expression data has led to identification of candidate genes. However, the assayed tissues are often non-diseased and heterogeneous in cell composition confounding the candidate prioritization. We collected single-cell transcriptomics (scRNA-seq) from atherosclerotic plaques and aimed to identify cell-type-specific expression of disease-associated genes. Methods and Results To identify disease-associated candidate genes, we applied gene-based analyses using GWAS summary statistics from 46 atherosclerotic, cardiometabolic, and other traits. Next we intersected these candidates with single-cell transcriptomics (scRNA-seq) to identify those genes that are specifically expressed in individual cell (sub)populations of atherosclerotic plaques. We derive an enrichment score and show that loci that associated with coronary artery disease demonstrated a prominent substrate in plaque smooth muscle cells (SKI, KANK2, SORT1), endothelial cells (SLC44A1, ATP2B1), and macrophages (APOE, HNRNPUL1). Further sub clustering of SMC-subtypes revealed genes in risk loci for coronary calcification specifically enriched in a synthetic cluster of SMCs. To verify the robustness of our approach, we used liver-derived scRNAseq-data and showed enrichment of circulating lipids-associated loci in hepatocytes. Conclusion We confirm known gene-cell pairs relevant for atherosclerotic disease, and discovered novel pairs pointing to new biological mechanisms amenable for therapy. We present an intuitive single-cell transcriptomics driven workflow rooted in human large-scale genetic studies to identify putative candidate genes and affected cells associated with cardiovascular traits. Athero-Express Biobank Study The AE started in 2002 and now includes over 3,500 patients who underwent surgery to remove atherosclerotic plaques (endarterectomy) from one (or more) of their major arteries (majority carotids and femorals); this is further described here. The study design and staining protocols are described by Verhoeven et al.. GitHub A link to the public GitHub repository: https://github.com/CirculatoryHealth/gwas2single. This contains all scripts used for the data, which is pseudonymized and shared here. Additional data Additional clinical data is available upon discussion and signing a Data Sharing Agreement (see Terms of Access). PlaqView In collaboration with the http://millerlab.org from the University of Virginia (USA) we created PlaqView.com. You can query any gene of interest in many carotid-plaque datasets, including ours. From our experience we know that usually this suffices most research questions and prevents the lengthy process of obtaining these data through a DSA.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Postmortem-derived Brain Sequencing Collection is a harmonized repository of scRNAseq sequencing data contributed by two ASAP CRN teams (Team Lee and Team Hafler). These samples were derived from the middle frontal gyrus, hippocampus, substantial nigra, and pre-frontal cortex regions of healthy controls, Parkinson’s Disease, and Alzheimer’s Disease brains. The samples have been harmonized across 10x sequenced data aligned into count tables with cell ranger.
The current collection represents the minimum viable product and will be expanded and improved as additional data is uploaded into the ASAP CRN Cloud. When complete, the collection will provide data generated from 1,800+ samples using proteomics, transcriptomics, and sequencing (single-nucleus RNAseq, single-cell RNAseq, bulk RNA-seq, ATAC-seq, long read WGS, and single-nucleus multiome sequencing (paired snRNAseq, snATACseq)) techniques.
The analysis workflow for this MVP dataset is available at https://github.com/ASAP-CRN/harmonized-wf-dev
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Single-cell RNA sequencing (scRNA-seq) technologies have precipitated the development of bioinformatic tools to reconstruct cell lineage specification and differentiation processes with single-cell precision. However, current start-up costs and recommended data volumes for statistical analysis remain prohibitively expensive, preventing scRNA-seq technologies from becoming mainstream. Here, we introduce single-cell amalgamation by latent semantic analysis (SALSA), a versatile workflow that combines measurement reliability metrics with latent variable extraction to infer robust expression profiles from ultra-sparse sc-RNAseq data. SALSA uses a matrix focusing approach that starts by identifying facultative genes with expression levels greater than experimental measurement precision and ends with cell clustering based on a minimal set of Profiler genes, each one a putative biomarker of cluster-specific expression profiles. To benchmark how SALSA performs in experimental settings, we used the publicly available 10X Genomics PBMC 3K dataset, a pre-curated silver standard from human frozen peripheral blood comprising 2,700 single-cell barcodes, and identified 7 major cell groups matching transcriptional profiles of peripheral blood cell types and driven agnostically by < 500 Profiler genes. Finally, we demonstrate successful implementation of SALSA in a replicative scRNA-seq scenario by using previously published DropSeq data from a multi-batch mouse retina experimental design, thereby identifying 10 transcriptionally distinct cell types from > 64,000 single cells across 7 independent biological replicates based on < 630 Profiler genes. With these results, SALSA demonstrates that robust pattern detection from scRNA-seq expression matrices only requires a fraction of the accrued data, suggesting that single-cell sequencing technologies can become affordable and widespread if meant as hypothesis-generation tools to extract large-scale differential expression effects.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Evaluation of different DEA methods on two real scRNA-seq datasets.
Adult zebrafish possess a strong regenerative capacity. Focusing on the heterogeneity of zebrafish optic nerve regeneration, single-cell sequencing analysis of the optic nerve of adult zebrafish two weeks after injury identified five major cell types: fibroblasts, vascular wall cells, immune cells, mature oligodendrocytes, and oligodendrocytes in the process of myelination.Experimental ProcedureDissociation of optic nerve to a single-cell suspensionOptic nerve samples from n=18 zebrafish (half of male and female) at 2 wpi were pooled to create a single-cell suspension, ensuring sufficient cell capture while minimizing inter-individual variability. Therefore, there are no biological replicates due to small tissue size of the optic nerve in zebrafish and low expression of mRNA in the optic nerve (Yu et al., 2024). Dissection of the optic nerve and then were minced with a sterile scalpel into 1 mm fragments, suspended in 5ml of digestion buffer consisting of 2 mg/mL Collagenase type II and 200U/ml DNase I in RPMI medium, and incubated in 37℃ water bath with shaking for 30 min. The suspension was passed through a 100 μm filter and centrifugated (400g, 10 min, 4℃). Pelleted cells were resuspended in red blood cell lysis buffer, incubated for 2 min, passed through a 40 μm filter, collected by centrifugation (400g, 10 min, 4℃) and resuspended in PBS containing 0.04% BSA. Cells were manually counted by Trypan blue and AO-PI (LUNA, D23001) after each centrifugation (400g, 10 min, 4℃) and resuspended. Single cells were processed using Chromium Controller (10X Genomics) according to the manufacturer’s protocol.Single-cell RNA sequencingBy using Chromium Next GEM Single Cell 3ʹ Kit v3.1and Chromium Next GEM Chip G Single Cell Kit, we performed single cell 3’gene expression profiling. The cell suspension was loaded onto the Chromium single cell controller (10x Genomics) to generate single-cell gel beads in the emulsion according to the manufacturer’s protocol. Captured cells were lysed and the released RNA were barcoded through reverse transcription in individual GEMs. Cell-barcoded 3’gene expression libraries were sequenced on an Illumina NovaSeq6000 system (Illumina, USA) by Shanghai Biochip Co., Ltd.,China.Single-cell RNA sequencing and data analysisThe raw single-cell RNA sequencing (scRNA-Seq) data were processed as described in previous paper ((Yu et al., 2024)). Low-quality cells were excluded based on the retaining criteria as RNA feature count between 300 and 5000, RNA count between 500-8000, a mitochondrial gene percentage below 20%, hemoglobin- and red blood cell–related gene percentages below 10%, and total RNA counts not exceeding the 95th percentile. A total of 3,359 cells were sequenced, and 1,341 cells were retrieved after sequencing and quality control. Clusters that did not belong to optic nerve tissue were excluded from further analysis. The detailed quality control workflow is present in Supplementary file1. Optic nerve samples were integrated using Seurat v5.1, followed by data normalizing, scaling, dimensional reduction, and clustering. Visualization was performed with t-SNE using the first 30 principal components. Cell types were annotated based on canonical marker genes, including mature oligodendrocyte (tspan2a, mag, mbpa), myelin forming oligodendrocyte (egr2b, gldn, mbpb, s100b, si:ch211-234p6.5, ndrg1a), fibroblast (col1a1a, col1a1b, dcn), immune cell (nr4a1, srgn) and mural cell (rgs5a, rbpms2a, myh11a). We note that this immune cell cluster does not express any microglial or macrophage markers but instead shows high expression of more generalized immune function–related genes. Consequently, we can only designate it as an “immune cell” cluster and cannot further subdivide it. Gene Ontology (GO) enrichment analysis was conducted to identify functional pathways and biological processes associated with each group. Proliferation scores were computed based on curated gene sets representing proliferation-related pathways
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Skeletal muscle regeneration relies on the orchestrated interaction of myogenic and non-myogenic cells with spatial and temporal coordination. The regenerative capacity of skeletal muscle declines with aging due to alterations in myogenic stem/progenitor cell states and functions, non-myogenic cell contributions, and systemic changes, all of which accrue with age. A holistic network-level view of the cell-intrinsic and -extrinsic changes influencing muscle stem/progenitor cell contributions to muscle regeneration across the lifespan remains poorly resolved. To provide a comprehensive atlas of regenerative muscle cell states across mouse lifespan, we collected a compendium of 273,923 single-cell transcriptomes from hindlimb muscles of young, old, and geriatric (4-7, 20, and 26 months old, respectively) mice at six closely sampled time-points following myotoxin injury. We identified 29 muscle-resident cell types, eight of which exhibited accelerated or delayed dynamics in their abundances between age groups, including T and NK cells and multiple macrophage subtypes, suggesting that the age-related decline in muscle repair may arise from temporal miscoordination of the inflammatory response. We performed a pseudotime analysis of myogenic cells across the regeneration timespan and found age-specific myogenic stem/progenitor cell trajectories in old and geriatric muscles. Given the critical role that cellular senescence plays in limiting cell contributions in aged tissues, we built a series of tools to bioinformatically identify senescence in these single-cell data and assess their ability to identify senescence within key myogenic stages. By comparing single-cell senescence scores to co-expression of hallmark senescence genes Cdkn2a and Cdkn1a, we found that an experimentally derived gene list derived from a muscle foreign body response (FBR) fibrosis model accurately (receiver-operator curve AUC = 0.82-0.86) identified senescent-like myogenic cells across mouse ages, injury time-points, and cell-cycle states, in a manner comparable to curated gene-lists. Further, this scoring approach in both single-cell and spatial transcriptomic datasets pinpointed transitory senescent-like subsets within the myogenic stem/progenitor cell trajectory that are associated with stalled MuSC self-renewal states across all ages of mice. This new resource on mouse skeletal muscle aging provides a comprehensive portrait of the changing cellular states and interactions underlying skeletal muscle regeneration across the mouse lifespan. Methods Mouse muscle injury and single-cell isolation. The Cornell University Institutional Animal Care and Use Committee (IACUC) approved all animal protocols (approval # 2014-0085), and experiments were performed in compliance with its institutional guidelines. Mice were maintained at 70-73°F on a 14/10-h light/dark with humidity mainly at 40%. Muscle injury was induced in young (4-7 months-old [mo]), old (20 mo), and geriatric (26 mo) C57BL/6J mice (Jackson Laboratory # 000664; NIA Aged Rodent Colonies) by injecting both tibialis anterior (TA) muscles with 10 µl of notexin (10 µg/ml; Latoxan, France). The mice were sacrificed, and TA muscles were collected at 0, 1, 2, 3.5, 5, and 7 days post-injury (dpi) with n = 3-4 biological replicates per sample. Each TA was processed independently to generate single-cell suspensions. At each time point, the young and old samples are biological replicates of TA muscles from distinct mice, and the geriatric samples are biological replicates of two TA muscles from each of the two mice. A mixture of male and female mice was used. See Supplemental Table 1 for additional details. Muscles were digested with 8 mg/ml Collagenase D (Roche, Basel, Switzerland) and 10 U/ml Dispase II (Roche, Basel, Switzerland) and then manually dissociated to generate cell suspensions. Myofiber debris was removed by filtering the cell suspensions through a 100 µm and then a 40 µm filter (Corning Cellgro # 431752 and # 431750). After filtration, erythrocytes were removed by incubating the cell suspension inan erythrocyte lysis buffer (IBI Scientific # 89135-030). Single-cell RNA-sequencing library preparation. After digestion, the single-cell suspensions were washed and resuspended in 0.04% BSA in PBS at a concentration of 106 cells/ml. A hemocytometer was used to manually count the cells to determine the concentration of the suspension. Single-cell RNA-sequencing libraries were prepared using the Chromium Single Cell 3’ reagent kit v3 (10x Genomics, Pleasanton, CA) following the manufacturer’s protocol (10x Genomics: Resolving Biology to Advance Human Health, 2020). Cells were diluted into the Chromium Single Cell A Chip to yield a recovery of 6,000 single-cell transcriptomes with <5% doublet rate. Libraries were sequenced on the NextSeq 500 (Illumina, San Diego, CA) (Illumina | Sequencing and array-based solutions for genetic research, 2020). The sequencing data was aligned to the mouse reference genome (mm10) using CellRanger v5.0.0 (10x Genomics) (10x Genomics: Resolving Biology to Advance Human Health, 2020). Preprocessing single-cell RNA-sequencing data. From the gene expression matrix, the downstream analysis was carried out in R (v3.6.1). First, the ambient RNA signal was removed using the default SoupX (v1.4.5) workflow (autoEstCounts and adjustCounts; github.com/constantAmateur/SoupX) (Young and Behjati, 2020). Samples were then preprocessed using the standard Seurat (v3.2.3) workflow (NormalizeData, ScaleData, FindVariableFeatures, RunPCA, FindNeighbors, FindClusters, and RunUMAP; github.com/satijalab/seurat) (Stuart et al., 2019). Cells with fewer than 200 genes, with fewer than 750 UMIs, and more than 25% of unique transcripts derived from mitochondrial genes were removed. After preprocessing, DoubletFinder (v2.0.3) was used to identify putative doublets in each dataset (McGinnis, Murrow, and Gartner, 2019). The estimated doublet rate was 5% according to the 10x Chromium handbook. The putative doublets were removed from each dataset. Next, the datasets were merged and then batch-corrected with Harmony (github.com/immunogenomics/harmony) (v1.0) (Korsunsky et al., 2019). Seurat was then used to process the integrated data. Dimensions accounting for 95% of the total variance were used to generate SNN graphs (FindNeighbors) and SNN clustering was performed (FindClusters). A clustering resolution of 0.8 was used resulting in 24 initial clusters. Cell type annotation in single-cell RNA-sequencing data. Cell types were determined by expression of canonical genes. Each of the 24 initial clusters received a unique cell type annotation. The nine myeloid clusters were challenging to differentiate between, so these clusters were subset out (Subset) and re-clustered using a resolution of 0.5 (FindNeighbors, FindClusters) resulting in 15 initial clusters. More specific myeloid cell type annotations were assigned based on the expression of canonical myeloid genes. This did not help to clarify the monocyte and macrophage annotations, but it did help to identify more specific dendritic cell and T cell subtypes. These more specific annotations were transferred from the myeloid subset back to the complete integrated object based on the cell barcode. Analysis of cell type dynamics. We generated a table with the number of cells from each sample (n = 65) in each cell type annotation (n = 29). We removed the erythrocytes from this analysis because they are not a native cell type in skeletal muscle. Next, for each sample, we calculated the percent of cells in each cell type annotation. The mean and standard deviation were calculated from each age and time point for every cell type. The solid line is the mean percentage of the given cell type, the ribbon is the standard deviation around the mean, and the points are the values from individual replicates. We evaluated whether there was a significant difference in the cell type dynamics over all six-time points using non-linear modeling. The dynamics for each cell type were fit to some non-linear equation (e.g., quadratic, cubic, quartic) independent and dependent on age. The type of equation used for each cell type was selected based on the confidence interval and significance (p < 0.05) of the leading coefficient. If the leading coefficient was significantly different from zero, it was concluded that the leading coefficient was needed. If the leading coefficient was not significantly different than zero, it was concluded that the leading coefficient was not needed, and the degree of the equation went down one. No modeling equation went below the second degree. The null hypothesis predicted that the coefficients of the non-linear equation were the same across the age groups while the alternative hypothesis predicted that the coefficients of the non-linear equation were different across the age groups. We conducted a One-Way ANOVA to see if the alternative hypothesis fits the data significantly better than the null hypothesis and we used FDR as the multiple comparison test correction (using the ANOVA and p.adjust (method = fdr) functions in R, respectively). T cell exhaustion scoring. We grouped the three T cell populations (this includes Cd3e+ cycling and non-cycling T cells and Cd4+ T cells) and z-scored all genes. The T cell exhaustion score was calculated using a transfer-learning method developed by Cherry et al 2023 and a T cell exhaustion gene list from Bengsch et al 2018 (Bengsch et al., 2018; Cherry et al., 2023). The Mann-Whitney U-test was performed on the T cell exhaustion score between ages. Senescence scoring. We tested two senescence-scoring methods along with fourteen senescence gene lists (Supplemental Table 2) to identify senescent-like cells within the scRNA-seq dataset. The Two-way Senescence Score (Sen Score) was calculated using a transfer-learning method developed by Cherry et al 2023 (Cherry et al., 2023). With this
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Understanding the factors that influence the biological response to inflammation is crucial, due to its involvement in physiological and pathological processes, including tissue repair/healing, cancer, infections, and autoimmune diseases. We have previously demonstrated that in vivo stretching can reduce inflammation and increase local pro-resolving lipid mediators in rats, suggesting a direct mechanical effect on inflammation resolution. Here, we aimed to explore further the effects of stretching at the cellular/molecular level in a mouse subcutaneous carrageenan-inflammation model. Stretching for 10 minutes twice a day reduced inflammation, increased the production of pro-resolving mediator pathway intermediate 17-HDHA at 48h post carrageenan injection, and decreased both pro-resolving and pro-inflammatory mediators (e.g., PGE2 and PGD2) at 96h. ScRNAseq analysis of inflammatory lesions at 96h showed that stretching increased the expression of both pro-inflammatory (Nos2) and pro-resolution (Arg1) genes in M1 and M2 macrophages at 96 hours. An intercellular communication analysis predicted specific ligand-receptor interactions orchestrated by neutrophils and M2a macrophages, suggesting a continuous neutrophil presence recruiting immune cells such as activated macrophages to contain the antigen while promoting resolution and preserving tissue homeostasis. Methods All ultrasound data acquisition and measurements were performed by investigators blinded to intervention condition. Ultrasound images of the back were acquired under isoflurane anesthesia. A high-frequency ultrasound scanner (Vevo 2100, Fujifilm VisualSonics, Toronto, Canada) in B mode with a 21 MHz transducer (MS 250) was used for optimal spatial resolution. A conductive gel was centrifuged for 5 minutes to remove air bubbles and spread over the skin. The transducer was stabilized with a clamp and mounted into an articulated arm to control the distance and the angle between the transducer and the skin surface. the transducer was oriented transversal or sagittal perpendicular to the skin of the back and centered on the lesion area. Total lesion area was calculated by averaging the lesion area measured at transversal and sagittal positions. Flow cytometry Inflammatory lesions were excised and minced in 5% FBS-DMEM, using a scalpel, then the suspension was filtered through a 70mm filter. Isolated cells were counted using an automated cell counter TC20 (Bio-Rad, CA). For surface receptors, cells at 1 × 106/mL were stained with a mix of mouse monoclonal antibodies: To detect neutrophils (N=92) we used the antibodies: APC anti-mouse Ly-6/Ly-6c(Gr1) and FITC anti-mouse CD45; for macrophage populations (N=32): we used the following combination of antibodies: APC cy7 anti-mouse CD45, APC anti-mouse F4/80, FITC anti-mouse Nos2 (iNOS), PE anti-mouse CD206. Stained cells were examined using a FACSCanto II Flow Cytometer (BD Biosciences, San Jose, CA) with FlowJo single cell analysis software. Single cell and bulk library preparation and sequencing Single cell library preparation was performed according to the manufacturer’s instructions for the 10× Chromium single cell kit (10x Genomics). The libraries were then pooled and sequenced on a NextSeq 2000 sequencer (Illumina). Single cell RNA seq data processing and quality control Read processing was performed using the 10x Genomics workflow (Zheng et al. 2017). Briefly, the Cell Ranger Single Cell Software Suite (v3.0.1) was used for demultiplexing, barcode assignment, and unique molecular identifier (UMI) quantification (http://software.10xgenomics.com/single cell/overview/welcome). The reads were aligned to a custom mm10 reference genome (Genome Reference Consortium Mouse Build 38) extended with additional annotation for several frequently used mouse transgenes. Both lanes per sample were merged using the ‘cellranger mkfastq’ function and processed using the ‘cellranger count’ function. In total, the Cell Ranger software detected 7,921 cells per sample, sequenced at 23,326 reads and identifying 1,611 genes derived from 4,375 UMIs per cell on average across all samples. The following metrics were used to flag poor quality cells: number of genes detected, total number of UMIs, and percentage of molecules mapped to mitochondrial genes. Within the Seurat workflow, low quality and artifact cells were excluded by removing any cells that expressed fewer than 200 genes, and removed genes expressed in less than 3 cells. Gene expression matrices were transformed for better interpretability using the Seurat function ‘NormalizeData’. A total of 51,943 cells were included in the subsequent clustering and pseudo time analyses. For cross-condition data integration and batch correction, ‘FindIntegrationAnchors’ and ‘IntegrateData’ were applied to data in Seurat, following the data integration vignette (https://satijalab.org/seurat/archive/v3.2/immune_alignment. html) Targeted LC-MS/MS Supernatants from the inflammatory lesions were placed in ice cold methanol containing deuterated internal standards (d8-5S-hydroxyeicosatetraenoic acid (5-HETE), d4-leukotriene B4 (LTB4), d4-prostaglandin E2 (PGE2) and d5-lipoxin A4 (LXA4); 500pg each) and homogenized using a PTFE Dounce (Kimble Chase). Proteins were allowed to precipitate (4oC), and lipid mediators were extracted using C18 solid-phase cartridges as described before (Dalli et al., 2018) Measurement of lipid mediators was carried out by liquid chromatography-tandem mass spectrometry (LC-MS/MS) using a QTrap 5500 (ABSciex, Framingham, MA) equipped with a Shimadzu LC-20AD HPLC and a Shimadzu SIL-20AC autoinjector (Shimadzu, Kyoto, Japan). An Agilent Eclipse Plus C18 column (100mm x 4.6 mm x 1.8 mm) maintained at 50oC was used with a gradient of methanol/water/acetic acid of 55:45:0.01 (v/v/v) to 100:0:0.01 at 0.4 ml/min flow rate. Multiple reaction monitoring (MRM) transitions were used to identify and quantify lipid mediators in samples, as compared with retention times of authentic standards run in parallel. Quantification was achieved using calibration curves constructed with synthetic standards for each mediator, after normalization to extraction recovery based on internal standards and followed by normalization to tissue weight.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Rapid advances in single-cell assays have outpaced methods for analysis of those data types. Different single-cell assays show extensive variation in sensitivity and signal to noise levels. In particular, scATAC-seq generates extremely sparse and noisy datasets. Existing methods developed to analyze this data require cells amenable to pseudo-time analysis or require datasets with drastically different cell-types. We describe a novel approach using self-organizing maps (SOM) to link scATAC-seq regions with scRNA-seq genes that overcomes these challenges and can generate draft regulatory networks. Our SOMatic package generates chromatin and gene expression SOMs separately and combines them using a linking function. We applied SOMatic on a mouse pre-B cell differentiation time-course using controlled Ikaros over-expression to recover gene ontology enrichments, identify motifs in genomic regions showing similar single-cell profiles, and generate a gene regulatory network that both recovers known interactions and predicts new Ikaros targets during the differentiation process. The ability of linked SOMs to detect emergent properties from multiple types of highly-dimensional genomic data with very different signal properties opens new avenues for integrative analysis of heterogeneous data.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
IntroductionBone metastasis (BoM) occurs when cancer cells spread from their primary sites to a bone. Currently, the mechanism underlying this metastasis process remains unclear.MethodsIn this project, through an integrated analysis of bulk-sequencing and single-cell RNA transcriptomic data, we explored the BoM-related features in tumor microenvironments of different tumors.ResultsWe first identified 34 up-regulated genes during the BoM process in breast cancer, and further explored their expression status among different components in the tumor microenvironment (TME) of BoM samples. Enriched EMP1+ fibroblasts were found in BoM samples, and a COL3A1-ADGRG1 communication between these fibroblasts and cancer cells was identified which might facilitate the BoM process. Moreover, a significant correlation between EMP1 and COL3A1 was identified in these fibroblasts, confirming the potential connection of these genes during the BoM process. Furthermore, the existence of these EMP1+/COL3A1+ fibroblasts was also verified in prostate cancer and renal cancer BoM samples, suggesting the importance of these fibroblasts from a pan-cancer perspective.DiscussionThis study is the first attempt to investigate the relationship between fibroblasts and BoM process across multi-tumor TMEs. Our findings contribute another perspective in the exploration of BoM mechanism while providing some potential targets for future treatments of tumor metastasis.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Single-cell RNA sequencing (scRNA-seq) allows the identification, characterization, and quantification of cell types in a tissue. When focused on B and T cells of the adaptive immune system, scRNA-seq carries the potential to track the clonal lineage of each analyzed cell through the unique rearranged sequence of its antigen receptor (BCR or TCR, respectively) and link it to the functional state inferred from transcriptome analysis. Here we introduce FB5P-seq, a FACS-based 5′-end scRNA-seq method for cost-effective, integrative analysis of transcriptome and paired BCR or TCR repertoire in phenotypically defined B and T cell subsets. We describe in detail the experimental workflow and provide a robust bioinformatics pipeline for computing gene count matrices and reconstructing repertoire sequences from FB5P-seq data. We further present two applications of FB5P-seq for the analysis of human tonsil B cell subsets and peripheral blood antigen-specific CD4 T cells. We believe that our novel integrative scRNA-seq method will be a valuable option to study rare adaptive immune cell subsets in immunology research.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Supplementary Material 1. Supplementary Fig. 1. Quality control and baseline data of each enrolled sample. (A). Principal component analysis before and after processing in Harmony. (B). t-SNE projections among different groups. (C). Quality control of the scRNA-seq data. (D). t-SNE projections from all enrolled samples. t-SNE in the control group (left), STS Pre group (middle), and STS Post group (right). (E). Boxplot comparing the proportion of plasmacytoid dendritic cells(pDCs) across the groups. The STS Pre vs. CT and STS Post vs. CT sample comparisons show exact P values determined by the Wilcoxon rank-sum test. Pre- vs. post-STS scores were calculated via the paired two-sample Wilcoxon signed-sum test. (F). The baseline information of patients and healthy controls enrolled in the scRNA-seq cohort. Supplementary Fig. 2. Focused analysis of T cells and pDCs. (A). UMAP embedding of T lymphocytes from all profiled samples in different groups. UMAP in the control group (left), STS Pre group (middle), and STS Post group (right). (B). Boxplot comparing the proportions of CRIP + CD4 + T cells, NK T cells, and TRGC2 + CD8 + T cells across the groups. The exact P values determined by the Wilcoxon rank-sum test are shown for the STS Pre vs. CT and STS Post vs. CT comparisons. Differences between STS Pre and STS Post were evaluated by the paired two-sample Wilcoxon signed-sum test. (C). Enriched pathways from Gene Ontology Biological Process Enrichment Analysis for TRGC + CD8 + T cells. (D). Enriched pathways identified by Gene Ontology Biological Process enrichment analysis in NEAT + T cells. (E). Heatmap representing the enrichment of MSigDB Hallmark gene sets for each T lymphocyte subtype across groups. (F). Pseudotime trajectory analysis of CD4 + T lymphocyte subtypes. (G). Heatmap represents DEGs within pDCs across groups. (H). Heatmap representing the enrichment of MSigDB Hallmark gene sets in the MSigDB of each group within pDCs. Supplementary Fig. 3. Focused analysis of B cells and myeloid cells. (A). Heatmap representing the enrichment of Hallmark gene sets in the MSigDB for each cell type within B lymphocytes across groups. (B). Boxplot comparing the proportions of myeloid cells across the groups. The exact P values determined by the Wilcoxon rank-sum test are shown for the STS Pre vs. CT and STS Post vs. CT comparisons. Differences between STS Pre and STS Post were evaluated by the paired two-sample Wilcoxon signed-sum test. (C). Enriched pathways from Gene Ontology Biological Process Enrichment Analysis for CD16 + monocytes. (D). Heatmap representing the enrichment of MSigDB Hallmark gene sets in each monocyte cell type across groups. Supplementary Fig. 4. The characteristics of IFN-related genes involved in pathogenesis. (A). Heatmap showing the differentially expressed genes (DEGs) in classical dendritic cells(cDCs) across groups. (B). Heatmap showing the genes differentially expressed in mast cells across groups. (C). Heatmap representing the enrichment of MSigDB Hallmark gene sets in the mast cells across groups. (D). Heatmap representing the enrichment of MSigDB Hallmark gene sets in the cDC across groups. (E). Heatmap representing the enrichment of Hallmark gene sets in the MSigDB for each cell type within Natural killer (NK) cells across groups. (F). Venn plot of the overlapping genes downregulated in the STS Pre group among B lymphocytes, T lymphocytes, monocytes, NK cells, cDCs and pDCs. (G). Receiver operating characteristic (ROC) curve and area under the curve (AUC) of overlapping genes expressed at lower levels before treatment in all cell types. (H). T-SNE analysis of CXCR4 expression in the three groups. (I). The relative expression of CXCR4 across groups was determined through qPCR. Statistical significance is denoted as P 0.05. Supplementary Fig. 7. Supplementary analysis from an extra INS cohort (GEO233277) also validates the activation of IFN. (A). UMAP dimensionality reduction embedding from GEO datasets. (B). Heatmap showing the expression levels of the markers across each cell type using scRNAseq from the GEO datasets. The color intensity indicates the marker of interest. (C). Violin plot of the ISGs across groups using scRNAseq from the GEO datasets. Significance was evaluated with the Wilcoxon rank-sum test. (D). ISG scores among cell subtypes across groups using scRNAseq from the GEO datasets. Significance was evaluated with the Wilcoxon rank-sum test. Statistical significance is denoted as P
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundCuproptosis is increasingly recognized as an essential factor in the pathological process of Alzheimer’s disease (AD). However, the specific role of cuproptosis-related genes in AD remains poorly understood.MethodsOur first step was to obtain gene expression data from the GEO database and identify differentially expressed cuproptosis-associated genes (DECAGs) in AD. GO, KEGG, and GSEA analyses were then conducted on these genes. Subsequently, we attempted to classify AD patients by unsupervised clustering. Then, four machine-learning models were used to screen hub-genes from the DECAGs. We also explored the immune features of these genes and predicted target drugs. Molecular docking analysis was then performed on the predicted drugs and their corresponding hub-gene related proteins. Candidate markers were then validated by single-cell analysis and intracellular communication was investigated in a GEO scRNA-seq dataset. Lastly, we examined the expression levels of the hub-genes in peripheral blood cells using real-time quantitative PCR.Results19 DECAGs were found in AD and the key biological processes and molecular functions associated with AD were further determined. Two subtypes of peripheral blood cells showed significant alternations in AD: Cluster1 and Cluster2. Five hub-genes including FDX1, GLS, PDK1, MAP2K1, and SOD1 were then screened out from the machine-learning study. All of the five hub-genes were significantly correlated with various immunocytes. We discovered compounds targeting hub-gene related proteins and forecasted multiple strong hydrogen bonding interactions between the picked predicted drugs and the target proteins by molecular docking analysis. Subsequently, in the single-cell analysis of AD peripheral blood, all hub-genes except SOD1 were found to be up-regulated in B cells, NK cells, and CD4+ T cells, possibly acting on the MIF pathway. Finally, we discovered that the levels of PDK1 expression in AD patients were remarkably upregulated, while FDX1 and GLS were significantly decreased using qPCR.ConclusionThis study examined changes in intercellular communication between immune cells in the peripheral blood and identified five novel feature genes associated with cuproptosis in AD patients. These results facilitated a deeper understanding of the molecular mechanisms of AD and suggested novel therapeutic targets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
BackgroundCuproptosis is increasingly recognized as an essential factor in the pathological process of Alzheimer’s disease (AD). However, the specific role of cuproptosis-related genes in AD remains poorly understood.MethodsOur first step was to obtain gene expression data from the GEO database and identify differentially expressed cuproptosis-associated genes (DECAGs) in AD. GO, KEGG, and GSEA analyses were then conducted on these genes. Subsequently, we attempted to classify AD patients by unsupervised clustering. Then, four machine-learning models were used to screen hub-genes from the DECAGs. We also explored the immune features of these genes and predicted target drugs. Molecular docking analysis was then performed on the predicted drugs and their corresponding hub-gene related proteins. Candidate markers were then validated by single-cell analysis and intracellular communication was investigated in a GEO scRNA-seq dataset. Lastly, we examined the expression levels of the hub-genes in peripheral blood cells using real-time quantitative PCR.Results19 DECAGs were found in AD and the key biological processes and molecular functions associated with AD were further determined. Two subtypes of peripheral blood cells showed significant alternations in AD: Cluster1 and Cluster2. Five hub-genes including FDX1, GLS, PDK1, MAP2K1, and SOD1 were then screened out from the machine-learning study. All of the five hub-genes were significantly correlated with various immunocytes. We discovered compounds targeting hub-gene related proteins and forecasted multiple strong hydrogen bonding interactions between the picked predicted drugs and the target proteins by molecular docking analysis. Subsequently, in the single-cell analysis of AD peripheral blood, all hub-genes except SOD1 were found to be up-regulated in B cells, NK cells, and CD4+ T cells, possibly acting on the MIF pathway. Finally, we discovered that the levels of PDK1 expression in AD patients were remarkably upregulated, while FDX1 and GLS were significantly decreased using qPCR.ConclusionThis study examined changes in intercellular communication between immune cells in the peripheral blood and identified five novel feature genes associated with cuproptosis in AD patients. These results facilitated a deeper understanding of the molecular mechanisms of AD and suggested novel therapeutic targets.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These are processed Seurat objects for the biological datasets in Localized Marker Detector (https://github.com/KlugerLab/LocalizedMarkerDetector):Tabular Muris bone marrow dataset (FACS-based and Droplet-based)We used publicly available scRNA-seq mouse bone marrow datasets (FACS and Droplet-based) from the Tabular Muris Consortium, which were already pre-processed and annotated according to their workflow. In addition, we applied ALRA imputation to generate a denoised assay alra and added several cell annotations: (1) Cell cycle annotation using CellCycleScoring with the updated 2019 cell cycle gene set; (2) Module Activity Scores for the gene modules listed in our paper.Mouse embryo skin datasetWe separated dermal cell populations from newly collected mouse embryo skin samples (aligned to the mouse genome mm10 using CellRanger (v.6.1.2)). Cells from the wildtype and SmoM2YFP mutant (SmoM2) for two consecutive days (embryonic day 13.5 and 14.5) were pooled for analysis. To avoid batch effects from pooling or integrating, we analyzed each condition separately: E13.5 SmoM2, E13.5 WT, E14.5 SmoM2, and E14.5 WT. For each condition, we performed standard normalization, selected the top 2,000 highly variable genes, and scaled the data using the Seurat v4 R package. We then applied PCA, retaining the number of PCs determined by the elbow plot: E13.5 SmoM2 (14 PCs), E13.5 WT (12 PCs), E14.5 SmoM2 (12 PCs), and E14.5 WT (11 PCs).
Not seeing a result you expected?
Learn how you can add new datasets to our index.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Although an essential step, the functional annotation of cells often proves particularly challenging in the analysis of single-cell transcriptional data. Several methods have been developed to accomplish this task. However, in most cases, these rely on techniques initially developed for bulk RNA sequencing or simply make use of marker genes identified from cell clustering followed by supervised annotation. To overcome these limitations and automatise the process, we have developed two novel methods, the single-cell gene set enrichment analysis (scGSEA) and the single cell mapper (scMAP). scGSEA combines latent data representations and gene set enrichment scores to detect coordinated gene activity at single-cell resolution. scMAP uses transfer learning techniques to repurpose and contextualise new cells into a reference cell atlas. Using both simulated and real datasets, we show that scGSEA effectively recapitulates recurrent patterns of pathways’ activity shared by cells from different experimental conditions. At the same time, we show that scMAP can reliably map and contextualise new single cell profiles on a breast cancer atlas we recently released. Both tools are provided in an effective and straightforward workflow providing a framework to determine cell function and significantly improve annotation and interpretation of scRNA-seq data.