Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dual RNA-sequencing enables simultaneous profiling of protein-coding and non-coding transcripts from two interacting organisms, an essential capability when physical separation is difficult, such as in host-parasite or cross-kingdom interactions (e.g., plant-plant or host-pathogen systems). By allowing in silico separation of mixed reads, dual RNA-seq reveals the transcriptomic dynamics of both partners during interaction. However, existing analysis workflows often require programming expertise, limiting accessibility. We present inDAGO, a free, open-source, cross-platform graphical user interface designed for biologists without coding skills. inDAGO supports both bulk and dual RNA sequencing, with dual RNA sequencing further accommodating both sequential and combined approaches. The interface guides users through key analysis steps, including quality control, read alignment, read summarization, exploratory data analysis, and identification of differentially expressed genes, while generating intermediate outputs and publication-ready plots. Optimized for speed and efficiency, inDAGO performs complete analyses on a standard laptop (16 GB RAM) without requiring high-performance computing. We validated inDAGO using diverse real datasets to demonstrate its reliability and usability. inDAGO, available on CRAN (https://cran.r-project.org/web/packages/inDAGO/) and GitHub (https://github.com/inDAGOverse/inDAGO), lowers the technical barrier to dual RNA-seq by enabling robust, reproducible analyses, even for users without coding experience.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.
Facebook
TwitterImmune checkpoint blockade has revolutionized cancer therapy. In particular, inhibition of programmed cell death protein 1 (PD-1) is effective for the treatment of metastatic melanoma and other cancers. Despite a dramatic increase in progression-free survival, a large proportion of patients do not show durable response. Therefore, predictive biomarkers of clinical response are urgently needed. Here, we employed high-dimensional single cell mass cytometry and a bioinformatics pipeline for the in-depth characterization of the immune cell subsets in the peripheral blood of metastatic melanoma patients before and after anti-PD-1 immunotherapy. During therapy, we observed a clear treatment response to immunotherapy in the T cell compartment. However, prior to commending therapy a strong predictor of progression free and overall survival in response to anti-PD-1 immunotherapy was the frequency of CD14+CD16-HLA-DRhi monocytes. We could confirm this by conventional flow cytometry in an independent validation cohort and propose this as a novel predictive biomarker for therapy decisions in the clinic. In order to determine whether there are cell intrinsic changes in the monocyte signature, we performed RNA sequencing on sorted CD14+CD16-HLA-DRhi cells from HD, NR and R at baseline. Representative samples (n=4, each) of responders/non responders/ and healthy donors were selected from archival samples stored in the dermatology biobank according to the same clinical criteria used in the discovery and validation cohorts for CyTOF and FACS analysis. CD14+CD16-HLA-DRhiLin- (CD3, CD4, CD19, CD45RO) monocytes were sorted from frozen PBMC form blood samples from HD, R and NR at baseline.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.
Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).
Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.
Preprocessing of single cell RNA sequencing (scRNA-seq) data
Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.
Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq
Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.
Differential gene expression by immune cells
Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.
Analysis of arsenic effect on immune cell gene expression by scRNA-seq.
Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.
Facebook
TwitterAttribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
License information was derived automatically
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:
matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)
*The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:
nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Facebook
Twitterhttps://ega-archive.org/dacs/EGAC50000000822https://ega-archive.org/dacs/EGAC50000000822
The dataset contains five samples from P1, one pre-HSCT and four post-HSCT, one sample from P2 and 5-6 HD. RNAseq reads were processed using the LUMC BIOWDL RNAseq pipeline v5.0.0 (https://github.com/biowdl/RNA-seq). Genes with log2CPM≥1 in at least 10% of all samples were retained for downstream analysis. Differential gene expression analysis was performed using the dgeAnalysis R-Shiny application (https://github.com/LUMC/dgeAnalysis/tree/v1.4.4). ATACseq reads were processed using the LUMC BIOWDL Chipseq pipeline 1.0.0-dev (https://github.com/biowdl/ChIP-seq).
Facebook
TwitterAnalysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.
Facebook
TwitterBackground
Individuals with cystic fibrosis have an elevated lifetime risk of colonization, infection, and disease caused by nontuberculous mycobacteria. A prior study involving non-cystic fibrosis individuals reported a gene expression signature associated with susceptibility to nontuberculous mycobacteria pulmonary disease (NTM-PD). In this study, we determined whether people living with cystic fibrosis who progress to NTM-PD have a gene expression pattern similar to the one seen in the non-cystic fibrosis population.
Methods
We evaluated whole blood transcriptomics using bulk RNA-seq in a cohort of cystic fibrosis patients with samples collected closest in timing to the first isolation of nontuberculous mycobacteria. The study population included patients who did (n = 12) and did not (n = 30) develop NTM-PD following the first mycobacterial growth. Progression to NTM-PD was defined by a consensus of two expert clinicians based on reviewing clinical, microbiological, and radiologic..., Study population and clinical data
This study is a secondary data analysis using blood samples and data from the “CF Biomarker†cohort approved by the University of British Columbia-Providence Health Care ethics review board (H12-00835). The local ethics board also reviewed and approved the secondary analysis (H20-00117). Patients in the parent cohort were recruited following informed consent at the St. Paul’s Hospital Adult CF Clinic (Vancouver, Canada) between January 2012 and December 2019. In the current analysis, we included participants who consented to the future use of their samples and data, had at least one positive respiratory culture for NTM, and had a whole blood RNA sample available (PAXgene® stored at -70°C). Lung transplant recipients and subjects without a definite diagnosis of CF were excluded. We preferentially selected blood samples taken during clinically stable periods and closest to the first positive growth of NTM. We did not limit blood sampling to within a spec..., The data sets are provided as comma-separated values and can be opened with standard statistical software or explored with a spreadsheet program. In our analyses, we employed R and the GUI R studio (v 4.1.1) for analysis. Raw sequencing processing and transcript counting was done in a CentOS high performance cluster, dependencies and commands ran are described inside markdown scripts.Â
Facebook
TwitterCell lines and compounds PCa cell lines (LNCaP, 22Rv1, VCaP, PC3, DU145, NCI-H660, C4-2), other cell lines (HEK293T, DLD1) and benign prostate line (RWPE-1) were purchased from ATCC and maintained according to ATCC protocols. Patient-derived CRPC organoids (WCM and MSK) were established and maintained as organoids in Matrigel drops according to the previously described protocol70. LNCaP-AR cells were a kind gift from Dr. Sawyers and Dr. Mu (Memorial Sloan Kettering Cancer Center) and were cultured as previously described5. All used cell lines and their phenotype are listed in Supplementary Table 1. Cell cultures were regularly tested for Mycoplasma contamination and confirmed to be negative. Genentech Inc. synthesized A947, its epimer (A858), FHD-286 and AU-15330. Cobimetinib, Trametinib, VL285 and CHIR99021 were purchased from SelleckChem. BRM014 was purchased from MedChemExpress. All drugs used in this study are listed in Supplementary Table 2. Single-cell RNA-sequencing by SORT-seq library generation and analysis SORT-seq was performed using Single Cell Discoveries (SCD) service. Organoids were treated for 72h with a control epimer (A858) or active compound (A947) at 1 µM, and 1x10e6 cells were harvested in PBS. Harvested cells were stained with 100ng/ml DAPI to stain dead cells. Using a cell sorter (conducted by Flow Cytometry Core, DBMR, Bern) and the recommended settings (Single Cell Discoveries B.V.), DAPI-negative cells were sorted as single cells in 376 wells of four 384-well plates containing immersion oil per condition. Resulting in a theoretical cell number of 1504 cells per condition. All post-harvesting steps were performed at 4°C. Plates were snap-frozen on dry ice for 15 minutes and sent out for sequencing at Single Cell Discoveries B.V. Data were analyzed using the Seurat package v.4.3.080. Cell QC filtering was done using the following thresholds: nCount > 4000, nFeature > 1000, percent.mito 0.85. Differential gene expression analysis between clusters was done with Seurat::FindAllMarkers. Module scores were generated with Seurat::AddModuleScore. Gene set enrichment analysis was done with the package fgsea v.1.24.081 and the human gene sets from the Molecular Signatures Database (https://www.gsea-msigdb.org). Gene regulatory networks analysis was done with pySCENIC v.0.12.182. Overall analysis was done in R v.4.2.2. RNA-seq library generation and processing For bulk RNA-seq, organoids were treated with A858 or A947 (1µM) for 24h and 48h (3 biological replicates per condition). RNA was extracted using the RNeasy Kit (Qiagen); library generation and subsequent sequencing was performed by the clinical genomics lab (CGL) at the University of Bern. Sequencing reads were aligned against the human genome hg38 with STAR v.2.7.3a83. Gene counts were generated with RSEM v.1.3.284, whose index was generated using the GENCODE v33 primary assembly annotation. Differential gene expression analysis was done with DESeq2 v.1.34.085. Gene set enrichment analysis was done with the package fgsea v.1.20.081 and the human gene sets from the Molecular Signatures Database (https://www.gsea-msigdb.org). Analysis was done in R v.4.1.2. TCF7L2 ChIP-seq library generation and processing For the ChIP-Seq assay, chromatin was prepared from 2 biological replicates of WCM1078 treated with A858 or A947 (1µM) for 4h, and ChIP-Seq assays were then performed by Active Motif Inc. using an antibody against TCF7L2 (Cell Signalling, cat#2569). ChIP-seq sequence data was processed using an ENCODE-DC/chip-seq-pipeline2 -based workflow (https://github.com/ENCODE-DCC/chip-seq-pipeline2). Briefly, fastq files were aligned on the hg38 human genome reference using Bowtie2 (v2.2.6) followed by alignment sorting (samtools v1.7) of resulting bam files with filtering out of unmapped reads and keeping reads with mapping quality higher than 30. Duplicates were removed with Picard’s MarkDuplicates (v1.126) function, followed by indexation of resulting bam files with samtools. For each bam file, genome coverage was computed with bedtools (v2.26.0), followed by the generation of bigwig (wigToBigWig v377) files. Peaks were called with macs2 (v2.2.4) for each treatment sample using a pooled input alignment (.bam file) as control. Downstream analyses were performed with DiffBind v3.11.1 with default parameters, except for summits=250 in dba.count(). dba.contrast() and dba.analyzed() were used to compute significant differential peaks with DESeq2. ATAC-seq library generation and processing ATAC-seq was performed from 50’000 cryo-preserved cells per condition (1µM A858 and 1µM A947, n = 3 biological replicates) treated for 4h and analyzed as described in previous study86. Briefly, 50,000 cryo-preserved cells per condition were lysed for 5 minutes on ice and tagmented for 30 minutes at 37°C, followed by DNA isolation. DNA was barcoded and amplified before sequencing. PRO-cap library generation and processing For PRO-cap, approximately ...
Facebook
Twitterhttps://ega-archive.org/dacs/EGAC00001002989https://ega-archive.org/dacs/EGAC00001002989
Manuscript Title: Co-targeting of BTK and MALT1 overcomes resistance to BTK inhibitors in mantle cell lymphoma
Journal: Journal of Clinical Investigation
Authors Vivian Changying Jiang1, Yang Liu1, Junwei Lian1, Shengjian Huang1, Alexa Jordan1, Qingsong Cai1, Fangfang Yan3, Joseph Mitchell McIntosh1, Yijing Li1, Yuxuan Che1, Zhihong Chen1, Jovanny Vargas1, Maria Badillo1, JohnNelson Bigcal1, Heng-Huan Lee1, Wei Wang1, Yixin Yao1, Lei Nie1, Christopher Flowers1, and Michael Wang1, 2*
Abstract Bruton’s tyrosine kinase (BTK) is a proven target in mantle cell lymphoma (MCL), an aggressive subtype of non-Hodgkin lymphoma. However, resistance to BTK inhibitors is a major clinical challenge. We here report that MALT1 is one of the top overexpressed genes in ibrutinib-resistant MCL cells, while expression of CARD11, which is upstream of MALT1, is decreased. MALT1 genetic knockout or inhibition produced dramatic defects in MCL cell growth regardless of ibrutinib sensitivity. Conversely, CARD11 knockout cells showed anti-tumor effects only in ibrutinib-sensitive cells, suggesting that MALT1 overexpression could drive ibrutinib resistance via bypassing BTK-CARD11 signaling. Additionally, BTK knockdown and MALT1 knockout markedly impaired MCL tumor migration and dissemination, and MALT1 pharmacological inhibition decreased MCL cell viability, adhesion, and migration by suppressing NF-κB, PI3K-ATK-mTOR, and integrin signaling. Importantly, co-targeting MALT1 with safimaltib and BTK with pirtobrutinib induced potent anti-MCL activity in ibrutinib-resistant MCL cell lines and patient-derived xenografts. Therefore, we conclude that MALT1 overexpression associates with resistance to BTK inhibitors in MCL, targeting abnormal MALT1 activity could be a promising therapeutic strategy to overcome BTK inhibitor resistance, and co-targeting of MALT1 and BTK should improve MCL treatment efficacy and durability as well as patient outcomes.
Dataset description: The bulk RNA-seq dataset was generated for the cell lines below and used for two major purposes: 1. DEG analysis and GSEA analysis comparing IBN-R and IBN-S cells 2. DEG analysis and GSEA analysis comparing MCL cells with/without MI-2 treatment.
sample Cell MI-2 Ibrutinib (IBN) Venetoclax (VEN) Used for IBN-R vs IBN-S comparison Used for MI-2 vs untreated (DMSO) H9 Granta519 - R S yes H21 Granta519 - R S yes H33 Granta519 - R S yes H10 Granta519-VEN-R - R R yes H22 Granta519-VEN-R - R R yes H34 Granta519-VEN-R - R R yes H3 JeKo BTK KD_1 - R R yes yes H15 JeKo BTK KD_1 - R R yes yes H27 JeKo BTK KD_1 - R R yes yes H5 JeKo BTK KD_2 - R R yes yes H17 JeKo BTK KD_2 - R R yes yes H29 JeKo BTK KD_2 - R R yes yes H1 JeKo-1 - S R yes yes H13 JeKo-1 - S R yes yes H25 JeKo-1 - S R yes yes H7 Mino - S S yes H19 Mino - S S yes H31 Mino - S S yes H8 Mino-VEN-R - S R yes H20 Mino-VEN-R - S R yes H32 Mino-VEN-R - S R yes H11 Rec-1 - S S yes H23 Rec-1 - S S yes H12 Rec-VEN-R - S S yes H24 Rec-VEN-R - S R yes H36 Rec-VEN-R - S R yes H35 Rec-1 -- S R yes H4 JeKo BTK KD_1 + MI-2 + yes H16 JeKo BTK KD_1 + MI-2 + yes H28 JeKo BTK KD_1 + MI-2 + yes H6 JeKo BTK KD_2 + MI-2 + yes H18 JeKo BTK KD_2 + MI-2 + yes H30 JeKo BTK KD_2 + MI-2 + yes H2 JeKo-1 + MI-2 + yes H14 JeKo-1 + MI-2 + yes H26 JeKo-1 + MI-2 + yes
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes information relevant to the following manuscript from the labs of Prof. Carlos Caldas (University of Cambridge), and Dr. Long V. Nguyen (Princess Margaret Cancer Centre, University Health Network):
Nguyen LV et al. Dynamics and plasticity of human breast cancer single cell-derived clones. Under consideration for publication.
Bulk RNA sequencing raw count matrices are provided (RawCounts.csv) along with the normalized count matrices (LogCPMNormCounts.csv).
Single cell RNA sequencing count matrix processed from R package metacell is provided (mat.pdx_LN_v2_filt.Rda), along with the mc and mc2d files with information on metacell partitions (mc.pdx_LN_v2_filt.Rda and mc2d.pdx_LN_v2_filt.Rda).
Code and information on data analysis is provided for reviewers in our unpublished manuscript and on Github (https://github.com/cclab-brca/clone-dynamics).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains R objects (RDS) for
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Single-cell RNA sequencing (scRNA-seq) has been widely applied to discover new cell types by detecting sub-populations in a heterogeneous group of cells. Since scRNA-seq experiments have lower read coverage/tag counts and introduce more technical biases compared to bulk RNA-seq experiments, the limited number of sampled cells combined with the experimental biases and other dataset specific variations presents a challenge to cross-dataset analysis and discovery of relevant biological variations across multiple cell populations. In this paper, we introduce a method of variance-driven multitask clustering of single-cell RNA-seq data (scVDMC) that utilizes multiple single-cell populations from biological replicates or different samples. scVDMC clusters single cells in multiple scRNA-seq experiments of similar cell types and markers but varying expression patterns such that the scRNA-seq data are better integrated than typical pooled analyses which only increase the sample size. By controlling the variance among the cell clusters within each dataset and across all the datasets, scVDMC detects cell sub-populations in each individual experiment with shared cell-type markers but varying cluster centers among all the experiments. Applied to two real scRNA-seq datasets with several replicates and one large-scale droplet-based dataset on three patient samples, scVDMC more accurately detected cell populations and known cell markers than pooled clustering and other recently proposed scRNA-seq clustering methods. In the case study applied to in-house Recessive Dystrophic Epidermolysis Bullosa (RDEB) scRNA-seq data, scVDMC revealed several new cell types and unknown markers validated by flow cytometry. MATLAB/Octave code available at https://github.com/kuanglab/scVDMC.
Facebook
Twitterhttps://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Hypertrophic cardiomyopathy (HCM) is characterized by thickening of the left ventricular wall, diastolic dysfunction, and fibrosis, and is associated with mutations in genes encoding sarcomere proteins. While in vitro studies have used human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) to study HCM, these models have not examined the multicellular interactions involved in fibrosis. Using engineered cardiac microtissues (CMTs) composed of HCM-causing MYH7-variant hiPSC-CMs and wild-type fibroblasts, we observed cell-cell cross-talk leading to increased collagen deposition, tissue stiffening, and decreased contractility dependent on fibroblast proliferation. hiPSC-CM conditioned media and single-nucleus RNA sequencing data suggested that fibroblast proliferation is mediated by paracrine signals from MYH7-variant cardiomyocytes. Furthermore, inhibiting epidermal growth factor receptor tyrosine kinase with erlotinib hydrochloride attenuated stromal activation. Last, HCM-causing MYBPC3-variant CMTs also demonstrated increased stromal activation and reduced contractility, but with distinct characteristics. Together, these findings establish a paracrine-mediated cross-talk potentially responsible for fibrotic changes observed in HCM.
Methods
snRNA-sequencing data
snRNA-seq dataset includes data on CMTs made with healthy wild type (WT) or hypertrophic cardiomyopathy (HCM)-causing (R403Q+/- mutation in myosin heavy chain) human induced pluripotent stem cell derived-cardiomyocytes (hiPSC-CMs) (derived from PGP1 hiPSC line) and ventricular cardiac fibroblasts (Lonza Cat. CC-2904) (1).
Methods: A total of 60,000 cells per tissue (ten tissues pooled per sample), consisting of 90% hiPSC-CMs and 10% vCFs, were mixed in an ECM solution consisting of 4 mg/ml of human fibrinogen (Sigma), 10% Matrigel (Corning), 0.4 unit of thrombin (Sigma) per mg of fibrinogen, 5 µM Y-27632 (Tocris), and 0.033 mg/mL aprotinin (Sigma). The cell-ECM mixture was pipetted into each tissue well, and after gel polymerization for ten minutes, tissue maintenance growth media containing high glucose DMEM (Fisher) supplemented with 10% FBS (Sigma), 1% penicillin-streptomycin (P/S) (Fisher), 1% Non-essential Amino Acids (Fisher), 1% Glutamax (Fisher), 5 µM Y-27632, 0.033 mg/mL aprotinin, and 150 µg/mL L-ascorbic acid 2-phosphate sesquimagnesium salt hydrate. Y-27632 was removed two days following seeding, and the growth media was replaced every other day.
CMTs were flash frozen in liquid nitrogen on day 7 (10 tissues pooled per sample) and stored in -80°C. Individual nuclei were isolated from frozen tissue samples. Briefly, nuclei were isolated (2) and RNA was reverse-transcribed and converted into cDNA libraries using a 10x Chromium Controller and Chromium Single Cell 3' v3.1 reagent kit (10x Genomics). Bar-coded libraries were pooled and sequenced (Illumina NovaSeq 6000). Single nucleus RNAseq alignment and gene counts were performed using Cell Ranger 1.2 (10x Genomics) and Seurat (3) and R 4.1.0 and managed via RStudio. There were 15,129 total sequenced nuclei.
Principal component (PC) analysis was performed to determine the dimensionality of this dataset within Seurat and the number of PCs was selected using both permutation-based and heuristic methods. Cluster assignment was performed in an unsupervised manner using a shared nearest neighbor approach within Seurat. To validate the ideal number of PCs and clustering resolution in this dataset, repeat analysis was performed while varying both the dimensionality and resolution. Using the Clustree R package (4), the hierarchical evolution of cluster identity was determined. This analysis allows the user to investigate how clustering resolution impacts cell identity, in an effort to assign robust classification to cell clusters which remain consistent across resolution values. For each clustering resolution, marker genes for each cluster, TTN for cardiomyocytes and FN1 for fibroblasts, were identified in order to assign relevant biological function to various subclusters. Dimensionality and resolution were adjusted until each identified cluster contained a non-zero amount of significantly expressed marker genes that were able to assign functional gene ontological terms; in this case, at cluster resolution 0.4, which was used for clustering analysis. At greater resolution values, cluster subsets returned either too few genes to assign functional classifications or non-coding genes of unknown function. Dotplot projections of single cells were generated with UMAP coordinates, using the number of dimensions determined from PCA as described above. For the purposes of this analysis, 30 dimensions were considered for both the initial PCA, and subsequent UMAP visualization. To assign cell clusters specific identities, canonical marker gene expression was analyzed to assign broad cell classes. To assign cell subclusters, such as CM1-CM6, unbiased genetic markers for each population were calculated using the Wilcoxon Rank Sum test within Seurat, at which point at most the top 100 genes were used as a Gene Ontology query to assign cellular functions to each cell subcluster.
Genes that were upregulated R403Q+/- cardiomyocytes compared to WT cardiomyocytes with a p-value < 0.01 and fold-change greater than 1.3 were input into Enrichr (5,6). The top upregulated pathways associated with the top upregulated genes were obtained using WikiPathways (7).
Bulk RNA-sequencing data
Bulk RNA-sequencing data include data on serum starved vCFs, vCFs treated with 100 ng/mL recombinant human epidermal growth factor (rhEGF), and conditioned media from hiPSC-CMs (1).
Methods: vCFs were trypsinized, centrifuged, and flash-frozen after two days in serum-starvation (low glucose DMEM with 0.1% fetal bovine serum) vs 100 ng/mL rhEGF or conditioned media from wild-type vs R403Q+/- hypertrophic cardiomyopathy hiPSC-CMs for three different batches. Cells were homogenized in Trizol Reagent (Life Technologies, Inc., Grand Island, NY) with TissueLyzer II (QIAGEN, Inc., Valencia, CA) and RNA was isolated by conventional methods. RNA went through two rounds of mRNA purification (polyA-selection) using Dynabeads mRNA DIRECT Kit (Invitrogen, Carlsbad, CA). Double-stranded cDNA was generated using the Superscript III First-Strand Synthesis System (Invitrogen, Carlsbad, CA). The cDNA products were used to construct libraries with the Nextera XT DNA Sample Preparation Kit (Illumina, Inc., San Diego, CA). Libraries were paired-end sequenced with a read length of 75 base pairs (75PE) on the Illumina NextSeq 500. Reads were aligned to the hg38 Human Genome using Spliced Transcripts Alignment to a Reference (STAR) (8). Data were analyzed as previously described (9) and normalized to the total number of reads per kilobase of exon per million (RPKM).
Genes that were upregulated in rhEGF-treated vCFs compared to serum-starved vCFs with a p-value < 0.01 and fold-change greater than 1.3 were input into Enrichr (5,6). For conditioned media experiments, gene expression values were averaged over three experimental repeats. Genes that had an average fold-change expression greater than 1.15 with a p-value < 0.01 for all three experiments were input into Enrichr. The top upregulated pathways and transcription factors associated with the top upregulated genes were obtained using WikiPathways (7) and ChEA (10) databases, respectively.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository for the scMappR manuscript:
Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).
RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.
Facebook
TwitterWound healing is an intensely studied topic involved in many relevant pathophysiological processes, including fibrosis. Despite the large interest in fibrosis, the network that related to commensal microbiota and skin fibrosis remain mysterious. Here, we pay attention to keloid, a classical yet intractable skin fibrotic disease to establish the association between commensal microbiota to scaring tissue. Our histological data reveal the presence of microbiota in the keloids. 16S rRNA sequencing characterize microbial composition and divergence between the pathological and normal skin tissue. Our research provides insights into the pathology of human fibrotic diseases, advocating commensal bacteria and IL-8 signaling as useful targets in future interventions of recurrent keloid disease., 16S rDNA sequencing data The data files here are raw 16S rDNA sequencing accompanied by R scripts that were used to analyze these data. The source of the data was: (1) a swab test from the surface of normal and keloid skin; (2) the tissues of keloid patients from deeper parts of the skin. Surface microbiota samples were collected from the pathological location or the normal lateral location of patients using a swab (Catch-all Sample Collection Swab, Epicenter) moistened in Yeast Cell Lysis Buffer (from MasterPure Yeast DNA Purification Kit; Epicenter). Samples were snap-frozen on dry ice, and DNA was isolated from specimens using the PureLink Genomic DNA Mini Kit (Invitrogen).  Amplification of the 16S-V3+V4 region was performed according to the manufacturer’s specifications. Sequencing of 16S rRNA amplicons was conducted by Apexbio Co., Shanghai, China using the Illumina Novaseq platform. The data were analyzed with the attached R scripts. Bulk RNA-Seq For RNA sequencing, human dermal ..., , # Commensal microbiome dysbiosis in keloid disease
https://doi.org/10.5061/dryad.d51c5b0bt
These datasets contain original fastq.gz readouts from 16S rDNA sequencing experiments and scripts used to analyze these data. The experiments were performed on patients with keloids, a fibrotic skin disease. We did two similar experiments with the same way of analysis:
Additionally, we provide the files from bulk RNA sequencing of human dermal fibroblasts treated with IL-8 and TGF-beta here.
All the data, additional files, and R scripts ar...
Facebook
TwitterTable S2. Control vs. N-cadherin shRNAs. We compared RNA transcriptomes from bulk cell populations, control or N-cadherin-depleted pediatric high-grade glioma cells. N-cad depletion decreased RNA expression but unaffected the expression of other cadherins and integrins except for CDH3. Differential gene expression analysis was performed with the DEseq2 for paired sample R package.
Table S3. Leader vs. follower cells. We compared RNA transcriptomes from migrating glioma leader and follower cells that we isolated by photoconversion and flow cytometry. 44 gene transcripts increased and 36 decreased in leader relative to followers out of 19,729 genes that were quantified (log2 fold-change >0.5, FDR <0.05). YAP-response genes and wound-healing genes were higher in leader than follower cells. Differential gene expression analysis was performed with the DEseq2 for paired sample R package. NA r..., Pediatric high-grade gliomas are highly invasive and essentially incurable. Glioma cells migrate between neurons and glia, along axon tracts, and through extracellular matrix surrounding blood vessels and underlying the pia. Mechanisms that allow adaptation to such complex environments are poorly understood. N-cadherin is highly expressed in pediatric gliomas and is associated with shorter survival. We found that inter-cellular homotypic N-cadherin interactions differentially regulate glioma migration according to the microenvironment, stimulating migration on cultured neurons or astrocytes but inhibiting invasion into reconstituted or astrocyte-deposited extracellular matrix. N-cadherin localizes to filamentous connections between migrating leader cells but to epithelial-like junctions between followers. Leader cells have more surface and recycling N-cadherin, increased YAP1/TAZ signaling, and increased proliferation relative to followers. YAP1/TAZ signaling is dynamically regulated as..., To compare RNA transcriptomes from bulk cell populations, control or N-cad shRNA cells were dissociated using Accutase at room temperature for 10 min and resuspended in HBSS. 500 cells were collected in the center of a 1.5 ml centrifuge tube containing 4.75 μl SMART-seq reaction buffer (Takara) using a BD FACSymphony S6 (BD Bioscience), avoiding cell loss on the tube walls. To compare RNA transcriptomes from leader and follower cells, approximately 90 spheroids of PBT-05 cells expressing histone H2B-Dendra2 cells were allowed to migrate for 24 hrs on laminin and photoconverted as above. After photoconversion, cells were dissociated with Accutase at room temperature, resuspended in HBSS and transferred to ice, and approximately 200 leader and follower cells were sorted into SMART-seq reaction buffer as above. Each experiment was performed on four different occasions. RNA was prepared and cDNA was synthesized with the SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara) and ran th...,
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
On this Zenodo link, we share the data that is required to reproduce all the analyses from our publication "satuRn: Scalable Analysis of differential Transcript Usage for bulk and single-cell RNA-sequencing applications".
This repository includes input transcript-level expression matrices and metadata for all datasets, as well as intermediate results and final outputs of the respective DTU analyses. For a more elaborate description of the data, we refer to the companion GitHub for our publications; https://github.com/statOmics/satuRnPaper. Note that this is version 1.0.1 of the data (uploaded on 2021-01-14). If any changes were to be made to the datasets in the future, this will also be communicated on our companion GitHub page.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This repository contains the data and the code used in tanja hyvarinen's project (tanja.hyvarinen@tuni.fi)
"analysis" contains the processing and analysis of the bulk RNA sequencing data, contains the analysis of the integration between our RNA seq and external datasets.
"fastq" contains the raw fastq sequences of the bulk RNA sequencing
"counts" contains the results of processing the fasta sequences with nfcore rnaseq workflow
Analysis and integration folders can contain the starting raw data in "data", the R scripts in order of execution (op1, op2 ..) and the "output" folder that contains the final processed data of each operation.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dual RNA-sequencing enables simultaneous profiling of protein-coding and non-coding transcripts from two interacting organisms, an essential capability when physical separation is difficult, such as in host-parasite or cross-kingdom interactions (e.g., plant-plant or host-pathogen systems). By allowing in silico separation of mixed reads, dual RNA-seq reveals the transcriptomic dynamics of both partners during interaction. However, existing analysis workflows often require programming expertise, limiting accessibility. We present inDAGO, a free, open-source, cross-platform graphical user interface designed for biologists without coding skills. inDAGO supports both bulk and dual RNA sequencing, with dual RNA sequencing further accommodating both sequential and combined approaches. The interface guides users through key analysis steps, including quality control, read alignment, read summarization, exploratory data analysis, and identification of differentially expressed genes, while generating intermediate outputs and publication-ready plots. Optimized for speed and efficiency, inDAGO performs complete analyses on a standard laptop (16 GB RAM) without requiring high-performance computing. We validated inDAGO using diverse real datasets to demonstrate its reliability and usability. inDAGO, available on CRAN (https://cran.r-project.org/web/packages/inDAGO/) and GitHub (https://github.com/inDAGOverse/inDAGO), lowers the technical barrier to dual RNA-seq by enabling robust, reproducible analyses, even for users without coding experience.