42 datasets found
  1. f

    Table 2_inDAGO: a user-friendly interface for seamless dual and bulk RNA-Seq...

    • frontiersin.figshare.com
    xlsx
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gaetano Aufiero; Carmine Fruggiero; Nunzio D’Agostino (2025). Table 2_inDAGO: a user-friendly interface for seamless dual and bulk RNA-Seq analysis.xlsx [Dataset]. http://doi.org/10.3389/fbinf.2025.1696823.s002
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    Frontiers
    Authors
    Gaetano Aufiero; Carmine Fruggiero; Nunzio D’Agostino
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dual RNA-sequencing enables simultaneous profiling of protein-coding and non-coding transcripts from two interacting organisms, an essential capability when physical separation is difficult, such as in host-parasite or cross-kingdom interactions (e.g., plant-plant or host-pathogen systems). By allowing in silico separation of mixed reads, dual RNA-seq reveals the transcriptomic dynamics of both partners during interaction. However, existing analysis workflows often require programming expertise, limiting accessibility. We present inDAGO, a free, open-source, cross-platform graphical user interface designed for biologists without coding skills. inDAGO supports both bulk and dual RNA sequencing, with dual RNA sequencing further accommodating both sequential and combined approaches. The interface guides users through key analysis steps, including quality control, read alignment, read summarization, exploratory data analysis, and identification of differentially expressed genes, while generating intermediate outputs and publication-ready plots. Optimized for speed and efficiency, inDAGO performs complete analyses on a standard laptop (16 GB RAM) without requiring high-performance computing. We validated inDAGO using diverse real datasets to demonstrate its reliability and usability. inDAGO, available on CRAN (https://cran.r-project.org/web/packages/inDAGO/) and GitHub (https://github.com/inDAGOverse/inDAGO), lowers the technical barrier to dual RNA-seq by enabling robust, reproducible analyses, even for users without coding experience.

  2. f

    Data_Sheet_2_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    zip
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_2_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.zip [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s002
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  3. e

    Bulk RNA-seq validation experiments for \High dimensional single cell...

    • ebi.ac.uk
    Updated Dec 19, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mark Robinson (2017). Bulk RNA-seq validation experiments for \High dimensional single cell analysis predicts response to anti-PD-1 immunotherapy\ [Dataset]. https://www.ebi.ac.uk/arrayexpress/experiments/E-MTAB-6214/
    Explore at:
    Dataset updated
    Dec 19, 2017
    Authors
    Mark Robinson
    Description

    Immune checkpoint blockade has revolutionized cancer therapy. In particular, inhibition of programmed cell death protein 1 (PD-1) is effective for the treatment of metastatic melanoma and other cancers. Despite a dramatic increase in progression-free survival, a large proportion of patients do not show durable response. Therefore, predictive biomarkers of clinical response are urgently needed. Here, we employed high-dimensional single cell mass cytometry and a bioinformatics pipeline for the in-depth characterization of the immune cell subsets in the peripheral blood of metastatic melanoma patients before and after anti-PD-1 immunotherapy. During therapy, we observed a clear treatment response to immunotherapy in the T cell compartment. However, prior to commending therapy a strong predictor of progression free and overall survival in response to anti-PD-1 immunotherapy was the frequency of CD14+CD16-HLA-DRhi monocytes. We could confirm this by conventional flow cytometry in an independent validation cohort and propose this as a novel predictive biomarker for therapy decisions in the clinic. In order to determine whether there are cell intrinsic changes in the monocyte signature, we performed RNA sequencing on sorted CD14+CD16-HLA-DRhi cells from HD, NR and R at baseline. Representative samples (n=4, each) of responders/non responders/ and healthy donors were selected from archival samples stored in the dermatology biobank according to the same clinical criteria used in the discovery and validation cohorts for CyTOF and FACS analysis. CD14+CD16-HLA-DRhiLin- (CD3, CD4, CD19, CD45RO) monocytes were sorted from frozen PBMC form blood samples from HD, R and NR at baseline.

  4. n

    Data from: Single cell RNA-seq analysis reveals that prenatal arsenic...

    • data.niaid.nih.gov
    • datadryad.org
    • +1more
    zip
    Updated Jun 1, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow (2020). Single cell RNA-seq analysis reveals that prenatal arsenic exposure results in long-term, adverse effects on immune gene expression in response to Influenza A infection [Dataset]. http://doi.org/10.5061/dryad.vt4b8gtp6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 1, 2020
    Dataset provided by
    Dartmouth College
    Dartmouth–Hitchcock Medical Center
    Authors
    Britton Goodale; Kevin Hsu; Kenneth Ely; Thomas Hampton; Bruce Stanton; Richard Enelow
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.

    Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).

    Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.

    Preprocessing of single cell RNA sequencing (scRNA-seq) data

    Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.

    Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq

    Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.

    Differential gene expression by immune cells

    Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.

    Analysis of arsenic effect on immune cell gene expression by scRNA-seq.

    Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.

  5. u

    Data from: Reference transcriptomics of porcine peripheral immune cells...

    • agdatacommons.nal.usda.gov
    • datasets.ai
    • +1more
    zip
    Updated Nov 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle (2025). Data from: Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing [Dataset]. http://doi.org/10.15482/USDA.ADC/1522411
    Explore at:
    zipAvailable download formats
    Dataset updated
    Nov 21, 2025
    Dataset provided by
    Ag Data Commons
    Authors
    Juber Herrera-Uribe; Jayne Wiarda; Sathesh K. Sivasankaran; Lance Daharsh; Haibo Liu; Kristen A. Byrne; Timothy P. L. Smith; Joan K. Lunney; Crystal L. Loving; Christopher K. Tuggle
    License

    Attribution-ShareAlike 4.0 (CC BY-SA 4.0)https://creativecommons.org/licenses/by-sa/4.0/
    License information was derived automatically

    Description

    This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows:

    matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz)

    *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include:

    nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().

  6. E

    Bulk RNA and ATACseq of 2 XLP patients and 5-6 HD

    • ega-archive.org
    Updated Nov 24, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Bulk RNA and ATACseq of 2 XLP patients and 5-6 HD [Dataset]. https://ega-archive.org/datasets/EGAD50000002072
    Explore at:
    Dataset updated
    Nov 24, 2025
    License

    https://ega-archive.org/dacs/EGAC50000000822https://ega-archive.org/dacs/EGAC50000000822

    Description

    The dataset contains five samples from P1, one pre-HSCT and four post-HSCT, one sample from P2 and 5-6 HD. RNAseq reads were processed using the LUMC BIOWDL RNAseq pipeline v5.0.0 (https://github.com/biowdl/RNA-seq). Genes with log2CPM≥1 in at least 10% of all samples were retained for downstream analysis. Differential gene expression analysis was performed using the dgeAnalysis R-Shiny application (https://github.com/LUMC/dgeAnalysis/tree/v1.4.4). ATACseq reads were processed using the LUMC BIOWDL Chipseq pipeline 1.0.0-dev (https://github.com/biowdl/ChIP-seq).

  7. f

    Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq...

    • datasetcatalog.nlm.nih.gov
    Updated Jun 23, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paules, Richard S.; Ramaiahgari, Sreenivasa C.; Ferguson, Stephen S.; Auerbach, Scott S.; Bushel, Pierre R. (2020). Table_2_Comparison of Normalization Methods for Analysis of TempO-Seq Targeted RNA Sequencing Data.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000579048
    Explore at:
    Dataset updated
    Jun 23, 2020
    Authors
    Paules, Richard S.; Ramaiahgari, Sreenivasa C.; Ferguson, Stephen S.; Auerbach, Scott S.; Bushel, Pierre R.
    Description

    Analysis of bulk RNA sequencing (RNA-Seq) data is a valuable tool to understand transcription at the genome scale. Targeted sequencing of RNA has emerged as a practical means of assessing the majority of the transcriptomic space with less reliance on large resources for consumables and bioinformatics. TempO-Seq is a templated, multiplexed RNA-Seq platform that interrogates a panel of sentinel genes representative of genome-wide transcription. Nuances of the technology require proper preprocessing of the data. Various methods have been proposed and compared for normalizing bulk RNA-Seq data, but there has been little to no investigation of how the methods perform on TempO-Seq data. We simulated count data into two groups (treated vs. untreated) at seven-fold change (FC) levels (including no change) using control samples from human HepaRG cells run on TempO-Seq and normalized the data using seven normalization methods. Upper Quartile (UQ) performed the best with regard to maintaining FC levels as detected by a limma contrast between treated vs. untreated groups. For all FC levels, specificity of the UQ normalization was greater than 0.84 and sensitivity greater than 0.90 except for the no change and +1.5 levels. Furthermore, K-means clustering of the simulated genes normalized by UQ agreed the most with the FC assignments [adjusted Rand index (ARI) = 0.67]. Despite having an assumption of the majority of genes being unchanged, the DESeq2 scaling factors normalization method performed reasonably well as did simple normalization procedures counts per million (CPM) and total counts (TCs). These results suggest that for two class comparisons of TempO-Seq data, UQ, CPM, TC, or DESeq2 normalization should provide reasonably reliable results at absolute FC levels ≥2.0. These findings will help guide researchers to normalize TempO-Seq gene expression data for more reliable results.

  8. f

    Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data...

    • frontiersin.figshare.com
    application/cdfv2
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao (2023). Data_Sheet_1_NormExpression: An R Package to Normalize Gene Expression Data Using Evaluated Methods.doc [Dataset]. http://doi.org/10.3389/fgene.2019.00400.s001
    Explore at:
    application/cdfv2Available download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Zhenfeng Wu; Weixiang Liu; Xiufeng Jin; Haishuo Ji; Hua Wang; Gustavo Glusman; Max Robinson; Lin Liu; Jishou Ruan; Shan Gao
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data normalization is a crucial step in the gene expression analysis as it ensures the validity of its downstream analyses. Although many metrics have been designed to evaluate the existing normalization methods, different metrics or different datasets by the same metric yield inconsistent results, particularly for the single-cell RNA sequencing (scRNA-seq) data. The worst situations could be that one method evaluated as the best by one metric is evaluated as the poorest by another metric, or one method evaluated as the best using one dataset is evaluated as the poorest using another dataset. Here raises an open question: principles need to be established to guide the evaluation of normalization methods. In this study, we propose a principle that one normalization method evaluated as the best by one metric should also be evaluated as the best by another metric (the consistency of metrics) and one method evaluated as the best using scRNA-seq data should also be evaluated as the best using bulk RNA-seq data or microarray data (the consistency of datasets). Then, we designed a new metric named Area Under normalized CV threshold Curve (AUCVC) and applied it with another metric mSCC to evaluate 14 commonly used normalization methods using both scRNA-seq data and bulk RNA-seq data, satisfying the consistency of metrics and the consistency of datasets. Our findings paved the way to guide future studies in the normalization of gene expression data with its evaluation. The raw gene expression data, normalization methods, and evaluation metrics used in this study have been included in an R package named NormExpression. NormExpression provides a framework and a fast and simple way for researchers to select the best method for the normalization of their gene expression data based on the evaluation of different methods (particularly some data-driven methods or their own methods) in the principle of the consistency of metrics and the consistency of datasets.

  9. d

    Whole blood RNA-seq demonstrates an increased host immune response in...

    • search.dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated Nov 29, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel Prieto; Bradley Quon; Jiah Jang; Alessandro N. Franciosi; Yossef Av-Gay; Horacio Bach; Scott J. Tebbutt (2023). Whole blood RNA-seq demonstrates an increased host immune response in individuals with cystic fibrosis who develop nontuberculous mycobacterial pulmonary disease [Dataset]. http://doi.org/10.5061/dryad.np5hqbzx2
    Explore at:
    Dataset updated
    Nov 29, 2023
    Dataset provided by
    Dryad Digital Repository
    Authors
    Miguel Prieto; Bradley Quon; Jiah Jang; Alessandro N. Franciosi; Yossef Av-Gay; Horacio Bach; Scott J. Tebbutt
    Time period covered
    Jan 1, 2022
    Description

    Background Individuals with cystic fibrosis have an elevated lifetime risk of colonization, infection, and disease caused by nontuberculous mycobacteria. A prior study involving non-cystic fibrosis individuals reported a gene expression signature associated with susceptibility to nontuberculous mycobacteria pulmonary disease (NTM-PD). In this study, we determined whether people living with cystic fibrosis who progress to NTM-PD have a gene expression pattern similar to the one seen in the non-cystic fibrosis population.
    Methods We evaluated whole blood transcriptomics using bulk RNA-seq in a cohort of cystic fibrosis patients with samples collected closest in timing to the first isolation of nontuberculous mycobacteria. The study population included patients who did (n = 12) and did not (n = 30) develop NTM-PD following the first mycobacterial growth. Progression to NTM-PD was defined by a consensus of two expert clinicians based on reviewing clinical, microbiological, and radiologic..., Study population and clinical data This study is a secondary data analysis using blood samples and data from the “CF Biomarker†cohort approved by the University of British Columbia-Providence Health Care ethics review board (H12-00835). The local ethics board also reviewed and approved the secondary analysis (H20-00117). Patients in the parent cohort were recruited following informed consent at the St. Paul’s Hospital Adult CF Clinic (Vancouver, Canada) between January 2012 and December 2019. In the current analysis, we included participants who consented to the future use of their samples and data, had at least one positive respiratory culture for NTM, and had a whole blood RNA sample available (PAXgene® stored at -70°C). Lung transplant recipients and subjects without a definite diagnosis of CF were excluded. We preferentially selected blood samples taken during clinically stable periods and closest to the first positive growth of NTM. We did not limit blood sampling to within a spec..., The data sets are provided as comma-separated values and can be opened with standard statistical software or explored with a spreadsheet program. In our analyses, we employed R and the GUI R studio (v 4.1.1) for analysis. Raw sequencing processing and transcript counting was done in a CentOS high performance cluster, dependencies and commands ran are described inside markdown scripts.Â

  10. o

    Data from: Androgen receptor-negative prostate cancer is vulnerable to...

    • explore.openaire.eu
    Updated Apr 26, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Andrej Benjak; Phillip Thienger (2024). Androgen receptor-negative prostate cancer is vulnerable to SWI/SNF-targeting degrader molecules [Dataset]. http://doi.org/10.5281/zenodo.11074188
    Explore at:
    Dataset updated
    Apr 26, 2024
    Authors
    Andrej Benjak; Phillip Thienger
    Description

    Cell lines and compounds PCa cell lines (LNCaP, 22Rv1, VCaP, PC3, DU145, NCI-H660, C4-2), other cell lines (HEK293T, DLD1) and benign prostate line (RWPE-1) were purchased from ATCC and maintained according to ATCC protocols. Patient-derived CRPC organoids (WCM and MSK) were established and maintained as organoids in Matrigel drops according to the previously described protocol70. LNCaP-AR cells were a kind gift from Dr. Sawyers and Dr. Mu (Memorial Sloan Kettering Cancer Center) and were cultured as previously described5. All used cell lines and their phenotype are listed in Supplementary Table 1. Cell cultures were regularly tested for Mycoplasma contamination and confirmed to be negative. Genentech Inc. synthesized A947, its epimer (A858), FHD-286 and AU-15330. Cobimetinib, Trametinib, VL285 and CHIR99021 were purchased from SelleckChem. BRM014 was purchased from MedChemExpress. All drugs used in this study are listed in Supplementary Table 2. Single-cell RNA-sequencing by SORT-seq library generation and analysis SORT-seq was performed using Single Cell Discoveries (SCD) service. Organoids were treated for 72h with a control epimer (A858) or active compound (A947) at 1 µM, and 1x10e6 cells were harvested in PBS. Harvested cells were stained with 100ng/ml DAPI to stain dead cells. Using a cell sorter (conducted by Flow Cytometry Core, DBMR, Bern) and the recommended settings (Single Cell Discoveries B.V.), DAPI-negative cells were sorted as single cells in 376 wells of four 384-well plates containing immersion oil per condition. Resulting in a theoretical cell number of 1504 cells per condition. All post-harvesting steps were performed at 4°C. Plates were snap-frozen on dry ice for 15 minutes and sent out for sequencing at Single Cell Discoveries B.V. Data were analyzed using the Seurat package v.4.3.080. Cell QC filtering was done using the following thresholds: nCount > 4000, nFeature > 1000, percent.mito 0.85. Differential gene expression analysis between clusters was done with Seurat::FindAllMarkers. Module scores were generated with Seurat::AddModuleScore. Gene set enrichment analysis was done with the package fgsea v.1.24.081 and the human gene sets from the Molecular Signatures Database (https://www.gsea-msigdb.org). Gene regulatory networks analysis was done with pySCENIC v.0.12.182. Overall analysis was done in R v.4.2.2. RNA-seq library generation and processing For bulk RNA-seq, organoids were treated with A858 or A947 (1µM) for 24h and 48h (3 biological replicates per condition). RNA was extracted using the RNeasy Kit (Qiagen); library generation and subsequent sequencing was performed by the clinical genomics lab (CGL) at the University of Bern. Sequencing reads were aligned against the human genome hg38 with STAR v.2.7.3a83. Gene counts were generated with RSEM v.1.3.284, whose index was generated using the GENCODE v33 primary assembly annotation. Differential gene expression analysis was done with DESeq2 v.1.34.085. Gene set enrichment analysis was done with the package fgsea v.1.20.081 and the human gene sets from the Molecular Signatures Database (https://www.gsea-msigdb.org). Analysis was done in R v.4.1.2. TCF7L2 ChIP-seq library generation and processing For the ChIP-Seq assay, chromatin was prepared from 2 biological replicates of WCM1078 treated with A858 or A947 (1µM) for 4h, and ChIP-Seq assays were then performed by Active Motif Inc. using an antibody against TCF7L2 (Cell Signalling, cat#2569). ChIP-seq sequence data was processed using an ENCODE-DC/chip-seq-pipeline2 -based workflow (https://github.com/ENCODE-DCC/chip-seq-pipeline2). Briefly, fastq files were aligned on the hg38 human genome reference using Bowtie2 (v2.2.6) followed by alignment sorting (samtools v1.7) of resulting bam files with filtering out of unmapped reads and keeping reads with mapping quality higher than 30. Duplicates were removed with Picard’s MarkDuplicates (v1.126) function, followed by indexation of resulting bam files with samtools. For each bam file, genome coverage was computed with bedtools (v2.26.0), followed by the generation of bigwig (wigToBigWig v377) files. Peaks were called with macs2 (v2.2.4) for each treatment sample using a pooled input alignment (.bam file) as control. Downstream analyses were performed with DiffBind v3.11.1 with default parameters, except for summits=250 in dba.count(). dba.contrast() and dba.analyzed() were used to compute significant differential peaks with DESeq2. ATAC-seq library generation and processing ATAC-seq was performed from 50’000 cryo-preserved cells per condition (1µM A858 and 1µM A947, n = 3 biological replicates) treated for 4h and analyzed as described in previous study86. Briefly, 50,000 cryo-preserved cells per condition were lysed for 5 minutes on ice and tagmented for 30 minutes at 37°C, followed by DNA isolation. DNA was barcoded and amplified before sequencing. PRO-cap library generation and processing For PRO-cap, approximately ...

  11. E

    RNAseq dataset for MALT1 in MCL

    • ega-archive.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RNAseq dataset for MALT1 in MCL [Dataset]. https://ega-archive.org/datasets/EGAD00001009771
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001002989https://ega-archive.org/dacs/EGAC00001002989

    Description

    Manuscript Title: Co-targeting of BTK and MALT1 overcomes resistance to BTK inhibitors in mantle cell lymphoma

    Journal: Journal of Clinical Investigation

    Authors Vivian Changying Jiang1, Yang Liu1, Junwei Lian1, Shengjian Huang1, Alexa Jordan1, Qingsong Cai1, Fangfang Yan3, Joseph Mitchell McIntosh1, Yijing Li1, Yuxuan Che1, Zhihong Chen1, Jovanny Vargas1, Maria Badillo1, JohnNelson Bigcal1, Heng-Huan Lee1, Wei Wang1, Yixin Yao1, Lei Nie1, Christopher Flowers1, and Michael Wang1, 2*

    Abstract Bruton’s tyrosine kinase (BTK) is a proven target in mantle cell lymphoma (MCL), an aggressive subtype of non-Hodgkin lymphoma. However, resistance to BTK inhibitors is a major clinical challenge. We here report that MALT1 is one of the top overexpressed genes in ibrutinib-resistant MCL cells, while expression of CARD11, which is upstream of MALT1, is decreased. MALT1 genetic knockout or inhibition produced dramatic defects in MCL cell growth regardless of ibrutinib sensitivity. Conversely, CARD11 knockout cells showed anti-tumor effects only in ibrutinib-sensitive cells, suggesting that MALT1 overexpression could drive ibrutinib resistance via bypassing BTK-CARD11 signaling. Additionally, BTK knockdown and MALT1 knockout markedly impaired MCL tumor migration and dissemination, and MALT1 pharmacological inhibition decreased MCL cell viability, adhesion, and migration by suppressing NF-κB, PI3K-ATK-mTOR, and integrin signaling. Importantly, co-targeting MALT1 with safimaltib and BTK with pirtobrutinib induced potent anti-MCL activity in ibrutinib-resistant MCL cell lines and patient-derived xenografts. Therefore, we conclude that MALT1 overexpression associates with resistance to BTK inhibitors in MCL, targeting abnormal MALT1 activity could be a promising therapeutic strategy to overcome BTK inhibitor resistance, and co-targeting of MALT1 and BTK should improve MCL treatment efficacy and durability as well as patient outcomes.

    Dataset description: The bulk RNA-seq dataset was generated for the cell lines below and used for two major purposes: 1. DEG analysis and GSEA analysis comparing IBN-R and IBN-S cells 2. DEG analysis and GSEA analysis comparing MCL cells with/without MI-2 treatment.

    sample Cell MI-2 Ibrutinib (IBN) Venetoclax (VEN) Used for IBN-R vs IBN-S comparison Used for MI-2 vs untreated (DMSO) H9 Granta519 - R S yes H21 Granta519 - R S yes H33 Granta519 - R S yes H10 Granta519-VEN-R - R R yes H22 Granta519-VEN-R - R R yes H34 Granta519-VEN-R - R R yes H3 JeKo BTK KD_1 - R R yes yes H15 JeKo BTK KD_1 - R R yes yes H27 JeKo BTK KD_1 - R R yes yes H5 JeKo BTK KD_2 - R R yes yes H17 JeKo BTK KD_2 - R R yes yes H29 JeKo BTK KD_2 - R R yes yes H1 JeKo-1 - S R yes yes H13 JeKo-1 - S R yes yes H25 JeKo-1 - S R yes yes H7 Mino - S S yes H19 Mino - S S yes H31 Mino - S S yes H8 Mino-VEN-R - S R yes H20 Mino-VEN-R - S R yes H32 Mino-VEN-R - S R yes H11 Rec-1 - S S yes H23 Rec-1 - S S yes H12 Rec-VEN-R - S S yes H24 Rec-VEN-R - S R yes H36 Rec-VEN-R - S R yes H35 Rec-1 -- S R yes H4 JeKo BTK KD_1 + MI-2 + yes H16 JeKo BTK KD_1 + MI-2 + yes H28 JeKo BTK KD_1 + MI-2 + yes H6 JeKo BTK KD_2 + MI-2 + yes H18 JeKo BTK KD_2 + MI-2 + yes H30 JeKo BTK KD_2 + MI-2 + yes H2 JeKo-1 + MI-2 + yes H14 JeKo-1 + MI-2 + yes H26 JeKo-1 + MI-2 + yes

  12. Human breast cancer PDX models bulk and single cell RNA sequencing

    • zenodo.org
    bin, csv
    Updated Aug 7, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Long V. Nguyen; Long V. Nguyen; Yaniv Eyal-Lubling; Daniel Guerrero-Romero; Raquel Manzano Garcia; Oscar M. Rueda; Oscar M. Rueda; Carlos Caldas; Carlos Caldas; Yaniv Eyal-Lubling; Daniel Guerrero-Romero; Raquel Manzano Garcia (2024). Human breast cancer PDX models bulk and single cell RNA sequencing [Dataset]. http://doi.org/10.5281/zenodo.10978990
    Explore at:
    bin, csvAvailable download formats
    Dataset updated
    Aug 7, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Long V. Nguyen; Long V. Nguyen; Yaniv Eyal-Lubling; Daniel Guerrero-Romero; Raquel Manzano Garcia; Oscar M. Rueda; Oscar M. Rueda; Carlos Caldas; Carlos Caldas; Yaniv Eyal-Lubling; Daniel Guerrero-Romero; Raquel Manzano Garcia
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This dataset includes information relevant to the following manuscript from the labs of Prof. Carlos Caldas (University of Cambridge), and Dr. Long V. Nguyen (Princess Margaret Cancer Centre, University Health Network):

    Nguyen LV et al. Dynamics and plasticity of human breast cancer single cell-derived clones. Under consideration for publication.

    Bulk RNA sequencing raw count matrices are provided (RawCounts.csv) along with the normalized count matrices (LogCPMNormCounts.csv).

    Single cell RNA sequencing count matrix processed from R package metacell is provided (mat.pdx_LN_v2_filt.Rda), along with the mc and mc2d files with information on metacell partitions (mc.pdx_LN_v2_filt.Rda and mc2d.pdx_LN_v2_filt.Rda).

    Code and information on data analysis is provided for reviewers in our unpublished manuscript and on Github (https://github.com/cclab-brca/clone-dynamics).

  13. EBAII n1 scRNAseq : Common resources (gene lists, ...)

    • zenodo.org
    bin
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bastien Job; Bastien Job (2024). EBAII n1 scRNAseq : Common resources (gene lists, ...) [Dataset]. http://doi.org/10.5281/zenodo.14101506
    Explore at:
    binAvailable download formats
    Dataset updated
    Nov 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Bastien Job; Bastien Job
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains R objects (RDS) for

    1. Gene lists used for the single-cell RNAseq analysis pre-processing
      1. Considered gene-lists :
        1. mitochondrial genes
        2. ribosomal protein-coding genes
        3. mechanical stress response genes
      2. Considered species :
        1. homo sapiens (human)
        2. mus musculus (mouse)
        3. rattus norvegicus (rat
    2. Reference bulk RNAseq profiles from ImmGenData, for automatic cell type annotation through celldex
  14. A multitask clustering approach for single-cell RNA-seq analysis in...

    • plos.figshare.com
    pdf
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Huanan Zhang; Catherine A. A. Lee; Zhuliu Li; John R. Garbe; Cindy R. Eide; Raphael Petegrosso; Rui Kuang; Jakub Tolar (2023). A multitask clustering approach for single-cell RNA-seq analysis in Recessive Dystrophic Epidermolysis Bullosa [Dataset]. http://doi.org/10.1371/journal.pcbi.1006053
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Huanan Zhang; Catherine A. A. Lee; Zhuliu Li; John R. Garbe; Cindy R. Eide; Raphael Petegrosso; Rui Kuang; Jakub Tolar
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single-cell RNA sequencing (scRNA-seq) has been widely applied to discover new cell types by detecting sub-populations in a heterogeneous group of cells. Since scRNA-seq experiments have lower read coverage/tag counts and introduce more technical biases compared to bulk RNA-seq experiments, the limited number of sampled cells combined with the experimental biases and other dataset specific variations presents a challenge to cross-dataset analysis and discovery of relevant biological variations across multiple cell populations. In this paper, we introduce a method of variance-driven multitask clustering of single-cell RNA-seq data (scVDMC) that utilizes multiple single-cell populations from biological replicates or different samples. scVDMC clusters single cells in multiple scRNA-seq experiments of similar cell types and markers but varying expression patterns such that the scRNA-seq data are better integrated than typical pooled analyses which only increase the sample size. By controlling the variance among the cell clusters within each dataset and across all the datasets, scVDMC detects cell sub-populations in each individual experiment with shared cell-type markers but varying cluster centers among all the experiments. Applied to two real scRNA-seq datasets with several replicates and one large-scale droplet-based dataset on three patient samples, scVDMC more accurately detected cell populations and known cell markers than pooled clustering and other recently proposed scRNA-seq clustering methods. In the case study applied to in-house Recessive Dystrophic Epidermolysis Bullosa (RDEB) scRNA-seq data, scVDMC revealed several new cell types and unknown markers validated by flow cytometry. MATLAB/Octave code available at https://github.com/kuanglab/scVDMC.

  15. Data from: Hypertrophic cardiomyopathy-associated mutations drive stromal...

    • nde-dev.biothings.io
    • datasetcatalog.nlm.nih.gov
    • +2more
    zip
    Updated Oct 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jourdan Ewoldt; Miranda Wang; Micheal McLellan; Paige Cloonan; Anant Chopra; Joshua Gorham; Linqing Li; Daniel DeLaughter; Xining Gao; Joshua Lee; Jon Willcox; Olivia Layton; Rebeccah Luu; Christopher Toepfer; Jeroen Eyckmans; Christine Seidman; Jonathan Seidman; Christopher Chen (2024). Hypertrophic cardiomyopathy-associated mutations drive stromal activation via EGFR-mediated paracrine signaling [Dataset]. http://doi.org/10.5061/dryad.3n5tb2rqw
    Explore at:
    zipAvailable download formats
    Dataset updated
    Oct 18, 2024
    Dataset provided by
    Harvard Medical School
    Massachusetts Institute of Technology
    University of New Hampshire
    Boston University
    Authors
    Jourdan Ewoldt; Miranda Wang; Micheal McLellan; Paige Cloonan; Anant Chopra; Joshua Gorham; Linqing Li; Daniel DeLaughter; Xining Gao; Joshua Lee; Jon Willcox; Olivia Layton; Rebeccah Luu; Christopher Toepfer; Jeroen Eyckmans; Christine Seidman; Jonathan Seidman; Christopher Chen
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    Hypertrophic cardiomyopathy (HCM) is characterized by thickening of the left ventricular wall, diastolic dysfunction, and fibrosis, and is associated with mutations in genes encoding sarcomere proteins. While in vitro studies have used human induced pluripotent stem cell-derived cardiomyocytes (hiPSC-CMs) to study HCM, these models have not examined the multicellular interactions involved in fibrosis. Using engineered cardiac microtissues (CMTs) composed of HCM-causing MYH7-variant hiPSC-CMs and wild-type fibroblasts, we observed cell-cell cross-talk leading to increased collagen deposition, tissue stiffening, and decreased contractility dependent on fibroblast proliferation. hiPSC-CM conditioned media and single-nucleus RNA sequencing data suggested that fibroblast proliferation is mediated by paracrine signals from MYH7-variant cardiomyocytes. Furthermore, inhibiting epidermal growth factor receptor tyrosine kinase with erlotinib hydrochloride attenuated stromal activation. Last, HCM-causing MYBPC3-variant CMTs also demonstrated increased stromal activation and reduced contractility, but with distinct characteristics. Together, these findings establish a paracrine-mediated cross-talk potentially responsible for fibrotic changes observed in HCM. Methods snRNA-sequencing data snRNA-seq dataset includes data on CMTs made with healthy wild type (WT) or hypertrophic cardiomyopathy (HCM)-causing (R403Q+/- mutation in myosin heavy chain) human induced pluripotent stem cell derived-cardiomyocytes (hiPSC-CMs) (derived from PGP1 hiPSC line) and ventricular cardiac fibroblasts (Lonza Cat. CC-2904) (1).
    Methods: A total of 60,000 cells per tissue (ten tissues pooled per sample), consisting of 90% hiPSC-CMs and 10% vCFs, were mixed in an ECM solution consisting of 4 mg/ml of human fibrinogen (Sigma), 10% Matrigel (Corning), 0.4 unit of thrombin (Sigma) per mg of fibrinogen, 5 µM Y-27632 (Tocris), and 0.033 mg/mL aprotinin (Sigma). The cell-ECM mixture was pipetted into each tissue well, and after gel polymerization for ten minutes, tissue maintenance growth media containing high glucose DMEM (Fisher) supplemented with 10% FBS (Sigma), 1% penicillin-streptomycin (P/S) (Fisher), 1% Non-essential Amino Acids (Fisher), 1% Glutamax (Fisher), 5 µM Y-27632, 0.033 mg/mL aprotinin, and 150 µg/mL L-ascorbic acid 2-phosphate sesquimagnesium salt hydrate. Y-27632 was removed two days following seeding, and the growth media was replaced every other day. CMTs were flash frozen in liquid nitrogen on day 7 (10 tissues pooled per sample) and stored in -80°C. Individual nuclei were isolated from frozen tissue samples. Briefly, nuclei were isolated (2) and RNA was reverse-transcribed and converted into cDNA libraries using a 10x Chromium Controller and Chromium Single Cell 3' v3.1 reagent kit (10x Genomics). Bar-coded libraries were pooled and sequenced (Illumina NovaSeq 6000). Single nucleus RNAseq alignment and gene counts were performed using Cell Ranger 1.2 (10x Genomics) and Seurat (3) and R 4.1.0 and managed via RStudio. There were 15,129 total sequenced nuclei. Principal component (PC) analysis was performed to determine the dimensionality of this dataset within Seurat and the number of PCs was selected using both permutation-based and heuristic methods. Cluster assignment was performed in an unsupervised manner using a shared nearest neighbor approach within Seurat. To validate the ideal number of PCs and clustering resolution in this dataset, repeat analysis was performed while varying both the dimensionality and resolution. Using the Clustree R package (4), the hierarchical evolution of cluster identity was determined. This analysis allows the user to investigate how clustering resolution impacts cell identity, in an effort to assign robust classification to cell clusters which remain consistent across resolution values. For each clustering resolution, marker genes for each cluster, TTN for cardiomyocytes and FN1 for fibroblasts, were identified in order to assign relevant biological function to various subclusters. Dimensionality and resolution were adjusted until each identified cluster contained a non-zero amount of significantly expressed marker genes that were able to assign functional gene ontological terms; in this case, at cluster resolution 0.4, which was used for clustering analysis. At greater resolution values, cluster subsets returned either too few genes to assign functional classifications or non-coding genes of unknown function. Dotplot projections of single cells were generated with UMAP coordinates, using the number of dimensions determined from PCA as described above. For the purposes of this analysis, 30 dimensions were considered for both the initial PCA, and subsequent UMAP visualization. To assign cell clusters specific identities, canonical marker gene expression was analyzed to assign broad cell classes. To assign cell subclusters, such as CM1-CM6, unbiased genetic markers for each population were calculated using the Wilcoxon Rank Sum test within Seurat, at which point at most the top 100 genes were used as a Gene Ontology query to assign cellular functions to each cell subcluster. Genes that were upregulated R403Q+/- cardiomyocytes compared to WT cardiomyocytes with a p-value < 0.01 and fold-change greater than 1.3 were input into Enrichr (5,6). The top upregulated pathways associated with the top upregulated genes were obtained using WikiPathways (7). Bulk RNA-sequencing data Bulk RNA-sequencing data include data on serum starved vCFs, vCFs treated with 100 ng/mL recombinant human epidermal growth factor (rhEGF), and conditioned media from hiPSC-CMs (1). Methods: vCFs were trypsinized, centrifuged, and flash-frozen after two days in serum-starvation (low glucose DMEM with 0.1% fetal bovine serum) vs 100 ng/mL rhEGF or conditioned media from wild-type vs R403Q+/- hypertrophic cardiomyopathy hiPSC-CMs for three different batches. Cells were homogenized in Trizol Reagent (Life Technologies, Inc., Grand Island, NY) with TissueLyzer II (QIAGEN, Inc., Valencia, CA) and RNA was isolated by conventional methods. RNA went through two rounds of mRNA purification (polyA-selection) using Dynabeads mRNA DIRECT Kit (Invitrogen, Carlsbad, CA). Double-stranded cDNA was generated using the Superscript III First-Strand Synthesis System (Invitrogen, Carlsbad, CA). The cDNA products were used to construct libraries with the Nextera XT DNA Sample Preparation Kit (Illumina, Inc., San Diego, CA). Libraries were paired-end sequenced with a read length of 75 base pairs (75PE) on the Illumina NextSeq 500. Reads were aligned to the hg38 Human Genome using Spliced Transcripts Alignment to a Reference (STAR) (8). Data were analyzed as previously described (9) and normalized to the total number of reads per kilobase of exon per million (RPKM). Genes that were upregulated in rhEGF-treated vCFs compared to serum-starved vCFs with a p-value < 0.01 and fold-change greater than 1.3 were input into Enrichr (5,6). For conditioned media experiments, gene expression values were averaged over three experimental repeats. Genes that had an average fold-change expression greater than 1.15 with a p-value < 0.01 for all three experiments were input into Enrichr. The top upregulated pathways and transcription factors associated with the top upregulated genes were obtained using WikiPathways (7) and ChEA (10) databases, respectively.

    1. J.K. Ewoldt et al. Hypertrophic cardiomyopathy–associated mutations drive stromal activation via EGFR-mediated paracrine signaling.Sci. Adv.10 (2024).
    2. E. R. Nadelmann, J. M. Gorham, D. Reichart, D. M. Delaughter, H. Wakimoto, E. L. Lindberg, M. Litviňukova, H. Maatz, J. J. Curran, D. Ischiu Gutierrez, N. Hübner, C. E. Seidman, J. G. Seidman, Isolation of Nuclei from Mammalian Cells and Tissues for Single-Nucleus Molecular Profiling. Curr Protoc 1(2021).
    3. Y. Hao, S. Hao, E. Andersen-Nissen, W. M. Mauck, S. Zheng, A. Butler, M. J. Lee, A. J. Wilk, C. Darby, M. Zager, P. Hoffman, M. Stoeckius, E. Papalexi, E. P. Mimitou, J. Jain, A. Srivastava, T. Stuart, L. M. Fleming, B. Yeung, A. J. Rogers, J. M. McElrath, C. A. Blish, R. Gottardo, P. Smibert, R. Satija, Integrated analysis of multimodal single-cell data. Cell 184, 3573-3587.e29 (2021).
    4. L. Zappia, A. Oshlack, Clustering trees: a visualization for evaluating clusterings at multiple resolutions. Gigascience 7 (2018).
    5. E. Y. Chen, C. M. Tan, Y. Kou, Q. Duan, Z. Wang, G. V. Meirelles, N. R. Clark, A. Ma’ayan, Enrichr: Interactive and collaborative HTML5 gene list enrichment analysis tool. BMC Bioinformatics 14, 128 (2013).
    6. Z. Xie, A. Bailey, M. V. Kuleshov, D. J. B. Clarke, J. E. Evangelista, S. L. Jenkins, A. Lachmann, M. L. Wojciechowicz, E. Kropiwnicki, K. M. Jagodnik, M. Jeon, A. Ma’ayan, Gene set knowledge discovery with Enrichr. Curr. Protoc. 1, e90 (2021).
    7. A. R. Pico, T. Kelder, M. P. Van Iersel, K. Hanspers, B. R. Conklin, C. Evelo, WikiPathways: Pathway editing for the people. PLOS Biol6, 1403–1407 (2008).
    8. A. Dobin, C. A. Davis, F. Schlesinger, J. Drenkow, C. Zaleski, S. Jha, P. Batut, M. Chaisson, T. R. Gingeras, STAR: Ultrafast universal RNA-seq aligner. Bioinformatics 29, 15–21 (2013).
    9. D. C. Christodoulou, H. Wakimoto, K. Onoue, S. Eminaga, J. M. Gorham, S. R. DePalma, D. S. Herman, P. Teekakirikul, D. A. Conner, D. M. McKean, A. A. Domenighetti, A. Aboukhalil, S. Chang, G. Srivastava, B. McDonough, P. L. de Jager, J. Chen, M. L. Bulyk, J. D. Muehlschlegel, C. E. Seidman, J. G. Seidman, 5’RNA-Seq identifies Fhl1 as a genetic modifier in cardiomyopathy. Journal of Clinical Investigation 124, 1364–1370 (2014).
    10. A. Lachmann, H. Xu, J. Krishnan, S. I. Berger, A. R. Mazloom, A. Ma’ayan, ChEA: Transcription factor regulation inferred from integrating genome-wide ChIP-X experiments. Bioinformatics 26, 2438–2444 (2010).
  16. Z

    Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer...

    • data.niaid.nih.gov
    Updated Feb 12, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson (2021). Data Repository: Single-cell mapper (scMappR): using scRNA-seq to infer cell-type specificities of differentially expressed genes [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4278129
    Explore at:
    Dataset updated
    Feb 12, 2021
    Dataset provided by
    Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
    Department of Medical Biophysics, University of Toronto, Toronto, ON, M5G 1L7, Canada; Princess Margaret Cancer Center, University Health Network, Toronto, ON, M5G 2C1, Canada
    Department of Cell and Systems Biology, University of Toronto, Toronto, ON, M5S 3G5, Canada; Department of Psychology, University of Toronto Mississauga, Mississauga, ON, L5L 1C6
    Department of Molecular Genetics, 2Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, CanadaUniversity of Toronto, Toronto, ON, M5S 1A8, Canada,
    Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada; Vector Institute for Artificial Intelligence, MaRS Centre, Toronto, ON, M5G 1M1; CIFAR, MaRS Centre, Toronto, ON, M5G 1M1
    Genetics and Genome Biology, SickKids Research Institute, Toronto, ON, M5G 0A4, Canada; Department of Cell and Systems Biology, University of Toronto, Toronto
    Authors
    Dustin Sokolowski; Mariela Faykoo-Martinez; Lauren Erdman; Huayun Hou; Cadia Chan; Helen Zhu; Melissa M. Holmes; Anna Goldenberg; Michael D Wilson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data repository for the scMappR manuscript:

    Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).

    RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.

  17. d

    Commensal microbiome dysbiosis in keloid disease

    • search.dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated Jul 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomasz Maj (2024). Commensal microbiome dysbiosis in keloid disease [Dataset]. http://doi.org/10.5061/dryad.d51c5b0bt
    Explore at:
    Dataset updated
    Jul 17, 2024
    Dataset provided by
    Dryad Digital Repository
    Authors
    Tomasz Maj
    Description

    Wound healing is an intensely studied topic involved in many relevant pathophysiological processes, including fibrosis. Despite the large interest in fibrosis, the network that related to commensal microbiota and skin fibrosis remain mysterious. Here, we pay attention to keloid, a classical yet intractable skin fibrotic disease to establish the association between commensal microbiota to scaring tissue. Our histological data reveal the presence of microbiota in the keloids. 16S rRNA sequencing characterize microbial composition and divergence between the pathological and normal skin tissue. Our research provides insights into the pathology of human fibrotic diseases, advocating commensal bacteria and IL-8 signaling as useful targets in future interventions of recurrent keloid disease., 16S rDNA sequencing data The data files here are raw 16S rDNA sequencing accompanied by R scripts that were used to analyze these data. The source of the data was: (1) a swab test from the surface of normal and keloid skin; (2) the tissues of keloid patients from deeper parts of the skin. Surface microbiota samples were collected from the pathological location or the normal lateral location of patients using a swab (Catch-all Sample Collection Swab, Epicenter) moistened in Yeast Cell Lysis Buffer (from MasterPure Yeast DNA Purification Kit; Epicenter). Samples were snap-frozen on dry ice, and DNA was isolated from specimens using the PureLink Genomic DNA Mini Kit (Invitrogen).  Amplification of the 16S-V3+V4 region was performed according to the manufacturer’s specifications. Sequencing of 16S rRNA amplicons was conducted by Apexbio Co., Shanghai, China using the Illumina Novaseq platform. The data were analyzed with the attached R scripts. Bulk RNA-Seq For RNA sequencing, human dermal ..., , # Commensal microbiome dysbiosis in keloid disease

    https://doi.org/10.5061/dryad.d51c5b0bt

    These datasets contain original fastq.gz readouts from 16S rDNA sequencing experiments and scripts used to analyze these data. The experiments were performed on patients with keloids, a fibrotic skin disease. We did two similar experiments with the same way of analysis:

    • Swab test: we used sterile cotton swabs to acquire bacteria from the surface of undamaged skin covering keloid and adjacent normal skin. Therefore, we have 12 keloid and 12 normal swabs paired together.
    • Tissue test: the keloid tissue with a margin of normal tissue was acquired from samples after surgery (5 patients).

    Additionally, we provide the files from bulk RNA sequencing of human dermal fibroblasts treated with IL-8 and TGF-beta here.

    Description of the data and file structure

    16S rDNA data

    File naming convention

    All the data, additional files, and R scripts ar...

  18. d

    Data from: N-cadherin dynamically regulates pediatric glioma cell migration...

    • search.dataone.org
    • data.niaid.nih.gov
    • +1more
    Updated Jul 26, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dayoung Kim; James Olson; Jonathan Cooper (2025). N-cadherin dynamically regulates pediatric glioma cell migration in complex environments [Dataset]. http://doi.org/10.5061/dryad.4tmpg4fj7
    Explore at:
    Dataset updated
    Jul 26, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Dayoung Kim; James Olson; Jonathan Cooper
    Time period covered
    Jan 1, 2024
    Description

    N-cadherin dynamically regulates pediatric glioma cell migration in complex environments

    Table S2. Control vs. N-cadherin shRNAs. We compared RNA transcriptomes from bulk cell populations, control or N-cadherin-depleted pediatric high-grade glioma cells. N-cad depletion decreased RNA expression but unaffected the expression of other cadherins and integrins except for CDH3. Differential gene expression analysis was performed with the DEseq2 for paired sample R package.

    Table S3. Leader vs. follower cells. We compared RNA transcriptomes from migrating glioma leader and follower cells that we isolated by photoconversion and flow cytometry. 44 gene transcripts increased and 36 decreased in leader relative to followers out of 19,729 genes that were quantified (log2 fold-change >0.5, FDR <0.05). YAP-response genes and wound-healing genes were higher in leader than follower cells. Differential gene expression analysis was performed with the DEseq2 for paired sample R package. NA r..., Pediatric high-grade gliomas are highly invasive and essentially incurable. Glioma cells migrate between neurons and glia, along axon tracts, and through extracellular matrix surrounding blood vessels and underlying the pia. Mechanisms that allow adaptation to such complex environments are poorly understood. N-cadherin is highly expressed in pediatric gliomas and is associated with shorter survival. We found that inter-cellular homotypic N-cadherin interactions differentially regulate glioma migration according to the microenvironment, stimulating migration on cultured neurons or astrocytes but inhibiting invasion into reconstituted or astrocyte-deposited extracellular matrix. N-cadherin localizes to filamentous connections between migrating leader cells but to epithelial-like junctions between followers. Leader cells have more surface and recycling N-cadherin, increased YAP1/TAZ signaling, and increased proliferation relative to followers. YAP1/TAZ signaling is dynamically regulated as..., To compare RNA transcriptomes from bulk cell populations, control or N-cad shRNA cells were dissociated using Accutase at room temperature for 10 min and resuspended in HBSS. 500 cells were collected in the center of a 1.5 ml centrifuge tube containing 4.75 μl SMART-seq reaction buffer (Takara) using a BD FACSymphony S6 (BD Bioscience), avoiding cell loss on the tube walls. To compare RNA transcriptomes from leader and follower cells, approximately 90 spheroids of PBT-05 cells expressing histone H2B-Dendra2 cells were allowed to migrate for 24 hrs on laminin and photoconverted as above. After photoconversion, cells were dissociated with Accutase at room temperature, resuspended in HBSS and transferred to ice, and approximately 200 leader and follower cells were sorted into SMART-seq reaction buffer as above. Each experiment was performed on four different occasions. RNA was prepared and cDNA was synthesized with the SMART-Seq v4 Ultra Low Input RNA Kit for Sequencing (Takara) and ran th...,

  19. Datasets associated with the publication of the "satuRn" R package

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Jul 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jeroen Gilis; Jeroen Gilis; Kristoffer Vitting-Seerup; Kristoffer Vitting-Seerup; Koen Van den Berge; Koen Van den Berge; Lieven Clement; Lieven Clement (2022). Datasets associated with the publication of the "satuRn" R package [Dataset]. http://doi.org/10.5281/zenodo.4438789
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Jeroen Gilis; Jeroen Gilis; Kristoffer Vitting-Seerup; Kristoffer Vitting-Seerup; Koen Van den Berge; Koen Van den Berge; Lieven Clement; Lieven Clement
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    On this Zenodo link, we share the data that is required to reproduce all the analyses from our publication "satuRn: Scalable Analysis of differential Transcript Usage for bulk and single-cell RNA-sequencing applications".

    This repository includes input transcript-level expression matrices and metadata for all datasets, as well as intermediate results and final outputs of the respective DTU analyses. For a more elaborate description of the data, we refer to the companion GitHub for our publications; https://github.com/statOmics/satuRnPaper. Note that this is version 1.0.1 of the data (uploaded on 2021-01-14). If any changes were to be made to the datasets in the future, this will also be communicated on our companion GitHub page.

  20. Data from: Microglia from patients with multiple sclerosis display...

    • zenodo.org
    Updated Nov 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Luca Giudice; Luca Giudice (2024). Microglia from patients with multiple sclerosis display cell-autonomous immune activation state [Dataset]. http://doi.org/10.5281/zenodo.14033312
    Explore at:
    Dataset updated
    Nov 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Luca Giudice; Luca Giudice
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This repository contains the data and the code used in tanja hyvarinen's project (tanja.hyvarinen@tuni.fi)

    "analysis" contains the processing and analysis of the bulk RNA sequencing data, contains the analysis of the integration between our RNA seq and external datasets.
    "fastq" contains the raw fastq sequences of the bulk RNA sequencing
    "counts" contains the results of processing the fasta sequences with nfcore rnaseq workflow

    Analysis and integration folders can contain the starting raw data in "data", the R scripts in order of execution (op1, op2 ..) and the "output" folder that contains the final processed data of each operation.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Gaetano Aufiero; Carmine Fruggiero; Nunzio D’Agostino (2025). Table 2_inDAGO: a user-friendly interface for seamless dual and bulk RNA-Seq analysis.xlsx [Dataset]. http://doi.org/10.3389/fbinf.2025.1696823.s002

Table 2_inDAGO: a user-friendly interface for seamless dual and bulk RNA-Seq analysis.xlsx

Related Article
Explore at:
xlsxAvailable download formats
Dataset updated
Nov 21, 2025
Dataset provided by
Frontiers
Authors
Gaetano Aufiero; Carmine Fruggiero; Nunzio D’Agostino
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Dual RNA-sequencing enables simultaneous profiling of protein-coding and non-coding transcripts from two interacting organisms, an essential capability when physical separation is difficult, such as in host-parasite or cross-kingdom interactions (e.g., plant-plant or host-pathogen systems). By allowing in silico separation of mixed reads, dual RNA-seq reveals the transcriptomic dynamics of both partners during interaction. However, existing analysis workflows often require programming expertise, limiting accessibility. We present inDAGO, a free, open-source, cross-platform graphical user interface designed for biologists without coding skills. inDAGO supports both bulk and dual RNA sequencing, with dual RNA sequencing further accommodating both sequential and combined approaches. The interface guides users through key analysis steps, including quality control, read alignment, read summarization, exploratory data analysis, and identification of differentially expressed genes, while generating intermediate outputs and publication-ready plots. Optimized for speed and efficiency, inDAGO performs complete analyses on a standard laptop (16 GB RAM) without requiring high-performance computing. We validated inDAGO using diverse real datasets to demonstrate its reliability and usability. inDAGO, available on CRAN (https://cran.r-project.org/web/packages/inDAGO/) and GitHub (https://github.com/inDAGOverse/inDAGO), lowers the technical barrier to dual RNA-seq by enabling robust, reproducible analyses, even for users without coding experience.

Search
Clear search
Close search
Google apps
Main menu