We hypothesized that exposure to fatty acids in metabolic dysfunction-associated steatohepatitis-like environment will profoundly affect gene expression of hepatocytes. More precisely, we wish to investigate expression of genes related to ferroptosis, i.e. an iron-catalyzed form of cell death through lethal lipid peroxidation. We aimed to study the effect of fatty acid supplementation in a metabolic dysfunction-associated steatohepatitis-like environment on gene expression of HepG2 cells profiled with bulk mRNA-sequencing. HepG2 cells were exposed for 48 hours to oleic acid (100microM) and palmitic acid (50microM), as well as hyperglycemia (4.5 mg/mL), hyperinsulinemia (100 nM), tumor necrosis factor alpha (50 ng/mL), interleukin 1-beta (25 ng/mL) and transforming growth factor-beta (8ng/mL). Control samples were exposed to solvents needed to dissolve oleic acid and palmitic acid in medium.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We applied BSCET to the Fadista pancreatic islets bulk RNA-seq data [15] using the MuSiC [14] proportion estimates obtained from the Segerstolpe single-cell reference [16]. We detected 283 SNPs across 129 genes with significant cell-type-specific AEI (FDR adjusted p-value < 0.05). (XLSX)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data repository for the scMappR manuscript:
Abstract from biorXiv (https://www.biorxiv.org/content/10.1101/2020.08.24.265298v1.full).
RNA sequencing (RNA-seq) is widely used to identify differentially expressed genes (DEGs) and reveal biological mechanisms underlying complex biological processes. RNA-seq is often performed on heterogeneous samples and the resulting DEGs do not necessarily indicate the cell types where the differential expression occurred. While single-cell RNA-seq (scRNA-seq) methods solve this problem, technical and cost constraints currently limit its widespread use. Here we present single cell Mapper (scMappR), a method that assigns cell-type specificity scores to DEGs obtained from bulk RNA-seq by integrating cell-type expression data generated by scRNA-seq and existing deconvolution methods. After benchmarking scMappR using RNA-seq data obtained from sorted blood cells, we asked if scMappR could reveal known cell-type specific changes that occur during kidney regeneration. We found that scMappR appropriately assigned DEGs to cell-types involved in kidney regeneration, including a relatively small proportion of immune cells. While scMappR can work with any user supplied scRNA-seq data, we curated scRNA-seq expression matrices for ∼100 human and mouse tissues to facilitate its use with bulk RNA-seq data alone. Overall, scMappR is a user-friendly R package that complements traditional differential expression analysis available at CRAN.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data used to test the robustness and applicability of transcription factor and pathway analysis tools on single-cell RNA-seq data, described in Holland et al. 2020.
The folder data contains raw data and the folder output contains intermediate and final results of all analyses.
The associated analyses code and more information are available on GitHub.
Abstract
Background
Many functional analysis tools have been developed to extract functional and mechanistic insight from bulk transcriptome data. With the advent of single-cell RNA sequencing (scRNA-seq), it is in principle possible to do such an analysis for single cells. However, scRNA-seq data has characteristics such as drop-out events and low library sizes. It is thus not clear if functional TF and pathway analysis tools established for bulk sequencing can be applied to scRNA-seq in a meaningful way.
Results
To address this question, we perform benchmark studies on simulated and real scRNA-seq data. We include the bulk-RNA tools PROGENy, GO enrichment, and DoRothEA that estimate pathway and transcription factor (TF) activities, respectively, and compare them against the tools SCENIC/AUCell and metaVIPER, designed for scRNA-seq. For the in silico study, we simulate single cells from TF/pathway perturbation bulk RNA-seq experiments. We complement the simulated data with real scRNA-seq data upon CRISPR-mediated knock-out. Our benchmarks on simulated and real data reveal comparable performance to the original bulk data. Additionally, we show that the TF and pathway activities preserve cell type-specific variability by analyzing a mixture sample sequenced with 13 scRNA-seq protocols. We also provide the benchmark data for further use by the community.
Conclusions
Our analyses suggest that bulk-based functional analysis tools that use manually curated footprint gene sets can be applied to scRNA-seq data, partially outperforming dedicated single-cell tools. Furthermore, we find that the performance of functional analysis tools is more sensitive to the gene sets than to the statistic used.
For questions related to the data please write an email to christian.holland@bioquant.uni-heidelberg.de or use the GitHub issue system.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We applied BSCET to the Fadista pancreatic islets bulk RNA-seq data [15] using the MuSiC [14] proportion estimates obtained from the Baron single-cell reference [12]. We detected 5 SNPs in 4 genes with cell-type-specific AEI significantly associated with HbA1c (FDR adjusted P-value < 0.05), with the direction of the association indicated in column ‘Correlation’. (XLSX)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We applied BSCET to the Fadista pancreatic islets bulk RNA-seq data [15] using the MuSiC [14] proportion estimates obtained from the Segerstolpe single-cell reference [16]. We detected 8 SNPs in 8 genes with cell-type-specific AEI significantly associated with HbA1c (FDR adjusted P-value < 0.05), with the direction of the association indicated in column ‘Correlation’. (XLSX)
In order to identify the transcriptional changes that correlate with kidney regeneration or fibrosis development, we performed a time-course bulk RNA-seq from whole-kidneys at 3, 7, 14, 28 and 42 days after the initial ischemic kidney from two distinct murine models.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This archive contians datasets, exercises and slides used for the Introduction to bulk RNAseq analysis workshop taught at the University of Copenhagen by the Center for Health Data Science (HeaDS). The course material can be found on Github.
Data.zip contains fastqc and multiqc examples of bulk RNAseq experiments, plus count matrices (both traditional counts and salmon pseudocounts), as well as sample metadata.
Slides.zip contains all the slides used in the workshop.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset supports the manuscript titled "Molecular and developmental deficits in Smith-Magenis syndrome patient hiPSC-derived cortical neural models". It includes processed single-cell RNA sequencing (scRNA-seq) and bulk RNA sequencing (bulk RNA-seq) data derived from human induced pluripotent stem cell (hiPSC)-derived cortical neural progenitor cells and neurons obtained from Smith-Magenis syndrome (SMS) patients and matched healthy controls.
The dataset comprises:
bulkRNA-seq.zip
: Normalized gene expression count tables and differential gene expression results from bulk RNA-seq analysis, along with patient-level metadata including demographic information (e.g., age, sex, diagnosis group).
The data capture transcriptomic changes across developmental stages and enable the study of disease-associated molecular and cellular alterations in SMS. These files are intended for secondary analysis and reproducibility; raw FASTQ files are not included in this deposit. A companion GitHub repository with code for data preprocessing and analysis will be provided.
https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html
Arsenic exposure via drinking water is a serious environmental health concern. Epidemiological studies suggest a strong association between prenatal arsenic exposure and subsequent childhood respiratory infections, as well as morbidity from respiratory diseases in adulthood, long after systemic clearance of arsenic. We investigated the impact of exclusive prenatal arsenic exposure on the inflammatory immune response and respiratory health after an adult influenza A (IAV) lung infection. C57BL/6J mice were exposed to 100 ppb sodium arsenite in utero, and subsequently infected with IAV (H1N1) after maturation to adulthood. Assessment of lung tissue and bronchoalveolar lavage fluid (BALF) at various time points post IAV infection reveals greater lung damage and inflammation in arsenic exposed mice versus control mice. Single-cell RNA sequencing analysis of immune cells harvested from IAV infected lungs suggests that the enhanced inflammatory response is mediated by dysregulation of innate immune function of monocyte derived macrophages, neutrophils, NK cells, and alveolar macrophages. Our results suggest that prenatal arsenic exposure results in lasting effects on the adult host innate immune response to IAV infection, long after exposure to arsenic, leading to greater immunopathology. This study provides the first direct evidence that exclusive prenatal exposure to arsenic in drinking water causes predisposition to a hyperinflammatory response to IAV infection in adult mice, which is associated with significant lung damage.
Methods Whole lung homogenate preparation for single cell RNA sequencing (scRNA-seq).
Lungs were perfused with PBS via the right ventricle, harvested, and mechanically disassociated prior to straining through 70- and 30-µm filters to obtain a single-cell suspension. Dead cells were removed (annexin V EasySep kit, StemCell Technologies, Vancouver, Canada), and samples were enriched for cells of hematopoetic origin by magnetic separation using anti-CD45-conjugated microbeads (Miltenyi, Auburn, CA). Single-cell suspensions of 6 samples were loaded on a Chromium Single Cell system (10X Genomics) to generate barcoded single-cell gel beads in emulsion, and scRNA-seq libraries were prepared using Single Cell 3’ Version 2 chemistry. Libraries were multiplexed and sequenced on 4 lanes of a Nextseq 500 sequencer (Illumina) with 3 sequencing runs. Demultiplexing and barcode processing of raw sequencing data was conducted using Cell Ranger v. 3.0.1 (10X Genomics; Dartmouth Genomics Shared Resource Core). Reads were aligned to mouse (GRCm38) and influenza A virus (A/PR8/34, genome build GCF_000865725.1) genomes to generate unique molecular index (UMI) count matrices. Gene expression data have been deposited in the NCBI GEO database and are available at accession # GSE142047.
Preprocessing of single cell RNA sequencing (scRNA-seq) data
Count matrices produced using Cell Ranger were analyzed in the R statistical working environment (version 3.6.1). Preliminary visualization and quality analysis were conducted using scran (v 1.14.3, Lun et al., 2016) and Scater (v. 1.14.1, McCarthy et al., 2017) to identify thresholds for cell quality and feature filtering. Sample matrices were imported into Seurat (v. 3.1.1, Stuart., et al., 2019) and the percentage of mitochondrial, hemoglobin, and influenza A viral transcripts calculated per cell. Cells with < 1000 or > 20,000 unique molecular identifiers (UMIs: low quality and doublets), fewer than 300 features (low quality), greater than 10% of reads mapped to mitochondrial genes (dying) or greater than 1% of reads mapped to hemoglobin genes (red blood cells) were filtered from further analysis. Total cells per sample after filtering ranged from 1895-2482, no significant difference in the number of cells was observed in arsenic vs. control. Data were then normalized using SCTransform (Hafemeister et al., 2019) and variable features identified for each sample. Integration anchors between samples were identified using canonical correlation analysis (CCA) and mutual nearest neighbors (MNNs), as implemented in Seurat V3 (Stuart., et al., 2019) and used to integrate samples into a shared space for further comparison. This process enables identification of shared populations of cells between samples, even in the presence of technical or biological differences, while also allowing for non-overlapping populations that are unique to individual samples.
Clustering and reference-based cell identity labeling of single immune cells from IAV-infected lung with scRNA-seq
Principal components were identified from the integrated dataset and were used for Uniform Manifold Approximation and Projection (UMAP) visualization of the data in two-dimensional space. A shared-nearest-neighbor (SNN) graph was constructed using default parameters, and clusters identified using the SLM algorithm in Seurat at a range of resolutions (0.2-2). The first 30 principal components were used to identify 22 cell clusters ranging in size from 25 to 2310 cells. Gene markers for clusters were identified with the findMarkers function in scran. To label individual cells with cell type identities, we used the singleR package (v. 3.1.1) to compare gene expression profiles of individual cells with expression data from curated, FACS-sorted leukocyte samples in the Immgen compendium (Aran D. et al., 2019; Heng et al., 2008). We manually updated the Immgen reference annotation with 263 sample group labels for fine-grain analysis and 25 CD45+ cell type identities based on markers used to sort Immgen samples (Guilliams et al., 2014). The reference annotation is provided in Table S2, cells that were not labeled confidently after label pruning were assigned “Unknown”.
Differential gene expression by immune cells
Differential gene expression within individual cell types was performed by pooling raw count data from cells of each cell type on a per-sample basis to create a pseudo-bulk count table for each cell type. Differential expression analysis was only performed on cell types that were sufficiently represented (>10 cells) in each sample. In droplet-based scRNA-seq, ambient RNA from lysed cells is incorporated into droplets, and can result in spurious identification of these genes in cell types where they aren’t actually expressed. We therefore used a method developed by Young and Behjati (Young et al., 2018) to estimate the contribution of ambient RNA for each gene, and identified genes in each cell type that were estimated to be > 25% ambient-derived. These genes were excluded from analysis in a cell-type specific manner. Genes expressed in less than 5 percent of cells were also excluded from analysis. Differential expression analysis was then performed in Limma (limma-voom with quality weights) following a standard protocol for bulk RNA-seq (Law et al., 2014). Significant genes were identified using MA/QC criteria of P < .05, log2FC >1.
Analysis of arsenic effect on immune cell gene expression by scRNA-seq.
Sample-wide effects of arsenic on gene expression were identified by pooling raw count data from all cells per sample to create a count table for pseudo-bulk gene expression analysis. Genes with less than 20 counts in any sample, or less than 60 total counts were excluded from analysis. Differential expression analysis was performed using limma-voom as described above.
This dataset contains files reconstructing single-cell data presented in 'Reference transcriptomics of porcine peripheral immune cells created through bulk and single-cell RNA sequencing' by Herrera-Uribe & Wiarda et al. 2021. Samples of peripheral blood mononuclear cells (PBMCs) were collected from seven pigs and processed for single-cell RNA sequencing (scRNA-seq) in order to provide a reference annotation of porcine immune cell transcriptomics at enhanced, single-cell resolution. Analysis of single-cell data allowed identification of 36 cell clusters that were further classified into 13 cell types, including monocytes, dendritic cells, B cells, antibody-secreting cells, numerous populations of T cells, NK cells, and erythrocytes. Files may be used to reconstruct the data as presented in the manuscript, allowing for individual query by other users. Scripts for original data analysis are available at https://github.com/USDA-FSEPRU/PorcinePBMCs_bulkRNAseq_scRNAseq. Raw data are available at https://www.ebi.ac.uk/ena/browser/view/PRJEB43826. Funding for this dataset was also provided by NRSP8: National Animal Genome Research Program (https://www.nimss.org/projects/view/mrp/outline/18464). Resources in this dataset:Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells 10X Format. File Name: PBMC7_AllCells.zipResource Description: Zipped folder containing PBMC counts matrix, gene names, and cell IDs. Files are as follows: matrix of gene counts* (matrix.mtx.gx) gene names (features.tsv.gz) cell IDs (barcodes.tsv.gz) *The ‘raw’ count matrix is actually gene counts obtained following ambient RNA removal. During ambient RNA removal, we specified to calculate non-integer count estimations, so most gene counts are actually non-integer values in this matrix but should still be treated as raw/unnormalized data that requires further normalization/transformation. Data can be read into R using the function Read10X().Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells Metadata. File Name: PBMC7_AllCells_meta.csvResource Description: .csv file containing metadata for cells included in the final dataset. Metadata columns include: nCount_RNA = the number of transcripts detected in a cell nFeature_RNA = the number of genes detected in a cell Loupe = cell barcodes; correspond to the cell IDs found in the .h5Seurat and 10X formatted objects for all cells prcntMito = percent mitochondrial reads in a cell Scrublet = doublet probability score assigned to a cell seurat_clusters = cluster ID assigned to a cell PaperIDs = sample ID for a cell celltypes = cell type ID assigned to a cellResource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells PCA Coordinates. File Name: PBMC7_AllCells_PCAcoord.csvResource Description: .csv file containing first 100 PCA coordinates for cells. Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells t-SNE Coordinates. File Name: PBMC7_AllCells_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells UMAP Coordinates. File Name: PBMC7_AllCells_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for all cells.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells t-SNE Coordinates. File Name: PBMC7_CD4only_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - CD4 T Cells UMAP Coordinates. File Name: PBMC7_CD4only_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only CD4 T cells (clusters 0, 3, 4, 28). A dataset of only CD4 T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells UMAP Coordinates. File Name: PBMC7_GDonly_UMAPcoord.csvResource Description: .csv file containing UMAP coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and UMAP coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gamma Delta T Cells t-SNE Coordinates. File Name: PBMC7_GDonly_tSNEcoord.csvResource Description: .csv file containing t-SNE coordinates for only gamma delta T cells (clusters 6, 21, 24, 31). A dataset of only gamma delta T cells can be re-created from the PBMC7_AllCells.h5Seurat, and t-SNE coordinates used in publication can be re-assigned using this .csv file.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - Gene Annotation Information. File Name: UnfilteredGeneInfo.txtResource Description: .txt file containing gene nomenclature information used to assign gene names in the dataset. 'Name' column corresponds to the name assigned to a feature in the dataset.Resource Title: Herrera-Uribe & Wiarda et al. PBMCs - All Cells H5Seurat. File Name: PBMC7.tarResource Description: .h5Seurat object of all cells in PBMC dataset. File needs to be untarred, then read into R using function LoadH5Seurat().
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
June 2023 Version
This archive contains materials (datasets, exercises and slides, etc) used for the Introduction to bulk RNAseq analysis workshop taught at the University of Copenhagen by the Center for Health Data Science (HeaDS). The course repo can be found on Github:
Assignments.zip contains exercises for the preprocessing part of the course, like fastqc and multiqc examples of bulk RNAseq experiments
Data.zip contains count matrices (both traditional counts and salmon pseudocounts), as well as sample metadata (samplesheet.csv) and backup results from the preprocessing pipeline.
Notes.zip contains supplementary materials such as extra pdfs for more information on bulk RNAseq technology.
Slides.zip contains all the slides used in the workshop.
Raw_reads.zip contains the raw reads from the bulk RNAseq experiment (10.1016/j.celrep.2014.10.054) used in this course.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
We developed a single-cell transcriptomics pipeline for high-throughput pharmacotranscriptomic screening. We explored the transcriptional landscape of three HGSOC models (JHOS2, a representative cell line; PDC2 and PDC3, two patient-derived samples) after treating their cells for 24 hours with 45 drugs representing 13 distinct classes of mechanism of action. Our work establishes a new precision oncology framework for the study of molecular mechanisms activated by a broad array of drug responses in cancer. . ├── 3D UMAPs/ → Interactive 3D UMAPs of cells treated with the 45 drugs used for multiplexed scRNA-seq. Related to Figure 4. Coordinates: x = UMAP 1; y = UMAP 2; z = UMAP 3. Legend: green = PDC1; blue = PDC2; red = JHOS2. │ ├── DMSO_3D_UMAP_Dini.et.al.html → 3D UMAP of untreated cells. │ └── drug_3D_UMAP_Dini.et.al.html → 3D UMAP of cells treated with (drug). ├── QC_plots/ → Diagnostic plots. Related to Figures 2–4. │ ├── model_QC_violin_plot_2023.pdf → Violin plots of the QC metrics used to filter the data. │ ├── model_col_HTO or model_row_HTO before and after filt → Heatmaps of the row or column HTO expression in each cell. │ └── model_counts_histogram_2023.pdf → Histogram of the distribution of the total counts per cell after filtering for high-quality cells. ├── scRNAseq/ → scRNA-seq data. Related to Figures 2–4. │ ├── AllData_subsampled_DGE_edgeR.csv.gz → Differential gene expression analyses results between treated and untreated cells via pseudobulk of aggregate subsamples, for each of the three models. Related to Figure 3. │ └── All_vs_all_RNAclusters_DEG_signif.txt → Differential gene expression analysis results (p.adj < 0.05) of FindAllMarkers for the Leiden/RNA clusters. ├── PDCs.transcript.counts.tsv → Bulk RNA-seq count data for PDCs 1–3 processed by Kallisto. Related to Figure S6. └── PDCs.transcript.TPM.tsv → Bulk RNA-seq TPM data for PDCs 1–3 processed by Kallisto. Related to Figure S6.
https://ega-archive.org/dacs/EGAC00001001864https://ega-archive.org/dacs/EGAC00001001864
Our study utilizes two novel techniques, cyclical immunofluorescence imaging and single-cell spatial transcriptomics, to spatially map the tumor microenvironment of a large sample set of pediatric high-grade gliomas (pHGG). Using these methods, we have identified an abundant immunosuppressive myeloid cell population that has not been described in the context of pediatric high-grade gliomas. We validate our findings using an in vitro assay, spatial analysis on additional pHGG biopsies, independent spatial transcriptomic data, bulk RNA sequencing data of an expanded cohort of in-house patients, and publicly available bulk RNA sequencing data of an even larger cohort of pHGG patients.
https://www.scilifelab.se/data/restricted-access/https://www.scilifelab.se/data/restricted-access/
Data Set DescriptionSingle cell RNA sequencing (Samrt-Seq3) and Whole exome sequencing from multiple regions of individual tumors from Breast Cancer patients and also single cell RNA seq for two ovarian cancer cell lines.The dataset contains raw sequencing data for various high-throughput molecular tests performed on two sample types: tumor samples from two breast cancer patients and cell lines derived from High-grade serous carcinoma Patients. The breast cancer data comes from two patients: patient 1 (BCSA1) has two tumor regions A-B and patient 2 (BCSA2) has five regions(A-E). For a normal sample and each region from each patient Whole Exome Sequencing was performed using Twist Biosciences Human Exome Kit by the SNP&SEQ Technology platform, SciLifeLab, National Genomics Infrastructure Uppsala, Sweden. Also for each patient, EPCAM+ CD45- sorted cells from all the regions where sorted to a 384 well plate, and Smart-Seq3 libraries were prepared at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.The HGSOC cell-line data comes from OV2295R2 and TOV2295R cell lines described in Laks et al Cell 2019 Nov 14; 179(5): 1207–1221.e22 doi: 10.1016/j.cell.2019.10.026 . The cell line Smart-Seq3 libraries were prepared from two 384 well plates at Karolinska Institutet and sequenced at National Genomics Infrastructure Uppsala, Sweden.Terms for accessThis dataset is to be used for research on intratumor heterogeneity and subclonal evolution of tumors. To apply for conditional access to the dataset in this publication, please contact datacentre@scilifelab.se.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Detailed quantitative analysis of GFP expression in SAHA and TCR-treated cells & Computational analysis of bulk and single-cell RNA-Seq data.
Detailed quantitative analysis of GFP expression in SAHA and TCR-treated cells.
Cells were prepared for single cell analysis at the Genome Technology Facility (GTF) of the University of Lausanne. Cells were loaded on Fluidigm C1 IFC plates (5-10 μm), with run ID smart33, smart34 and smart35, corresponding to untreated, SAHA- and TCR-treated conditions respectively. After single cell capture on the Fluidigm C1 IFC plate, each chamber was inspected visually by microscopy and pictures were captured with a Zeiss Axiovert 200 M fluorescence microscope equipped with a Roper Scientific CoolSnap HQ camera using a Plan-Neofluar 10X lens (smart34 run) or 20X lens (for smart35 run). For each capture chamber, pictures in bright field and FITC channel were taken with the MetaMorph 6.3 software. Picture analysis was then performed using ImageJ 1.50b software (open access software: website). Brightness and contrast were adjusted for qualitative assessment of the pictures.
Computational analysis of bulk and single-cell RNA-Seq data.
Upon bulk or single cell isolation, RNA extraction and library preparation was performed according to Illumina protocols. Bulk and single-cell RNA-Seq data analysis are detailed here.
Linked to the paper published in Cell Reports (doi:10.1016/j.celrep.2018.03.102):
Single-Cell RNA-Seq Reveals Transcriptional Heterogeneity in Latent and Reactivated HIV-infected Cells
Despite effective treatment, HIV can persist in latent reservoirs, which represent a major obstacle towards HIV eradication. Targeting and reactivating latent cells is challenging due to the heterogeneous nature of HIV infected cells. Here, we used a primary model of HIV latency and single-cell RNA sequencing to characterize transcriptional heterogeneity during HIV latency and reactivation. Our analysis identified transcriptional programs leading to successful reactivation of HIV expression.
https://ega-archive.org/dacs/EGAC00001000551https://ega-archive.org/dacs/EGAC00001000551
In this study, we performed systematic comparative analysis of seven widely-used SNV-calling methods, including SAMtools, the GATK Best Practices pipeline, CTAT, FreeBayes, MuTect2, Strelka2 and VarScan2, on both simulated and real single-cell RNA-seq datasets. We generated SMART-seq2 data for 70 CD45- single cells, which were derived from two colorectal cancer patients (P0411 and P0413). The average sequencing depths of these cells were 1.4 million reads per cell. We also generated tumor and adjacent normal bulk WES data, as well as tumor bulk RNA-seq data for these patients.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
In this zipped folder are two folders: data and results
data
data contains two subfolders: bulk_data single_cell_data
bulk_data has the processed bulk data for the sex-dependent liver analysis (original data is from Schaum, N. et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature 583, 596–602 (2020). Original data used for processing can be downloaded from Gene Expression Omnibus, under accession ID GSE132040.
single_cell_data has the processed single-cell data and the generated pseudobulks used by BuDDI. Within this folder,
kang_rybakov contains the processed Kang et al. data used in the analysis. (original data is from Kang, H. M. et al. Multiplexed droplet single-cell RNA-sequencing using natural genetic variation. Nat. Biotechnol. 36, 89–94 (2018).) liver_sex contains the processed single-cell data used to generate pseudobulks (original data is from 1.Tabula Muris Consortium. A single-cell transcriptomic atlas characterizes ageing tissues in the mouse. Nature 583, 590–595 (2020). and 2.Schaum, N. et al. Ageing hallmarks exhibit organ-specific temporal signatures. Nature 583, 596–602 (2020). cibersort_* contains the data used to perform CIBERSORTx augmented_* contains the pseudobulks
results
This folder contains several results folders. Each one is specific to an experiment and a model. The naming convention is "model_experiment" buddiM2: the BuDDI model CVAE: conditional variational autoencoder PCA: principal components analysis bp: BayesPrism cibsersort: CIBERSORTx
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset includes information relevant to the following manuscript from the labs of Prof. Carlos Caldas (University of Cambridge), and Dr. Long V. Nguyen (Princess Margaret Cancer Centre, University Health Network):
Nguyen LV et al. Dynamics and plasticity of human breast cancer single cell-derived clones. Under consideration for publication.
Bulk RNA sequencing raw count matrices are provided (RawCounts.csv) along with the normalized count matrices (LogCPMNormCounts.csv).
Single cell RNA sequencing count matrix processed from R package metacell is provided (mat.pdx_LN_v2_filt.Rda), along with the mc and mc2d files with information on metacell partitions (mc.pdx_LN_v2_filt.Rda and mc2d.pdx_LN_v2_filt.Rda).
Code and information on data analysis is provided for reviewers in our unpublished manuscript and on Github (https://github.com/cclab-brca/clone-dynamics).
https://ega-archive.org/dacs/EGAC00001001443https://ega-archive.org/dacs/EGAC00001001443
It was a single-cell RNA sequencing study on the PBMC samples from four Finnish children at risk of developing Type 1 diabetes and their gender age and HLA matched control children. All four Case children were positive for multiple islet specific autoantibodies and two of them also progressed to clinical disease during the follow up whereas the control children remain negative for all autoantibodies. Single-cell analysis confirmed some of the signatures obtained from the bulk data. It identified that high IL32 in case samples in the bulk RNA-seq was contributed mainly by activated T cells and NK cells. Trajectory analysis of the scRNA-seq data suggested that IL32 expression increased as the T cells moved towards activated state.
We hypothesized that exposure to fatty acids in metabolic dysfunction-associated steatohepatitis-like environment will profoundly affect gene expression of hepatocytes. More precisely, we wish to investigate expression of genes related to ferroptosis, i.e. an iron-catalyzed form of cell death through lethal lipid peroxidation. We aimed to study the effect of fatty acid supplementation in a metabolic dysfunction-associated steatohepatitis-like environment on gene expression of HepG2 cells profiled with bulk mRNA-sequencing. HepG2 cells were exposed for 48 hours to oleic acid (100microM) and palmitic acid (50microM), as well as hyperglycemia (4.5 mg/mL), hyperinsulinemia (100 nM), tumor necrosis factor alpha (50 ng/mL), interleukin 1-beta (25 ng/mL) and transforming growth factor-beta (8ng/mL). Control samples were exposed to solvents needed to dissolve oleic acid and palmitic acid in medium.