Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data for the downstream analysis. One zip file contains the results of the TF-Prioritizer run, and the other contains the input files for the ORA. These are cancer-related GO terms, tissue-related GO terms for the cell lines used, and mutated gene lists for the different tissues.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
ObjectivesThe sequencing by the PolyA selection is the most common approach for library preparation. With limited amount or degraded RNA, alternative protocols such as the NuGEN have been developed. However, it is not yet clear how the different library preparations affect the downstream analyses of the broad applications of RNA sequencing.Methods and MaterialsEight human mammary epithelial cell (HMEC) lines with high quality RNA were sequenced by Illumina’s mRNA-Seq PolyA selection and NuGEN ENCORE library preparation. The following analyses and comparisons were conducted: 1) the numbers of genes captured by each protocol; 2) the impact of protocols on differentially expressed gene detection between biological replicates; 3) expressed single nucleotide variant (SNV) detection; 4) non-coding RNAs, particularly lincRNA detection; and 5) intragenic gene expression.ResultsSequences from the NuGEN protocol had lower (75%) alignment rate than the PolyA (over 90%). The NuGEN protocol detected fewer genes (12–20% less) with a significant portion of reads mapped to non-coding regions. A large number of genes were differentially detected between the two protocols. About 17–20% of the differentially expressed genes between biological replicates were commonly detected between the two protocols. Significantly higher numbers of SNVs (5–6 times) were detected in the NuGEN samples, which were largely from intragenic and intergenic regions. The NuGEN captured fewer exons (25% less) and had higher base level coverage variance. While 6.3% of reads were mapped to intragenic regions in the PolyA samples, the percentages were much higher (20–25%) for the NuGEN samples. The NuGEN protocol did not detect more known non-coding RNAs such as lincRNAs, but targeted small and “novel” lincRNAs.ConclusionDifferent library preparations can have significant impacts on downstream analysis and interpretation of RNA-seq data. The NuGEN provides an alternative for limited or degraded RNA but it has limitations for some RNA-seq applications.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Datasets and metadata used for the full streamline analysis of plant data under different conditions of infection. The tutorial is an example of analysis which can be useful in multiple scenario where comparisons are needed (healthy and sick patients, for example). You can find the tutorial at our website https://hds-sandbox.github.io/AdvancedSingleCell
Usage notes:
all files are ready to use, except for control1.tar.gz which is a folder that needs to be decompressed
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Model predictions and experiment results for downstream analysis in the updated scNODE paper (https://www.biorxiv.org/content/10.1101/2023.11.22.568346v2).
Facebook
TwitterIn the rapidly moving proteomics field, a diverse patchwork of data analysis pipelines and algorithms for data normalization and differential expression analysis is used by the community. We generated a mass spectrometry downstream analysis pipeline (MS-DAP) that integrates both popular and recently developed algorithms for normalization and statistical analyses. Additional algorithms can be easily added in the future as plugins. MS-DAP is open-source and facilitates transparent and reproducible proteome science by generating extensive data visualizations and quality reporting, provided as standardized PDF reports. Second, we performed a systematic evaluation of methods for normalization and statistical analysis on a large variety of data sets, including additional data generated in this study, which revealed key differences. Commonly used approaches for differential testing based on moderated t-statistics were consistently outperformed by more recent statistical models, all integrated in MS-DAP. Third, we introduced a novel normalization algorithm that rescues deficiencies observed in commonly used normalization methods. Finally, we used the MS-DAP platform to reanalyze a recently published large-scale proteomics data set of CSF from AD patients. This revealed increased sensitivity, resulting in additional significant target proteins which improved overlap with results reported in related studies and includes a large set of new potential AD biomarkers in addition to previously reported.
Facebook
TwitterAttribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Downstream analysis scripts and metadata for the paper titled "Habitat fragmentation shifts soil microbial composition but not richness". All analysis done in R v4.4.2.See "variable descriptions" tab in the metadata file for details.
Facebook
Twitterhttps://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy
The downstream bioprocessing market is projected to grow from USD 18.7 billion in 2023 to USD 32.7 billion by 2028, at a CAGR of 14.3% during the forecast period. The growth of the market is primarily attributed to the increasing demand for biopharmaceuticals, the rising prevalence of chronic diseases, and the growing adoption of advanced technologies in downstream bioprocessing. Additionally, supportive government initiatives, such as tax incentives and grants, are creating a conducive environment for the growth of the market. Key drivers of the downstream bioprocessing market include the rising demand for biopharmaceuticals, the increasing prevalence of chronic diseases, the growing adoption of advanced technologies, and supportive government initiatives. However, the market is also facing certain challenges, such as the high cost of downstream bioprocessing and the regulatory complexities associated with biopharmaceutical production. The largest segment of the downstream bioprocessing market is purification, which accounted for over 40% of the market in 2023. This segment is expected to continue to grow at a steady pace due to the increasing demand for high-purity biopharmaceuticals. The major players in the downstream bioprocessing market include Danaher, Eppendorf, GE Healthcare, Parker Hannifin, and Thermo Fisher Scientific. These companies offer a wide range of products and services for downstream bioprocessing, including equipment, consumables, and software.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Recent advances in single-cell sequencing techniques have enabled gene expression profiling of individual cells in tissue samples so that it can accelerate biomedical research to develop novel therapeutic methods and effective drugs for complex disease. The typical first step in the downstream analysis pipeline is classifying cell types through accurate single-cell clustering algorithms. Here, we describe a novel single-cell clustering algorithm, called GRACE (GRaph Autoencoder based single-cell Clustering through Ensemble similarity larning), that can yield highly consistent groups of cells. We construct the cell-to-cell similarity network through the ensemble similarity learning framework, and employ a low-dimensional vector representation for each cell through a graph autoencoder. Through performance assessments using real-world single-cell sequencing datasets, we show that the proposed method can yield accurate single-cell clustering results by achieving higher assessment metric scores.
Facebook
TwitterAdditional file 5. GWAS SNPs analysis. This file contains the results of GWAS SNPs analysis on the intron flanking BSJs.
Facebook
TwitterAdditional file 4. miRNA binding sites analysis. This file contains the results of the miRNA binding sites analysis in the ALPK2 circRNA sequence.
Facebook
TwitterMIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Scripts and data for the paper: Consequences and opportunities arising due to sparser single-cell RNA-seq datasets
With the number of cells measured in single-cell RNA sequencing (scRNA-seq) datasets increasing exponentially and concurrent increased sparsity due to more zero counts being measured for many genes, we demonstrate here that downstream analyses on binary-based gene expression give similar results as count-based analyses. Moreover, a binary representation scales up to ~ 50-fold more cells that can be analyzed using the same computational resources. We also highlight the possibilities provided by binarized scRNA-seq data. Development of specialized tools for bit-aware implementations of downstream analytical tasks will enable a more fine-grained resolution of biological heterogeneity.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of the methods’ differentially expressed genes’ AUC score for the simulated datasets (all versions of Dataset 3 and Dataset 4).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset contains a set of text files from running the tool Nanopolish eventalign on several nanopore direct RNA sequencing data sets produced by Jay Hesselberth's lab at the University of Colorado (BioProject accession number PRJNA910992), as well as external sequencing data sets from PMID: 34893601 (synthetic oligonucleotides from Leger et al) and PMID: 35252946 (yeast rRNA data from Stephenson et al). These files can be used as inputs to the R markdown documents at https://github.com/hesselberthlab/RNARePore to reproduce the figures in the associated manuscript.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Publicly available RNA-sequencing (RNA-seq) data are a rich resource for elucidating the mechanisms of human disease; however, preprocessing these data requires considerable bioinformatic expertise and computational infrastructure. Analyzing multiple datasets with a consistent computational workflow increases the accuracy of downstream meta-analyses. This collection of datasets represents the human intracellular transcriptional response to disorders and diseases such as acute lymphoblastic leukemia (ALL), B-cell lymphomas, chronic obstructive pulmonary disease (COPD), colorectal cancer, lupus erythematosus; as well as infection with pathogens including Borrelia burgdorferi, hantavirus, influenza A virus, Middle East respiratory syndrome coronavirus (MERS-CoV), Streptococcus pneumoniae, respiratory syncytial virus (RSV), severe acute respiratory syndrome coronavirus (SARS-CoV), and severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). We calculated the statistically significant differentially expressed genes and Gene Ontology (GO) terms for all datasets. In addition, a subset of the datasets also include results from splice variant analyses, intracellular signaling pathway enrichments as well as read mapping and quantification. All analyses were performed using well-established algorithms and are provided to facilitate future data mining activities, wet lab studies, and to accelerate collaboration and discovery.
Facebook
Twitterhttps://www.rootsanalysis.com/privacy.htmlhttps://www.rootsanalysis.com/privacy.html
The single use downstream bioprocessing market is likely to grow from USD 1.54 bn in 2024 to USD 1.79 bn in 2025 and USD 6.75 bn by 2035, representing a CAGR of 14.2%
Facebook
Twitterhttps://www.emergenresearch.com/privacy-policyhttps://www.emergenresearch.com/privacy-policy
The global Downstream Processing Market size is expected to reach USD 121.51 Billion in 2032 registering a CAGR of 15.0%. Our report provides a comprehensive overview of the industry, including key players, market share, growth opportunities and more.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Comparison of the methods based on the clustering metrics computed on the full datasets.
Facebook
TwitterODC Public Domain Dedication and Licence (PDDL) v1.0http://www.opendatacommons.org/licenses/pddl/1.0/
License information was derived automatically
RNA sequencing is widely used to measure gene expression across a vast range of animal and plant tissues and conditions. Most studies of computational methods for gene expression analysis use simulated data to evaluate the accuracy of these methods. In this work we present a dataset of 3 tissues each containing 10 samples of simulated short RNA-seq reads across 4 types of transcription characterized from the GTEx dataset. For each type of transcription: 1) known isoforms; 2) splicing noise; 3) intronic noise; 4) intergenic noise -we provide sets of reads in CRAM format along with corresponding expression matrices and annotations in the GTF format for downstream analysis. A copy of GRCh.38 used in our analysis is also provided along with the simulated data. Further details on the structure of each file is provided in the accompanying README document.
Facebook
TwitterRemark: for cell cycle analysis - see paper https://arxiv.org/abs/2208.05229 "Computational challenges of cell cycle analysis using single cell transcriptomics" Alexander Chervov, Andrei Zinovyev
Data - results of single cell RNA sequencing, i.e. rows - correspond to cells, columns to genes (csv file is vice versa). value of the matrix shows how strong is "expression" of the corresponding gene in the corresponding cell. https://en.wikipedia.org/wiki/Single-cell_transcriptomics
Particular data from: https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE76381 There are original TXT files and reconversion to *.h5ad format which is more easy to work with. There are several subdatasets human/mouse/different cell types.
Paper: SCnorm: robust normalization of single-cell RNA-seq data https://pubmed.ncbi.nlm.nih.gov/28418000/ Bacher R, Chu LF, Leng N, Gasch AP et al. SCnorm: robust normalization of single-cell RNA-seq data. Nat Methods 2017 Jun;14(6):584-586
Abstract: The normalization of RNA-seq data is essential for accurate downstream inference, but the assumptions upon which most normalization methods are based are not applicable in the single-cell setting. Consequently, applying existing normalization methods to single-cell RNA-seq data introduces artifacts that bias downstream analyses. To address this, we introduce SCnorm for accurate and efficient normalization of single-cell RNA-seq data.
Total 183 single cells (92 H1 cells, 91 H9 cells), sequenced twice, were used to evaluate SCnorm in normalizing single cell RNA-seq experiments. Total 48 bulk H1 samples were used to compare bulk and single cell properties. For single-cell RNA-seq, the identical single-cell indexed and fragmented cDNA were pooled at 96 cells per lane or at 24 cells per lane to test the effects of sequencing depth, resulting in approximately 1 million and 4 million mapped reads per cell in the two pooling groups, respectively.
Single cell RNA sequencing is important technology in modern biology, see e.g. "Eleven grand challenges in single-cell data science" https://genomebiology.biomedcentral.com/articles/10.1186/s13059-020-1926-6
Also see review : Nature. P. Kharchenko: "The triumphs and limitations of computational methods for scRNA-seq" https://www.nature.com/articles/s41592-021-01171-x
Facebook
TwitterA major event in mammalian male sex determination is the induction of the testis determining factor Sry and its downstream gene Sox9. The current study provides one of the first genome wide analyses of the downstream gene binding targets for SRY and SOX9 to help elucidate the molecular control of Sertoli cell differentiation and testis development. A modified ChIP-Chip analysis using a comparative hybridization was used to identify 71 direct downstream binding targets for SRY and 109 binding targets for SOX9. Interestingly, only 5 gene targets overlapped between SRY and SOX9. In addition to the direct response element binding gene targets, a large number of atypical binding gene targets were identified for both SRY and SOX9. Bioinformatic analysis of the downstream binding targets identified gene networks and cellular pathways potentially involved in the induction of Sertoli cell differentiation and testis development. The specific DNA sequence binding site motifs for both SRY and SOX9 were identified. Observations provide insights into the molecular control of male gonadal sex determination. Overall design: The current study provides one of the first genome wide analyses of the downstream gene binding targets for SRY and SOX9 to help elucidate the molecular control of Sertoli cell differentiation and testis development. At embryonic day 13 (E13) of pregnancy rats were euthanized and embryonic gonads were collected for chromatin. A modified ChIP-Chip analysis using a comparative hybridization was used to identify direct downstream binding targets for SRY and for SOX9. Then, bioinformatic analysis of the downstream binding targets was done to identify gene networks and cellular pathways that are potentially involved in the induction of Sertoli cell differentiation and testis development.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Data for the downstream analysis. One zip file contains the results of the TF-Prioritizer run, and the other contains the input files for the ORA. These are cancer-related GO terms, tissue-related GO terms for the cell lines used, and mutated gene lists for the different tissues.