100+ datasets found
  1. Additional file 1 of NetSeekR: a network analysis pipeline for RNA-Seq time...

    • springernature.figshare.com
    txt
    Updated May 30, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Himangi Srivastava; Drew Ferrell; George V. Popescu (2023). Additional file 1 of NetSeekR: a network analysis pipeline for RNA-Seq time series data [Dataset]. http://doi.org/10.6084/m9.figshare.19090649.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 30, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Himangi Srivastava; Drew Ferrell; George V. Popescu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Additional file 1. Brief description of functions implemented in NetSeekR.

  2. Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud

    • plos.figshare.com
    • datasetcatalog.nlm.nih.gov
    pdf
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Malachi Griffith; Jason R. Walker; Nicholas C. Spies; Benjamin J. Ainscough; Obi L. Griffith (2023). Informatics for RNA Sequencing: A Web Resource for Analysis on the Cloud [Dataset]. http://doi.org/10.1371/journal.pcbi.1004393
    Explore at:
    pdfAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Malachi Griffith; Jason R. Walker; Nicholas C. Spies; Benjamin J. Ainscough; Obi L. Griffith
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Massively parallel RNA sequencing (RNA-seq) has rapidly become the assay of choice for interrogating RNA transcript abundance and diversity. This article provides a detailed introduction to fundamental RNA-seq molecular biology and informatics concepts. We make available open-access RNA-seq tutorials that cover cloud computing, tool installation, relevant file formats, reference genomes, transcriptome annotations, quality-control strategies, expression, differential expression, and alternative splicing analysis methods. These tutorials and additional training resources are accompanied by complete analysis pipelines and test datasets made available without encumbrance at www.rnaseq.wiki.

  3. f

    Data from: A Robust Analytical Pipeline for Genome-Wide Identification of...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Sep 28, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kojima, Takaaki; Kobayashi, Tetsuo; Ihara, Kunio; Nakano, Hideo; Kunitake, Emi (2016). A Robust Analytical Pipeline for Genome-Wide Identification of the Genes Regulated by a Transcription Factor: Combinatorial Analysis Performed Using gSELEX-Seq and RNA-Seq [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001503876
    Explore at:
    Dataset updated
    Sep 28, 2016
    Authors
    Kojima, Takaaki; Kobayashi, Tetsuo; Ihara, Kunio; Nakano, Hideo; Kunitake, Emi
    Description

    For identifying the genes that are regulated by a transcription factor (TF), we have established an analytical pipeline that combines genomic systematic evolution of ligands by exponential enrichment (gSELEX)-Seq and RNA-Seq. Here, SELEX was used to select DNA fragments from an Aspergillus nidulans genomic library that bound specifically to AmyR, a TF from A. nidulans. High-throughput sequencing data were obtained for the DNAs enriched through the selection, following which various in silico analyses were performed. Mapping reads to the genome revealed the binding motifs including the canonical AmyR-binding motif, CGGN8CGG, as well as the candidate promoters controlled by AmyR. In parallel, differentially expressed genes related to AmyR were identified by using RNA-Seq analysis with samples from A. nidulans WT and amyR deletant. By obtaining the intersecting set of genes detected using both gSELEX-Seq and RNA-Seq, the genes directly regulated by AmyR in A. nidulans can be identified with high reliability. This analytical pipeline is a robust platform for comprehensive genome-wide identification of the genes that are regulated by a target TF.

  4. Raw and processed (filtered and annotated) scRNAseq data

    • figshare.com
    zip
    Updated Jun 12, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac (2023). Raw and processed (filtered and annotated) scRNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.23499192.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jun 12, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Gabrielle Leclercq-Cohen; Sabrina Danilin; Llucia Alberti-Servera; Stephan Schmeing; Hélène Haegel; Sina Nassiri; Marina Bacac
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Single cell RNA-seq data generated and reported as part of the manuscript entitled "Dissecting the mechanisms underlying the Cytokine Release Syndrome (CRS) mediated by T Cell Bispecific Antibodies" by Leclercq-Cohen et al 2023. Raw and processed (filtered and annotated) data are provided as AnnData objects which can be directly ingested to reproduce the findings of the paper or for ab initio data reuse: 1- raw.zip provides concatenated raw/unfiltered counts for the 20 samples in the standard Market Exchange Format (MEX) format. 2- 230330_sw_besca2_LowFil_raw.h5ad contains filtered cells and raw counts in the HDF5 format. 3- 221124_sw_besca2_LowFil.annotated.h5ad contains filtered cells and log normalized counts, along with cell type annotation in the HDF5 format.

    scRNAseq data generation: Whole blood from 4 donors was treated with 0.2 μg/mL CD20-TCB, or incubated in the absence of CD20- TCB. At baseline (before addition of TCB) and assay endpoints (2, 4, 6, and 20 hrs), blood was collected for total leukocyte isolation using EasySepTM red blood cell depletion reagent (Stemcell). Briefly, cells were counted and processed for single cell RNA sequencing using the BD Rhapsody platform. To load several samples on a single BD Rhapsody cartridge, sample cells were labelled with sample tags (BD Human Single-Cell Multiplexing Kit) following the manufacturer’s protocol prior to pooling. Briefly, 1x106 cells from each sample were re-suspended in 180 μL FBS Stain Buffer (BD, PharMingen) and sample tags were added to the respective samples and incubated for 20 min at RT. After incubation, 2 successive washes were performed by addition of 2 mL stain buffer and centrifugation for 5 min at 300 g. Cells were then re- suspended in 620 μL cold BD Sample Buffer, stained with 3.1 μL of both 2 mM Calcein AM (Thermo Fisher Scientific) and 0.3 mM Draq7 (BD Biosciences) and finally counted on the BD Rhapsody scanner. Samples were then diluted and/or pooled equally in 650 μL cold BD Sample Buffer. The BD Rhapsody cartridges were then loaded with up to 40 000 – 50 000 cells. Single cells were isolated using Single-Cell Capture and cDNA Synthesis with the BD Rhapsody Express Single-Cell Analysis System according to the manufacturer’s recommendations (BD Biosciences). cDNA libraries were prepared using the Whole Transcriptome Analysis Amplification Kit following the BD Rhapsody System mRNA Whole Transcriptome Analysis (WTA) and Sample Tag Library Preparation Protocol (BD Biosciences). Indexed WTA and sample tags libraries were quantified and quality controlled on the Qubit Fluorometer using the Qubit dsDNA HS Assay, and on the Agilent 2100 Bioanalyzer system using the Agilent High Sensitivity DNA Kit. Sequencing was performed on a Novaseq 6000 (Illumina) in paired-end mode (64-8- 58) with Novaseq6000 S2 v1 or Novaseq6000 SP v1.5 reagents kits (100 cycles). scRNAseq data analysis: Sequencing data was processed using the BD Rhapsody Analysis pipeline (v 1.0 https://www.bd.com/documents/guides/user-guides/GMX_BD-Rhapsody-genomics- informatics_UG_EN.pdf) on the Seven Bridges Genomics platform. Briefly, read pairs with low sequencing quality were first removed and the cell label and UMI identified for further quality check and filtering. Valid reads were then mapped to the human reference genome (GRCh38-PhiX-gencodev29) using the aligner Bowtie2 v2.2.9, and reads with the same cell label, same UMI sequence and same gene were collapsed into a single raw molecule while undergoing further error correction and quality checks. Cell labels were filtered with a multi-step algorithm to distinguish those associated with putative cells from those associated with noise. After determining the putative cells, each cell was assigned to the sample of origin through the sample tag (only for cartridges with multiplex loading). Finally, the single-cell gene expression matrices were generated and a metrics summary was provided. After pre-processing with BD’s pipeline, the count matrices and metadata of each sample were aggregated into a single adata object and loaded into the besca v2.3 pipeline for the single cell RNA sequencing analysis (43). First, we filtered low quality cells with less than 200 genes, less than 500 counts or more than 30% of mitochondrial reads. This permissive filtering was used in order to preserve the neutrophils. We further excluded potential multiplets (cells with more than 5,000 genes or 20,000 counts), and genes expressed in less than 30 cells. Normalization, log-transformed UMI counts per 10,000 reads [log(CP10K+1)], was applied before downstream analysis. After normalization, technical variance was removed by regressing out the effects of total UMI counts and percentage of mitochondrial reads, and gene expression was scaled. The 2,507 most variable genes (having a minimum mean expression of 0.0125, a maximum mean expression of 3 and a minimum dispersion of 0.5) were used for principal component analysis. Finally, the first 50 PCs were used as input for calculating the 10 nearest neighbours and the neighbourhood graph was then embedded into the two-dimensional space using the UMAP algorithm at a resolution of 2. Cell type annotation was performed using the Sig-annot semi-automated besca module, which is a signature- based hierarchical cell annotation method. The used signatures, configuration and nomenclature files can be found at https://github.com/bedapub/besca/tree/master/besca/datasets. For more details, please refer to the publication.

  5. Example RNA-seq analysis of data from GSE119855

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Mar 9, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Geert van Geest; Geert van Geest (2023). Example RNA-seq analysis of data from GSE119855 [Dataset]. http://doi.org/10.5281/zenodo.7691547
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 9, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Geert van Geest; Geert van Geest
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Analysis of four samples of GEO accession GSE119855 with the IBU RNA-seq pipeline

  6. Ngs-Based Rna-Seq Market Analysis North America, Europe, Asia, Rest of World...

    • technavio.com
    pdf
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Ngs-Based Rna-Seq Market Analysis North America, Europe, Asia, Rest of World (ROW) - US, UK, Germany, Singapore, China - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/ngs-based-rna-seq-market-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2024 - 2028
    Area covered
    United States, United Kingdom
    Description

    Snapshot img

    NGS-Based Rna-Seq Market Size 2024-2028

    The NGS-based RNA-seq market size is forecast to increase by USD 6.66 billion, at a CAGR of 20.52% between 2023 and 2028.

    The market is witnessing significant growth, driven by the increased adoption of next-generation sequencing (NGS) methods for RNA-Seq analysis. The advanced capabilities of NGS techniques, such as high-throughput, cost-effectiveness, and improved accuracy, have made them the preferred choice for researchers and clinicians in various fields, including genomics, transcriptomics, and personalized medicine. However, the market faces challenges, primarily from the lack of clinical validation on direct-to-consumer genetic tests. As the use of NGS technology in consumer applications expands, ensuring the accuracy and reliability of results becomes crucial.
    The absence of standardized protocols and regulatory oversight in this area poses a significant challenge to market growth and trust. Companies seeking to capitalize on market opportunities must focus on addressing these challenges through collaborations, partnerships, and investments in research and development to ensure the clinical validity and reliability of their NGS-based RNA-Seq offerings.
    

    What will be the Size of the NGS-based RNA-Seq market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report.
    Request Free Sample

    The market continues to evolve, driven by advancements in NGS technology and its applications across various sectors. Spatial transcriptomics, a novel approach to studying gene expression in its spatial context, is gaining traction in disease research and precision medicine. Splice junction detection, a critical component of RNA-seq data analysis, enhances the accuracy of gene expression profiling and differential gene expression studies. Cloud computing plays a pivotal role in handling the massive amounts of data generated by NGS platforms, enabling real-time data analysis and storage. Enrichment analysis, gene ontology, and pathway analysis facilitate the interpretation of RNA-seq data, while data normalization and quality control ensure the reliability of results.

    Precision medicine and personalized therapy are key applications of RNA-seq, with single-cell RNA-seq offering unprecedented insights into the complexities of gene expression at the single-cell level. Read alignment and variant calling are essential steps in RNA-seq data analysis, while bioinformatics pipelines and RNA-seq software streamline the process. NGS technology is revolutionizing drug discovery by enabling the identification of biomarkers and gene fusion detection in various diseases, including cancer and neurological disorders. RNA-seq is also finding applications in infectious diseases, microbiome analysis, environmental monitoring, agricultural genomics, and forensic science. Sequencing costs are decreasing, making RNA-seq more accessible to researchers and clinicians.

    The ongoing development of sequencing platforms, library preparation, and sample preparation kits continues to drive innovation in the field. The dynamic nature of the market ensures that it remains a vibrant and evolving field, with ongoing research and development in areas such as data visualization, clinical trials, and sequencing depth.

    How is this NGS-based RNA-Seq industry segmented?

    The NGS-based RNA-seq industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    End-user
    
      Acamedic and research centers
      Clinical research
      Pharma companies
      Hospitals
    
    
    Technology
    
      Sequencing by synthesis
      Ion semiconductor sequencing
      Single-molecule real-time sequencing
      Others
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        Singapore
    
    
      Rest of World (ROW)
    

    .

    By End-user Insights

    The acamedic and research centers segment is estimated to witness significant growth during the forecast period.

    The global next-generation sequencing (NGS) market for RNA sequencing (RNA-Seq) is primarily driven by academic and research institutions, including those from universities, research institutes, government entities, biotechnology organizations, and pharmaceutical companies. These institutions utilize NGS technology for various research applications, such as whole-genome sequencing, epigenetics, and emerging fields like agrigenomics and animal research, to enhance crop yield and nutritional composition. NGS-based RNA-Seq plays a pivotal role in translational research, with significant investments from both private and public organizations fueling its growth. The technology is instrumental in disease research, enabling the identification of nov

  7. A Low-Cost Library Construction Protocol and Data Analysis Pipeline for...

    • plos.figshare.com
    tiff
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lin Wang; Yaqing Si; Lauren K. Dedow; Ying Shao; Peng Liu; Thomas P. Brutnell (2023). A Low-Cost Library Construction Protocol and Data Analysis Pipeline for Illumina-Based Strand-Specific Multiplex RNA-Seq [Dataset]. http://doi.org/10.1371/journal.pone.0026426
    Explore at:
    tiffAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Lin Wang; Yaqing Si; Lauren K. Dedow; Ying Shao; Peng Liu; Thomas P. Brutnell
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The emergence of NextGen sequencing technology has generated much interest in the exploration of transcriptomes. Currently, Illumina Inc. (San Diego, CA) provides one of the most widely utilized sequencing platforms for gene expression analysis. While Illumina reagents and protocols perform adequately in RNA-sequencing (RNA-seq), alternative reagents and protocols promise a higher throughput at a much lower cost. We have developed a low-cost and robust protocol to produce Illumina-compatible (GAIIx and HiSeq2000 platforms) RNA-seq libraries by combining several recent improvements. First, we designed balanced adapter sequences for multiplexing of samples; second, dUTP incorporation in 2nd strand synthesis was used to enforce strand-specificity; third, we simplified RNA purification, fragmentation and library size-selection steps thus drastically reducing the time and increasing throughput of library construction; fourth, we included an RNA spike-in control for validation and normalization purposes. To streamline informatics analysis for the community, we established a pipeline within the iPlant Collaborative. These scripts are easily customized to meet specific research needs and improve on existing informatics and statistical treatments of RNA-seq data. In particular, we apply significance tests for determining differential gene expression and intron retention events. To demonstrate the potential of both the library-construction protocol and data-analysis pipeline, we characterized the transcriptome of the rice leaf. Our data supports novel gene models and can be used to improve current rice genome annotation. Additionally, using the rice transcriptome data, we compared different methods of calculating gene expression and discuss the advantages of a strand-specific approach to detect bona-fide anti-sense transcripts and to detect intron retention events. Our results demonstrate the potential of this low cost and robust method for RNA-seq library construction and data analysis.

  8. Supporting data for "Software pipelines for RNA-Seq, ChIP-Seq and Germline...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Kyritsis; Konstantinos Kyritsis; Nikolaos Pechlivanis; Nikolaos Pechlivanis; Fotis Psomopoulos; Fotis Psomopoulos (2023). Supporting data for "Software pipelines for RNA-Seq, ChIP-Seq and Germline Variant calling analyses in Common Workflow Language (CWL)" [Dataset]. http://doi.org/10.5281/zenodo.8116556
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Konstantinos Kyritsis; Konstantinos Kyritsis; Nikolaos Pechlivanis; Nikolaos Pechlivanis; Fotis Psomopoulos; Fotis Psomopoulos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets produced during the validation of CWL-based pipelines, designed for the analysis of data from RNA-Seq, ChIP-Seq and germline variant calling experiments. Specifically, the workflows were tested using publicly available High-throughput (HTS) data from published studies on Chronic Lymphocytic Leukemia (CLL) (accession numbers: E-MTAB-6962, GSE115772) and Genome in a Bottle (GIAB) project samples (accession numbers: SRR6794144, SRR22476789, SRR22476790, SRR22476791).

    The supporting data include:

    • Differential transcript and gene expression results produced during the analysis with the CWL-based RNA-Seq pipeline
    • Bigwig and narrowPeak files, differential binding results, table of consensus peaks and read counts of EZH2 and H3K27me3, produced during the analysis with the CWL-based ChIP-Seq pipeline
    • VCF files containing the detected and filtered variants, along with the respective hap.py () results regarding comparisons against the GIAB golden standard truth sets for both CWL-based germline variant calling pipelines
  9. Efficient Identification of Multiple Pathways: RNA-Seq Analysis of Livers...

    • data.nasa.gov
    Updated Apr 23, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    nasa.gov (2025). Efficient Identification of Multiple Pathways: RNA-Seq Analysis of Livers from 56Fe Ion Irradiated Mice Followers 0 --> [Dataset]. https://data.nasa.gov/dataset/efficient-identification-of-multiple-pathways-rna-seq-analysis-of-livers-from-56fe-ion-irr
    Explore at:
    Dataset updated
    Apr 23, 2025
    Dataset provided by
    NASAhttp://nasa.gov/
    License

    U.S. Government Workshttps://www.usa.gov/government-works
    License information was derived automatically

    Description

    Background: mRNA interactions with each other and other signaling molecules define different biological pathways and functions. Researchers have been investigating various tools to analyze these types of interactions. In particular gene co-expression network methods have proved useful in finding and analyzing these molecular interactions. Many different analytical pipelines to identify these interactions networks have been proposed with the aim of identifying an optimal partition of the network where the individual modules are neither too small to make any general inference or too large to be biologically interpretable. Results: In this study we propose a new pipeline to perform gene co-expression network analysis. The proposed pipeline uses WGCNA a widely used software to perform different aspects of gene co-expression network analysis and modularity maximization algorithm to analyze novel RNA-Seq data to understand the effects of low-dose 56Fe ion irradiation on the formation of hepatocellular carcinoma in mice. The network results along with experimental validation show that using WGCNA combined with Modularity provide a more biologically interpretable network in our dataset. Our pipeline showed better performance than the existing clustering algorithm in WGCNA in finding modules and identified a module with mitochondrial subunits that are supported by mitochondrial complex assay. Conclusions: We present a pipeline that can reduce the problem of parameter selection with the existing algorithm in WGCNA for comparable RNA-Seq datasets which may assist in future research to discover novel mRNA interactions and their downstream molecular effects. C57BL16 males were placed into 2 treatment groups and received the following irradiation treatments at Brookhaven National Laboratories (Long Island NY): 600 MeV/n 56Fe (0.2 Gy) and no irradiation. Left liver lobes were collected at 30 60 120 270 and 360 days post-irradiation flash frozen and stored at -80 xc2 xb0C until they could be processed for RNA-Seq. Livers were sampled by taking two 40-micron thick slices using a cryotome at -20 xc2 xb0C. This allowed multiple sampling of the tissue without the tissue going through multiple freeze/thaw cycles. Total RNA was isolated from the liver slices using RNAqueousTM Total RNA Isolation Kit (ThermoFisher Scientific Waltham MA) and rRNA was removed via Ribo-ZeroTM rRNA Removal Kit (Illumina San Diego CA) prior to library preparation with the Illumina TruSeq RNA Library kit. Samples were sequenced in a paired-end 50 base format on an Illumina HiSeq 1500. Reads were aligned to the mouse GRCm38 reference genome using the STAR alignment program version 2.5.3a with the recommended ENCODE options. The -quantMode GeneCounts option was used to obtain read counts per gene based on the Gencode release M14 annotation file. Total number of reads used in analysis varies between 23-35 millions of reads.

  10. f

    Data from: A comparative study of RNA-Seq and microarray data analysis on...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated May 16, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Bayerlová, Michaela; Wolff, Alexander; Beißbarth, Tim; Kube, Dieter; Gaedcke, Jochen (2018). A comparative study of RNA-Seq and microarray data analysis on the two examples of rectal-cancer patients and Burkitt Lymphoma cells [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000710372
    Explore at:
    Dataset updated
    May 16, 2018
    Authors
    Bayerlová, Michaela; Wolff, Alexander; Beißbarth, Tim; Kube, Dieter; Gaedcke, Jochen
    Description

    BackgroundPipeline comparisons for gene expression data are highly valuable for applied real data analyses, as they enable the selection of suitable analysis strategies for the dataset at hand. Such pipelines for RNA-Seq data should include mapping of reads, counting and differential gene expression analysis or preprocessing, normalization and differential gene expression in case of microarray analysis, in order to give a global insight into pipeline performances.MethodsFour commonly used RNA-Seq pipelines (STAR/HTSeq-Count/edgeR, STAR/RSEM/edgeR, Sailfish/edgeR, TopHat2/Cufflinks/CuffDiff)) were investigated on multiple levels (alignment and counting) and cross-compared with the microarray counterpart on the level of gene expression and gene ontology enrichment. For these comparisons we generated two matched microarray and RNA-Seq datasets: Burkitt Lymphoma cell line data and rectal cancer patient data.ResultsThe overall mapping rate of STAR was 98.98% for the cell line dataset and 98.49% for the patient dataset. Tophat’s overall mapping rate was 97.02% and 96.73%, respectively, while Sailfish had only an overall mapping rate of 84.81% and 54.44%. The correlation of gene expression in microarray and RNA-Seq data was moderately worse for the patient dataset (ρ = 0.67–0.69) than for the cell line dataset (ρ = 0.87–0.88). An exception were the correlation results of Cufflinks, which were substantially lower (ρ = 0.21–0.29 and 0.34–0.53). For both datasets we identified very low numbers of differentially expressed genes using the microarray platform. For RNA-Seq we checked the agreement of differentially expressed genes identified in the different pipelines and of GO-term enrichment results.ConclusionIn conclusion the combination of STAR aligner with HTSeq-Count followed by STAR aligner with RSEM and Sailfish generated differentially expressed genes best suited for the dataset at hand and in agreement with most of the other transcriptomics pipelines.

  11. m

    Data from: RNA Sequencing-Based Single Sample Predictors of Molecular...

    • data.mendeley.com
    Updated Jun 1, 2022
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Johan Vallon-Christersson (2022). RNA Sequencing-Based Single Sample Predictors of Molecular Subtype and Risk of Recurrence for Clinical Assessment of Early-Stage Breast Cancer [Dataset]. http://doi.org/10.17632/yzxtxn4nmd.1
    Explore at:
    Dataset updated
    Jun 1, 2022
    Authors
    Johan Vallon-Christersson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Gene expression data and associated supplementary files from RNAseq of breast cancer samples from Staaf et al. (source reference below). Library preparation for mRNA-sequencing was done by a stranded dUTP mRNA protocol or by Illumina stranded TruSeq mRNA protocol. Expression data (Fragments Per Kilobase per Million reads, FPKM) was generated by an analysis pipeline utilizing Hisat/StringTie with GRCh38 human genome primary assembly and GENCODE Release 27 transcripts/genes. Gene expression data is summarized on GENCODE gene identifier. Gene and transcript definitions and gene annotations are from GENCODE Release 27.

    Detailed description including material and methods for RNAseq, Hisat/StringTie analysis pipeline, and the development of the Single Sample Predictor (SSP) models for Breast Cancer is available in Staaf et al. (source reference below).

    The developed SSP models are available as an R package available at GitHub (reference below).

  12. DESeq2 DGE Analysis Pasilla RNA-Seq Dataset

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). DESeq2 DGE Analysis Pasilla RNA-Seq Dataset [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/deseq2-dge-analysis-pasilla-rna-seq-dataset
    Explore at:
    zip(43449 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains RNA-Seq differential gene expression (DGE) analysis data.

    It is derived from the Pasilla fruit fly dataset.

    The data is processed using DESeq2, a widely-used tool for DGE analysis in R.

    It includes gene counts, normalized counts, and statistical test results.

    Users can explore differentially expressed genes between experimental conditions.

    The dataset is suitable for transcriptomics, bioinformatics, and genomics research.

    It can be used for benchmarking DGE analysis pipelines.

    The dataset provides reproducible examples for learning DESeq2 workflows.

    The source data is publicly available from the original Pasilla RNA-Seq study.

    The dataset can be used to visualize and interpret RNA-Seq results in R.

    It is ideal for researchers, students, and data scientists interested in genomics.

    The dataset helps understand gene expression changes under experimental conditions.

  13. [RNAseq] Aging, Dementia and TBI

    • kaggle.com
    zip
    Updated Feb 9, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alberto Zorzetto (2022). [RNAseq] Aging, Dementia and TBI [Dataset]. https://www.kaggle.com/datasets/albertozorzetto/rnaseq-aging-dementia-and-tbi
    Explore at:
    zip(129095902 bytes)Available download formats
    Dataset updated
    Feb 9, 2022
    Authors
    Alberto Zorzetto
    License

    https://creativecommons.org/publicdomain/zero/1.0/https://creativecommons.org/publicdomain/zero/1.0/

    Description

    Dataset

    The data set includes 377 RNA-Seq samples collected from hippocampus, temporal cortex, and parietal cortex (both grey and white matter) in 55 aged donors with TBI and their matched controls (107 donors total after QC). Additional donor meta-data, neuropathology metrics, and IHC image quantifications for each sample are available.

    The sequencing results were aligned and aggregated at the gene level using the RSEM algorithm, and the resulting *fpkm *values were normalized across all samples within each brain region to account for processing batch and RNA quality (RIN).

    Description

    • fpkm_table_unnormalized.csv : Contains the (row, column) matrix of fpkm values obtained for each (gene, sample) from the RSEM analysis pipeline

      • The first row contains the unique identifiers of the RNA-seq profiles of the samples (rnaseq_profile_id)
      • The first column contains the gene unique identifiers (gene_id)
    • fpkm_table_normalized.csv: Contains the (row, column) matrix of fpkm values obtained for each (gene, sample) after correcting for RIN and batch effects. These are the data displayed in the RNA-Seq page heatmap

      • The first row contains the unique identifiers of the RNA-seq profiles of the samples (rnaseq_profile_id)
      • The first column contains the gene unique identifiers (gene_id)
    • columns-samples.csv: Contains information about the samples profiled with RNA sequencing

      • rnaseq_profile_id Expression profile obtained from aligning the RNA-Seq data to the GRCh38.p2 reference genome.
      • donor_id and donor_name Donor from which the sample was dissected
      • specimen_id and specimen_name Specimen from which the sample was dissected (i.e., a particular brain structure from a particular donor)
      • rna_well_id Unique identifier of the sample.
      • polygon_id Unique identifier of an avg_graphic_object that outlines where the sample was cut from.
      • structure_id, structure_abbreviation, structure_color, structure_name Label that groups samples by brain region (hippocampus, temporal cortex, parietal cortex, and forebrain white matter).
      • hemisphere Hemisphere from which the processed sample was collected
    • rows-genes.csv: Contains information about the genes for which fpkm values were calculated.

      • gene_id: Unique identifier for the gene.
      • chromosome: Chromosome associated with the gene.
      • gene_entrez_id, gene_symbol, gene_name: entrez_id, NCBI symbol, and name of the gene.
  14. D

    RNA Sequencing Technologies Market Report | Global Forecast From 2025 To...

    • dataintelo.com
    csv, pdf, pptx
    Updated Jan 7, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). RNA Sequencing Technologies Market Report | Global Forecast From 2025 To 2033 [Dataset]. https://dataintelo.com/report/global-rna-sequencing-technologies-market
    Explore at:
    pdf, pptx, csvAvailable download formats
    Dataset updated
    Jan 7, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    RNA Sequencing Technologies Market Outlook



    The global RNA sequencing technologies market size was valued at $2.3 billion in 2023 and is poised to grow to $9.7 billion by 2032, exhibiting a robust CAGR of 16.9% during the forecast period. This impressive growth can be attributed to the increasing demand for personalized medicine and advancements in biotechnology, which have propelled the adoption of RNA sequencing technologies across various sectors.



    The primary growth factor driving the RNA sequencing technologies market is the increasing focus on personalized medicine. As healthcare moves towards more targeted and individualized treatment plans, RNA sequencing enables a deeper understanding of the genetic and molecular underpinnings of diseases. This, in turn, facilitates the development of more effective treatments and therapies tailored to individual patients. Additionally, technological advancements in sequencing methods and bioinformatics tools have significantly lowered the costs and increased the accuracy and efficiency of RNA sequencing, further boosting its adoption.



    Another significant growth factor is the rising prevalence of chronic diseases and conditions such as cancer, cardiovascular diseases, and neurological disorders. These complex diseases require detailed molecular and genetic profiling for effective diagnosis and treatment. RNA sequencing provides a comprehensive view of the transcriptome, making it an invaluable tool in the detection and understanding of disease mechanisms. This has led to increased investments in RNA sequencing applications by pharmaceutical and biotechnology companies, as well as academic and research institutions.



    Furthermore, the expanding scope of RNA sequencing in drug discovery and development is a crucial driver of market growth. By offering insights into gene expression and regulation, RNA sequencing helps identify potential drug targets and biomarkers, accelerating the drug development process. This has led to a surge in collaborative research efforts and partnerships between sequencing technology providers and pharmaceutical companies. As the demand for novel therapeutics continues to rise, the role of RNA sequencing in the drug discovery pipeline is expected to become even more significant.



    mRNA Sequencing has emerged as a pivotal component within the broader RNA sequencing technologies landscape. This method focuses on capturing the messenger RNA molecules present in a sample, providing insights into the actively expressed genes at any given moment. The precision of mRNA Sequencing allows researchers to explore the dynamic nature of gene expression, making it invaluable for understanding cellular responses to environmental changes, disease states, and developmental processes. As the demand for personalized medicine grows, mRNA Sequencing offers the potential to tailor treatments based on an individual's unique gene expression profile, thus enhancing therapeutic efficacy and minimizing adverse effects.



    Regionally, North America holds a dominant position in the RNA sequencing technologies market, attributed to the presence of major biotechnology firms and advanced research infrastructures. Additionally, favorable regulatory environments and substantial government funding for genomics research further support market growth in this region. However, the Asia Pacific region is anticipated to exhibit the highest CAGR during the forecast period, driven by increasing healthcare investments, growing awareness of personalized medicine, and a burgeoning biotech sector.



    Technology Analysis



    Single-cell RNA Sequencing Analysis



    Single-cell RNA sequencing (scRNA-seq) is a powerful technology that enables the analysis of gene expression at the individual cell level, providing a high-resolution view of cellular heterogeneity. This technology has revolutionized our understanding of complex biological systems, including cancer, immune responses, and developmental biology. The ability to profile thousands of cells simultaneously has led to significant advancements in identifying rare cell populations and understanding cellular functions within tissues. As a result, scRNA-seq is increasingly being adopted by academic and research institutions for basic and translational research.



    The market for scRNA-seq is driven by the continuous innovations in sequencing platforms and data analysis tools, which have made the technology more

  15. G

    Single-Nucleus RNA-Seq Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Single-Nucleus RNA-Seq Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/single-nucleus-rna-seq-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Oct 7, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Single-Nucleus RNA-Seq Market Outlook



    According to our latest research, the global Single-Nucleus RNA-Seq market size reached USD 368 million in 2024, reflecting a robust expansion driven by technological advancements and increasing research applications. The market is expected to grow at a CAGR of 17.2% from 2025 to 2033, with the market size projected to reach USD 1.57 billion by 2033. This impressive growth is fueled by the rising adoption of single-nucleus RNA sequencing (snRNA-seq) in various fields such as neuroscience, oncology, and developmental biology, as well as the increasing availability of high-throughput sequencing platforms and advanced bioinformatics tools.




    The growth of the Single-Nucleus RNA-Seq market is primarily propelled by the increasing need to understand cellular heterogeneity at a granular level, particularly in complex tissues such as the brain and tumors. The limitations of traditional bulk RNA sequencing, which averages gene expression across heterogeneous cell populations, have underscored the value of single-nucleus approaches. Researchers and clinicians are leveraging snRNA-seq to unravel disease mechanisms, identify novel therapeutic targets, and develop precision medicine strategies. The adoption of this technology is further accelerated by the emergence of automated instruments, improved sample preparation protocols, and the expanding availability of high-quality consumables, which collectively enhance throughput, reproducibility, and scalability of single-nucleus transcriptomic studies.




    Another significant growth driver for the Single-Nucleus RNA-Seq market is the increasing investment in genomics and transcriptomics research by both public and private sectors. Major funding agencies and governments across North America, Europe, and Asia Pacific are allocating substantial resources to support large-scale single-cell and single-nucleus sequencing projects. Pharmaceutical and biotechnology companies are integrating snRNA-seq into their drug discovery and development pipelines to better understand disease pathogenesis and patient stratification. The proliferation of collaborative initiatives between academic institutions, industry players, and clinical research organizations is also fostering innovation and expanding the application landscape of single-nucleus RNA sequencing across various domains, including immunology and developmental biology.




    The market is also benefiting from the rapid evolution of bioinformatics and data analysis tools tailored specifically for single-nucleus RNA-Seq data. The complexity and volume of data generated by snRNA-seq experiments necessitate sophisticated computational pipelines for quality control, normalization, clustering, and downstream analysis. The development of user-friendly software platforms and cloud-based solutions has democratized access to advanced analytics, enabling researchers with varying levels of computational expertise to derive meaningful insights from their data. This trend is expected to continue as more commercial and open-source solutions emerge, further driving the adoption of single-nucleus RNA sequencing technologies in both research and clinical settings.




    From a regional perspective, North America currently dominates the Single-Nucleus RNA-Seq market, accounting for the largest share in 2024, followed by Europe and Asia Pacific. This leadership is attributed to the presence of leading genomics research centers, robust funding infrastructure, and early adoption of cutting-edge sequencing technologies. However, Asia Pacific is anticipated to witness the fastest growth during the forecast period, supported by increasing investments in life sciences research, expanding biotechnology industry, and growing awareness of precision medicine. Europe is also expected to maintain a significant market share due to strong academic research output and collaborative initiatives in genomics and transcriptomics.





    Product Type Analysis



    The Single-Nucleus RNA-Seq market by product type is segmented into

  16. n

    Transcription start site analysis for heterogenous CD4+ T cells using 5′...

    • data.niaid.nih.gov
    • data-staging.niaid.nih.gov
    • +1more
    zip
    Updated Apr 22, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Akiko Oguchi; Yasuhiro Murakawa (2024). Transcription start site analysis for heterogenous CD4+ T cells using 5′ scRNA-seq [Dataset]. http://doi.org/10.5061/dryad.gtht76hv9
    Explore at:
    zipAvailable download formats
    Dataset updated
    Apr 22, 2024
    Dataset provided by
    RIKEN Center for Integrative Medical Sciences
    Authors
    Akiko Oguchi; Yasuhiro Murakawa
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    These datasets are generated by ReapTEC (read-level pre-filtering and transcribed enhancer call) using 5' single-cell RNA-seq data on human heterogenous CD4+ T cells. By taking advantage of a unique “cap signature” derived from the 5′-end of a transcript, ReapTEC simultaneously profiles gene expression and enhancer activity at nucleotide resolution using 5′-end single-cell RNA-sequencing (5′ scRNA-seq). The detail of ReapTEC pipeline is described in https://github.com/MurakawaLab/ReapTEC.

  17. D

    Long-Read RNA Sequencing Services Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Long-Read RNA Sequencing Services Market Research Report 2033 [Dataset]. https://dataintelo.com/report/long-read-rna-sequencing-services-market
    Explore at:
    csv, pptx, pdfAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Long-Read RNA Sequencing Services Market Outlook




    According to our latest research, the global Long-Read RNA Sequencing Services market size reached USD 1.02 billion in 2024, reflecting robust adoption across research and clinical domains. The market is anticipated to grow at a CAGR of 18.7% from 2025 to 2033, with the forecasted market size expected to reach USD 5.40 billion by 2033. This strong upward trajectory is driven by increasing demand for comprehensive transcriptomic profiling, advances in sequencing technologies, and expanding applications in personalized medicine and disease biomarker discovery.




    One of the primary growth factors fueling the Long-Read RNA Sequencing Services market is the increasing need for high-resolution transcriptome analysis. Traditional short-read sequencing technologies often fail to capture the full complexity of RNA molecules, particularly when it comes to isoform diversity, alternative splicing events, and structural variations. Long-read sequencing platforms, such as PacBio SMRT and Oxford Nanopore, provide the ability to sequence full-length RNA transcripts, enabling researchers to achieve more accurate gene expression profiling and uncover novel transcripts. This capability is particularly valuable in oncology, neurobiology, and rare disease research, where understanding transcriptomic heterogeneity is crucial for biomarker identification and therapeutic development. As research institutions and pharmaceutical companies strive to unlock the intricacies of gene regulation, demand for long-read RNA sequencing continues to surge, driving market expansion.




    Technological advancements and reductions in sequencing costs are further accelerating market growth. The evolution of long-read sequencing platforms has led to improved read accuracy, higher throughput, and reduced turnaround times, making these services more accessible to a broader range of end-users. Innovations such as real-time data analysis, cloud-based bioinformatics solutions, and automated library preparation protocols are streamlining workflows, allowing academic and clinical laboratories to efficiently process large sample volumes. Additionally, the integration of artificial intelligence and machine learning in data analysis pipelines is enhancing the interpretation of complex transcriptomic datasets, offering deeper insights into gene function and disease mechanisms. As service providers continue to invest in technology upgrades and expand their service portfolios, the adoption of long-read RNA sequencing is expected to rise significantly across both developed and emerging markets.




    The growing emphasis on precision medicine and translational research is also a major driver for the Long-Read RNA Sequencing Services market. Healthcare providers and pharmaceutical companies are increasingly leveraging transcriptomic data to stratify patient populations, predict disease progression, and tailor therapeutic interventions. Long-read RNA sequencing enables the detection of clinically relevant fusion genes, rare transcript variants, and alternative splicing events that may be missed by conventional methods. This capability supports the development of targeted therapies and companion diagnostics, particularly in oncology and genetic disorders. Furthermore, regulatory agencies are recognizing the value of comprehensive RNA sequencing data in clinical trials, fostering greater adoption of long-read technologies in the pharmaceutical and biotechnology sectors. As a result, the market is poised for sustained growth, with service providers playing a pivotal role in bridging the gap between research discoveries and clinical applications.




    From a regional perspective, North America currently dominates the Long-Read RNA Sequencing Services market, accounting for the largest revenue share in 2024. This leadership is attributed to the presence of leading sequencing technology developers, a strong network of academic and research institutions, and substantial investments in genomics infrastructure. The region is also characterized by a high level of adoption among pharmaceutical and biotechnology companies, driven by robust funding for precision medicine initiatives and clinical research programs. Meanwhile, Europe and Asia Pacific are emerging as significant growth engines, supported by expanding research collaborations, government initiatives, and rising awareness of the benefits of advanced transcriptomic analysis. As global demand for high-resolution RNA sequencing contin

  18. u

    RNAseq RAW DATA of bacterial interactions with avocado roots

    • portaldelainvestigacion.uma.es
    • figshare.com
    Updated 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Cazorla, Francisco; Tienda, Sandra; Cazorla, Francisco; Tienda, Sandra (2023). RNAseq RAW DATA of bacterial interactions with avocado roots [Dataset]. https://portaldelainvestigacion.uma.es/documentos/67a9c7cf19544708f8c732b5?lang=eu
    Explore at:
    Dataset updated
    2023
    Authors
    Cazorla, Francisco; Tienda, Sandra; Cazorla, Francisco; Tienda, Sandra
    Description

    RNAseq comparing wt strain PcPCL1606 and the derivative mutant AdarB, defective in HPR production. RNA was extracted from the rhizosphere samples using a PowerSoil® RNA extraction kit (Qiagen Iberia S.L., Madrid, Spain) following the manufacturer's instructions and its amount was quantified using a NanoDrop 2000 spectrophotometer (Thermo Fisher Scientific, Waltham, MA, USA). For the RNAseq experiment, the quantity and quality of RNA were verified by the Genomics and Ultrasequencing Service Unit (University of Malaga) and subsequently sequenced using NextSeq550 equipment (Illumina). The raw reads and their subsequent processing were carried out by the Centre for Supercomputing and Bioinnovation (University of Malaga). The bacterial RNAseq data analysis was performed based on a series of software packages adapted to the experimental model. The software components of the RNAseq analysis pipeline included analysis by SeqTrimNext (v.2.0.6) to remove low-quality reads, adapters, organular DNA and contaminant sequences; BOWTIE (v.2.2.9) to align reads to the genomic reference; Samtools (v. 0.1.19), a package of programs to deal directly with the alignment files, reading, writing, editing or viewing the alignment files in SAM/BAM format (http://www.htslib.org/); and TUXEDO tools (http://cole-trapnell-lab.github.io/cufflinks/manual/), used to estimate the aligned RNAseq reads in the different transcripts and estimate their abundance. The abundance of the transcripts was measured in fragments per kilobase of fragments of exon per million reads (fpkm). Once the transcripts and their corresponding estimated fpkm have been assembled, these transcripts were annotated with the known reference set of genes obtained from the database from the annotated reference file. This pipeline is a tool developed by the Andalusian Platform for Bioinformatics (PAB; http://www.scbi.uma.es/site/omics/bioinformatics) for the study of differential expression analysis using data of RNAseq on a genomic reference. The subsequent analysis of differential expression with a method analogous to differentially expressed sequences, and the graphical representation of the expression results was done using the 'cummeRbund' R package (v. 2.42.0). The array of reads in fpkm format generated will be used to obtain a list of differentially expressed genes that showed a p-value less than 0.05.NAseq comparing wt strain PcPCL1606 and the derivative mutant AdarB, defective in HPR production.

  19. d

    Data from: Base editing strategies to convert CAG to CAA diminish the...

    • datadryad.org
    • data.niaid.nih.gov
    • +2more
    zip
    Updated May 31, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Jong-Min Lee (2024). Base editing strategies to convert CAG to CAA diminish the disease-causing mutation in Huntington's disease [Dataset]. http://doi.org/10.5061/dryad.k3j9kd5cb
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 31, 2024
    Dataset provided by
    Dryad
    Authors
    Jong-Min Lee
    Time period covered
    May 2, 2024
    Description

    Base editing strategies to convert CAG to CAA in Huntington's disease

    HD.BE.RNAseq.Meta.Data.230116.csv: Sample characteristics and meta-data

    Description of columns in the metadata file

    • Sample: Name of the sample for each RNAseq sample
    • Cell: HEK293 cells that were used for RNAseq analysis
    • Group: experimental group including empty vector-treated controls (n=4), gRNA 1-tretaed samples (n=4), and gRNA2-treated samples (n=4)
    • Replicate: replicate number
    • PC1: principal component 1 value
    • PC2: principal component 2 value
    • PC3: principal component 3 value
    • PC4: principal component 4 value
    • PC5: principal component 5 value
    • PC6: principal component 6 value
    • PC7: principal component 7 value
    • PC8: principal component 8 value
    • PC9: principal component 9 value
    • PC10: principal component 10 value
    • PC11: principal component 11 value
    • PC12: principal component 12 value

    HD.BE.RNAseq.12.Sample.230116.txt: RNAseq expression data

  20. f

    Supporting Information S1 - A Comparative Study of Techniques for...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Aug 13, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lundberg, Andreas E.; Edson, Janette; Bartlett, Perry F.; Narayanan, Ramesh K.; Marshall, Vikki M.; Wray, Naomi R.; Jhaveri, Dhanisha J.; Zhang, Zong Hong; Robinson, Gregory J.; Bauer, Denis C.; Zhao, Qiong-Yi (2014). Supporting Information S1 - A Comparative Study of Techniques for Differential Expression Analysis on RNA-Seq Data [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001175228
    Explore at:
    Dataset updated
    Aug 13, 2014
    Authors
    Lundberg, Andreas E.; Edson, Janette; Bartlett, Perry F.; Narayanan, Ramesh K.; Marshall, Vikki M.; Wray, Naomi R.; Jhaveri, Dhanisha J.; Zhang, Zong Hong; Robinson, Gregory J.; Bauer, Denis C.; Zhao, Qiong-Yi
    Description

    Figure S1, Venn diagram showing the number of differentially expressed genes identified by two versions of Cuffdiff2. Figure S2, The effects of biological replicates on the differential expression analysis for Cuffdiff v2.0.2. Figure S3, The detected fold changes of all the differentially expressed genes identified by three tools were compared and shown, including DESeq vs. edgeR (top panel), DESeq vs. Cuffdiff2 (middle panel) and edgeR vs. Cuffdiff2 (bottom panel). File S1, Analysis pipelines, methods and examples of commands for differential expression analysis, subsampling fastq files and generating SAM/BAM files based on simulated count values. File S2, The raw count values for genes with high fold changes were picked up by edgeR but not by DESeq. Genes with high fold changes (the absolute value of log2 fold changes larger than 2) identified as DEGs by edgeR but not by DESeq are listed in the file. The gene ID, the log2 fold changes (logFC) and FDR from DESeq, the logFC and FDR from edgeR, the raw count values for the four replicates of sample K (K1–K4) and sample N (N1–N4) are shown in each of the columns. Table S1, Numbers of reads for the human hbr and uhr samples from the MAQC dataset. Table S2, Numbers of reads for the mouse neurosphere samples for treatment groups of K and N (the K_N dataset). Table S3, The number of reads for each individual sample of the LCL3 dataset. Table S4, The definition for TP, FP, TN, FN, TPR and FPR. Table S5, The false positive rate for Cuffdiff2, DESeq and edgeR based on the LCL1 dataset. (ZIP)

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Himangi Srivastava; Drew Ferrell; George V. Popescu (2023). Additional file 1 of NetSeekR: a network analysis pipeline for RNA-Seq time series data [Dataset]. http://doi.org/10.6084/m9.figshare.19090649.v1
Organization logoOrganization logo

Additional file 1 of NetSeekR: a network analysis pipeline for RNA-Seq time series data

Related Article
Explore at:
txtAvailable download formats
Dataset updated
May 30, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Himangi Srivastava; Drew Ferrell; George V. Popescu
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Additional file 1. Brief description of functions implemented in NetSeekR.

Search
Clear search
Close search
Google apps
Main menu