100+ datasets found
  1. f

    DataSheet_1_Read Mapping and Transcript Assembly: A Scalable and...

    • frontiersin.figshare.com
    pdf
    Updated Jun 1, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sateesh Peri; Sarah Roberts; Isabella R. Kreko; Lauren B. McHan; Alexandra Naron; Archana Ram; Rebecca L. Murphy; Eric Lyons; Brian D. Gregory; Upendra K. Devisetty; Andrew D. L. Nelson (2023). DataSheet_1_Read Mapping and Transcript Assembly: A Scalable and High-Throughput Workflow for the Processing and Analysis of Ribonucleic Acid Sequencing Data.pdf [Dataset]. http://doi.org/10.3389/fgene.2019.01361.s001
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    Frontiers
    Authors
    Sateesh Peri; Sarah Roberts; Isabella R. Kreko; Lauren B. McHan; Alexandra Naron; Archana Ram; Rebecca L. Murphy; Eric Lyons; Brian D. Gregory; Upendra K. Devisetty; Andrew D. L. Nelson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Next-generation RNA-sequencing is an incredibly powerful means of generating a snapshot of the transcriptomic state within a cell, tissue, or whole organism. As the questions addressed by RNA-sequencing (RNA-seq) become both more complex and greater in number, there is a need to simplify RNA-seq processing workflows, make them more efficient and interoperable, and capable of handling both large and small datasets. This is especially important for researchers who need to process hundreds to tens of thousands of RNA-seq datasets. To address these needs, we have developed a scalable, user-friendly, and easily deployable analysis suite called RMTA (Read Mapping, Transcript Assembly). RMTA can easily process thousands of RNA-seq datasets with features that include automated read quality analysis, filters for lowly expressed transcripts, and read counting for differential expression analysis. RMTA is containerized using Docker for easy deployment within any compute environment [cloud, local, or high-performance computing (HPC)] and is available as two apps in CyVerse's Discovery Environment, one for normal use and one specifically designed for introducing undergraduates and high school to RNA-seq analysis. For extremely large datasets (tens of thousands of FASTq files) we developed a high-throughput, scalable, and parallelized version of RMTA optimized for launching on the Open Science Grid (OSG) from within the Discovery Environment. OSG-RMTA allows users to utilize the Discovery Environment for data management, parallelization, and submitting jobs to OSG, and finally, employ the OSG for distributed, high throughput computing. Alternatively, OSG-RMTA can be run directly on the OSG through the command line. RMTA is designed to be useful for data scientists, of any skill level, interested in rapidly and reproducibly analyzing their large RNA-seq data sets.

  2. The output and the log files from RNA-Seq workflow benchmark for CWL-metrics...

    • zenodo.org
    • data.niaid.nih.gov
    • +1more
    application/gzip
    Updated Jan 24, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tazro Ohta; Tazro Ohta (2020). The output and the log files from RNA-Seq workflow benchmark for CWL-metrics manuscript [Dataset]. http://doi.org/10.5281/zenodo.2586547
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Tazro Ohta; Tazro Ohta
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The output files and log files generated by the workflow executions for RNA-Seq workflow benchmark by CWL-metrics, from the manuscript "Accumulating computational resource usage of genomic data analysis workflow to optimize cloud computing instance selection" (https://doi.org/10.1101/456756).

  3. m

    CWL run of RNA-seq Analysis Workflow (CWLProv 0.5.0 Research Object)

    • data.mendeley.com
    • data.niaid.nih.gov
    • +3more
    Updated Dec 4, 2018
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Farah Zaib Khan (2018). CWL run of RNA-seq Analysis Workflow (CWLProv 0.5.0 Research Object) [Dataset]. http://doi.org/10.17632/xnwncxpw42.1
    Explore at:
    Dataset updated
    Dec 4, 2018
    Authors
    Farah Zaib Khan
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This workflow adapts the approach and parameter settings of Trans-Omics for precision Medicine (TOPMed). The RNA-seq pipeline originated from the Broad Institute. There are in total five steps in the workflow starting from:

    1. Read alignment using STAR which produces aligned BAM files including the Genome BAM and Transcriptome BAM.
    2. The Genome BAM file is processed using Picard MarkDuplicates producing an updated BAM file containing information on duplicate reads (such reads can indicate biased interpretation).
    3. SAMtools index is then employed to generate an index for the BAM file, in preparation for the next step.
    4. The indexed BAM file is processed further with RNA-SeQC which takes the BAM file, human genome reference sequence and Gene Transfer Format (GTF) file as inputs to generate transcriptome-level expression quantifications and standard quality control metrics.
    5. In parallel with transcript quantification, isoform expression levels are quantified by RSEM. This step depends only on the output of the STAR tool, and additional RSEM reference sequences.

    For testing and analysis, the workflow author provided example data created by down-sampling the read files of a TOPMed public access data. Chromosome 12 was extracted from the Homo Sapien Assembly 38 reference sequence and provided by the workflow authors. The required GTF and RSEM reference data files are also provided. The workflow is well-documented with a detailed set of instructions of the steps performed to down-sample the data are also provided for transparency. The availability of example input data, use of containerization for underlying software and detailed documentation are important factors in choosing this specific CWL workflow for CWLProv evaluation.

    This dataset folder is a CWLProv Research Object that captures the Common Workflow Language execution provenance, see https://w3id.org/cwl/prov/0.5.0 or use https://pypi.org/project/cwl

  4. q

    RNAseq data analysis using Galaxy

    • qubeshub.org
    Updated Jul 2, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Escobar; Sam Donovan; Irina Makarevitch; William (Bill) Morgan; Sabrina Robertson (2021). RNAseq data analysis using Galaxy [Dataset]. http://doi.org/10.25334/XHW8-7189
    Explore at:
    Dataset updated
    Jul 2, 2021
    Dataset provided by
    QUBES
    Authors
    Matthew Escobar; Sam Donovan; Irina Makarevitch; William (Bill) Morgan; Sabrina Robertson
    Description

    This is a bioinformatics exercise intended for use in a computer lab setting with life science majors.

  5. DUAL RNA SEQ Human and Bacteria DESeq2

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). DUAL RNA SEQ Human and Bacteria DESeq2 [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/dual-rna-seq-human-and-bacteria-deseq2
    Explore at:
    zip(258793 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset provides a complete dual RNA-seq workflow integrating human host and bacterial pathogen transcriptomic analysis.

    It is designed to guide researchers through preprocessing, alignment, quantification, and differential expression using DESeq2.

    Dual RNA-seq captures gene expression from both the infected host and the invading pathogen in a single experiment.

    This dataset focuses on demonstrating a reproducible analysis pipeline using RNA-seq count matrices for both human and bacterial genomes.

    The included tutorial explains how to load and structure raw count data for downstream analysis.

    Step-by-step instructions walk through data normalization, exploratory data analysis, and quality checks in DESeq2.

    The workflow highlights differential expression detection for both human and bacterial transcripts.

    Visualizations such as PCA plots, heatmaps, MA plots, and volcano plots are included in the explained workflow.

    The dataset is suitable for beginners learning dual RNA-seq as well as experienced users requiring a reference workflow.

    It demonstrates how simultaneous host–pathogen expression profiling reveals immune responses and bacterial adaptation.

    The resources help users understand how to compare infected versus control samples systematically.

    The dataset is structured to provide clarity, reproducibility, and ease of use for all researchers.

    This dual RNA-seq tutorial shows best practices for handling multi-organism RNA-seq datasets.

    The included R code offers a complete reference for DESeq2-based differential expression analysis.

    Researchers can adapt the workflow to their own host–pathogen datasets and extend it further with custom analyses.

    The content supports education, training, reproducible research, and biological discovery through transcriptomics.

  6. Supporting data for "Software pipelines for RNA-Seq, ChIP-Seq and Germline...

    • zenodo.org
    • data.niaid.nih.gov
    zip
    Updated Sep 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Konstantinos Kyritsis; Konstantinos Kyritsis; Nikolaos Pechlivanis; Nikolaos Pechlivanis; Fotis Psomopoulos; Fotis Psomopoulos (2023). Supporting data for "Software pipelines for RNA-Seq, ChIP-Seq and Germline Variant calling analyses in Common Workflow Language (CWL)" [Dataset]. http://doi.org/10.5281/zenodo.8116556
    Explore at:
    zipAvailable download formats
    Dataset updated
    Sep 27, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Konstantinos Kyritsis; Konstantinos Kyritsis; Nikolaos Pechlivanis; Nikolaos Pechlivanis; Fotis Psomopoulos; Fotis Psomopoulos
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets produced during the validation of CWL-based pipelines, designed for the analysis of data from RNA-Seq, ChIP-Seq and germline variant calling experiments. Specifically, the workflows were tested using publicly available High-throughput (HTS) data from published studies on Chronic Lymphocytic Leukemia (CLL) (accession numbers: E-MTAB-6962, GSE115772) and Genome in a Bottle (GIAB) project samples (accession numbers: SRR6794144, SRR22476789, SRR22476790, SRR22476791).

    The supporting data include:

    • Differential transcript and gene expression results produced during the analysis with the CWL-based RNA-Seq pipeline
    • Bigwig and narrowPeak files, differential binding results, table of consensus peaks and read counts of EZH2 and H3K27me3, produced during the analysis with the CWL-based ChIP-Seq pipeline
    • VCF files containing the detected and filtered variants, along with the respective hap.py () results regarding comparisons against the GIAB golden standard truth sets for both CWL-based germline variant calling pipelines
  7. o

    Transcription profiling by high throughput sequencing of two subspecies of...

    • omicsdi.org
    xml
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Miguel Ramos,Joao Coito,Margarida Rocheta,Helena Silva,Manuela Costa,Jorge Cunha, Transcription profiling by high throughput sequencing of two subspecies of grapevine at four flower developmental stages [Dataset]. https://www.omicsdi.org/dataset/arrayexpress-repository/E-GEOD-56844
    Explore at:
    xmlAvailable download formats
    Authors
    Miguel Ramos,Joao Coito,Margarida Rocheta,Helena Silva,Manuela Costa,Jorge Cunha
    Variables measured
    Transcriptomics,Multiomics
    Description

    Purpose: Next-generation sequencing (NGS) has revolutionized systems-based analysis of cellular pathways. The goals of this study are to compare NGS-derived flower development transcriptome profiling (RNA-seq) of two subspecies Methods: Flower mRNA profiles of wild-type (WT) four developmental stages and the same stages of Vitis vinifera subp vinifera were generated by deep sequencing using Illumina. Initial quality assessment was based on data passing the Illumina Chastity filtering. Subsequently, reads containing adapters and/or PhiX control signal were removed using an in-house filtering protocol. The second quality assessment was based on the remaining reads using the FASTQC quality control tool version 0.10.0. qRT–PCR validation was performed using EvaGreen assays. Results: Using an optimized data analysis workflow, we mapped about 13 to 19 million sequence reads per Vitis sample, 50 bp in length equivalent to 1.5 Gb of total sequence data by each sample. The exception was male stage G (M_G) were only 7 to 8 million sequence reads were obtained. Five genes (VvTFL1, VvLFY, VvAP1, Vv AP3, VvPI), related to flowering development, were used to validate RNA-Seq data and to test for data reproducibility through qRT–PCR. The coefficient of correlation (r) obtained between the log2 of RPKM (RNA-Seq) versus log2 of mRNA average number (RT-qPCR), varied from ≈ 0.97 (VvTLF) to ≈ 0.73 (VvPI) indicating a good correlation between both techniques and thus validating our RNA-Seq results. Conclusions: Our study represents the first detailed transcriptome analysis of four Vitis flower developmental stages, with the same individual, in three genders, generated by RNA-seq technology. The optimized data analysis workflows reported here should provide a framework for comparative investigations of expression profiles. Our results show that NGS offers a comprehensive and accurate quantitative and qualitative evaluation of mRNA contentper developmental stage. We conclude that RNA-seq based transcriptome characterization would expedite genetic network analyses and permit the dissection of complex biologic functions. Flowering mRNA profiles of four developmental stages of Vitis wild type (WT) and the domesticated Vitis were generated by deep sequencing using Illumina HiSeq 2500.

  8. w

    Global RNA Sequencing Technologies Market Research Report: By Technology...

    • wiseguyreports.com
    Updated Aug 12, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Global RNA Sequencing Technologies Market Research Report: By Technology (Microarray, Next-Generation Sequencing, Polymerase Chain Reaction), By Application (Transcription Regulation, Gene Expression Analysis, Clinical Diagnostics), By End Use (Academic Research, Pharmaceutical Companies, Clinical Laboratories), By Workflow (Sample Preparation, Sequencing, Data Analysis) and By Regional (North America, Europe, South America, Asia Pacific, Middle East and Africa) - Forecast to 2035 [Dataset]. https://www.wiseguyreports.com/reports/rna-sequencing-technologies-market
    Explore at:
    Dataset updated
    Aug 12, 2025
    License

    https://www.wiseguyreports.com/pages/privacy-policyhttps://www.wiseguyreports.com/pages/privacy-policy

    Time period covered
    Aug 1, 2025
    Area covered
    Global
    Description
    BASE YEAR2024
    HISTORICAL DATA2019 - 2023
    REGIONS COVEREDNorth America, Europe, APAC, South America, MEA
    REPORT COVERAGERevenue Forecast, Competitive Landscape, Growth Factors, and Trends
    MARKET SIZE 20245.8(USD Billion)
    MARKET SIZE 20256.2(USD Billion)
    MARKET SIZE 203512.0(USD Billion)
    SEGMENTS COVEREDTechnology, Application, End Use, Workflow, Regional
    COUNTRIES COVEREDUS, Canada, Germany, UK, France, Russia, Italy, Spain, Rest of Europe, China, India, Japan, South Korea, Malaysia, Thailand, Indonesia, Rest of APAC, Brazil, Mexico, Argentina, Rest of South America, GCC, South Africa, Rest of MEA
    KEY MARKET DYNAMICSTechnological advancements, Increasing research funding, Growing applications in genomics, Rising prevalence of diseases, Demand for personalized medicine
    MARKET FORECAST UNITSUSD Billion
    KEY COMPANIES PROFILEDQiagen, Geneious, Oxford Nanopore Technologies, Novogene, Roche, Macrogen, PerkinElmer, Thermo Fisher Scientific, Zymo Research, BGI Group, Illumina, Pacific Biosciences, Genomatix Software, Agilent Technologies, 10x Genomics
    MARKET FORECAST PERIOD2025 - 2035
    KEY MARKET OPPORTUNITIESIncreased demand for personalized medicine, Advances in therapeutic RNA applications, Growth in genomic research funding, Expanding use in oncology diagnostics, Integration with AI and data analytics
    COMPOUND ANNUAL GROWTH RATE (CAGR) 6.8% (2025 - 2035)
  9. Additional file 5 of A comprehensive workflow for optimizing RNA-seq data...

    • springernature.figshare.com
    txt
    Updated Aug 18, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gao Jiang; Juan-Yu Zheng; Shu-Ning Ren; Weilun Yin; Xinli Xia; Yun Li; Hou-Ling Wang (2024). Additional file 5 of A comprehensive workflow for optimizing RNA-seq data analysis [Dataset]. http://doi.org/10.6084/m9.figshare.26738397.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 18, 2024
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Gao Jiang; Juan-Yu Zheng; Shu-Ning Ren; Weilun Yin; Xinli Xia; Yun Li; Hou-Ling Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary Material 5.

  10. S

    RNA Sequencing (KYSE450SOX2KO vs KYSE450WT)

    • scidb.cn
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Xinxin; Liu Kuancan (2025). RNA Sequencing (KYSE450SOX2KO vs KYSE450WT) [Dataset]. http://doi.org/10.57760/sciencedb.26117
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 6, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Li Xinxin; Liu Kuancan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data uploaded herein represents the transcriptomic sequencing results from the human esophageal squamous cell carcinoma cell line KYSE450, specifically comparing SOX2-knockout cells (KYSE450SOX2KO) with wild-type controls (KYSE450WT). The data generation process began with the separate cultivation of KYSE450SOX2KO and KYSE450WT cells until they reached a specific growth state. Total RNA was then extracted using commercially available RNA extraction kits. Following extraction, the quality of the isolated total RNA was assessed to ensure it met the requirements for library construction. Qualified RNA samples were used to construct sequencing libraries, involving steps such as RNA fragmentation, cDNA synthesis, end repair, addition of A-tails, and adapter ligation. Finally, high-throughput sequencing was performed using the Illumina platform, generating a large volume of short-read sequence data.The data processing and analysis workflow started with quality control of the raw sequencing data (in .fastq format). Tools such as Fastp were employed to remove low-quality bases, adapter sequences, and ambiguous reads, thereby obtaining high-quality clean reads. Subsequently, these clean reads were aligned to the human reference genome using alignment software. After alignment, tools like featureCounts or HTSeq-count were used to count the number of reads mapping to each gene or transcript, thereby quantifying gene expression levels and generating a gene expression matrix. This matrix records the read counts (Read Count) for each gene within each sample and was ultimately organized and saved in .xls format. The uploaded data files primarily consist of this gene expression matrix, encompassing expression data from two samples (KYSE450SOX2KO and KYSE450WT). The data covers changes in gene expression levels across the entire genome. Temporal and spatial resolution are not applicable, as this is an in vitro cell line study. This dataset provides a foundation for studying the regulatory role of the SOX2 gene in esophageal squamous cell carcinoma and facilitates subsequent analyses such as differential expression analysis and pathway enrichment.

  11. S

    RNA Sequencing (KYSE450SLC8A1KO vs KYSE450WT)

    • scidb.cn
    Updated Jun 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Li Xinxin; Liu Kuancan (2025). RNA Sequencing (KYSE450SLC8A1KO vs KYSE450WT) [Dataset]. http://doi.org/10.57760/sciencedb.26122
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Jun 6, 2025
    Dataset provided by
    Science Data Bank
    Authors
    Li Xinxin; Liu Kuancan
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    The data uploaded herein represents the transcriptomic sequencing results from the human esophageal squamous cell carcinoma cell line KYSE450, specifically comparing SLC8A1-knockout cells (KYSE450SLC8A1KO) with wild-type controls (KYSE450WT). The data generation process began with the separate cultivation of KYSE450SLC8A1KO and KYSE450WT cells until they reached a specific growth state. Qualified RNA samples were used to construct sequencing libraries, involving steps such as RNA fragmentation, cDNA synthesis, end repair, addition of A-tails, and adapter ligation. Finally, high-throughput sequencing was performed using the Illumina platform, generating a large volume of short-read sequence data.The data processing and analysis workflow started with quality control of the raw sequencing data (in .fastq format). Tools such as Fastp were employed to remove low-quality bases, adapter sequences, and ambiguous reads, thereby obtaining high-quality clean reads. Subsequently, these clean reads were aligned to the human reference genome using alignment software. After alignment, tools like featureCounts or HTSeq-count were used to count the number of reads mapping to each gene or transcript, thereby quantifying gene expression levels and generating a gene expression matrix. This matrix records the read counts (Read Count) for each gene within each sample and was ultimately organized and saved in .xls format. The uploaded data files primarily consist of this gene expression matrix, encompassing expression data from two samples (KYSE450SLC8A1KO and KYSE450WT). The data covers changes in gene expression levels across the entire genome. Temporal and spatial resolution are not applicable, as this is an in vitro cell line study. This dataset provides a foundation for studying the regulatory role of the SLC8A1 gene in esophageal squamous cell carcinoma and facilitates subsequent analyses such as differential expression analysis and pathway enrichment.

  12. Z

    Results of "Curare and GenExVis: A versatile toolkit for analyzing and...

    • data.niaid.nih.gov
    Updated Apr 12, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Blumenkamp, Patrick; Pfister, Max; Diedrich, Sonja; Brinkrolf, Karina; Jaenicke, Sebastian; Goesmann, Alexander (2024). Results of "Curare and GenExVis: A versatile toolkit for analyzing and visualizing RNA-Seq data" [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_10362479
    Explore at:
    Dataset updated
    Apr 12, 2024
    Dataset provided by
    University of Giessen
    Authors
    Blumenkamp, Patrick; Pfister, Max; Diedrich, Sonja; Brinkrolf, Karina; Jaenicke, Sebastian; Goesmann, Alexander
    Description

    Even though high-throughput transcriptome sequencing is routinely performed in many laboratories, computational analysis of such data remains a cumbersome process often executed manually, hence error-prone and lacking reproducibility. For corresponding data processing, we introduce Curare, an easy-to-use yet versatile workflow builder for analyzing high-throughput RNA-Seq data focusing on differential gene expression experiments. Data analysis with Curare is customizable and subdivided into preprocessing, quality control, mapping, and downstream analysis stages, providing multiple options for each step while ensuring the reproducibility of the workflow. For a fast and straightforward exploration and visualization of differential gene expression results, we provide the gene expression visualizer software GenExVis. GenExVis can create various charts and tables from simple gene expression tables and DESeq2 results without the requirement to upload data or install software packages.

  13. RNA_Seq_Data_Preprocessing_DGE analysis

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). RNA_Seq_Data_Preprocessing_DGE analysis [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/rna-seq-data-preprocessing-dge-analysis
    Explore at:
    zip(75256 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains RNA-Seq data preprocessing and differential gene expression (DGE) analysis.

    It is designed for researchers, bioinformaticians, and students interested in transcriptomics.

    The dataset includes raw count data and step-by-step preprocessing instructions.

    It demonstrates quality control, normalization, and filtering of RNA-Seq data.

    Differential expression analysis using popular tools and methods is included.

    Results include differentially expressed genes with statistical significance.

    It provides visualizations like PCA plots, heatmaps, and volcano plots.

    The dataset is suitable for learning and reproducing RNA-Seq workflows.

    Both human-readable explanations and code snippets are included for guidance.

    It can serve as a reference for new RNA-Seq projects and research pipelines.

  14. E

    Data from: RaScALL: Rapid (Ra) screening (Sc) of RNA-seq data for...

    • ega-archive.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    RaScALL: Rapid (Ra) screening (Sc) of RNA-seq data for prognostically significant genomic alterations in acute lymphoblastic leukaemia (ALL) [Dataset]. https://ega-archive.org/datasets/EGAD00001009087
    Explore at:
    License

    https://ega-archive.org/dacs/EGAC00001002790https://ega-archive.org/dacs/EGAC00001002790

    Description

    RNA-sequencing (RNA-seq) efforts in acute lymphoblastic leukaemia (ALL) have identified numerous prognostically significant genomic alterations which can guide diagnostic risk stratification and treatment choices when detected early. However, a full RNA-seq Bioinformatics workflow is time-consuming and costly in a clinical setting where rapid detection and accurate reporting of clinically relevant alterations are essential. To accelerate the identification of ALL-specific alterations (including gene fusions, single nucleotide variants and focal gene deletions), we developed the rapid screening tool RaScALL, capable of identifying more than 100 prognostically significant lesions directly from raw sequencing reads. RaScALL uses the k-mer based targeted detection tool km and known ALL variant information to achieve a high degree of accuracy for reporting subtype defining genomic alterations compared to standard alignment-based pipelines. Gene fusions, including difficult to detect fusions involving EPOR and DUX4, were accurately identified in 98% (164 samples) of reported cases in a 180-patient Australian study cohort and 95% (n=63) of samples in a North American validation cohort. Pathogenic sequence variants were correctly identified in 75% of tested samples, including all cases involving subtype defining variants PAX5 p.P80R (n=12) and IKZF1 p.N159Y (n=4). Accurate detection of intragenic IKZF1 deletions resulting in aberrant transcript isoforms was also detectable with 98% accuracy. Importantly, the median analysis time for detection of all targeted alterations averaged 22 minutes per sample, significantly shorter than standard alignment-based approaches, ensuring accelerated risk-stratification and therapeutic triage.

  15. Ngs-Based Rna-Seq Market Analysis North America, Europe, Asia, Rest of World...

    • technavio.com
    pdf
    Updated Aug 15, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2024). Ngs-Based Rna-Seq Market Analysis North America, Europe, Asia, Rest of World (ROW) - US, UK, Germany, Singapore, China - Size and Forecast 2024-2028 [Dataset]. https://www.technavio.com/report/ngs-based-rna-seq-market-analysis
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Aug 15, 2024
    Dataset provided by
    TechNavio
    Authors
    Technavio
    License

    https://www.technavio.com/content/privacy-noticehttps://www.technavio.com/content/privacy-notice

    Time period covered
    2024 - 2028
    Area covered
    United Kingdom, United States
    Description

    Snapshot img

    NGS-Based Rna-Seq Market Size 2024-2028

    The NGS-based RNA-seq market size is forecast to increase by USD 6.66 billion, at a CAGR of 20.52% between 2023 and 2028.

    The market is witnessing significant growth, driven by the increased adoption of next-generation sequencing (NGS) methods for RNA-Seq analysis. The advanced capabilities of NGS techniques, such as high-throughput, cost-effectiveness, and improved accuracy, have made them the preferred choice for researchers and clinicians in various fields, including genomics, transcriptomics, and personalized medicine. However, the market faces challenges, primarily from the lack of clinical validation on direct-to-consumer genetic tests. As the use of NGS technology in consumer applications expands, ensuring the accuracy and reliability of results becomes crucial.
    The absence of standardized protocols and regulatory oversight in this area poses a significant challenge to market growth and trust. Companies seeking to capitalize on market opportunities must focus on addressing these challenges through collaborations, partnerships, and investments in research and development to ensure the clinical validity and reliability of their NGS-based RNA-Seq offerings.
    

    What will be the Size of the NGS-based RNA-Seq market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2018-2022 and forecasts 2024-2028 - in the full report.
    Request Free Sample

    The market continues to evolve, driven by advancements in NGS technology and its applications across various sectors. Spatial transcriptomics, a novel approach to studying gene expression in its spatial context, is gaining traction in disease research and precision medicine. Splice junction detection, a critical component of RNA-seq data analysis, enhances the accuracy of gene expression profiling and differential gene expression studies. Cloud computing plays a pivotal role in handling the massive amounts of data generated by NGS platforms, enabling real-time data analysis and storage. Enrichment analysis, gene ontology, and pathway analysis facilitate the interpretation of RNA-seq data, while data normalization and quality control ensure the reliability of results.

    Precision medicine and personalized therapy are key applications of RNA-seq, with single-cell RNA-seq offering unprecedented insights into the complexities of gene expression at the single-cell level. Read alignment and variant calling are essential steps in RNA-seq data analysis, while bioinformatics pipelines and RNA-seq software streamline the process. NGS technology is revolutionizing drug discovery by enabling the identification of biomarkers and gene fusion detection in various diseases, including cancer and neurological disorders. RNA-seq is also finding applications in infectious diseases, microbiome analysis, environmental monitoring, agricultural genomics, and forensic science. Sequencing costs are decreasing, making RNA-seq more accessible to researchers and clinicians.

    The ongoing development of sequencing platforms, library preparation, and sample preparation kits continues to drive innovation in the field. The dynamic nature of the market ensures that it remains a vibrant and evolving field, with ongoing research and development in areas such as data visualization, clinical trials, and sequencing depth.

    How is this NGS-based RNA-Seq industry segmented?

    The NGS-based RNA-seq industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2024-2028, as well as historical data from 2018-2022 for the following segments.

    End-user
    
      Acamedic and research centers
      Clinical research
      Pharma companies
      Hospitals
    
    
    Technology
    
      Sequencing by synthesis
      Ion semiconductor sequencing
      Single-molecule real-time sequencing
      Others
    
    
    Geography
    
      North America
    
        US
    
    
      Europe
    
        Germany
        UK
    
    
      APAC
    
        China
        Singapore
    
    
      Rest of World (ROW)
    

    .

    By End-user Insights

    The acamedic and research centers segment is estimated to witness significant growth during the forecast period.

    The global next-generation sequencing (NGS) market for RNA sequencing (RNA-Seq) is primarily driven by academic and research institutions, including those from universities, research institutes, government entities, biotechnology organizations, and pharmaceutical companies. These institutions utilize NGS technology for various research applications, such as whole-genome sequencing, epigenetics, and emerging fields like agrigenomics and animal research, to enhance crop yield and nutritional composition. NGS-based RNA-Seq plays a pivotal role in translational research, with significant investments from both private and public organizations fueling its growth. The technology is instrumental in disease research, enabling the identification of nov

  16. G

    Single-Cell RNA Sequencing Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Sep 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Single-Cell RNA Sequencing Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/single-cell-rna-sequencing-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Sep 1, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Single-Cell RNA Sequencing Market Outlook



    According to our latest research, the global single-cell RNA sequencing market size reached USD 2.15 billion in 2024, reflecting robust expansion driven by technological advancements and increasing research investments. The market is forecasted to surge to USD 8.74 billion by 2033, registering a compelling CAGR of 16.9% over the forecast period. This rapid growth is primarily fueled by the escalating demand for high-resolution cellular analysis in biomedical research and clinical diagnostics, as well as the continuous evolution of sequencing technologies and bioinformatics tools.



    A major growth driver in the single-cell RNA sequencing market is the rising focus on precision medicine and personalized therapies. Single-cell RNA sequencing (scRNA-seq) enables researchers to dissect cellular heterogeneity at an unprecedented resolution, which is crucial for understanding complex biological systems and disease mechanisms. The ability to analyze gene expression patterns at the single-cell level is revolutionizing cancer research, immunology, and neurology, allowing for the identification of rare cell populations and novel therapeutic targets. As healthcare systems and pharmaceutical companies increasingly prioritize tailored treatment approaches, the demand for advanced single-cell analysis tools continues to surge, propelling market expansion.



    Another significant factor contributing to the growth of the single-cell RNA sequencing market is the ongoing technological innovation in sequencing platforms, microfluidics, and computational biology. The introduction of high-throughput, cost-effective, and user-friendly sequencing instruments has democratized access to single-cell analysis across academic, clinical, and industrial settings. Moreover, the integration of automation and artificial intelligence in data analysis workflows has substantially reduced turnaround times and minimized technical errors, making single-cell RNA sequencing more accessible and reliable. These advancements have not only broadened the application landscape but have also stimulated collaborations among research institutions, biotechnology firms, and healthcare providers.



    Government and private sector investments in life sciences research are also playing a pivotal role in accelerating the adoption of single-cell RNA sequencing technologies. Numerous national and international initiatives are funding large-scale single-cell projects aimed at mapping cellular diversity in various tissues and diseases. The proliferation of consortia such as the Human Cell Atlas and substantial grants for single-cell genomics research are fostering innovation and infrastructure development. Additionally, the growing awareness of the clinical utility of scRNA-seq in diagnostics and drug discovery is prompting hospitals, clinics, and pharmaceutical companies to incorporate these technologies into their research and development pipelines, further fueling market growth.



    From a regional perspective, North America continues to dominate the single-cell RNA sequencing market, accounting for the largest revenue share in 2024, followed by Europe and Asia Pacific. The regionÂ’s leadership is attributed to the presence of advanced healthcare infrastructure, a strong base of academic and research institutions, and significant investments in genomics and personalized medicine. Meanwhile, Asia Pacific is emerging as a high-growth market, driven by increasing government funding, expanding biotechnology sectors, and rising awareness of advanced molecular diagnostics. As countries in this region enhance their research capabilities and invest in next-generation sequencing technologies, the global landscape of the single-cell RNA sequencing market is expected to witness further diversification and expansion.



    In recent years, the advent of Single-Cell Long-Read Sequencing has further revolutionized the field of genomics by providing a more comprehensive view of the genome at the single-cell level. This technology allows for the sequencing of longer DNA fragments, which significantly improves the accuracy of genome assemblies and the detection of structural variants. Unlike traditional short-read sequencing, Single-Cell Long-Read Sequencing can capture complex genomic regions and repetitive sequences, offering deeper insights into genetic diversity and cellular function. As res

  17. DESeq2 DGE Analysis Pasilla RNA-Seq Dataset

    • kaggle.com
    zip
    Updated Nov 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dr. Nagendra (2025). DESeq2 DGE Analysis Pasilla RNA-Seq Dataset [Dataset]. https://www.kaggle.com/datasets/mannekuntanagendra/deseq2-dge-analysis-pasilla-rna-seq-dataset
    Explore at:
    zip(43449 bytes)Available download formats
    Dataset updated
    Nov 29, 2025
    Authors
    Dr. Nagendra
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    This dataset contains RNA-Seq differential gene expression (DGE) analysis data.

    It is derived from the Pasilla fruit fly dataset.

    The data is processed using DESeq2, a widely-used tool for DGE analysis in R.

    It includes gene counts, normalized counts, and statistical test results.

    Users can explore differentially expressed genes between experimental conditions.

    The dataset is suitable for transcriptomics, bioinformatics, and genomics research.

    It can be used for benchmarking DGE analysis pipelines.

    The dataset provides reproducible examples for learning DESeq2 workflows.

    The source data is publicly available from the original Pasilla RNA-Seq study.

    The dataset can be used to visualize and interpret RNA-Seq results in R.

    It is ideal for researchers, students, and data scientists interested in genomics.

    The dataset helps understand gene expression changes under experimental conditions.

  18. G

    RNA Sequencing Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). RNA Sequencing Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/rna-sequencing-market
    Explore at:
    pptx, csv, pdfAvailable download formats
    Dataset updated
    Aug 23, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    RNA Sequencing Market Outlook



    According to our latest research and industry analysis, the global RNA Sequencing (RNA-Seq) market size in 2024 stands at USD 3.2 billion, driven by the surging demand for advanced genomics solutions in biomedical research and clinical diagnostics. The market is experiencing a robust growth trajectory with a CAGR of 17.6% from 2025 to 2033, projecting the market size to reach USD 11.1 billion by 2033. This rapid expansion is primarily fueled by the escalating adoption of next-generation sequencing (NGS) technologies, increased focus on precision medicine, and the growing prevalence of complex diseases that require comprehensive transcriptomic profiling.




    One of the primary growth factors propelling the RNA Sequencing market is the increasing application of RNA-Seq in the discovery and development of novel therapeutics, particularly in oncology, neurology, and rare genetic disorders. The ability of RNA-Seq to deliver high-throughput, unbiased, and quantitative analysis of transcriptomes has revolutionized the way researchers understand gene expression, alternative splicing, and transcript variants. This has facilitated more accurate biomarker identification, drug target validation, and patient stratification, leading to enhanced personalized medicine approaches. Moreover, the integration of artificial intelligence and machine learning with RNA-Seq data analytics is further accelerating the extraction of actionable insights, thereby amplifying the utility and value proposition of RNA sequencing in both research and clinical settings.




    Another significant growth driver is the continuous technological advancements in sequencing platforms and library preparation protocols, which have substantially improved the accuracy, speed, and cost-effectiveness of RNA-Seq workflows. Innovations such as single-cell RNA sequencing, long-read sequencing, and nanopore-based technologies are enabling researchers to unravel cellular heterogeneity and complex transcriptomic landscapes with unprecedented resolution. Additionally, the decreasing cost of sequencing and the proliferation of user-friendly bioinformatics tools have democratized access to RNA-Seq, empowering academic institutions, hospitals, and even smaller biotech firms to leverage these powerful tools for a wide array of applications, from basic research to translational and clinical studies.




    A third pivotal factor contributing to the market's expansion is the rising investment from both public and private sectors in genomics research and precision healthcare infrastructure. Governments across North America, Europe, and Asia Pacific are launching large-scale genomics initiatives, funding biobanks, and fostering collaborations between academia, industry, and healthcare providers. These efforts are not only expanding the installed base of sequencing instruments but are also driving demand for consumables and sequencing services. Furthermore, the COVID-19 pandemic underscored the critical role of RNA sequencing in pathogen surveillance and vaccine development, which has further entrenched RNA-Seq as an indispensable tool in modern life sciences.




    From a regional standpoint, North America currently dominates the RNA Sequencing market, accounting for the largest share in 2024, owing to its advanced healthcare infrastructure, high R&D expenditure, and presence of leading genomics companies. Europe follows closely, driven by strong government support and a vibrant biotech ecosystem. The Asia Pacific region is emerging as a high-growth market, fueled by increasing investments in genomics, rising healthcare awareness, and expanding research capabilities in countries like China, Japan, and India. Latin America and the Middle East & Africa are gradually catching up, supported by growing collaborations and capacity-building initiatives. The global landscape is thus characterized by dynamic regional trends, evolving regulatory frameworks, and a rapidly expanding user base for RNA sequencing technologies.





    Product & Service Analysis


    <

  19. Optimized Analytical Workflow for Single-Nucleus Transcriptomics in Main...

    • zenodo.org
    Updated Nov 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pengwei Dong; Shitong Ding; Guanlin Wang; Pengwei Dong; Shitong Ding; Guanlin Wang (2024). Optimized Analytical Workflow for Single-Nucleus Transcriptomics in Main Metabolic Tissues [Dataset]. http://doi.org/10.5281/zenodo.14172280
    Explore at:
    Dataset updated
    Nov 19, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Pengwei Dong; Shitong Ding; Guanlin Wang; Pengwei Dong; Shitong Ding; Guanlin Wang
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Nov 16, 2024
    Description

    Single-nucleus RNA sequencing (snRNA-seq) has emerged as a powerful approach for studying cellular heterogeneity in metabolic tissues. However, snRNA-seq analysis remains challenging due to low gene expression and data complexity. Here, we introduce an optimized analytical workflow for snRNA-seq data from 67 samples across four main metabolic tissues white adipose tissue, hypothalamus, muscle and liver. We emphasized the importance of key steps including ambient RNA removal, doublet identification, normalization and data integration to ensure accurate downstream analysis. This workflow offers a valuable resource for researchers in metabolism, facilitating deeper insights into cellular diversity and metabolic function through rigorous snRNA-seq analysis.

  20. e

    Data from: RNA sequence reveals mouse retinal transcriptome changes early...

    • ebi.ac.uk
    Updated Feb 20, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Toru Nakazawa; Yuji Tanaka; Masayuki Yasuda (2014). RNA sequence reveals mouse retinal transcriptome changes early after axonal injury [Dataset]. https://www.ebi.ac.uk/biostudies/studies/E-GEOD-55228
    Explore at:
    Dataset updated
    Feb 20, 2014
    Authors
    Toru Nakazawa; Yuji Tanaka; Masayuki Yasuda
    Description

    Purpose: The purpose of this study was to use RNA-seq to investigate the molecular mechanisms of damage in the early stages of the response to axonal injury, before the onset of RGC death. Methods: 12-week-old wild-type (WT) mice were used in this study. The experiment group underwent an optic nerve crush (ONC) procedure to induce axonal injury in the right eye, and the control group underwent a sham procedure. Retinal mRNA profiles were generated by deep sequencing, in triplicate, using IlluminaHiseq2000. The sequence reads were analyzed by CLC genomics workbench and R software. qRT–PCR validation was performed using TaqMan assays. Results: Using an optimized data analysis workflow, we mapped about 66 million sequence reads per sample to the mouse genome (build mm9). Differential gene expression analysis showed that endoplasmic reticulum stress-related genes and antioxidative response-related genes have been shown to be significantly upregulated 2 days after ONC. Conclusions: Our study represents the first detailed analysis of retinal transcriptomes in the early stages after axonal injury. Our results indicated that ER stress plays a key role under these conditions. Furthermore, the antioxidative defense and immune responses occurred concurrently in the early stages after axonal injury. We believe that our study will lead to a better understanding of and insight into the molecular mechanisms underlying RGC death after axonal injury. Retinal mRNA profiles of 12 week-old wild type (WT) after ONC or sham were generated by deep sequencing, in triplicate, using Illumina Hiseq2000.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Sateesh Peri; Sarah Roberts; Isabella R. Kreko; Lauren B. McHan; Alexandra Naron; Archana Ram; Rebecca L. Murphy; Eric Lyons; Brian D. Gregory; Upendra K. Devisetty; Andrew D. L. Nelson (2023). DataSheet_1_Read Mapping and Transcript Assembly: A Scalable and High-Throughput Workflow for the Processing and Analysis of Ribonucleic Acid Sequencing Data.pdf [Dataset]. http://doi.org/10.3389/fgene.2019.01361.s001

DataSheet_1_Read Mapping and Transcript Assembly: A Scalable and High-Throughput Workflow for the Processing and Analysis of Ribonucleic Acid Sequencing Data.pdf

Related Article
Explore at:
pdfAvailable download formats
Dataset updated
Jun 1, 2023
Dataset provided by
Frontiers
Authors
Sateesh Peri; Sarah Roberts; Isabella R. Kreko; Lauren B. McHan; Alexandra Naron; Archana Ram; Rebecca L. Murphy; Eric Lyons; Brian D. Gregory; Upendra K. Devisetty; Andrew D. L. Nelson
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Next-generation RNA-sequencing is an incredibly powerful means of generating a snapshot of the transcriptomic state within a cell, tissue, or whole organism. As the questions addressed by RNA-sequencing (RNA-seq) become both more complex and greater in number, there is a need to simplify RNA-seq processing workflows, make them more efficient and interoperable, and capable of handling both large and small datasets. This is especially important for researchers who need to process hundreds to tens of thousands of RNA-seq datasets. To address these needs, we have developed a scalable, user-friendly, and easily deployable analysis suite called RMTA (Read Mapping, Transcript Assembly). RMTA can easily process thousands of RNA-seq datasets with features that include automated read quality analysis, filters for lowly expressed transcripts, and read counting for differential expression analysis. RMTA is containerized using Docker for easy deployment within any compute environment [cloud, local, or high-performance computing (HPC)] and is available as two apps in CyVerse's Discovery Environment, one for normal use and one specifically designed for introducing undergraduates and high school to RNA-seq analysis. For extremely large datasets (tens of thousands of FASTq files) we developed a high-throughput, scalable, and parallelized version of RMTA optimized for launching on the Open Science Grid (OSG) from within the Discovery Environment. OSG-RMTA allows users to utilize the Discovery Environment for data management, parallelization, and submitting jobs to OSG, and finally, employ the OSG for distributed, high throughput computing. Alternatively, OSG-RMTA can be run directly on the OSG through the command line. RMTA is designed to be useful for data scientists, of any skill level, interested in rapidly and reproducibly analyzing their large RNA-seq data sets.

Search
Clear search
Close search
Google apps
Main menu