100+ datasets found
  1. s

    Test dataset from: GenErode: a bioinformatics pipeline to investigate genome...

    • figshare.scilifelab.se
    • datasetcatalog.nlm.nih.gov
    • +3more
    application/x-gzip
    Updated Jan 15, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino (2025). Test dataset from: GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species [Dataset]. http://doi.org/10.17044/scilifelab.19248172.v2
    Explore at:
    application/x-gzipAvailable download formats
    Dataset updated
    Jan 15, 2025
    Dataset provided by
    National Bioinformatics Infrastructure Sweden (Stockholm University & Science for Life Laboratory)
    Authors
    Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino
    License

    https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

    Description

    This item contains a test dataset based on Sumatran rhinoceros (Dicerorhinus sumatrensis) whole-genome re-sequencing data that we publish along with the GenErode pipeline (https://github.com/NBISweden/GenErode; Kutschera et al. 2022) and that we reduced in size so that users have the possibility to get familiar with the pipeline before analyzing their own genome-wide datasets. We extracted scaffold ‘Sc9M7eS_2_HRSCAF_41’ of size 40,842,778 bp from the Sumatran rhinoceros genome assembly (Dicerorhinus sumatrensis harrissoni; GenBank accession number GCA_014189135.1) to be used as reference genome in GenErode. Some GenErode steps require the reference genome of a closely related species, so we additionally provide three scaffolds from the White rhinoceros genome assembly (Ceratotherium simum simum; GenBank accession number GCF_000283155.1) with a combined length of 41,195,616 bp that are putatively orthologous to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with gene predictions in GTF format. The repository also contains a Sumatran rhinoceros mitochondrial genome (GenBank accession number NC_012684.1) to be used as reference for the optional mitochondrial mapping step in GenErode. The test dataset contains whole-genome re-sequencing data from three historical and three modern Sumatran rhinoceros samples from the now-extinct Malay Peninsula population from von Seth et al. (2021) that was subsampled to paired-end reads that mapped to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with a small proportion of randomly selected reads that mapped to the Sumatran rhinoceros mitochondrial genome or elsewhere in the genome. For GERP analyses, scaffolds from the genome assemblies of 30 mammalian outgroup species are provided that had reciprocal blast hits to gene predictions from Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’. Further, a phylogeny of the White rhinoceros and the 30 outgroup species including divergence time estimates (in billions of years) from timetree.org is available. Finally, the item contains configuration and metadata files that were used for three separate runs of GenErode to generate the results presented in Kutschera et al. (2022). Bash scripts and a workflow description for the test dataset generation are available in the GenErode GitHub repository (https://github.com/NBISweden/GenErode/docs/extras/test_dataset_generation).

    References: Kutschera VE, Kierczak M, van der Valk T, von Seth J, Dussex N, Lord E, et al. GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics 2022;23:228. https://doi.org/10.1186/s12859-022-04757-0 von Seth J, Dussex N, Díez-Del-Molino D, van der Valk T, Kutschera VE, Kierczak M, et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nature Communications 2021;12:2393.

  2. q

    Genome Solver - A bioinformatics pipeline for community science

    • qubeshub.org
    Updated Feb 20, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Vinayak Mathur; Gaurav Arora; Anne Rosenwald (2024). Genome Solver - A bioinformatics pipeline for community science [Dataset]. http://doi.org/10.25334/SEDX-YS29
    Explore at:
    Dataset updated
    Feb 20, 2024
    Dataset provided by
    QUBES
    Authors
    Vinayak Mathur; Gaurav Arora; Anne Rosenwald
    Description

    The Genome Solver was an NSF-funded project developed as a way to train undergraduate life science faculty in basic web-based tools for bioinformatics. As part of the project we developed a one-day workshop consisting of bioinformatics modules on the theme of bacterial genomics, which we delivered to faculty at colleges and universities around the country. All of our workshop material can be accessed on the QUBESHub website: https://qubeshub.org/community/groups/genomesolver/

  3. f

    Bioinformatics pipelines (Assembly & Annotation)

    • smithsonian.figshare.com
    txt
    Updated Aug 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Adela Roa-Varon; Andrea Quattrini; Santiago Herrera (2024). Bioinformatics pipelines (Assembly & Annotation) [Dataset]. http://doi.org/10.25573/data.26198450.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    Aug 17, 2024
    Dataset provided by
    National Museum of Natural History
    Authors
    Adela Roa-Varon; Andrea Quattrini; Santiago Herrera
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Assembly and annotation scripts.

  4. f

    Filtering steps in bioinformatics pipeline and remaining sequencing reads.

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Jul 11, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Onderdonk, Andrew; Houseman, Andres; Roeselers, Guus; Gerber, Georg K.; Delaney, Mary; Bry, Lynn; Liu, Qing; DuBois, Andrea; Belavusava, Vera; Belzer, Clara; Cavanaugh, Colleen; Yeliseyev, Vladimir (2014). Filtering steps in bioinformatics pipeline and remaining sequencing reads. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001264218
    Explore at:
    Dataset updated
    Jul 11, 2014
    Authors
    Onderdonk, Andrew; Houseman, Andres; Roeselers, Guus; Gerber, Georg K.; Delaney, Mary; Bry, Lynn; Liu, Qing; DuBois, Andrea; Belavusava, Vera; Belzer, Clara; Cavanaugh, Colleen; Yeliseyev, Vladimir
    Description

    Filtering steps in bioinformatics pipeline and remaining sequencing reads.

  5. G

    Bioinformatics Pipelines as a Service Market Research Report 2033

    • growthmarketreports.com
    csv, pdf, pptx
    Updated Aug 29, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Growth Market Reports (2025). Bioinformatics Pipelines as a Service Market Research Report 2033 [Dataset]. https://growthmarketreports.com/report/bioinformatics-pipelines-as-a-service-market
    Explore at:
    pdf, csv, pptxAvailable download formats
    Dataset updated
    Aug 29, 2025
    Dataset authored and provided by
    Growth Market Reports
    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Bioinformatics Pipelines as a Service Market Outlook



    According to our latest research, the global Bioinformatics Pipelines as a Service market size was valued at USD 1.82 billion in 2024, and is anticipated to grow at a robust CAGR of 14.6% from 2025 to 2033. By the end of 2033, the market is forecasted to reach USD 5.73 billion. This growth is primarily driven by the increasing adoption of cloud computing in life sciences, the exponential rise in biological data generation, and the growing need for scalable, cost-effective, and automated bioinformatics solutions across healthcare, pharmaceutical, and research sectors.




    The surge in next-generation sequencing (NGS) and other high-throughput technologies has led to an unprecedented volume of biological data, creating a pressing demand for advanced computational tools. Bioinformatics Pipelines as a Service (BPaaS) addresses this need by offering scalable, automated, and user-friendly platforms that streamline complex data analysis workflows. Researchers and clinicians are increasingly leveraging these services to accelerate genomic, proteomic, and transcriptomic studies. The shift towards precision medicine and the growing importance of biomarker discovery are key growth factors, as BPaaS platforms enable rapid and reproducible analysis, reducing time-to-insight and enhancing research productivity. Furthermore, the integration of artificial intelligence (AI) and machine learning (ML) within these pipelines is further enhancing data interpretation, fostering innovation, and expanding market opportunities.




    Another significant growth driver is the rising demand for cost-effective and flexible bioinformatics solutions among small and medium-sized enterprises (SMEs) and academic institutions. Traditional bioinformatics infrastructure requires substantial investment in hardware, software, and skilled personnel, which can be prohibitive for smaller organizations. BPaaS eliminates these barriers by providing on-demand access to sophisticated analytical tools and computational resources, democratizing access to advanced bioinformatics. This trend is particularly evident in emerging economies, where cloud-based solutions are enabling research institutions and biotechnology startups to participate in cutting-edge life sciences research without heavy capital expenditure. Additionally, the growing collaborations between bioinformatics service providers and pharmaceutical companies are accelerating drug discovery and development pipelines, further propelling market growth.




    Regulatory compliance and data security have also become critical considerations, especially with the increasing use of patient-derived data in clinical and translational research. BPaaS providers are investing in robust security protocols, compliance certifications, and data governance frameworks to address these concerns. The adoption of cloud-based bioinformatics pipelines is being facilitated by advancements in data encryption, multi-factor authentication, and secure data storage solutions, ensuring the protection of sensitive genomic and clinical information. This has instilled greater confidence among healthcare providers and pharmaceutical companies, driving broader acceptance of BPaaS solutions in regulated environments. As a result, the market is witnessing strong demand from both developed and developing regions, with North America and Europe leading in adoption, while Asia Pacific and Latin America are rapidly emerging as high-growth markets.




    From a regional perspective, North America dominated the Bioinformatics Pipelines as a Service market in 2024, accounting for approximately 44% of global revenue, followed by Europe and Asia Pacific. The presence of leading bioinformatics companies, advanced healthcare infrastructure, and substantial investments in genomics research have positioned North America as a key driver of market expansion. Europe is also witnessing significant growth due to increased funding for life sciences research and supportive regulatory frameworks. Meanwhile, Asia Pacific is projected to exhibit the highest CAGR over the forecast period, driven by expanding biotechnology industries, growing government initiatives, and rising adoption of digital health technologies in countries such as China, India, and Japan.



    The emergence of "https://growthmarketreports.com/report/cloud-based-multi-omics-data-warehouse-market" target="_blank">Cloud-Based Multi-Omics D

  6. D

    Bioinformatics Pipelines As A Service Market Research Report 2033

    • dataintelo.com
    csv, pdf, pptx
    Updated Sep 30, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Dataintelo (2025). Bioinformatics Pipelines As A Service Market Research Report 2033 [Dataset]. https://dataintelo.com/report/bioinformatics-pipelines-as-a-service-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Sep 30, 2025
    Dataset authored and provided by
    Dataintelo
    License

    https://dataintelo.com/privacy-and-policyhttps://dataintelo.com/privacy-and-policy

    Time period covered
    2024 - 2032
    Area covered
    Global
    Description

    Bioinformatics Pipelines as a Service Market Outlook



    According to our latest research, the Bioinformatics Pipelines as a Service market size reached USD 2.37 billion globally in 2024. The market is exhibiting robust momentum, growing at a CAGR of 13.2% during the forecast period. By 2033, the market is projected to attain a value of USD 6.71 billion. This impressive growth trajectory is primarily driven by the increasing adoption of next-generation sequencing, expanding applications in personalized medicine, and growing demand for scalable, cloud-based bioinformatics solutions. As per our latest research, the market's expansion is underpinned by the convergence of advanced computational tools and the exponential rise in biological data generation across various sectors.




    A major growth factor fueling the Bioinformatics Pipelines as a Service market is the accelerating pace of genomic and multi-omics research worldwide. The proliferation of high-throughput sequencing technologies has resulted in an unprecedented surge in biological data. This deluge of information necessitates robust, scalable, and automated bioinformatics pipelines that can efficiently process, analyze, and interpret complex datasets. Organizations, ranging from pharmaceutical giants to academic research institutes, are increasingly turning to pipeline-as-a-service models to streamline their workflows, reduce operational overheads, and ensure data reproducibility. The ability to access cutting-edge analytical tools without heavy upfront investments in IT infrastructure is particularly attractive, fostering widespread adoption across both developed and emerging markets.




    Another significant driver is the growing emphasis on personalized medicine and precision healthcare. As clinicians and researchers strive to tailor treatments to individual genetic profiles, the need for sophisticated bioinformatics analysis has never been greater. Bioinformatics Pipelines as a Service platforms enable seamless integration of diverse omics data, supporting the identification of biomarkers, therapeutic targets, and patient-specific interventions. The flexibility of these solutions, combined with their ability to adapt to rapidly evolving scientific methodologies, positions them as indispensable assets in both clinical diagnostics and drug discovery pipelines. Moreover, regulatory agencies are increasingly recognizing the value of standardized, auditable bioinformatics workflows, further accelerating market adoption.




    The expanding application scope of bioinformatics pipelines in non-clinical domains, such as agriculture and crop science, is also contributing to market growth. Researchers in agrigenomics are leveraging these platforms to enhance crop yields, improve disease resistance, and accelerate breeding programs. The integration of metabolomics and proteomics data is enabling deeper insights into plant physiology and stress responses, driving innovation in sustainable agriculture. Additionally, the rise of collaborative research initiatives and public-private partnerships is fostering the development of interoperable, user-friendly pipeline solutions that cater to a broad spectrum of end-users. These trends collectively underscore the transformative potential of Bioinformatics Pipelines as a Service across diverse scientific disciplines.




    From a regional perspective, North America continues to dominate the Bioinformatics Pipelines as a Service market, supported by a robust biotechnology ecosystem, substantial R&D investments, and a favorable regulatory landscape. Europe follows closely, driven by strong academic research networks and government-backed genomics initiatives. The Asia Pacific region is emerging as a high-growth market, fueled by expanding healthcare infrastructure, rising awareness of precision medicine, and increasing participation in international genomics collaborations. Meanwhile, Latin America and the Middle East & Africa are witnessing gradual adoption, with market growth primarily concentrated in major urban centers and research hubs. Despite regional disparities, the global outlook remains overwhelmingly positive, with technological advancements and cross-sector collaborations expected to drive sustained market expansion through 2033.



    Offering Analysis



    The Offering segment of the Bioinformatics Pipelines as a Service market is bifurcated into Platform and S

  7. i

    A Novel Bioinformatics Pipeline and a Machine Learning Approach for...

    • ieee-dataport.org
    Updated Oct 7, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Owen Visser (2025). A Novel Bioinformatics Pipeline and a Machine Learning Approach for Antimicrobial Resistance Phenotypic Prediction [Dataset]. https://ieee-dataport.org/documents/novel-bioinformatics-pipeline-and-machine-learning-approach-antimicrobial-resistance
    Explore at:
    Dataset updated
    Oct 7, 2025
    Authors
    Owen Visser
    Description

    including 5

  8. Appendix N

    • figshare.com
    xlsx
    Updated Nov 16, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    haifei hu (2021). Appendix N [Dataset]. http://doi.org/10.6084/m9.figshare.17020004.v1
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Nov 16, 2021
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    haifei hu
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Appendix N: The website link of the bioinformatics tools and online resources used in this thesis were summarised

  9. f

    Virus reads reported by the bioinformatic pipeline.

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    Updated May 23, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lewandowska, Dagmara W.; Huber, Michael; Schreiber, Peter W.; Bayard, Cornelia; Mueller, Nicolas J.; Schuurmans, Macé M.; Geissberger, Fabienne D.; Zagordi, Osvaldo; Capaul, Riccarda; Ruehe, Bettina; Benden, Christian; Greiner, Michael; Böni, Jürg; Trkola, Alexandra; Zbinden, Andrea (2017). Virus reads reported by the bioinformatic pipeline. [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001808167
    Explore at:
    Dataset updated
    May 23, 2017
    Authors
    Lewandowska, Dagmara W.; Huber, Michael; Schreiber, Peter W.; Bayard, Cornelia; Mueller, Nicolas J.; Schuurmans, Macé M.; Geissberger, Fabienne D.; Zagordi, Osvaldo; Capaul, Riccarda; Ruehe, Bettina; Benden, Christian; Greiner, Michael; Böni, Jürg; Trkola, Alexandra; Zbinden, Andrea
    Description

    Virus reads reported by the bioinformatic pipeline.

  10. f

    Data from: Methy-Pipe: An Integrated Bioinformatics Pipeline for Whole...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    • +1more
    Updated Jun 19, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sun, Kun; Lo, Y. M. Dennis; Sun, Hao; Chiu, Rossa W. K.; Jiang, Peiyong; Lun, Fiona M. F.; Chan, K. C. Allen; Wang, Huating; Guo, Andy M. (2014). Methy-Pipe: An Integrated Bioinformatics Pipeline for Whole Genome Bisulfite Sequencing Data Analysis [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001212226
    Explore at:
    Dataset updated
    Jun 19, 2014
    Authors
    Sun, Kun; Lo, Y. M. Dennis; Sun, Hao; Chiu, Rossa W. K.; Jiang, Peiyong; Lun, Fiona M. F.; Chan, K. C. Allen; Wang, Huating; Guo, Andy M.
    Description

    DNA methylation, one of the most important epigenetic modifications, plays a crucial role in various biological processes. The level of DNA methylation can be measured using whole-genome bisulfite sequencing at single base resolution. However, until now, there is a paucity of publicly available software for carrying out integrated methylation data analysis. In this study, we implemented Methy-Pipe, which not only fulfills the core data analysis requirements (e.g. sequence alignment, differential methylation analysis, etc.) but also provides useful tools for methylation data annotation and visualization. Specifically, it uses Burrow-Wheeler Transform (BWT) algorithm to directly align bisulfite sequencing reads to a reference genome and implements a novel sliding window based approach with statistical methods for the identification of differentially methylated regions (DMRs). The capability of processing data parallelly allows it to outperform a number of other bisulfite alignment software packages. To demonstrate its utility and performance, we applied it to both real and simulated bisulfite sequencing datasets. The results indicate that Methy-Pipe can accurately estimate methylation densities, identify DMRs and provide a variety of utility programs for downstream methylation data analysis. In summary, Methy-Pipe is a useful pipeline that can process whole genome bisulfite sequencing data in an efficient, accurate, and user-friendly manner. Software and test dataset are available at http://sunlab.lihs.cuhk.edu.hk/methy-pipe/.

  11. f

    Table_1_Comparison of Bioinformatics Pipelines and Operating Systems for the...

    • datasetcatalog.nlm.nih.gov
    • figshare.com
    • +1more
    Updated Jun 17, 2020
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Mirabelli, Peppino; Soricelli, Andrea; Mombelli, Elisa; Festari, Cristina; Greub, Gilbert; Gurry, Thomas; Mazzelli, Monica; Lopizzo, Nicola; Ribaldi, Federica; Cattaneo, Annamaria; Marizzoni, Moira; Frisoni, Giovanni B.; Provasi, Stefania; Salvatore, Marco; Franzese, Monica (2020). Table_1_Comparison of Bioinformatics Pipelines and Operating Systems for the Analyses of 16S rRNA Gene Amplicon Sequences in Human Fecal Samples.xlsx [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0000509442
    Explore at:
    Dataset updated
    Jun 17, 2020
    Authors
    Mirabelli, Peppino; Soricelli, Andrea; Mombelli, Elisa; Festari, Cristina; Greub, Gilbert; Gurry, Thomas; Mazzelli, Monica; Lopizzo, Nicola; Ribaldi, Federica; Cattaneo, Annamaria; Marizzoni, Moira; Frisoni, Giovanni B.; Provasi, Stefania; Salvatore, Marco; Franzese, Monica
    Description

    Amplicon high-throughput sequencing of 16S ribosomal RNA (rRNA) gene is currently the most widely used technique to investigate complex gut microbial communities. Microbial identification might be influenced by several factors, including the choice of bioinformatic pipelines, making comparisons across studies difficult. Here, we compared four commonly used pipelines (QIIME2, Bioconductor, UPARSE and mothur) run on two operating systems (OS) (Linux and Mac), to evaluate the impact of bioinformatic pipeline and OS on the taxonomic classification of 40 human stool samples. We applied the SILVA 132 reference database for all the pipelines. We compared phyla and genera identification and relative abundances across the four pipelines using the Friedman rank sum test. QIIME2 and Bioconductor provided identical outputs on Linux and Mac OS, while UPARSE and mothur reported only minimal differences between OS. Taxa assignments were consistent at both phylum and genus level across all the pipelines. However, a difference in terms of relative abundance was identified for all phyla (p < 0.013) and for the majority of the most abundant genera (p < 0.028), such as Bacteroides (QIIME2: 24.5%, Bioconductor: 24.6%, UPARSE-linux: 23.6%, UPARSE-mac: 20.6%, mothur-linux: 22.2%, mothur-mac: 21.6%, p < 0.001). The use of different bioinformatic pipelines affects the estimation of the relative abundance of gut microbial community, indicating that studies using different pipelines cannot be directly compared. A harmonization procedure is needed to move the field forward.

  12. Supplementary tables

    • figshare.com
    pdf
    Updated Jan 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Javier Montalvo-Arredondo (2022). Supplementary tables [Dataset]. http://doi.org/10.6084/m9.figshare.17704355.v1
    Explore at:
    pdfAvailable download formats
    Dataset updated
    Jan 10, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Javier Montalvo-Arredondo
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Supplementary table 1. USEQ pipeline test.

  13. f

    Description of the parameters of sebnif and their default values.

    • figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kun Sun; Yu Zhao; Huating Wang; Hao Sun (2023). Description of the parameters of sebnif and their default values. [Dataset]. http://doi.org/10.1371/journal.pone.0084500.t001
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOS ONE
    Authors
    Kun Sun; Yu Zhao; Huating Wang; Hao Sun
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Description of the parameters of sebnif and their default values.

  14. R

    Bioinformatics Pipelines as a Service Market Research Report 2033

    • researchintelo.com
    csv, pdf, pptx
    Updated Oct 1, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Research Intelo (2025). Bioinformatics Pipelines as a Service Market Research Report 2033 [Dataset]. https://researchintelo.com/report/bioinformatics-pipelines-as-a-service-market
    Explore at:
    pptx, pdf, csvAvailable download formats
    Dataset updated
    Oct 1, 2025
    Dataset authored and provided by
    Research Intelo
    License

    https://researchintelo.com/privacy-and-policyhttps://researchintelo.com/privacy-and-policy

    Time period covered
    2024 - 2033
    Area covered
    Global
    Description

    Bioinformatics Pipelines as a Service Market Outlook



    According to our latest research, the Global Bioinformatics Pipelines as a Service market size was valued at $1.98 billion in 2024 and is projected to reach $7.61 billion by 2033, expanding at a robust CAGR of 16.1% during the forecast period of 2025–2033. The primary driver fueling this remarkable growth is the surging demand for scalable, automated, and highly efficient bioinformatics solutions across genomics, proteomics, and other omics research domains. The proliferation of next-generation sequencing technologies, coupled with the exponential growth in biological data generation, has necessitated advanced, cloud-based bioinformatics pipelines that can streamline data analysis, reduce turnaround times, and enhance reproducibility for both research and clinical applications. As a result, Bioinformatics Pipelines as a Service (BPaaS) has emerged as a mission-critical enabler, accelerating scientific discovery and innovation in life sciences while democratizing access to high-performance computational tools.



    Regional Outlook



    North America currently holds the largest share of the Bioinformatics Pipelines as a Service market, accounting for over 38% of the global revenue in 2024. This dominance can be attributed to the region’s mature biotechnology and pharmaceutical ecosystem, extensive investments in genomics research, and the presence of leading bioinformatics service providers and cloud computing giants. The United States, in particular, has established a robust regulatory and funding framework that encourages the adoption of advanced digital health solutions, including BPaaS. Major academic research centers and healthcare institutions across North America are increasingly leveraging these platforms to support precision medicine initiatives, large-scale population genomics projects, and translational research, further solidifying the region’s leadership in this market.



    In contrast, the Asia Pacific region is projected to exhibit the fastest growth, with a remarkable CAGR of 19.3% between 2025 and 2033. This acceleration is underpinned by substantial investments in national genomics programs, expanding biotechnology hubs in countries such as China, India, and South Korea, and the rising adoption of cloud infrastructure. Governments and private players across Asia Pacific are actively fostering public-private partnerships, upgrading research capabilities, and incentivizing digital transformation in healthcare and life sciences. The growing pool of skilled bioinformaticians, coupled with the region’s large and genetically diverse populations, is creating significant opportunities for BPaaS providers to offer tailored solutions for disease research, drug discovery, and personalized medicine.



    Emerging economies in Latin America and Middle East & Africa are gradually embracing bioinformatics pipelines as a service, although market penetration remains constrained by challenges such as limited access to high-speed internet, lower R&D funding, and fragmented healthcare infrastructure. Nonetheless, localized demand for cost-effective and scalable bioinformatics solutions is rising, particularly as academic and clinical institutions seek to participate in global genomics consortia and leverage international expertise. Regulatory harmonization efforts, capacity-building initiatives, and targeted investments in digital health infrastructure are expected to gradually bridge adoption gaps, making these regions promising markets for future expansion.



    Report Scope





    Attributes Details
    Report Title Bioinformatics Pipelines as a Service Market Research Report 2033
    By Component Software, Services
    By Deployment Mode Cloud-based, On-Premises, Hybrid
    By Application Genomics, Proteomics, Transcriptomics, Metabolomics, Other

  15. Additional file 3: of iMAP: an integrated bioinformatics and visualization...

    • springernature.figshare.com
    • datasetcatalog.nlm.nih.gov
    • +1more
    html
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Teresia Buza; Triza Tonui; Francesca Stomeo; Christian Tiambo; Robab Katani; Megan Schilling; Beatus Lyimo; Paul Gwakisa; Isabella Cattadori; Joram Buza; Vivek Kapur (2023). Additional file 3: of iMAP: an integrated bioinformatics and visualization pipeline for microbiome data analysis [Dataset]. http://doi.org/10.6084/m9.figshare.8637557.v1
    Explore at:
    htmlAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Teresia Buza; Triza Tonui; Francesca Stomeo; Christian Tiambo; Robab Katani; Megan Schilling; Beatus Lyimo; Paul Gwakisa; Isabella Cattadori; Joram Buza; Vivek Kapur
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Preprocessing report generated automatically by the iMAP to provide a summary of quality control of the reads. The iMAP pipeline automatically saved the output in the “reports” folder as “report2_read_preprocessing.html”. (HTML 3463 kb)

  16. d

    Data from: UnFATE: A comprehensive probe set and bioinformatics pipeline for...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claudio Gennaro Ametrano (2025). UnFATE: A comprehensive probe set and bioinformatics pipeline for phylogeny reconstruction and multilocus barcoding of filamentous ascomycetes (Ascomycota, Pezizomycotina) [Dataset]. http://doi.org/10.5061/dryad.tht76hf1x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    Dryad
    Authors
    Claudio Gennaro Ametrano
    Time period covered
    May 12, 2022
    Description

    UnFATE: A Comprehensive Probe Set and Bioinformatics Pipeline for Phylogeny Reconstruction and Multilocus Barcoding of Filamentous Ascomycetes (Ascomycota, Pezizomycotina)

    The repository includes the representative sequences of the UnFATE 195 genes and the baits designed from them, the single locus trees, alignments and final phylogenies for the proof of concept Pezizomycotina phylogeny inferred using the universal probe set and the pipeline we developed (files ending in "Pezizo_pilotTE"). It also includes the supermatrices and single locus alignments generated by mining the 195 genes of our gene set from publicly available genome, used in published phylogenomic inferences.

    File description: the following tar.gz files contain the the reference sequences (unfate_markers_reference_sequences_DNA.tar.gz) obtained from the clustering approach adopted to find the best representative sequences to build the universal bait set (baits.tar.gz).

    See Ametrano et al. 2025 (Systemati...

  17. M

    Bioinformatics Services Market Grows from USD 2.9 Billion to 10.7 Billion by...

    • media.market.us
    Updated Oct 8, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Market.us Media (2025). Bioinformatics Services Market Grows from USD 2.9 Billion to 10.7 Billion by 2033 [Dataset]. https://media.market.us/bioinformatics-services-market-news-2025/
    Explore at:
    Dataset updated
    Oct 8, 2025
    Dataset authored and provided by
    Market.us Media
    License

    https://media.market.us/privacy-policyhttps://media.market.us/privacy-policy

    Time period covered
    2022 - 2032
    Description

    Overview

    The Global Bioinformatics Services Market is projected to reach USD 10.7 billion by 2033, growing from USD 2.9 billion in 2023 at a CAGR of 13.9%. Growth is being driven by the rapid expansion of genomic and health data generation across research institutions, healthcare systems, and public-health agencies. The World Health Organization’s Global Genomic Surveillance Strategy has positioned bioinformatics as a core element in detecting and responding to health threats. This policy direction is reinforcing global demand for scalable analytical platforms, secure data sharing, and sustainable workflow solutions.

    A fundamental growth catalyst is the declining cost of sequencing. According to the U.S. National Human Genome Research Institute, the cost per genome has decreased sharply since the late 2000s. As sequencing becomes more affordable, the number of samples increases, driving demand for downstream data storage, processing, and interpretation. Consequently, outsourcing bioinformatics tasks to specialized service providers has become more common and cost-effective.

    Another major factor supporting market expansion is the rise in publicly available genomic data. The NIH Sequence Read Archive (SRA) surpassed 50 petabases of data by early 2024, requiring large-scale indexing, quality control, and reanalysis. This massive data load necessitates professional expertise and infrastructure, which are primarily offered by bioinformatics service companies.

    The integration of genomics into healthcare systems is further strengthening market growth. The NHS Genomic Medicine Service in England is expanding clinical genomics applications in oncology and rare disease management. This transition creates sustained demand for validated bioinformatics pipelines, variant curation, and clinical reporting services. Healthcare institutions increasingly depend on external service providers for secure, clinical-grade analysis pipelines and data governance compliance, ensuring both accuracy and confidentiality in genomic interpretation.

    Emerging Opportunities and Regional Investments

    Public health initiatives and global investments are enhancing the bioinformatics services landscape. Programs like the U.S. CDC’s Advanced Molecular Detection and ECDC’s sequencing integration are driving large-scale genomic surveillance. These initiatives require ongoing analysis, pipeline standardization, and data-platform management, which are largely delivered through external service providers. As countries institutionalize sequencing, recurring demand for bioinformatics workflows and analytic services is expected to persist.

    In low- and middle-income countries, international investment is expanding market opportunities. The World Bank’s genomic capacity-building programs in Africa are fostering sequencing and analytics infrastructure. These efforts include bioinformatics training and workflow design, ensuring long-term sustainability. Such projects significantly widen the global serviceable market for bioinformatics expertise. Similarly, large-scale national genomic initiatives like the NIH All of Us program generate billions of variants that require harmonization, annotation, and interpretation, sustaining demand for cloud-based data management and analytic platforms.

    The growing focus on antimicrobial resistance (AMR) is also fueling bioinformatics adoption. Under WHO’s GLASS platform, countries are integrating whole-genome sequencing into AMR surveillance. This expansion is creating consistent demand for quality assurance, centralized analysis hubs, and workflow optimization. Furthermore, data governance reforms by the OECD and other regulatory bodies are facilitating secure secondary use of genomic data, promoting trust in data sharing and collaboration.

    Strategic public funding further strengthens the market outlook. Horizon Europe’s Health Work Programme (2025) and NHGRI’s technology initiatives continue to fund large-scale, data-driven research, ensuring a steady flow of contracts for bioinformatics firms. Workforce development is also improving, with national systems such as NHS England expanding bioinformatics training. This capacity building not only supports in-house analytics but also increases outsourcing to handle peak workloads and specialized computational tasks.

    In conclusion, the bioinformatics services market is benefiting from multiple converging factors—technological affordability, global health investments, regulatory clarity, and expanding data ecosystems. These structural developments are shaping a resilient, long-term demand environment for scalable, compliant, and high-quality bioinformatics services worldwide.

    https://market.us/wp-content/uploads/2022/06/Bioinformatics-Services-Market-Size-Forecast-2.jpg" alt="Bioinformatics Services Market Size Forecast">

  18. B

    Bioinformatics Platforms Market Report

    • datainsightsmarket.com
    doc, pdf, ppt
    Updated Jun 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Data Insights Market (2025). Bioinformatics Platforms Market Report [Dataset]. https://www.datainsightsmarket.com/reports/bioinformatics-platforms-market-7647
    Explore at:
    ppt, pdf, docAvailable download formats
    Dataset updated
    Jun 17, 2025
    Dataset authored and provided by
    Data Insights Market
    License

    https://www.datainsightsmarket.com/privacy-policyhttps://www.datainsightsmarket.com/privacy-policy

    Time period covered
    2025 - 2033
    Area covered
    Global
    Variables measured
    Market Size
    Description

    The size of the Bioinformatics Platforms Market market was valued at USD 16.36 Million in 2023 and is projected to reach USD 27.93 Million by 2032, with an expected CAGR of 7.94% during the forecast period. Recent developments include: In June 2022, California's biotechnology research startup LatchBio launched an end-to-end bioinformatics platform for handling big biotech data to accelerate scientific discovery., In March 2022, ARUP launched Rio, a bioinformatics pipeline and analytics platform for better, faster next-generation sequencing test results.. Key drivers for this market are: Increasing Demand for Nucleic Acid and Protein Sequencing, Increasing Initiatives from Governments and Private Organizations; Accelerating Growth of Proteomics and Genomics; Increasing Research on Molecular Biology and Drug Discovery. Potential restraints include: Lack of Well-defined Standards and Common Data Formats for Integration of Data, Data Complexity Concerns and Lack of User-friendly Tools. Notable trends are: Sequence Analysis Platform Segment is Expected Hold a Significant Share Over the Forecast Period.

  19. f

    Data from: Bioinformatics Pipelines for Targeted Resequencing and...

    • datasetcatalog.nlm.nih.gov
    • plos.figshare.com
    Updated Apr 21, 2014
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tothill, Richard W.; Saeed, Isaam; Doyle, Maria A.; Li, Jason; Mar, Victoria; Dobrovic, Alexander; Ryland, Georgina L.; Halgamuge, Saman K.; Thompson, Ella R.; Caramia, Franco; Campbell, Ian G.; Ellul, Jason; McArthur, Grant A.; Wong, Stephen Q.; Goode, David L.; Doig, Ken; Hunter, Sally M.; Papenfuss, Anthony T. (2014). Bioinformatics Pipelines for Targeted Resequencing and Whole-Exome Sequencing of Human and Mouse Genomes: A Virtual Appliance Approach for Instant Deployment [Dataset]. https://datasetcatalog.nlm.nih.gov/dataset?q=0001194308
    Explore at:
    Dataset updated
    Apr 21, 2014
    Authors
    Tothill, Richard W.; Saeed, Isaam; Doyle, Maria A.; Li, Jason; Mar, Victoria; Dobrovic, Alexander; Ryland, Georgina L.; Halgamuge, Saman K.; Thompson, Ella R.; Caramia, Franco; Campbell, Ian G.; Ellul, Jason; McArthur, Grant A.; Wong, Stephen Q.; Goode, David L.; Doig, Ken; Hunter, Sally M.; Papenfuss, Anthony T.
    Description

    Targeted resequencing by massively parallel sequencing has become an effective and affordable way to survey small to large portions of the genome for genetic variation. Despite the rapid development in open source software for analysis of such data, the practical implementation of these tools through construction of sequencing analysis pipelines still remains a challenging and laborious activity, and a major hurdle for many small research and clinical laboratories. We developed TREVA (Targeted REsequencing Virtual Appliance), making pre-built pipelines immediately available as a virtual appliance. Based on virtual machine technologies, TREVA is a solution for rapid and efficient deployment of complex bioinformatics pipelines to laboratories of all sizes, enabling reproducible results. The analyses that are supported in TREVA include: somatic and germline single-nucleotide and insertion/deletion variant calling, copy number analysis, and cohort-based analyses such as pathway and significantly mutated genes analyses. TREVA is flexible and easy to use, and can be customised by Linux-based extensions if required. TREVA can also be deployed on the cloud (cloud computing), enabling instant access without investment overheads for additional hardware. TREVA is available at http://bioinformatics.petermac.org/treva/.

  20. Bioinformatics pipeline for processing ADPr-Seq data

    • figshare.com
    zip
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Alexessander Couto Alves; Emily Wasson (2025). Bioinformatics pipeline for processing ADPr-Seq data [Dataset]. http://doi.org/10.6084/m9.figshare.29066690.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    May 21, 2025
    Dataset provided by
    Figsharehttp://figshare.com/
    figshare
    Authors
    Alexessander Couto Alves; Emily Wasson
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Software to produce figures in ADP-Seq paper, filter reads properly paired to restriction enzyme fragments termini, and quantify the amount of ADPr DNA modification in each restriction enzyme fragment.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino (2025). Test dataset from: GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species [Dataset]. http://doi.org/10.17044/scilifelab.19248172.v2

Test dataset from: GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species

Related Article
Explore at:
application/x-gzipAvailable download formats
Dataset updated
Jan 15, 2025
Dataset provided by
National Bioinformatics Infrastructure Sweden (Stockholm University & Science for Life Laboratory)
Authors
Verena Kutschera; Marcin Kierczak; Tom van der Valk; Johanna von Seth; Nicolas Dussex; Edana Lord; Marianne Dehasque; David W. G. Stanton; Payam Emami Khoonsari; Björn Nystedt; Love Dalén; David Díez del molino
License

https://www.gnu.org/licenses/gpl-3.0.htmlhttps://www.gnu.org/licenses/gpl-3.0.html

Description

This item contains a test dataset based on Sumatran rhinoceros (Dicerorhinus sumatrensis) whole-genome re-sequencing data that we publish along with the GenErode pipeline (https://github.com/NBISweden/GenErode; Kutschera et al. 2022) and that we reduced in size so that users have the possibility to get familiar with the pipeline before analyzing their own genome-wide datasets. We extracted scaffold ‘Sc9M7eS_2_HRSCAF_41’ of size 40,842,778 bp from the Sumatran rhinoceros genome assembly (Dicerorhinus sumatrensis harrissoni; GenBank accession number GCA_014189135.1) to be used as reference genome in GenErode. Some GenErode steps require the reference genome of a closely related species, so we additionally provide three scaffolds from the White rhinoceros genome assembly (Ceratotherium simum simum; GenBank accession number GCF_000283155.1) with a combined length of 41,195,616 bp that are putatively orthologous to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with gene predictions in GTF format. The repository also contains a Sumatran rhinoceros mitochondrial genome (GenBank accession number NC_012684.1) to be used as reference for the optional mitochondrial mapping step in GenErode. The test dataset contains whole-genome re-sequencing data from three historical and three modern Sumatran rhinoceros samples from the now-extinct Malay Peninsula population from von Seth et al. (2021) that was subsampled to paired-end reads that mapped to Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’, along with a small proportion of randomly selected reads that mapped to the Sumatran rhinoceros mitochondrial genome or elsewhere in the genome. For GERP analyses, scaffolds from the genome assemblies of 30 mammalian outgroup species are provided that had reciprocal blast hits to gene predictions from Sumatran rhinoceros scaffold ‘Sc9M7eS_2_HRSCAF_41’. Further, a phylogeny of the White rhinoceros and the 30 outgroup species including divergence time estimates (in billions of years) from timetree.org is available. Finally, the item contains configuration and metadata files that were used for three separate runs of GenErode to generate the results presented in Kutschera et al. (2022). Bash scripts and a workflow description for the test dataset generation are available in the GenErode GitHub repository (https://github.com/NBISweden/GenErode/docs/extras/test_dataset_generation).

References: Kutschera VE, Kierczak M, van der Valk T, von Seth J, Dussex N, Lord E, et al. GenErode: a bioinformatics pipeline to investigate genome erosion in endangered and extinct species. BMC Bioinformatics 2022;23:228. https://doi.org/10.1186/s12859-022-04757-0 von Seth J, Dussex N, Díez-Del-Molino D, van der Valk T, Kutschera VE, Kierczak M, et al. Genomic insights into the conservation status of the world’s last remaining Sumatran rhinoceros populations. Nature Communications 2021;12:2393.

Search
Clear search
Close search
Google apps
Main menu