100+ datasets found
  1. Bioinformatics data for paper

    • catalog.data.gov
    • data.amerigeoss.org
    Updated Nov 12, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    U.S. EPA Office of Research and Development (ORD) (2020). Bioinformatics data for paper [Dataset]. https://catalog.data.gov/dataset/bioinformatics-data-for-paper
    Explore at:
    Dataset updated
    Nov 12, 2020
    Dataset provided by
    United States Environmental Protection Agencyhttp://www.epa.gov/
    Description

    Data for sequence comparison of commamox genomes and genes identified. This dataset is associated with the following publication: Camejo, P., J. Santodomingo, K. McMahon, and D. Noguera. Genome-enabled insights into the ecophysiology of the comammox bacterium Ca. Nitrospira nitrosa. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 2(5): 1-16, (2017).

  2. R

    DMP of GenoToul Bioinfo facility

    • entrepot.recherche.data.gouv.fr
    odt
    Updated Dec 14, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claire Hoede; Christine Gaspin; Claire Hoede; Christine Gaspin (2023). DMP of GenoToul Bioinfo facility [Dataset]. http://doi.org/10.57745/ATSYED
    Explore at:
    odt(32706)Available download formats
    Dataset updated
    Dec 14, 2023
    Dataset provided by
    Recherche Data Gouv
    Authors
    Claire Hoede; Christine Gaspin; Claire Hoede; Christine Gaspin
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    The data management plan of the infrastructure concerns three types of data: • Data-U: user data stored on working spaces assigned to user and project accounts. • Data-P: data related to projects supported by GenoToul bioinfo. These data are stored in project or user accounts. They are processed within the framework of collaboration or service provision. • Data-O: data mutualized between users, including workflows, software (compiled code, source code) and databanks (flat files) and associated metadata. They are hosted either on our computing infrastructure and/or on virtual machines.

  3. n

    DAVID

    • neuinfo.org
    • scicrunch.org
    Updated Aug 17, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). DAVID [Dataset]. http://identifiers.org/RRID:SCR_001881
    Explore at:
    Dataset updated
    Aug 17, 2024
    Description

    Bioinformatics resource system including web server and web service for functional annotation and enrichment analyses of gene lists. Consists of comprehensive knowledgebase and set of functional analysis tools. Includes gene centered database integrating heterogeneous gene annotation resources to facilitate high throughput gene functional analysis.

  4. VectorBase (Bioinformatics Resource for Invertebrate Vectors of Human...

    • gbif.org
    Updated Mar 19, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    VectorBase.org (2024). VectorBase (Bioinformatics Resource for Invertebrate Vectors of Human Pathogens) [Dataset]. http://doi.org/10.15468/cqfb4o
    Explore at:
    Dataset updated
    Mar 19, 2024
    Dataset provided by
    Global Biodiversity Information Facilityhttps://www.gbif.org/
    VectorBase.org
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    VectorBase is a National Institute of Allergy and Infectious Diseases (NIAID) Bioinformatics Resource Center (BRC) providing genomic, phenotypic and population-centric data to the scientific community for invertebrate vectors of human pathogens. This includes warehousing standardized population survey data from mosquito abatement districts across the United States, and from international sources. info@vectorbase.org

  5. f

    Dataset for practice session 1 in bioinformatics

    • figshare.com
    txt
    Updated Jul 17, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Elena Sugis (2016). Dataset for practice session 1 in bioinformatics [Dataset]. http://doi.org/10.6084/m9.figshare.3490211.v3
    Explore at:
    txtAvailable download formats
    Dataset updated
    Jul 17, 2016
    Dataset provided by
    figshare
    Authors
    Elena Sugis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Dataset for the practice in the data preprocessing and unsupervised learning in the introduction to bioinformatics course

  6. Bioinformatics Market Analysis, Size, and Forecast 2025-2029: North America...

    • technavio.com
    Updated Jun 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Technavio (2025). Bioinformatics Market Analysis, Size, and Forecast 2025-2029: North America (US, Canada, and Mexico), Europe (France, Germany, Italy, and UK), APAC (China, India, and Japan), and Rest of World (ROW) [Dataset]. https://www.technavio.com/report/bioinformatics-market-industry-analysis
    Explore at:
    Dataset updated
    Jun 19, 2025
    Dataset provided by
    TechNavio
    Authors
    Technavio
    Time period covered
    2021 - 2025
    Area covered
    France, Europe, United Kingdom, Canada, Germany, United States, Global
    Description

    Snapshot img

    Bioinformatics Market Size 2025-2029

    The bioinformatics market size is forecast to increase by USD 15.98 billion at a CAGR of 17.4% between 2024 and 2029.

    The market is experiencing significant growth, driven by the reduction in the cost of genetic sequencing and the development of advanced bioinformatics tools for Next-Generation Sequencing (NGS) technologies. These advancements have led to an increase in the volume and complexity of genomic data, necessitating the need for sophisticated bioinformatics solutions. However, the market faces challenges, primarily the shortage of trained laboratory professionals capable of handling and interpreting the vast amounts of data generated. This skills gap can hinder the effective implementation and utilization of bioinformatics tools, potentially limiting the market's growth potential.
    Companies seeking to capitalize on market opportunities must focus on addressing this challenge by investing in training programs and collaborating with academic institutions. Additionally, data security, data privacy, and regulatory compliance are crucial aspects of the market, ensuring the protection and ethical use of sensitive biological data. Partnerships with technology providers and service organizations can help bridge the gap in expertise and resources, enabling organizations to leverage the power of bioinformatics for research and development, diagnostics, and personalized medicine applications.
    

    What will be the Size of the Bioinformatics Market during the forecast period?

    Explore in-depth regional segment analysis with market size data - historical 2019-2023 and forecasts 2025-2029 - in the full report.
    Request Free Sample

    The market is experiencing significant growth, driven by the increasing demand for precision medicine and the exploration of complex biological systems. Structural variation and gene regulation play crucial roles in gene networks and biological networks, necessitating advanced tools for SNP genotyping and statistical analysis. Precision medicine relies on the identification of mutations and biomarkers through mutation analysis and biomarker validation.
    Metabolic networks, protein microarrays, CDNA microarrays, and RNA microarrays contribute to the discovery of new insights in evolutionary biology and conservation biology. The integration of these technologies enables a comprehensive understanding of gene regulation, gene networks, and metabolic pathways, ultimately leading to the development of novel therapeutics. Protein-protein interactions and signal transduction pathways are essential in understanding protein networks and metabolic pathways. Ontology mapping and predictive modeling facilitate data warehousing and data analytics in this field.
    

    How is this Bioinformatics Industry segmented?

    The bioinformatics industry research report provides comprehensive data (region-wise segment analysis), with forecasts and estimates in 'USD million' for the period 2025-2029, as well as historical data from 2019-2023 for the following segments.

    Application
    
      Molecular phylogenetics
      Transcriptomic
      Proteomics
      Metabolomics
    
    
    Product
    
      Platforms
      Tools
      Services
    
    
    End-user
    
      Pharmaceutical and biotechnology companies
      CROs and research institutes
      Others
    
    
    Geography
    
      North America
    
        US
        Canada
        Mexico
    
    
      Europe
    
        France
        Germany
        Italy
        UK
    
    
      APAC
    
        China
        India
        Japan
    
    
      Rest of World (ROW)
    

    By Application Insights

    The molecular phylogenetics segment is estimated to witness significant growth during the forecast period. In the dynamic and innovative realm of bioinformatics, various technologies and techniques are shaping the future of research and development. Molecular phylogenetics, a significant branch of bioinformatics, employs molecular data to explore the evolutionary connections among species, offering enhanced insights into the intricacies of life. This technique has been instrumental in numerous research domains, such as drug discovery, disease diagnosis, and conservation biology. For instance, it plays a pivotal role in the study of viral evolution. By deciphering the molecular data of distinct virus strains, researchers can trace their evolutionary history and unravel their origins and transmission patterns.

    Furthermore, the integration of proteomic technologies, network analysis, data integration, and systems biology is expanding the scope of bioinformatics research and applications. Bioinformatics services, open-source bioinformatics, and commercial bioinformatics software are vital components of the market, catering to the diverse needs of researchers, industries, and institutions. Bioinformatics databases, including sequence databases and bioinformatics algorithms, are indispensable resources for storing, accessing, and analyzing biological data. In the realm of personalized medicine and drug di

  7. d

    Bioinformatics Links Directory

    • dknet.org
    Updated Jan 29, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2024). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018
    Explore at:
    Dataset updated
    Jan 29, 2022
    Description

    Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.

  8. C

    Bioinformatics for Researchers in Life Sciences: Tools and Learning...

    • data.iadb.org
    csv, pdf
    Updated Apr 10, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    IDB Datasets (2025). Bioinformatics for Researchers in Life Sciences: Tools and Learning Resources [Dataset]. http://doi.org/10.60966/kwvb-wr19
    Explore at:
    csv(276253), pdf(2989058), csv(355108)Available download formats
    Dataset updated
    Apr 10, 2025
    Dataset provided by
    IDB Datasets
    License

    Attribution-NonCommercial-NoDerivs 3.0 (CC BY-NC-ND 3.0)https://creativecommons.org/licenses/by-nc-nd/3.0/
    License information was derived automatically

    Time period covered
    Jan 1, 2020 - Jan 1, 2021
    Description

    The COVID-19 pandemic has shown that bioinformatics--a multidisciplinary field that combines biological knowledge with computer programming concerned with the acquisition, storage, analysis, and dissemination of biological data--has a fundamental role in scientific research strategies in all disciplines involved in fighting the virus and its variants. It aids in sequencing and annotating genomes and their observed mutations; analyzing gene and protein expression; simulation and modeling of DNA, RNA, proteins and biomolecular interactions; and mining of biological literature, among many other critical areas of research. Studies suggest that bioinformatics skills in the Latin American and Caribbean region are relatively incipient, and thus its scientific systems cannot take full advantage of the increasing availability of bioinformatic tools and data. This dataset is a catalog of bioinformatics software for researchers and professionals working in life sciences. It includes more than 300 different tools for varied uses, such as data analysis, visualization, repositories and databases, data storage services, scientific communication, marketplace and collaboration, and lab resource management. Most tools are available as web-based or desktop applications, while others are programming libraries. It also includes 10 suggested entries for other third-party repositories that could be of use.

  9. d

    Data from: Semi-artificial datasets as a resource for validation of...

    • search.dataone.org
    • explore.openaire.eu
    • +2more
    Updated May 21, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Lucie Tamisier; Annelies Haegeman; Yoika Foucart; Nicolas Fouillien; Maher Al Rwahnih; Nihal Buzkan; Thierry Candresse; Michela Chiumenti; Kris De Jonghe; Marie Lefebvre; Paolo Margaria; Jean Sébastien Reynard; Kristian Stevens; Denis Kutnjak; Sébastien Massart (2025). Semi-artificial datasets as a resource for validation of bioinformatics pipelines for plant virus detection [Dataset]. http://doi.org/10.5061/dryad.0zpc866z8
    Explore at:
    Dataset updated
    May 21, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Lucie Tamisier; Annelies Haegeman; Yoika Foucart; Nicolas Fouillien; Maher Al Rwahnih; Nihal Buzkan; Thierry Candresse; Michela Chiumenti; Kris De Jonghe; Marie Lefebvre; Paolo Margaria; Jean Sébastien Reynard; Kristian Stevens; Denis Kutnjak; Sébastien Massart
    Time period covered
    Jan 1, 2021
    Description

    In the last decade, High-Throughput Sequencing (HTS) has revolutionized biology and medicine. This technology allows the sequencing of huge amount of DNA and RNA fragments at a very low price. In medicine, HTS tests for disease diagnostics are already brought into routine practice. However, the adoption in plant health diagnostics is still limited. One of the main bottlenecks is the lack of expertise and consensus on the standardization of the data analysis. The Plant Health Bioinformatic Network (PHBN) is an Euphresco project aiming to build a community network of bioinformaticians/computational biologists working in plant health. One of the main goals of the project is to develop reference datasets that can be used for validation of bioinformatics pipelines and for standardization purposes.

    Semi-artificial datasets have been created for this purpose (Datasets 1 to 10). They are composed of a “real†HTS dataset spiked with artificial viral reads. It will allow researchers to adjust ...

  10. Data from: A bioinformatics approach for integrated transcriptomic and...

    • data.niaid.nih.gov
    • ebi.ac.uk
    xml
    Updated Nov 2, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ceereena Ubaida Mohien; Ceereena Ubaida Mohien (2012). A bioinformatics approach for integrated transcriptomic and proteomic comparative analyses of model and non-sequenced anopheline vectors of human malaria parasites [Dataset]. https://data.niaid.nih.gov/resources?id=pxd000062
    Explore at:
    xmlAvailable download formats
    Dataset updated
    Nov 2, 2012
    Dataset provided by
    Johns Hopkins School of Medicine
    Authors
    Ceereena Ubaida Mohien; Ceereena Ubaida Mohien
    Variables measured
    Proteomics
    Description

    Malaria morbidity and mortality caused by both Plasmodium falciparum and Plasmodium vivax extend well beyond the African continent, and, although P. vivax causes 80-300 million severe cases each year, vivax transmission remains poorly understood. Plasmodium parasites are transmitted by Anopheles mosquitoes, and the critical site of interaction between parasite and host is at the mosquito's luminal midgut brush border. While the genome of the "model" African P. falciparum vector, Anopheles gambiae, has been sequenced, evolutionary divergence limits its utility as a reference across anophelines, especially non-sequenced P. vivax vectors such as Anopheles albimanus. Clearly, enabling technologies and platforms that bridge this substantial scientific gap are required in order to provide public health scientists key transcriptomic and proteomic information that could spur the development of novel interventions to combat this disease. To our knowledge, no approaches have been published which address this issue. To bolster our understanding of P. vivax-An. albimanus midgut interactions, we developed an integrated bioinformatic-hybrid RNA-Seq-LC-MS/MS approach involving An. albimanus transcriptome (15,764 contigs) and luminal midgut subproteome (9,445 proteins) assembly, which, when used with our custom Diptera protein database (685,078 sequences), facilitated a comparative proteomic analysis of the midgut brush borders of two important malaria vectors, An. gambiae and An. albimanus. Summary from: http://www.mcponline.org/content/early/2012/10/17/mcp.M112.019596.long The An. albimanus transcriptome dataset is available at http://funcgen.vectorbase.org/RNAseq/Anopheles_albimanus/INSP/v2

  11. R

    Metabarcoding datasets and bioinformatic scripts for the LEARN_BIOCONTROL...

    • entrepot.recherche.data.gouv.fr
    application/gzip, bin +2
    Updated Feb 8, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Corinne Vacher; Corinne Vacher (2022). Metabarcoding datasets and bioinformatic scripts for the LEARN_BIOCONTROL project [Dataset]. http://doi.org/10.15454/GCYURM
    Explore at:
    bin(250646), bin(543283), tsv(2748), pdf(387366), bin(645743), bin(541587), bin(1238906), bin(1872607), bin(773994373), bin(787737), bin(1206349), bin(9256934), tsv(76411), bin(605657), bin(324084), bin(3082), application/gzip(349277), bin(4676703), bin(169948), bin(2157928), bin(421981), bin(347811), bin(2976302303)Available download formats
    Dataset updated
    Feb 8, 2022
    Dataset provided by
    Recherche Data Gouv
    Authors
    Corinne Vacher; Corinne Vacher
    License

    https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html

    Description

    Metabarcoding datasets describing the foliar mycobiome of grapevine (Vitis vinifera L.) leaves collected in the LEARN-BIOCONTROL project. Bioinformatic scripts used to analyze the sequence data. Raw and filtered ASV table in the QIIME2 and R phyloseq formats.

  12. h

    Bioinformatic Tools for the Prediction of Key Regulatory Molecules in...

    • heidata.uni-heidelberg.de
    application/x-gzip +2
    Updated Jul 7, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pavlo Holenya; Florian Heigwer; Stefan Wölfl; Pavlo Holenya; Florian Heigwer; Stefan Wölfl (2017). Bioinformatic Tools for the Prediction of Key Regulatory Molecules in Signaltransduction Networks [Dataset]. http://doi.org/10.11588/DATA/10056
    Explore at:
    zip(987251), application/x-gzip(923281), zip(972050), application/x-gzip(923987), application/x-gzip(10231642), text/plain; charset=us-ascii(1718), zip(12217650), text/plain; charset=us-ascii(3878)Available download formats
    Dataset updated
    Jul 7, 2017
    Dataset provided by
    heiDATA
    Authors
    Pavlo Holenya; Florian Heigwer; Stefan Wölfl; Pavlo Holenya; Florian Heigwer; Stefan Wölfl
    License

    https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.11588/DATA/10056https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.3/customlicense?persistentId=doi:10.11588/DATA/10056

    Description

    Kinetic Operating Microarray Analyzer (KOMA) enables calibration and high-throughput analysis of quantitative microarray data collected by using kinetic detection protocol. This tool can be also helpful for analyzing data from any other analytical assays employing enzymatic signal amplification, in which a broader range of quantification is reached by the time-resolved recording of readouts.

  13. h

    Bioinformatic Tool for the Analysis of Time-Resolved data: the TReCCA...

    • heidata.uni-heidelberg.de
    pdf, zip
    Updated Nov 29, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Julia Lochead; Julia Schessner; Tobias Werner; Stefan Wölfl; Julia Lochead; Julia Schessner; Tobias Werner; Stefan Wölfl (2017). Bioinformatic Tool for the Analysis of Time-Resolved data: the TReCCA Analyser [Dataset]. http://doi.org/10.11588/DATA/UYHOBR
    Explore at:
    zip(2538057), pdf(1956739), zip(6648484), zip(96157)Available download formats
    Dataset updated
    Nov 29, 2017
    Dataset provided by
    heiDATA
    Authors
    Julia Lochead; Julia Schessner; Tobias Werner; Stefan Wölfl; Julia Lochead; Julia Schessner; Tobias Werner; Stefan Wölfl
    License

    https://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.11588/DATA/UYHOBRhttps://heidata.uni-heidelberg.de/api/datasets/:persistentId/versions/1.1/customlicense?persistentId=doi:10.11588/DATA/UYHOBR

    Description

    The TReCCA Analyser is conceived to facilitate, speed up and intensify the analysis and representation of your time-resolved data, more specically in the case of cell culture assays. Without having to type any formula, it will perform at wish the following calculations: Control condition normalisation. Technical replicate averaging and standard deviation calculation. Smoothing and slope calculation of the data in order to obtain the rate of change. IC50/EC50 determination of a substance in a time-resolved fashion.

  14. R

    Plant Bioinformatics Facility (PlantBioinfoPF) data management plan

    • entrepot.recherche.data.gouv.fr
    docx
    Updated Jul 18, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Celia Michotey; Celia Michotey (2024). Plant Bioinformatics Facility (PlantBioinfoPF) data management plan [Dataset]. http://doi.org/10.15454/9HM5UI
    Explore at:
    docx(380140), docx(391106)Available download formats
    Dataset updated
    Jul 18, 2024
    Dataset provided by
    Recherche Data Gouv
    Authors
    Celia Michotey; Celia Michotey
    License

    https://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.15454/9HM5UIhttps://entrepot.recherche.data.gouv.fr/api/datasets/:persistentId/versions/2.1/customlicense?persistentId=doi:10.15454/9HM5UI

    Description

    Data management plan of the Plant Bioinformatics Facility (PlantBioinfoPF), hosted at URGI. Copyrights: The creator(s) of this plan accept(s) that all or part of the text may be reused and personalized if necessary for another plan. You can cite this plan’s DOI as a source, but this does not imply that the creator(s) endorse(s) or have any connection with your project or submission.

  15. data-for-snakemake-novice-bioinformatics.tar.xz

    • figshare.com
    xz
    Updated Sep 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Timothy Booth (2023). data-for-snakemake-novice-bioinformatics.tar.xz [Dataset]. http://doi.org/10.6084/m9.figshare.19733338.v2
    Explore at:
    xzAvailable download formats
    Dataset updated
    Sep 26, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Timothy Booth
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Sample reads originating from a subsample of RNA-seq data from E-MTAB-4044, including yeast cDNA fasta file.

    For use in the Carpentries Incubator course: https://carpentries-incubator.github.io/snakemake-novice-bioinformatics/setup.html

  16. r

    Bioinformatics Links Directory

    • rrid.site
    • scicrunch.org
    • +3more
    Updated Jul 19, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). Bioinformatics Links Directory [Dataset]. http://identifiers.org/RRID:SCR_008018/resolver
    Explore at:
    Dataset updated
    Jul 19, 2025
    Description

    Database of curated links to molecular resources, tools and databases selected on the basis of recommendations from bioinformatics experts in the field. This resource relies on input from its community of bioinformatics users for suggestions. Starting in 2003, it has also started listing all links contained in the NAR Webserver issue. The different types of information available in this portal: * Computer Related: This category contains links to resources relating to programming languages often used in bioinformatics. Other tools of the trade, such as web development and database resources, are also included here. * Sequence Comparison: Tools and resources for the comparison of sequences including sequence similarity searching, alignment tools, and general comparative genomics resources. * DNA: This category contains links to useful resources for DNA sequence analyses such as tools for comparative sequence analysis and sequence assembly. Links to programs for sequence manipulation, primer design, and sequence retrieval and submission are also listed here. * Education: Links to information about the techniques, materials, people, places, and events of the greater bioinformatics community. Included are current news headlines, literature sources, educational material and links to bioinformatics courses and workshops. * Expression: Links to tools for predicting the expression, alternative splicing, and regulation of a gene sequence are found here. This section also contains links to databases, methods, and analysis tools for protein expression, SAGE, EST, and microarray data. * Human Genome: This section contains links to draft annotations of the human genome in addition to resources for sequence polymorphisms and genomics. Also included are links related to ethical discussions surrounding the study of the human genome. * Literature: Links to resources related to published literature, including tools to search for articles and through literature abstracts. Additional text mining resources, open access resources, and literature goldmines are also listed. * Model Organisms: Included in this category are links to resources for various model organisms ranging from mammals to microbes. These include databases and tools for genome scale analyses. * Other Molecules: Bioinformatics tools related to molecules other than DNA, RNA, and protein. This category will include resources for the bioinformatics of small molecules as well as for other biopolymers including carbohydrates and metabolites. * Protein: This category contains links to useful resources for protein sequence and structure analyses. Resources for phylogenetic analyses, prediction of protein features, and analyses of interactions are also found here. * RNA: Resources include links to sequence retrieval programs, structure prediction and visualization tools, motif search programs, and information on various functional RNAs.

  17. r

    Supplementary Files for thesis titled "Visual-analytics-driven...

    • researchdata.edu.au
    Updated 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Kaur Sandeep (2022). Supplementary Files for thesis titled "Visual-analytics-driven bioinformatics methods for the analysis of biomolecular data" [Dataset]. https://researchdata.edu.au/supplementary-files-thesis-biomolecular-data/2089386
    Explore at:
    Dataset updated
    2022
    Dataset provided by
    UNSW, Sydney
    University of New South Wales
    Authors
    Kaur Sandeep
    License

    https://www.gnu.org/licenses/gpl-3.0.en.htmlhttps://www.gnu.org/licenses/gpl-3.0.en.html

    Description

    This data set provides Supplementary files referenced in the thesis titled "Visual-analytics-driven bioinformatics methods for the analysis of biomolecular data".

    In particular, this data set consists of the following files (Details are also provided in an included README.txt file):

    Description of files in this data set:

    1. Supplementary File 4.1. Supplementary File 4.1 - URL and variants schema.pdf. Graphical Backus-Naur schema of the variant syntax recognized by Aquaria.
    2. Supplementary File 4.2. Supplementary File 4.2 - Schema.json. Aquaria feature set schema. This schema can be utilized in conjunction with user-specified JSON files for validation in online tools such as https://www.jsonschemavalidator.net/ (see Section 4.5.5).
    3. Supplementary File 6.1. Supplementary File 6.1 - Illumina and complete genome IDs.xlsx. NCBI SRA accession identifiers of 673 Illumina (short-read length) and 673 PacBio sequenced genomes (long-read length), corresponding to 673 isolates sequenced using two technologies.
    4. Supplementary File 6.2. Supplementary File 6.2 - Distribution of IS in complete genomes.xlsx. ISs in complete genomes.
    5. Supplementary File 6.3. Supplementary File 6.3 - QUAST analysis of assemblies.xlsx. Summary of SPAdes and SKESA assembly quality statistics, generated using QUAST.
    6. Supplementary File 6.4. Supplementary File 6.4 - WiIS performance metrics.xlsx. WiIS performance metrics for each genome.
    7. Supplementary File 6.5. Supplementary File 6.5 - Correlation of performance metrics and assembly statistics.xlsx. Correlation of WiIS performance metrics with SPAdes and SKESA assembly quality statistics.
    8. Supplementary File 6.6. Supplementary File 6.6 - IS insertions found by all tools.xlsx. IS insertions found by all tools for each of the 673 short-read sequenced genome.
    9. Supplementary File 6.7. Supplementary File 6.7 - IS insertions found by all tools (20 base pair distance threshold).xlsx. IS insertions found by all tools, with a buffer length of 20 base pairs, for each of the 673 short-read sequenced genome.
    10. Supplementary File 6.8. Supplementary File 6.8 - WiIS SPAdes IS insertions found with respect.xlsx. Summary of IS insertions found by WiIS (SPAdes) with respect to Tohama I (including the counts of insertions identified by WiIS, but not in Tohama I).
    11. Supplementary File 6.9. Wiis.zip. WiIS code.
  18. d

    metabarcoding data for: Benchmark of bioinformatics tools for fast and...

    • search.dataone.org
    • datadryad.org
    Updated May 17, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Laetitia Mathon (2025). metabarcoding data for: Benchmark of bioinformatics tools for fast and accurate species identification from environmental DNA metabarcoding [Dataset]. http://doi.org/10.5061/dryad.15dv41nx6
    Explore at:
    Dataset updated
    May 17, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Laetitia Mathon
    Time period covered
    Jan 1, 2021
    Description

    This dataset contains fish DNA sequences samples, simulated with Grinder, to build a mock community, as well as real fish eDNA metabarcoding data from the Mediterranean sea.

    These data have been used to compare the efficiency of different bioinformatic tools in retrieving the species composition of real and simulated samples.

  19. o

    Data for for Detecting cell-of-origin and cancer-specific methylation...

    • explore.openaire.eu
    • data.niaid.nih.gov
    • +1more
    Updated Apr 11, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Efrat Katsman; Shari Orlanski; Filippo Martignano; Silvestro G Conticello; Benjamin P Berman (2022). Data for for Detecting cell-of-origin and cancer-specific methylation features of cell-free DNA from Nanopore sequencing [Dataset]. http://doi.org/10.5281/zenodo.6448475
    Explore at:
    Dataset updated
    Apr 11, 2022
    Authors
    Efrat Katsman; Shari Orlanski; Filippo Martignano; Silvestro G Conticello; Benjamin P Berman
    Description

    This is an updated version of the earlier dataset. We removed the directory "SegmentationResultsMartignano2021". This data was incorrect, and the correct version is now moved to the source code tree at: 10.5281/zenodo.6641763. We also added a new set of files called doubleBarcodeIds, which contains IDs of all reads with two barcodes (see Methods). Datasets accompanying the paper https://doi.org/10.1101/2021.10.18.464684

  20. m

    augmentation data for DAISM

    • data.mendeley.com
    • zenodo.org
    Updated Jun 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yating Lin (2022). augmentation data for DAISM [Dataset]. http://doi.org/10.17632/ysjwjvpnh3.1
    Explore at:
    Dataset updated
    Jun 22, 2022
    Authors
    Yating Lin
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The purified dataset for data augmentation for DAISM-DNNXMBD can be downloaded from this repository.

    The pbmc8k dataset downloaded from 10X Genomics were processed and uesd for data augmentation to create training datasets for training DAISM-DNN models. pbmc8k.h5ad contains 5 cell types (B.cells, CD4.T.cells, CD8.T.cells, monocytic.lineage, NK.cells), and pbmc8k_fine.h5ad cantains 7 cell types (naive.B.cells, memory.B.cells, naive.CD4.T.cells, memory.CD4.T.cells,naive.CD8.T.cells, memory.CD8.T.cells, regulatory.T.cells, monocytes, macrophages, myeloid.dendritic.cells, NK.cells).

    For RNA-seq dataset, it contains 5 cell types (B.cells, CD4.T.cells, CD8.T.cells, monocytic.lineage, NK.cells). Raw FASTQ reads were downloaded from the NCBI website, and transcription and gene-level expression quantification were performed using Salmon (version 0.11.3) with Gencode v29 after quality control of FASTQ reads using fastp. All tools were used with default parameters.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
U.S. EPA Office of Research and Development (ORD) (2020). Bioinformatics data for paper [Dataset]. https://catalog.data.gov/dataset/bioinformatics-data-for-paper
Organization logo

Bioinformatics data for paper

Explore at:
Dataset updated
Nov 12, 2020
Dataset provided by
United States Environmental Protection Agencyhttp://www.epa.gov/
Description

Data for sequence comparison of commamox genomes and genes identified. This dataset is associated with the following publication: Camejo, P., J. Santodomingo, K. McMahon, and D. Noguera. Genome-enabled insights into the ecophysiology of the comammox bacterium Ca. Nitrospira nitrosa. ENVIRONMENTAL SCIENCE & TECHNOLOGY. American Chemical Society, Washington, DC, USA, 2(5): 1-16, (2017).

Search
Clear search
Close search
Google apps
Main menu