Facebook
TwitterCurrent versions of all published workflows can be accessed at https://cpt.tamu.edu/galaxy-pub/workflows/list_published. (XLSX)
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The table summarizes the report generated by Metavisitor from a batch of 40 sequence datasets (S14 File). Metadata associated with each indicated sequence dataset as well as the ability of Metavisitor to detect HIV in datasets and patients are indicated.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This dataset is an RO-Crate representation of an execution of an example Galaxy workflow, making use of some of Galaxy's platform specific features. It follows the Workflow Run Crate profile. The workflow has been run with Galaxy version 23.0 and exported using the implemented export invocation to RO-crate feature.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
This replication package contains all materials used to evaluate how Large Language Models (LLMs) support scientific workflow development in Galaxy and Nextflow. It includes the full set of prompts, LLM responses, and generated workflows analyzed in the study. The package provides six PDF files: (1) LLMs’ understanding of fundamental scientific workflow and workflow-system concepts, and (2) their domain knowledge of Galaxy and Nextflow platforms, including architecture, key features, and reproducibility mechanisms. It also includes workflow-specific background questions for both systems, covering domain tasks such as SNP-rich exon detection, peak-to-gene association, methylation analysis, and QC pipelines.
The package further provides the complete workflows generated by GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3 for a set of benchmark tasks, detailing tool selections, execution steps, file transformations, and workflow structure. Together, these artifacts enable full transparency and reproducibility of our multi-dimensional assessment of LLMs’ conceptual reasoning, domain understanding, and workflow-generation capabilities across two major scientific workflow systems.
The first two files provide foundational insights. The first file, Table-2 Fundamental_Concepts_Of_Scientific_Workflow_and_SWS, includes LLM-generated responses to conceptual questions about scientific workflows and workflow systems, evaluating the understanding of GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3. The second file, Table-3 LLMs Understanding of Galaxy and Nextflow, further explores LLMs’ domain-specific knowledge by addressing background questions about the Galaxy and Nextflow platforms, including their architecture, tools, reproducibility, and key features such as Galaxy’s ToolShed or Nextflow’s DSL concepts and nf-core integration.
The next two files, Table-4 and Table-5, contain workflow-specific background questions designed to assess LLM comprehension of domain-level specific tasks within Galaxy and Nextflow, respectively. These include tasks such as identifying SNP-rich exons, associating peaks with genes, or understanding methylation data processing. The final two files, LLMs Generated workflows using Galaxy Workflow System and LLMs generated workflows using Nextflow Workflow System, showcase the actual workflows generated by LLMs in response to structured prompts. Each file presents detailed, step-by-step workflows for different tasks, comparing how each LLM structures, sequences, and explains the analyses using real-world tools and formats (e.g., FastQC, BEDTools, MultiQC). These documents together form a multi-dimensional assessment of LLMs’ capability in generating, reasoning about, and structuring scientific workflows.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Input dataset for Galaxy Training Material for the Analyze unaligned ncRNAs workflow.
See https://github.com/galaxyproject/training-material for more information.
Facebook
TwitterThe way we process and analyze catalysis research data is revolutionazing. Galaxy, the open-source platform, transforms complex data processing and analysis into a seamless, user-friendly experience.
Ever wished for a time machine in your research? Galaxy's workflow tools allow you to recreate and share your analyses with ease, ensuring reproducibility and transparency in your catalysis studies.
How to Navigate Galaxy for catalysis-related research? Dr. Abraham Nieva de la Hidalga from UK Catalysis Hub will answer some of your questions on this topic. This video is a part of series of a Flash Pitch Event which took place at Annual Digital Catalysis & Catalysis-Related Sciences Conference (ADCR23) on 3rd of November 2023.
More information about the presentation at: https://zenodo.org/records/10172120
Stay tuned for more exciting content, and thank you for being a part of our growing community!
Check out our website: https://nfdi4cat.org/
Follow us:
https://in.linkedin.com/company/nfdi4cat
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Implementation of genomic variants calling as an installable GALAXY workflows using NGS data. Repository contains two separate sets of simulated ebola test data. One for SNPs and INDELs calling and another for Structural Variants calling.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Identifiers in the GTN correspond to training materials in various formats (markdown, slides, video). The users can apply learned concepts directly within the framework via galaxy workflows.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The table compare SoS with several popular bioinformatics workflow systems including Nextflow, Snakemake, Bpipe, CWL, and Galaxy, in three broad aspects: 1) basic features (syntax, file format, user interface, etc), 2) workflow features (workflow specification, dependency handling, execution and monitoring, etc), and built-in support for external tools and services (container support, HPC systems, distributed systems and cloud support). It is a snapshot of an interactive table online at https://vatlab.github.io/blog/post/comparison where comments and potential contributions from the community can be continuously incorporated through github issues or pull requests. (XLSX)
Facebook
TwitterCC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
The files included here are a set of Galaxy workflows, starting structure files (PDB, mol2, and frcmod), and specialized force field files (ZAFF) for the simulation of coronavirus helicases in the apo and drug-bound state. The inhibitor molecules include those from virtual screening (FCID1 and thioguanine), as well as experimentally validated candidates (Lumacaftor and SSYA10-001).
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow
Facebook
TwitterGene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.
Facebook
TwitterMetaproteomics characterizes proteins expressed by microorganism communities (microbiome) present in environmental samples or a host organism (e.g. human), revealing insights into the molecular functions conferred by these communities. Compared to conventional proteomics, metaproteomics presents unique data analysis challenges, including the use large protein databases derived from hundreds of organisms, as well as numerous processing steps to ensure data quality. This data analysis complexity limits the use of metaproteomics for many researchers. In response, we have developed an accessible and flexible metaproteomics workflow within the Galaxy bioinformatics framework. Via analysis of human oral tissue exudate samples, we have established a modular Galaxy-based workflow that automates a reduction method for searching large sequence databases, enabling comprehensive identification of host proteins (human) as well as meta-proteins from the non-host organisms. Downstream, automated processing steps enable BLASTP analysis and evaluation/visualization of peptide sequence match quality, maximizing confidence in results. Outputted results are compatible with tools for taxonomic and functional characterization (e.g. Unipept, MEGAN5). Galaxy also allows for the sharing of complete workflows with others, promoting reproducibility and also providing a template for further modification and improvement. Our results provide a blueprint for establishing Galaxy as a solution for metaproteomic data analysis.
Facebook
TwitterThis record includes training materials associated with the Australian BioCommons webinar ‘Here’s one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia’. This webinar took place on 26 October 2022. Event description Have you discovered a brilliant bioinformatics workflow but you’re not quite sure how to use it? In this webinar we will introduce the power of Galaxy for construction and (re)use of reproducible workflows, whether building workflows from scratch, recreating them from published descriptions and/or extracting from Galaxy histories. Using an established bioinformatics method, we’ll show you how to: Use the workflows creator in Galaxy Australia Build a workflow based on a published method Annotate workflows so that you (and others) can understand them Make workflows finable and citable (important and very easy to do!) Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. GalaxyWorkflows_Slides (PDF): A PDF copy of the slides presented during the webinar. Materials shared elsewhere: A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/IMkl6p7hkho
Facebook
TwitterThe input data of this tutorial is from an RNA-seq experiment looking for differentially expressed genes in D. melanogaster (fruit fly) between two experimental conditions. Please use the ‘fastqsanger’ File Format.
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
A scientific workflow describes a process for accomplishing a scientific objective, usually expressed in terms of tasks and their dependencies. We have collected publicly available workflows from Galaxy Main Server and tried to reuse them. This dataset contained our collected workflows.
Facebook
Twitterhttps://creativecommons.org/licenses/zero/1.0https://creativecommons.org/licenses/zero/1.0
This is a multiple regression analysis workflow designed to predict algal bloom risk in the Baltic Sea based on oceanographic and nutrient data. The workflow combines data preprocessing, statistical modeling, and spatial visualization to assess water quality at bathing sites.
Facebook
Twitterhttps://www.gnu.org/licenses/gpl-3.0-standalone.htmlhttps://www.gnu.org/licenses/gpl-3.0-standalone.html
This dataset is associated with the Galaxy workflow "Cloud-Aerosole MT-MG Pre-Processing"
Facebook
TwitterAttribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Facebook
TwitterCurrent versions of all published workflows can be accessed at https://cpt.tamu.edu/galaxy-pub/workflows/list_published. (XLSX)