Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Galaxy workflow from Galaxy 101 for everyone. This workflow is used in the training "How to reproduce published Galaxy analyses" to learn how to run a published Galaxy workflow.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Metavisitor is a software package that allows biologists and clinicians without specialized bioinformatics expertise to detect and assemble viral genomes from deep sequence datasets. The package is composed of a set of modular bioinformatic tools and workflows that are implemented in the Galaxy framework. Using the graphical Galaxy workflow editor, users with minimal computational skills can use existing Metavisitor workflows or adapt them to suit specific needs by adding or modifying analysis modules. Metavisitor works with DNA, RNA or small RNA sequencing data over a range of read lengths and can use a combination of de novo and guided approaches to assemble genomes from sequencing reads. We show that the software has the potential for quick diagnosis as well as discovery of viruses from a vast array of organisms. Importantly, we provide here executable Metavisitor use cases, which increase the accessibility and transparency of the software, ultimately enabling biologists or clinicians to focus on biological or medical questions.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Input dataset for Galaxy Training Material for the Analyze unaligned ncRNAs workflow.
See https://github.com/galaxyproject/training-material for more information.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
FASTA Feature Retriever and Tabular Feature Retriever are galaxy workflow that retrieves features(like genes) in fasta or tabular format using .bed and genome .fasta as input
input data derived from:
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The table summarizes the Metavisitor report files available as S16 and S17 Files.
This record includes training materials associated with the Australian BioCommons workshop ‘Translating workflows into Nextflow with Janis’. This workshop took place online on 19 June 2023. Event description Bioinformatics workflows are critical for reproducibly transferring methodologies between research groups and for scaling between computational infrastructures. Research groups currently invest a lot of time and effort in creating and updating workflows; the ability to translate from one workflow language into another can make them easier to share, and maintain with minimal effort. For example, research groups that would like to run an existing Galaxy workflow on HPC, or extend it for their use, might find translating the workflow to Nextflow more suitable for their ongoing use-cases. Janis is a framework that provides an abstraction layer for describing workflows, and a tool that can translate workflows between existing languages such as CWL, WDL, Galaxy and Nextflow. Janis aims to translate as much as it can, leaving the user to validate the workflow and make small manual adjustments where direct translations are not possible. Originating from the Portable Pipelines Project between Melbourne Bioinformatics, the Peter MacCallum Cancer Centre, and the Walter and Eliza Hall Institute of Medical Research, this tool is now available for everyone to use. This workshop provides an introduction to Janis and how it can be used to translate Galaxy and CWL based tools and workflows into Nextflow. Using hands-on examples we’ll step you through the process and demonstrate how to optimise, troubleshoot and test the translated workflows. This workshop event and accompanying materials were developed by the Melbourne Bioinformatics and the Peter MacCallum Cancer Centre. The workshop was enabled through the Australian BioCommons - Bring Your Own Data Platforms project funded by the Australian Research Data Commons and NCRIS via Bioplatforms Australia. Materials Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. Intro to Galaxy (PDF): Slides presented during the workshop Intro to CWL (PDF): Slides presented during the workshop Intro to the session & Janis (PDF): Slides presented during the workshop Janis_Schedule (PDF): Schedule for the workshop providing a breakdown of topics and timings Materials shared elsewhere: This workshop follows the accompanying training materials: https://www.melbournebioinformatics.org.au/tutorials/tutorials/janis_translate/janis_translate A recording of the workshop is available on the Australian BioCommons YouTube channel: https://youtu.be/0IiY1GEx_BY
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
This is the input dataset for the MFAssignR Galaxy training workflow. The input dataset corresponds to the model data of MFAssignR (Raw_Neg_ML), containing a raw mass list, measured in a negative ESI mode.
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy’s flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics.
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Datasets for Transformer-based tool recommeder in Galaxy:
1. Tool popularity - Contains last one year usage of all Galaxy tools per month (Extracted from Galaxy Europe using query https://github.com/galaxyproject/gxadmin/blob/main/docs/README.query.md#query-tool-popularity)
2. Workflow connections - Contains workflows as tabular files as pairs of tools - IN and OUT (Extracted from Galaxy Europe using query https://github.com/galaxyproject/gxadmin/blob/main/docs/README.query.md#query-workflow-connections)
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
See Method section for a description of the columns.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Workflows in WorkflowHub.
This record includes training materials associated with the Australian BioCommons webinar ‘Here’s one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia’. This webinar took place on 26 October 2022. Event description Have you discovered a brilliant bioinformatics workflow but you’re not quite sure how to use it? In this webinar we will introduce the power of Galaxy for construction and (re)use of reproducible workflows, whether building workflows from scratch, recreating them from published descriptions and/or extracting from Galaxy histories. Using an established bioinformatics method, we’ll show you how to: Use the workflows creator in Galaxy Australia Build a workflow based on a published method Annotate workflows so that you (and others) can understand them Make workflows finable and citable (important and very easy to do!) Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. GalaxyWorkflows_Slides (PDF): A PDF copy of the slides presented during the webinar. Materials shared elsewhere: A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/IMkl6p7hkho
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The table summarizes the report generated by Metavisitor from a batch of 40 sequence datasets (S14 File). Metadata associated with each indicated sequence dataset as well as the ability of Metavisitor to detect HIV in datasets and patients are indicated.
Short read data from the exome of chromosome 22 of a single human individual. There are one million 76bp reads in the dataset, produced on an Illumina GAIIx from exome-enriched DNA. This data was generated as part of the 1000 Genomes project: http://www.1000genomes.org/. Please use the ‘fastqsanger’ File Format
Gene duplication is a major factor contributing to evolutionary novelty, and the contraction or expansion of gene families has often been associated with morphological, physiological, and environmental adaptations. The study of homologous genes helps us to understand the evolution of gene families. It plays a vital role in finding ancestral gene duplication events as well as identifying genes that have diverged from a common ancestor under positive selection. There are various tools available, such as MSOAR, OrthoMCL, and HomoloGene, to identify gene families and visualize syntenic information between species, providing an overview of syntenic regions evolution at the family level. Unfortunately, none of them provide information about structural changes within genes, such as the conservation of ancestral exon boundaries among multiple genomes. The Ensembl GeneTrees computational pipeline generates gene trees based on coding sequences, provides details about exon conservation, and is used in the Ensembl Compara project to discover gene families. A certain amount of expertise is required to configure and run the Ensembl Compara GeneTrees pipeline via command line. Therefore, we converted this pipeline into a Galaxy workflow, called GeneSeqToFamily, and provided additional functionality. This workflow uses existing tools from the Galaxy ToolShed, as well as providing additional wrappers and tools that are required to run the workflow. GeneSeqToFamily represents the Ensembl GeneTrees pipeline as a set of interconnected Galaxy tools, so they can be run interactively within the Galaxy's user-friendly workflow environment while still providing the flexibility to tailor the analysis by changing configurations and tools if necessary. Additional tools allow users to subsequently visualize the gene families produced by the workflow, using the Aequatus.js interactive tool, which has been developed as part of the Aequatus software project.
https://spdx.org/licenses/etalab-2.0.htmlhttps://spdx.org/licenses/etalab-2.0.html
Main publication Poll report and form on HAL Authors The raw data was generated by the poll respondents The authors of this Dataset, excluding Vlad Visan, are such respondents. There are also other respondents who chose to remain anonymous The script was written by Vlad Visan The raw format was adapted to a numerical format by Vlad Visan Overall description A poll took place in February 2024, to understand the administrative burden of using Galaxy, specifically for small-scale admins. Context Useful to anyone considering using Galaxy Done as part of the technology monitoring phase of the "Gestionnaire de workflows" (Workflow Management System) project of the OSUG LabEx File descriptions raw_data_names_removed.tsv Raw poll answers. With any personally identifiable information redacted. SSA-Poll-19-Feb-2024-Filtered-Numerical.tab This numerically filtered format is required by the script The transformation could be done automatically in the future, but there are some subtleties: "-1" denotes "ignore/invalid" Some empty answers have to manually be converted to "0" I manually changed one answer that was "0" to "-1" after reading the associated comment which made it clear that "invalid" was more appropriate numericalCsvImportAndGenerateCharts.R The script parses the data, and creates one distribution/histogram graph per column It expects a filtered version, with only the numerical fields. Form-V2.pdf Survey questions, with several errors corrected: End-user assistance questions were worded wrongly Various spelling/wording mistakes
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Test Data for Galaxy tutorial "Clustering 3k PBMCs with Seurat" - SCTransform workflow
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
These files are input files for workflow Workflow constructed from history 'ISOCOR_TRACEGROOMER_DIMET' in W4M galaxy. https://workflow4metabolomics.usegalaxy.fr/workflows/list_published
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
The Data of this table were extracted from the Metavisitor report file available as S15 File. Values of the column “Coverage of complete viral genome (%)” correspond to the fractions (in %) of the complete viral genomes that are covered by blast hits of viral contigs to these genomes and values of the column “Mean blast bit score” correspond to the mean values of the bit scores observed for these blast hits. Note that blast alignments to incomplete viral genomes were not taken into account. For detection of false positives, reads were aligned to the bowtie2 vir1 index before de novo assembly and counts of these reads were reported in the column “Read mapping to vir1 using bowtie2”).
Bed file containing the list of SNPs and INDELs on chr22 exome.
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Galaxy workflow from Galaxy 101 for everyone. This workflow is used in the training "How to reproduce published Galaxy analyses" to learn how to run a published Galaxy workflow.