70 datasets found
  1. Sample datasets for Galaxy NGS tutorial

    • zenodo.org
    application/gzip
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anton Nekrutenko; Anton Nekrutenko (2020). Sample datasets for Galaxy NGS tutorial [Dataset]. http://doi.org/10.5281/zenodo.583613
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anton Nekrutenko; Anton Nekrutenko
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets in fastqsanger.gz format representing re-sequencing of human mitochondria

  2. o

    Software and supporting data for Colib'read on Galaxy.

    • explore.openaire.eu
    Updated Jan 1, 2016
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yvan, Le Bras; Olivier Collin; Cyril Monjeaud; Vincent Lacroix; Eric Rivals; Claire Lemaitre; Vincent Miele; Gustavo Sacomoto; Camille Marchet; Bastien Cazaux; Amal, ZineEl Aabidine; Leena Salmela; Susete Alves-Carvalho; Alexan Andrieux; Raluca Uricaru; Pierre Peterlongo (2016). Software and supporting data for Colib'read on Galaxy. [Dataset]. http://doi.org/10.5524/100170
    Explore at:
    Dataset updated
    Jan 1, 2016
    Authors
    Yvan, Le Bras; Olivier Collin; Cyril Monjeaud; Vincent Lacroix; Eric Rivals; Claire Lemaitre; Vincent Miele; Gustavo Sacomoto; Camille Marchet; Bastien Cazaux; Amal, ZineEl Aabidine; Leena Salmela; Susete Alves-Carvalho; Alexan Andrieux; Raluca Uricaru; Pierre Peterlongo
    Description

    With NGS technologies, life sciences face a raw data deluge. Classical analysis processes of such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to directly focus on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools. Dedicated to whole genome assembly-free treatments, the Colibread tools suite uses optimized algorithms for various analyses of NGS datasets, such as variant calling or read set comparisons. Based on the use of de Bruijn graph and bloom filter, such analyses can be performed in few hours, using small amounts of memory. Applications on real data demonstrate the good accuracy of these tools compared to classical approaches. To facilitate data analysis and tools dissemination, we developed Galaxy tools and tool shed repositories. With the Colibread Galaxy tools suite, we give the possibility to a broad range of life scientists to analyze raw NGS data. More importantly, our approach allows to keep the maximum of biological information from data and use very low memory footprint.

  3. Training data for 'Genome annotation with Maker' tutorial (Galaxy Training...

    • zenodo.org
    • explore.openaire.eu
    bin
    Updated Dec 31, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Bretaudeau; Anthony Bretaudeau (2020). Training data for 'Genome annotation with Maker' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.1402567
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 31, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anthony Bretaudeau; Anthony Bretaudeau
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data provided here are part of a Galaxy Training Network tutorial for genome annotation with Maker.

    It is based on data used in another Maker tutorial.

    The full genome was downloaded from NCBI, and mitochondria sequence removed from it for simplicity.

  4. SARS-CoV-2 genomics resources for Galaxy

    • zenodo.org
    bin, tsv
    Updated Jan 22, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Wolfgang Maier; Wolfgang Maier (2022). SARS-CoV-2 genomics resources for Galaxy [Dataset]. http://doi.org/10.5281/zenodo.5888324
    Explore at:
    bin, tsvAvailable download formats
    Dataset updated
    Jan 22, 2022
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Wolfgang Maier; Wolfgang Maier
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Reference and custom annotation data expected as input by Galaxy SARS-CoV-2 variation analysis workflows developed by covid19.galaxyproject.org

  5. f

    Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using...

    • acs.figshare.com
    xlsx
    Updated Jun 2, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratik D. Jagtap; James E. Johnson; Getiria Onsongo; Fredrik W. Sadler; Kevin Murray; Yuanbo Wang; Gloria M. Shenykman; Sricharan Bandhakavi; Lloyd M. Smith; Timothy J. Griffin (2023). Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework [Dataset]. http://doi.org/10.1021/pr500812t.s007
    Explore at:
    xlsxAvailable download formats
    Dataset updated
    Jun 2, 2023
    Dataset provided by
    ACS Publications
    Authors
    Pratik D. Jagtap; James E. Johnson; Getiria Onsongo; Fredrik W. Sadler; Kevin Murray; Yuanbo Wang; Gloria M. Shenykman; Sricharan Bandhakavi; Lloyd M. Smith; Timothy J. Griffin
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy’s flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics.

  6. Training material for the course "Exome analysis with GALAXY"

    • zenodo.org
    bin, txt, vcf
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paolo Uva; Gianmauro Cuccuru; Paolo Uva; Gianmauro Cuccuru (2020). Training material for the course "Exome analysis with GALAXY" [Dataset]. http://doi.org/10.5281/zenodo.61377
    Explore at:
    bin, txt, vcfAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Paolo Uva; Gianmauro Cuccuru; Paolo Uva; Gianmauro Cuccuru
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    Galaxy is an open source, web-based platform for data intensive biomedical research. It makes accessible bioinformatics applications to users lacking programming skills, enabling them to easily build analysis workflows for NGS data.

    The course "Exome analysis using Galaxy" is aimed at PhD student, biologists, clinicians and researchers who are analysing, or need to analyse in the near future, high throughput exome sequencing data. The aim of the course is to make participants familiarise with the Galaxy platform and prepare them to work independently, using state-of-the art tools for the analysis of exome sequencing data.

    The course will be delivered using a mixture of lectures and computer based hands-on practical sessions. Lectures will provide an up-to-date overview of the strategies for the analysis of exome next-generation experiments, starting from the raw sequence data. Analyses include sequence quality control, alignment to a reference genome, refinement of aligned sequences, variant calling, annotation and interpretation, and tools for visual inspection of results. Participants will apply the knowledge gained during the course to the analysis of Illumina’s real exome datasets, and implement workflows to reproduce the complete analysis. After the course, participants will be able to create pipeline for their individual analyses.

    Those are the needed datasets for this course.

  7. HIV detection in ILC patient samples of Use Case 3–1.

    • figshare.com
    • plos.figshare.com
    xls
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski (2023). HIV detection in ILC patient samples of Use Case 3–1. [Dataset]. http://doi.org/10.1371/journal.pone.0168397.t002
    Explore at:
    xlsAvailable download formats
    Dataset updated
    Jun 1, 2023
    Dataset provided by
    PLOShttp://plos.org/
    Authors
    Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The table summarizes the report generated by Metavisitor from a batch of 40 sequence datasets (S14 File). Metadata associated with each indicated sequence dataset as well as the ability of Metavisitor to detect HIV in datasets and patients are indicated.

  8. Z

    Training data for 'Genome annotation with Funannotate' tutorial (Galaxy...

    • data.niaid.nih.gov
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Bretaudeau; Alexandre Cormier; Stéphanie Robin; Erwan Corre; Laura Leroi (2023). Training data for 'Genome annotation with Funannotate' tutorial (Galaxy Training Material) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5653162
    Explore at:
    Dataset updated
    Apr 26, 2023
    Dataset provided by
    CNRS
    Ifremer
    INRAE
    Authors
    Anthony Bretaudeau; Alexandre Cormier; Stéphanie Robin; Erwan Corre; Laura Leroi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data provided here are part of a Galaxy Training Network tutorial for genome annotation with funannotate.

    Genome was assembled following the GTN Flye assembly tutorial, then masked with RepeatMasker.

    RNASeq data: SRR8534859 reads were mapped to the genome using STAR (toolshed.g2.bx.psu.edu/repos/iuc/rgrnastar/rna_star/2.7.8a+galaxy0), then the bam was downsampled (10% with toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_DownsampleSam/2.18.2.1) to reduce the size of the dataset. Fastq files were then extracted from the resulting bam file (toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_SamToFastq/2.18.2.1).

    SwissProt_subset.fasta is a subset of SwissProt proteins that are known to have some similarity with the genome (found using Diamond against the genome, then extracting sequences matching with e-value < 0.0001).

  9. o

    Supporting materials for: "VirAmp: A Galaxy-based viral genome assembly...

    • explore.openaire.eu
    Updated Jan 1, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yinan Wan; Daniel, W Renner; Istvan Albert; Moriah, L Szpara (2015). Supporting materials for: "VirAmp: A Galaxy-based viral genome assembly pipeline". [Dataset]. http://doi.org/10.5524/100113
    Explore at:
    Dataset updated
    Jan 1, 2015
    Authors
    Yinan Wan; Daniel, W Renner; Istvan Albert; Moriah, L Szpara
    Description

    We have developed a multi-step viral genome assembly pipeline named VirAmp, that combines existing tools and techniques and presents them to end users via a web-enabled Galaxy interface. Our pipeline allows users to assemble, analyze and interpret high coverage viral sequencing data with an ease and efficiency that was not possible previously. Our software makes a large number of genome assembly and related tools available to life scientists and automates the currently recommended best practices into a single, easy to use interface. We tested our pipeline with three different datasets from human herpes simplex virus (HSV). VirAmp provides a user-friendly interface and a complete pipeline for viral genome analysis. We make our software available via an Amazon Elastic Cloud disk image that can be easily launched by anyone with an Amazon web service account. A demonstration version of our system can be found at http://viramp.com. We also maintain detailed documentation on each tool and methodology at http://docs.viramp.com. Here in GigaDB you will find an archived version of the tools as they were published.

  10. Scientific Workflow Development Using Large Language Models

    • zenodo.org
    Updated Nov 25, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Workflows Scientific; Workflows Scientific (2025). Scientific Workflow Development Using Large Language Models [Dataset]. http://doi.org/10.5281/zenodo.16416384
    Explore at:
    Dataset updated
    Nov 25, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Workflows Scientific; Workflows Scientific
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Jun 1, 2025
    Description

    This replication package contains all materials used to evaluate how Large Language Models (LLMs) support scientific workflow development in Galaxy and Nextflow. It includes the full set of prompts, LLM responses, and generated workflows analyzed in the study. The package provides six PDF files: (1) LLMs’ understanding of fundamental scientific workflow and workflow-system concepts, and (2) their domain knowledge of Galaxy and Nextflow platforms, including architecture, key features, and reproducibility mechanisms. It also includes workflow-specific background questions for both systems, covering domain tasks such as SNP-rich exon detection, peak-to-gene association, methylation analysis, and QC pipelines.

    The package further provides the complete workflows generated by GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3 for a set of benchmark tasks, detailing tool selections, execution steps, file transformations, and workflow structure. Together, these artifacts enable full transparency and reproducibility of our multi-dimensional assessment of LLMs’ conceptual reasoning, domain understanding, and workflow-generation capabilities across two major scientific workflow systems.

    The first two files provide foundational insights. The first file, Table-2 Fundamental_Concepts_Of_Scientific_Workflow_and_SWS, includes LLM-generated responses to conceptual questions about scientific workflows and workflow systems, evaluating the understanding of GPT-4o, Gemini 2.5 Flash, and DeepSeek-V3. The second file, Table-3 LLMs Understanding of Galaxy and Nextflow, further explores LLMs’ domain-specific knowledge by addressing background questions about the Galaxy and Nextflow platforms, including their architecture, tools, reproducibility, and key features such as Galaxy’s ToolShed or Nextflow’s DSL concepts and nf-core integration.

    The next two files, Table-4 and Table-5, contain workflow-specific background questions designed to assess LLM comprehension of domain-level specific tasks within Galaxy and Nextflow, respectively. These include tasks such as identifying SNP-rich exons, associating peaks with genes, or understanding methylation data processing. The final two files, LLMs Generated workflows using Galaxy Workflow System and LLMs generated workflows using Nextflow Workflow System, showcase the actual workflows generated by LLMs in response to structured prompts. Each file presents detailed, step-by-step workflows for different tasks, comparing how each LLM structures, sequences, and explains the analyses using real-world tools and formats (e.g., FastQC, BEDTools, MultiQC). These documents together form a multi-dimensional assessment of LLMs’ capability in generating, reasoning about, and structuring scientific workflows.

  11. Training data for 'Genome annotation with Funannotate' tutorial (Galaxy...

    • zenodo.org
    application/gzip, bin
    Updated Apr 26, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Bretaudeau; Anthony Bretaudeau; Alexandre Cormier; Alexandre Cormier; Stéphanie Robin; Stéphanie Robin; Erwan Corre; Erwan Corre; Laura Leroi; Laura Leroi (2023). Training data for 'Genome annotation with Funannotate' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.5653163
    Explore at:
    bin, application/gzipAvailable download formats
    Dataset updated
    Apr 26, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Anthony Bretaudeau; Anthony Bretaudeau; Alexandre Cormier; Alexandre Cormier; Stéphanie Robin; Stéphanie Robin; Erwan Corre; Erwan Corre; Laura Leroi; Laura Leroi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data provided here are part of a Galaxy Training Network tutorial for genome annotation with funannotate.

    Genome was assembled following the GTN Flye assembly tutorial, then masked with RepeatMasker.

    RNASeq data: SRR12951075 and SRR8534859 reads were mapped to the genome using STAR (toolshed.g2.bx.psu.edu/repos/iuc/rgrnastar/rna_star/2.7.8a+galaxy0), then bam mere merged (toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_MergeSamFiles/2.18.2.1) and downsampled (10% with toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_DownsampleSam/2.18.2.1) to reduce the size of the dataset. Fastq fiels were then extracted from the resulting bam file (toolshed.g2.bx.psu.edu/repos/devteam/picard/picard_SamToFastq/2.18.2.1).

    SwissProt_subset.fasta is a subset of SwissProt proteins that are known to have some similarity with the genome (found using Diamond against the genome, then extracting sequences matching with e-value < 0.0001).

  12. Z

    Data for Galaxy training in assembly of Lumpy Skin Disease Virus Genome

    • data.niaid.nih.gov
    Updated Aug 4, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Tomas Klingström (2022). Data for Galaxy training in assembly of Lumpy Skin Disease Virus Genome [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_4032450
    Explore at:
    Dataset updated
    Aug 4, 2022
    Dataset provided by
    Sveriges lantbruksuniversitet
    Authors
    Tomas Klingström
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    This upload was to support a Galaxy tutorial on the Lumpy Skin Disease virus genome prepared by the Defend2020 project. To access the data deposited in Genbank and the Sequence Read Archive please refer to the following deposits.

    The LSDV isolate Kubash/KAZ/16 sequence has been deposited in GenBank under accession number MN642592, and raw data have been submitted to the SRA under BioProject number PRJNA587601.

  13. Z

    Training data for 'Long non-coding RNAs (lncRNAs) annotation with FEELnc'...

    • data-staging.niaid.nih.gov
    • data.niaid.nih.gov
    Updated May 28, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Stéphanie Robin; Anthony Bretaudeau; Alexandre Cormier; Erwan Corre; Laura Leroi (2024). Training data for 'Long non-coding RNAs (lncRNAs) annotation with FEELnc' tutorial (Galaxy Training Material) [Dataset]. https://data-staging.niaid.nih.gov/resources?id=zenodo_7107049
    Explore at:
    Dataset updated
    May 28, 2024
    Dataset provided by
    INRAE
    CNRS
    Ifremer
    Authors
    Stéphanie Robin; Anthony Bretaudeau; Alexandre Cormier; Erwan Corre; Laura Leroi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Data needed for the 'Long non-coding RNAs (lncRNAs) annotation with FEELnc' tutorial (Galaxy Training Material).The assembly was generated following the 'Genome assembly using PacBio data' tutorial.The annotation was generated following the 'Genome annotation with Funannotate ' tutorial.

    The bam file is RNASeq SRR8534859_1.fastq.gz and SRR8534859_2.fastq.gz mapping on the genome assembly.

  14. e

    Data from: Metaproteomic analysis using the Galaxy framework

    • ebi.ac.uk
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Pratik Jagtap, Metaproteomic analysis using the Galaxy framework [Dataset]. https://www.ebi.ac.uk/pride/archive/projects/PXD001655
    Explore at:
    Authors
    Pratik Jagtap
    Variables measured
    Proteomics
    Description

    Metaproteomics characterizes proteins expressed by microorganism communities (microbiome) present in environmental samples or a host organism (e.g. human), revealing insights into the molecular functions conferred by these communities. Compared to conventional proteomics, metaproteomics presents unique data analysis challenges, including the use large protein databases derived from hundreds of organisms, as well as numerous processing steps to ensure data quality. This data analysis complexity limits the use of metaproteomics for many researchers. In response, we have developed an accessible and flexible metaproteomics workflow within the Galaxy bioinformatics framework. Via analysis of human oral tissue exudate samples, we have established a modular Galaxy-based workflow that automates a reduction method for searching large sequence databases, enabling comprehensive identification of host proteins (human) as well as meta-proteins from the non-host organisms. Downstream, automated processing steps enable BLASTP analysis and evaluation/visualization of peptide sequence match quality, maximizing confidence in results. Outputted results are compatible with tools for taxonomic and functional characterization (e.g. Unipept, MEGAN5). Galaxy also allows for the sharing of complete workflows with others, promoting reproducibility and also providing a template for further modification and improvement. Our results provide a blueprint for establishing Galaxy as a solution for metaproteomic data analysis.

  15. q

    Tackling "Big Data" with Biology Undergrads: A Simple RNA-seq Data Analysis...

    • qubeshub.org
    Updated Aug 28, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew Escobar*; William Morgan; Irina Makarevitch; Sabrina Robertson (2021). Tackling "Big Data" with Biology Undergrads: A Simple RNA-seq Data Analysis Tutorial Using Galaxy [Dataset]. http://doi.org/10.24918/cs.2019.13
    Explore at:
    Dataset updated
    Aug 28, 2021
    Dataset provided by
    QUBES
    Authors
    Matthew Escobar*; William Morgan; Irina Makarevitch; Sabrina Robertson
    Description

    Analyzing high-throughput DNA sequence data is a fundamental skill in modern biology. However, real and perceived barriers such as massive file sizes, substantial computational requirements, and lack of instructor background knowledge can discourage faculty from incorporating high-throughput sequence data into their courses. We developed a straightforward and detailed tutorial that guides students through the analysis of RNA sequencing (RNA-seq) data using Galaxy, a public web-based bioinformatics platform. The tutorial stretches over three laboratory periods (~8 hours) and is appropriate for undergraduate molecular biology and genetics courses. Sequence files are imported into a student's Galaxy user account directly from the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA), eliminating the need for on-site file storage. Using Galaxy's graphical user interface and a defined set of analysis tools, students perform sequence quality assessment and trimming, map individual sequence reads to a genome, generate a counts table, and carry out differential gene expression analysis. All of these steps are performed "in the cloud," using offsite computational infrastructure. The provided tutorial utilizes RNA-seq data from a published study focused on nematode infection of Arabidopsis thaliana. Based on their analysis of the data, students are challenged to develop new hypotheses about how plants respond to nematode parasitism. However, the workflow is flexible and can accommodate alternative data sets from NCBI SRA or the instructor. Overall, this resource provides a simple introduction to the analysis of "big data" in the undergraduate classroom, with limited prior background and infrastructure required for successful implementation.

  16. r

    Genomics Virtual Laboratory (GVL) Tutorials: RNA-Seq differential gene...

    • researchdata.edu.au
    Updated May 2, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    QFAB (2013). Genomics Virtual Laboratory (GVL) Tutorials: RNA-Seq differential gene expression (DGE) analysis using Galaxy workflow Input [Dataset]. https://researchdata.edu.au/genomics-virtual-laboratory-workflow-input/14067
    Explore at:
    Dataset updated
    May 2, 2013
    Dataset provided by
    QFAB
    Description

    The input data of this tutorial is from an RNA-seq experiment looking for differentially expressed genes in D. melanogaster (fruit fly) between two experimental conditions. Please use the ‘fastqsanger’ File Format.

  17. Additional file 1: of sRNAPipe: a Galaxy-based pipeline for bioinformatic...

    • springernature.figshare.com
    txt
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Romain Pogorelcnik; Chantal Vaury; Pierre Pouchin; Silke Jensen; Emilie Brasset (2023). Additional file 1: of sRNAPipe: a Galaxy-based pipeline for bioinformatic in-depth exploration of small RNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.6885320.v1
    Explore at:
    txtAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    figshare
    Figsharehttp://figshare.com/
    Authors
    Romain Pogorelcnik; Chantal Vaury; Pierre Pouchin; Silke Jensen; Emilie Brasset
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    ReadMe. This file gives instructions concerning the prerequisites and the installation of sRNAPipe. (TXT 3Â kb)

  18. NGSAP-VC : Genomic Variant Calling as an Installable GALAXY Workflow Using...

    • zenodo.org
    application/gzip, tar
    Updated Jan 24, 2020
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ambarish; Ambarish (2020). NGSAP-VC : Genomic Variant Calling as an Installable GALAXY Workflow Using NGS data. [Dataset]. http://doi.org/10.5281/zenodo.3548331
    Explore at:
    application/gzip, tarAvailable download formats
    Dataset updated
    Jan 24, 2020
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Ambarish; Ambarish
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Implementation of genomic variants calling as an installable GALAXY workflows using NGS data. Repository contains two separate sets of simulated ebola test data. One for SNPs and INDELs calling and another for Structural Variants calling.

  19. o

    WEBINAR: Here's one we prepared earlier: (re)creating bioinformatics methods...

    • explore.openaire.eu
    Updated Oct 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Gareth Price; Johan Gustafsson (2022). WEBINAR: Here's one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia [Dataset]. http://doi.org/10.5281/zenodo.7251309
    Explore at:
    Dataset updated
    Oct 26, 2022
    Authors
    Gareth Price; Johan Gustafsson
    Area covered
    Australia
    Description

    This record includes training materials associated with the Australian BioCommons webinar ‘Here’s one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia’. This webinar took place on 26 October 2022. Event description Have you discovered a brilliant bioinformatics workflow but you’re not quite sure how to use it? In this webinar we will introduce the power of Galaxy for construction and (re)use of reproducible workflows, whether building workflows from scratch, recreating them from published descriptions and/or extracting from Galaxy histories. Using an established bioinformatics method, we’ll show you how to: Use the workflows creator in Galaxy Australia Build a workflow based on a published method Annotate workflows so that you (and others) can understand them Make workflows finable and citable (important and very easy to do!) Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. GalaxyWorkflows_Slides (PDF): A PDF copy of the slides presented during the webinar. Materials shared elsewhere: A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/IMkl6p7hkho

  20. Z

    Training data for 'Refining Manual Genome Annotations with Apollo...

    • data.niaid.nih.gov
    • zenodo.org
    Updated Jul 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Anthony Bretaudeau (2022). Training data for 'Refining Manual Genome Annotations with Apollo (eukaryotes)' tutorial (Galaxy Training Material) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_6902133
    Explore at:
    Dataset updated
    Jul 28, 2022
    Dataset provided by
    INRAE
    Authors
    Anthony Bretaudeau
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The data provided here are part of a Galaxy Training Network tutorial for manual curation of eukaryotic genome annotation using Apollo.

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Anton Nekrutenko; Anton Nekrutenko (2020). Sample datasets for Galaxy NGS tutorial [Dataset]. http://doi.org/10.5281/zenodo.583613
Organization logo

Sample datasets for Galaxy NGS tutorial

Explore at:
application/gzipAvailable download formats
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Anton Nekrutenko; Anton Nekrutenko
License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

Datasets in fastqsanger.gz format representing re-sequencing of human mitochondria

Search
Clear search
Close search
Google apps
Main menu