54 datasets found

Training data for 'Unicycler assembly of SARS-CoV-2 genome with...
zenodo.org
data.niaid.nih.gov
application/gzip
Updated Aug 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Maier; Wolfgang Maier (2022). Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.3732359
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.3732359
Dataset updated
Aug 4, 2022
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wolfgang Maier; Wolfgang Maier
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data here is a copy of the corresponding SRR records in the NCBI SRA. The duplication serves a dual purpose:

as a backup should there be problems connecting to NCBI servers, e.g., during Galaxy user trainings.

to illustrate how to obtain raw sequencing data from alternative sources, and to organize the data into the same collection structure in a Galaxy history that is generated by specialized Galaxy SRA download tools.
Training material for the course "Exome analysis with GALAXY"
zenodo.org
explore.openaire.eu
bin, txt, vcf
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Uva; Gianmauro Cuccuru; Paolo Uva; Gianmauro Cuccuru (2020). Training material for the course "Exome analysis with GALAXY" [Dataset]. http://doi.org/10.5281/zenodo.61377
Explore at:
bin, txt, vcfAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.61377
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paolo Uva; Gianmauro Cuccuru; Paolo Uva; Gianmauro Cuccuru
License
CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
License information was derived automatically
Description
Galaxy is an open source, web-based platform for data intensive biomedical research. It makes accessible bioinformatics applications to users lacking programming skills, enabling them to easily build analysis workflows for NGS data.

The course "Exome analysis using Galaxy" is aimed at PhD student, biologists, clinicians and researchers who are analysing, or need to analyse in the near future, high throughput exome sequencing data. The aim of the course is to make participants familiarise with the Galaxy platform and prepare them to work independently, using state-of-the art tools for the analysis of exome sequencing data.

The course will be delivered using a mixture of lectures and computer based hands-on practical sessions. Lectures will provide an up-to-date overview of the strategies for the analysis of exome next-generation experiments, starting from the raw sequence data. Analyses include sequence quality control, alignment to a reference genome, refinement of aligned sequences, variant calling, annotation and interpretation, and tools for visual inspection of results. Participants will apply the knowledge gained during the course to the analysis of Illumina’s real exome datasets, and implement workflows to reproduce the complete analysis. After the course, participants will be able to create pipeline for their individual analyses.

Those are the needed datasets for this course.
g
Software and supporting data for Colib'read on Galaxy.
gigadb.org
explore.openaire.eu
Updated Jan 20, 2016
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2016). Software and supporting data for Colib'read on Galaxy. [Dataset]. http://doi.org/10.5524/100170
Explore at:
Unique identifier
https://doi.org/10.5524/100170
Dataset updated
Jan 20, 2016
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
With NGS technologies, life sciences face a raw data deluge. Classical analysis processes of such data often begin with an assembly step, needing large amounts of computing resources, and potentially removing or modifying parts of the biological information contained in the data. Our approach proposes to directly focus on biological questions, by considering raw unassembled NGS data, through a suite of six command-line tools. Dedicated to ”whole genome assembly-free” treatments, the Colib’read tools suite uses optimized algorithms for various analyses of NGS datasets, such as variant calling or read set comparisons. Based on the use of de Bruijn graph and bloom filter, such analyses can be performed in few hours, using small amounts of memory. Applications on real data demonstrate the good accuracy of these tools compared to classical approaches. To facilitate data analysis and tools dissemination, we developed Galaxy tools and tool shed repositories. With the Colib’read Galaxy tools suite, we give the possibility to a broad range of life scientists to analyze raw NGS data. More importantly, our approach allows to keep the maximum of biological information from data and use very low memory footprint.
Additional file 3: Figure S2. of TRAPLINE: a standardized and automated...
springernature.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Markus Wolfien; Christian Rimmbach; Ulf Schmitz; Julia Jung; Stefan Krebs; Gustav Steinhoff; Robert David; Olaf Wolkenhauer (2023). Additional file 3: Figure S2. of TRAPLINE: a standardized and automated pipeline for RNA sequencing data analysis, evaluation and annotation [Dataset]. http://doi.org/10.6084/m9.figshare.c.3631766_D9.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3631766_D9.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Markus Wolfien; Christian Rimmbach; Ulf Schmitz; Julia Jung; Stefan Krebs; Gustav Steinhoff; Robert David; Olaf Wolkenhauer
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Visualization for RNA transcript quality control and comparison of per base quality score Q. The images are taken before (A) and after (B) quality trimming procedure (removes reads with Q â‰¤ 20) to estimate the effect of trimming. The quality score Q is plotted to the read position by using the FastQC package in Galaxy (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). The color indicates the quality of the read: "red" low quality, "orange" median quality, "green" good quality. Red line expresses the mean of the measured values (yellow boxes are inter-quartile range) and the blue line represents the mean quality. (ZIP 81 kb)
o
Supporting data and materials for "NCBI BLAST+ integrated into Galaxy".
explore.openaire.eu
aspera.gigadb.org
Updated Jan 1, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Peter, J.A. Cock; John, M. Chilton; Björn Grüning; James, E. Johnson; Nicola Soranzo (2015). Supporting data and materials for "NCBI BLAST+ integrated into Galaxy". [Dataset]. http://doi.org/10.5524/100149
Explore at:
Unique identifier
https://doi.org/10.5524/100149
Dataset updated
Jan 1, 2015
Authors
Peter, J.A. Cock; John, M. Chilton; Björn Grüning; James, E. Johnson; Nicola Soranzo
Description
The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters of the Galaxy web-based biomedical data analysis platform, integrating BLAST into Galaxy was a natural step for sequence comparison workflows. Here we provide the command line NCBI BLAST+ tool suite wrapped for use within Galaxy. The integration of the BLAST+ tool suite into Galaxy has the goal of making common BLAST tasks easy and advanced tasks possible. This project is an informal international collaborative effort, it is deployed and used on Galaxy servers worldwide.
Training material for the SIGU course "Data analysis and interpretation for...
zenodo.org
data.niaid.nih.gov
bin
Updated Apr 26, 2021
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Paolo Uva; Paolo Uva; Alessandro Bruselles; Alessandro Bruselles; Andrea Ciolfi; Andrea Ciolfi; Gianmauro Cuccuru; Gianmauro Cuccuru; Giuseppe Marangi; Giuseppe Marangi; Tommaso Pippucci; Tommaso Pippucci (2021). Training material for the SIGU course "Data analysis and interpretation for clinical genomics" (part 4/4) [Dataset]. http://doi.org/10.5281/zenodo.4270091
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.4270091
Dataset updated
Apr 26, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Paolo Uva; Paolo Uva; Alessandro Bruselles; Alessandro Bruselles; Andrea Ciolfi; Andrea Ciolfi; Gianmauro Cuccuru; Gianmauro Cuccuru; Giuseppe Marangi; Giuseppe Marangi; Tommaso Pippucci; Tommaso Pippucci
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
This repository contains datasets required for the online training "Data analysis and interpretation for clinical genomics" available at https://sigu-training.github.io/clinical_genomics/.

Tools used in the training are available at the European Galaxy instance running at https://usegalaxy.eu, which also includes a copy of this repository in the Shared Data Libraries. BAM files in this dataset are based on the hg38 reference genome.

This is part of a 4 dataset submission. Refer to this dataset for details.
Training data for 'Reference based RADSeq ' tutorial (Galaxy Training...
zenodo.org
data.niaid.nih.gov
bin, txt
Updated Jan 24, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Yvan Le Bras; Yvan Le Bras; Paul Hohenlohe; Paul Hohenlohe (2020). Training data for 'Reference based RADSeq ' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.1134547
Explore at:
bin, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.1134547
Dataset updated
Jan 24, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Yvan Le Bras; Yvan Le Bras; Paul Hohenlohe; Paul Hohenlohe
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data provided here are part of a Galaxy Training Network tutorial that analyzes RAD-seq data from a study published by Hohelnlohe et al., 2010 (DOI:10.1371/journal.pgen.1000862) to identify and type single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations and thus estimate genetic diversity and differentiation among populations.
Additional file 1: of sRNAPipe: a Galaxy-based pipeline for bioinformatic...
springernature.figshare.com
txt
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Romain Pogorelcnik; Chantal Vaury; Pierre Pouchin; Silke Jensen; Emilie Brasset (2023). Additional file 1: of sRNAPipe: a Galaxy-based pipeline for bioinformatic in-depth exploration of small RNAseq data [Dataset]. http://doi.org/10.6084/m9.figshare.6885320.v1
Explore at:
txtAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.6885320.v1
Dataset updated
May 31, 2023
Dataset provided by
figshare
Figsharehttp://figshare.com/
Authors
Romain Pogorelcnik; Chantal Vaury; Pierre Pouchin; Silke Jensen; Emilie Brasset
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
ReadMe. This file gives instructions concerning the prerequisites and the installation of sRNAPipe. (TXT 3Â kb)
Data from: ReGaTE: Registration of Galaxy Tools in Elixir
ckan.earlham.ac.uk
Updated Dec 30, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.earlham.ac.uk (2018). ReGaTE: Registration of Galaxy Tools in Elixir [Dataset]. https://ckan.earlham.ac.uk/dataset/ba54c894-e4ea-449e-897d-ae8b511365fe
Explore at:
Dataset updated
Dec 30, 2018
Dataset provided by
CKANhttps://ckan.org/
Description
Bioinformaticians routinely use multiple software tools and data sources in their day-to-day work and have been guided in their choices by a number of cataloguing initiatives. The ELIXIR Tools and Data Services Registry (bio.tools) aims to provide a central information point, independent of any specific scientific scope within bioinformatics or technological implementation. Meanwhile, efforts to integrate bioinformatics software in workbench and workflow environments have accelerated to enable the design, automation, and reproducibility of bioinformatics experiments. One such popular environment is the Galaxy framework, with currently more than 80 publicly available Galaxy servers around the world. In the context of a generic registry for bioinformatics software, such as bio.tools, Galaxy instances constitute a major source of valuable content. Yet there has been, to date, no convenient mechanism to register such services en masse. Findings: We present ReGaTE (Registration of Galaxy Tools in Elixir), a software utility that automates the process of registering the services available in a Galaxy instance. This utility uses the BioBlend application program interface to extract service metadata from a Galaxy server, enhance the metadata with the scientific information required by bio.tools, and push it to the registry. Conclusions: ReGaTE provides a fast and convenient way to publish Galaxy services in bio.tools. By doing so, service providers may increase the visibility of their services while enriching the software discovery function that bio.tools provides for its users. The source code of ReGaTE is freely available on Github at https://github.com/C3BI-pasteur-fr/ReGaTE.
Data from: PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements...
seanoe.org
bin
Updated 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Jeremy Berthelier; Nathalie Casse; Nicolas Daccord; Véronique Jamilloux; Bruno Saint-Jean; Gregory Carrier (2018). PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements [Dataset]. http://doi.org/10.17882/51795
Explore at:
binAvailable download formats
Unique identifier
https://doi.org/10.17882/51795
Dataset updated
2018
Dataset provided by
SEANOE
Authors
Jeremy Berthelier; Nathalie Casse; Nicolas Daccord; Véronique Jamilloux; Bruno Saint-Jean; Gregory Carrier
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
to date, genome assembly of non-model organisms is usually not at chromosomal level and higly fragmented. this fragmentation is recognized to be, in part, the result of a bad assembly of the transposable elements (tes) copies, increasing the difficulty to detect and annotate them.in this context, we designed a new bioinformatics pipeline named pirate for detect, classify and annotate tes of non-model organisms. pirate combines multiple analysis packages representing all the major approaches for te detection. the goal is to promote the detection of complete te sequences of every te families. the detection of complete te sequences, bearing recognizable conserved domains or specific motifs, allows to facilitate the classification step. the classification step of pirate has been optimized for algal genomes.each tools used by pirate are automated into a stand-alone galaxy. this pirate-galaxy can be used through a virtual machine, which can be download below.this pirate-galaxy is a suitable and flexible platform to study tes in the genome of every organisms.you can find a tutorial below.please contact us if you have any issues or comments : berthelier.j [at] laposte.net or gregory.carrier [at] ifremer.fror you can leave a message on github: https://github.com/jberthelier/pirate/issues
HIV detection in ILC patient samples of Use Case 3–1.
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski (2023). HIV detection in ILC patient samples of Use Case 3–1. [Dataset]. http://doi.org/10.1371/journal.pone.0168397.t002
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0168397.t002
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The table summarizes the report generated by Metavisitor from a batch of 40 sequence datasets (S14 File). Metadata associated with each indicated sequence dataset as well as the ability of Metavisitor to detect HIV in datasets and patients are indicated.
f
Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using...
acs.figshare.com
figshare.com
xlsx
Updated Jun 3, 2023
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pratik D. Jagtap; James E. Johnson; Getiria Onsongo; Fredrik W. Sadler; Kevin Murray; Yuanbo Wang; Gloria M. Shenykman; Sricharan Bandhakavi; Lloyd M. Smith; Timothy J. Griffin (2023). Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using the Galaxy Framework [Dataset]. http://doi.org/10.1021/pr500812t.s003
Explore at:
xlsxAvailable download formats
Unique identifier
https://doi.org/10.1021/pr500812t.s003
Dataset updated
Jun 3, 2023
Dataset provided by
ACS Publications
Authors
Pratik D. Jagtap; James E. Johnson; Getiria Onsongo; Fredrik W. Sadler; Kevin Murray; Yuanbo Wang; Gloria M. Shenykman; Sricharan Bandhakavi; Lloyd M. Smith; Timothy J. Griffin
License
Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
License information was derived automatically
Description
Proteogenomics combines large-scale genomic and transcriptomic data with mass-spectrometry-based proteomic data to discover novel protein sequence variants and improve genome annotation. In contrast with conventional proteomic applications, proteogenomic analysis requires a number of additional data processing steps. Ideally, these required steps would be integrated and automated via a single software platform offering accessibility for wet-bench researchers as well as flexibility for user-specific customization and integration of new software tools as they emerge. Toward this end, we have extended the Galaxy bioinformatics framework to facilitate proteogenomic analysis. Using analysis of whole human saliva as an example, we demonstrate Galaxy’s flexibility through the creation of a modular workflow incorporating both established and customized software tools that improve depth and quality of proteogenomic results. Our customized Galaxy-based software includes automated, batch-mode BLASTP searching and a Peptide Sequence Match Evaluator tool, both useful for evaluating the veracity of putative novel peptide identifications. Our complex workflow (approximately 140 steps) can be easily shared using built-in Galaxy functions, enabling their use and customization by others. Our results provide a blueprint for the establishment of the Galaxy framework as an ideal solution for the emerging field of proteogenomics.
Z
Training material for the SIGU course "Data analysis and interpretation for...
data.niaid.nih.gov
Updated Apr 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Andrea Ciolfi (2021). Training material for the SIGU course "Data analysis and interpretation for clinical genomics" (part 1/4) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3531577
Explore at:
Dataset updated
Apr 26, 2021
Dataset provided by
Giuseppe Marangi
Tommaso Pippucci
Alessandro Bruselles
Andrea Ciolfi
Paolo Uva
Gianmauro Cuccuru
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
In years 2018-2019, we organized on behalf of the Italian Society of Human Genetics (SIGU) an itinerant Galaxy-based “hands-on-computer” training activity entitled “Data analysis and interpretation for clinical genomics”. This one-day course was offered to participants including clinical doctors, biologists, laboratory technicians and bioinformaticians. Topics covered by the course were NGS data quality check, detection of variants, copy number alterations and runs of homozygosity, annotation and filtering and clinical interpretation of sequencing results.

To meet the constant need for training on basic NGS analysis and interpretation of sequencing data in the clinical setting, we designed an on-line Galaxy-based training resource dedicated to this topic, articulated in presentations and practical assignments by which students will learn how to approach NGS data processing at the level of FASTQ, BAM and VCF files and clinically-oriented examination of variants emerging from sequencing experiments such as whole exomes.

This repository contains datasets required for the online training "Data analysis and interpretation for clinical genomics" available at https://sigu-training.github.io/clinical_genomics/.

Tools used in the training are available at the European Galaxy instance running at https://usegalaxy.eu, which also includes a copy of this repository in the Shared Data Libraries. Files named Fam_*.bam are based on hg38 reference genome; all the other files refer to hg19.

This is part of a 4 dataset submission.
o
WEBINAR: Here's one we prepared earlier: (re)creating bioinformatics methods...
explore.openaire.eu
Updated Oct 26, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Gareth Price; Johan Gustafsson (2022). WEBINAR: Here's one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia [Dataset]. http://doi.org/10.5281/zenodo.7251309
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7251309
Dataset updated
Oct 26, 2022
Authors
Gareth Price; Johan Gustafsson
Area covered
Australia
Description
This record includes training materials associated with the Australian BioCommons webinar ‘Here’s one we prepared earlier: (re)creating bioinformatics methods and workflows with Galaxy Australia’. This webinar took place on 26 October 2022. Event description Have you discovered a brilliant bioinformatics workflow but you’re not quite sure how to use it? In this webinar we will introduce the power of Galaxy for construction and (re)use of reproducible workflows, whether building workflows from scratch, recreating them from published descriptions and/or extracting from Galaxy histories. Using an established bioinformatics method, we’ll show you how to: Use the workflows creator in Galaxy Australia Build a workflow based on a published method Annotate workflows so that you (and others) can understand them Make workflows finable and citable (important and very easy to do!) Materials are shared under a Creative Commons Attribution 4.0 International agreement unless otherwise specified and were current at the time of the event. Files and materials included in this record: Event metadata (PDF): Information about the event including, description, event URL, learning objectives, prerequisites, technical requirements etc. Index of training materials (PDF): List and description of all materials associated with this event including the name, format, location and a brief description of each file. GalaxyWorkflows_Slides (PDF): A PDF copy of the slides presented during the webinar. Materials shared elsewhere: A recording of this webinar is available on the Australian BioCommons YouTube Channel: https://youtu.be/IMkl6p7hkho
o
Computational_Genomics
explore.openaire.eu
Updated May 4, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Rodolfo Aramayo (2023). Computational_Genomics [Dataset]. http://doi.org/10.5281/zenodo.7897471
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.7897471
Dataset updated
May 4, 2023
Authors
Rodolfo Aramayo
Description
Computational_Genomics_ Instructor: Name: Dr. Rodolfo Aramayo, PhD Email address: raramayo@tamu.edu Location: Department of Biology Room 412A, Biological Sciences Building West (BSBW) Texas A&M University College Station, TX 77843-3258 Description: This repository contains materials used to teach Computational Genomics in the Spring 2023. This course was heavily based on materials extracted from and/or adapted from: ENSEMBL, and ENSEMBL Tutorials and Examples. Genomes. 2nd edition Current Topics in Genome Analysis Galaxy Training Materials Course Topics: History of Bioinformatics History of Genomics Cloning Basics The Carbon Clarke Formula Introduction to Galaxy Genome Files: FASTA Format Uploading Data into Galaxy Introduction to Text Manipulations Introduction to Regular Expressions Introduction to Gene Models and Tables: GFF3 Files Introduction to Genome Annotation Cyverse User Portal Introduction to Genome Browsers (ENSEMBL) Introduction to Comparative Genomics Working with Genome Files Introduction to Sequence Analysis Computational Arithmetics Author: Rodolfo Aramayo (raramayo@tamu.edu) License: All content produced in this site is licensed by: CC BY-NC-SA 4.0
e
Data from: NGS data related to Rajaram et al.: Allele specific DNA...
b2find.eudat.eu
darus.uni-stuttgart.de
Updated Oct 11, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2024). NGS data related to Rajaram et al.: Allele specific DNA demethylation ... [Dataset]. https://b2find.eudat.eu/dataset/23ea9904-2d9b-5dfd-b327-37c824dfb440
Explore at:
Dataset updated
Oct 11, 2024
Description
Method overview To achieve targeted locus and allele-specific DNA demethylation, HEK293 cells were transfected with two plasmids. One plasmid contains, dCas9 fused to a SunTag with five repeats of the GCN4 peptide, separated by 22 aa long linkers, and scFv-fused TET1CD, as well as a GFP reporter protein. The other plasmid is a multiguide plasmid with 4 individual sgRNAs flanked by U6 promoter and gRNA scaffold, and a DsRed fluorophore. Control experiments were conducted with a scrambled sgRNA that does not have a binding site in the human genome. Initial studies showed that cells positive for two plasmids exhibited detectable fluorescence of the corresponding reporter proteins on day 3 post-transfection. Hence, FACS sorting was conducted at this time point. A part of the sorted cells was used immediately for downstream analysis, the other part was re-seeded to harvest at later time points. For DNA methylation analysis, genomic DNA was isolated from the cell samples and subjected to bisulfite treatment. Library preparation was performed using the bisulfite-converted samples, followed by NGS and data analysis. All methylation experiments were conducted in three independent biological replicates. For measurement of the genomic allele frequencies, genomic DNA of the untreated samples was used for the amplification of the region around the target SNP and an exonic region with additional SNP for each target, which was followed by library preparation, NGS and data analysis. To monitor the variation in the expression of the target genes, RNA was isolated from the treated cells on Day 6. cDNA synthesized from the isolated RNA was used for the library preparation of the exonic region. The library was subjected to NGS followed by data analysis. All experiments were conducted in three independent biological replicates. Method details The gDNA of transfected HEK293 cells sorted by FACS was extracted using QIAmp DNA Mini Kit (Qiagen). 500 ng of genomic DNA was subjected to overnight digestion with EcoRV which is not cutting in any of the target amplicons. Zymo EZ DNA Methylation-Lightning Kit (D5030-E) was used for bisulfite conversion. The library for NGS was prepared by two consecutive PCR reactions (Leitao et al, 2018). Firstly, bisulfite converted genomic DNA of each sample was amplified with target gene specific primers. The gene specific optimized amount of a product from the first PCR was used as a template for the second PCR to add the Illumina TruSeq sequencing adapters. Final products were quantified, pooled in equimolar amounts and purified using SPRIselect beads (Beckman Coulter). Ready-to-use pools of libraries were sequenced on NovaSeq 6000 using a PE250 flow cell (Novogene). For expression analysis, RNA was isolated from the sorted cells using Qiagen RNeasy extraction kit (Cat. No. 74034). By an additional treatment with TURBO DNA-free™ Kit (Ambion #AM1907) the residual genomic DNA from the samples were removed. 500 ng of the DNase-free RNA was used for cDNA synthesis with Applied Biosystems- High-Capacity cDNA Reverse Transcription Kit (Cat No 4368814). NRT was used as a negative control for cDNA synthesis, where the reaction was conducted without addition of the reverse transcriptase enzyme. In addition, NTC (no template control) reactions were included. The transcripts were subjected to library preparation in a two-step PCR process as mentioned above. For amplification of the genomic regions, 10 ng of the isolated genomic DNA was used. Two-step library preparation was carried out for NGS of genomic regions. All NGS data were obtained in the form of FASTQ files. Data analysis NGS data in a FASTQ format was analyzed as described (Rajaram et al., 2023) on the Galaxy platform (https://usegalaxy.org/) (The Galaxy platform for accessible, reproducible and collaborative biomedical analyses, 2022), where all the following tools are available. First, Illumina adapter sequences were removed using Trim Galore!. Afterwards, two paired-end reads were merged using Pear and reads with low quality were removed with Filter FASTQ. All NGS data files were subjected to this processing. For quantitative analysis of the methylation at individual CpG sites, the following steps were carried out. De-multiplexing of individual samples tagged with combinations of barcodes and Illumina indices was done by converting the FASTQ files using FASTQ to Tabular, followed by selection of lines with the tool Select and re-conversion of the files to a FASTQ format with Tabular to FASTQ. For the alignment of reads to a reference sequence, bwameth was used and the DNA methylation at each CpG site was analyzed by applying the tool MethylDackel. The output files were processed using Microsoft Excel. For the analysis of the allelic ratios of the transcript and genomic region, de-multiplexing of individual samples tagged with combinations of barcodes and Illumina indices was done by converting the FASTQ files using FASTQ to Tabular, followed by selection of lines with the tool Select. Input for the selection of lines was provided in accordance to the SNP of interest. Output of the tool Select provides the number of reads corresponding to each allele.
N
NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes...
search.nfdi4chem.de
darus.uni-stuttgart.de
html
Updated May 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DaRUS (2023). NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes by DNMT3A demonstrates efficient linker DNA methylation and a role of H3K36me3 [Dataset]. http://doi.org/10.18419/darus-1252
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.18419/darus-1252
Dataset updated
May 8, 2023
Dataset provided by
DaRUS
Description
Methylation experiments:
For the competitive nucleosome methylation experiments, 0.6 pmol of each nucleosome variant were digested with MluI (NEB) for 60 min at 37°C in 10 µL NEB Cutsmart buffer (50 mM KOAc/20 mM Tris-acetate pH 7.9, 10 mM Magnesium Acetate, 100 µg/mL BSA) to remove residual unbound DNA. Afterwards, DNMT3A2 or DNMT3AC was added to the mixture to a final concentration ranging from 0.5 µM to 3 µM and in 80 µL NEB Cutsmart buffer supplemented with 10 mM EDTA and 25 µM AdoMet (Perkin Elmer). The methylation reaction was allowed to proceed for 2 h at 37°C. To stop the reaction and remove all nucleosome-bound proteins, proteinase K was added to the reaction and the sample was incubated for further 60 min at 37°C. The resulting unbound DNA was purified from the reaction mixture using the Nucleospin Gel and PCR cleanup kit (Macherey-Nagel). Bisulfite conversion of the methylated DNA was performed using the EZ DNA Methylation-Lightning kit (Zymo Research). Methylation of free DNA was conducted the same way using 15 µM DNA.

Library preparation and sequencing analysis:
Sample-specific barcodes and indices were added to the DNA by PCR amplification in a two-step PCR process. Briefly, in the first PCR, barcoded primers were used to amplify the bisulfite converted nucleosome DNA using the HotStartTaq Polymerase (Qiagen) and the resulting 321 bp fragment was purified using the Nucleospin Gel and PCR cleanup kit (Macherey-Nagel). In the second PCR step, adaptors and indices required for sequencing were added by amplification with the respective primers and the Phusion polymerase (ThermoFisher). The final 390-bp product was purified and used for Illumina paired end 2x250 bp sequencing. Datasets were analyzed using a local instance of the Galaxy bioinformatics server. Sequence reads were trimmed with the Trim Galore! Tool (developed by Felix Krueger at the Babraham Institute) and subsequently paired using PEAR. The reads were filtered according to the expected DNA length using the Filter FASTQ tool and mapped to the corresponding reference sequence using bwameth to determine the percentage of methylated CpGs.

The naming of the files is described in the Supplemental Table 1 of the accompanying manuscript.
N
Data related to "Structural and biochemical insight into the mechanism of...
search.nfdi4chem.de
darus.uni-stuttgart.de
+1more
html
Updated May 8, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
DaRUS (2023). Data related to "Structural and biochemical insight into the mechanism of dual CpG site binding and methylation by DNMT3A" [Dataset]. http://doi.org/10.18419/darus-1781
Explore at:
htmlAvailable download formats
Unique identifier
https://doi.org/10.18419/darus-1781
Dataset updated
May 8, 2023
Dataset provided by
DaRUS
Description
Methylation of substrate libraries
Single-stranded DNA oligonucleotides used for generation of double stranded substrates with different distance between CpG sites were obtained from IDT. Sixteen single-stranded oligonucleotides were pooled in equimolar amounts and the second strand synthesis was conducted by a primer extension reaction using one universal primer. The obtained mix of double-stranded DNA oligonucleotides was methylated by DNMT3A catalytic domain and DNMT3A/3L and incubated for 60 min at 37 °C in the presence of 0.8 mM S-adenosyl-L-methionine (Sigma) in reaction buffer (20 mM HEPES pH 7.5, 1 mM EDTA, 50 mM KCl, 0.05 mg/mL bovine serum albumin). For DNMT3A, concentrations of 0.25 µM, 0,5 µM, 1 µM and 2 µM were used, for DNMT3A/3L 0.125 µM and 0.25 µM. In addition, a no-enzyme control was processed identically to all other samples. Reactions were stopped by shock freezing in liquid nitrogen, then treated with proteinase K for 2 hours at 42 °C. Afterwards DNA was digested with BsaI-HFv2 enzyme and a hairpin (pGAGAAGGGATGTGGATACACATCCCT) was ligated using T4 DNA ligase (NEB). DNA was bisulfite converted using EZ DNA Methylation-Lightning kit (ZYMO RESEARCH) according to the manufacturer protocol, purified and eluted with 10 µL ddH2O.

NGS library generation
Libraries for Illumina Next Generation Sequencing (NGS) were produced with the two-step PCR approach. In the first PCR, 2 µL of bisulfite-converted DNA were amplified with the HotStartTaq DNA Polymerase (QIAGEN) and primers containing internal barcodes using following conditions: 15 min at 95 °C, 10 cycles of 30 sec at 94 °C, 30 sec at 50 °C, 1 min and 30 sec at 72 °C, and final 5 min at 72 °C; using a mixture containing 1x PCR Buffer, 1x Q-Solution, 0.2 mM dNTPs, 0.05 U/µL HotStartTaq DNA Polymerase, 0.4 µM forward and 0.4 µM reverse primers in a total volume of 20 µL. In the second PCR, 1 µL of obtained products were amplified by Phusion Polymerase (Thermo) with another set of primers to introduce adapters and indices needed for NGS (30 sec at 98 °C, 10 cycles - 10 sec at 98 °C, 40 sec at 72 °C, and 5 min at 72 °C). PCRII was carried out in 1x Phusion HF Buffer, 0.2 mM dNTPs, 0.02 U/µL Phusion HF DNA Polymerase, 0.4 µM forward and 0.4 µM reverse primers in a total volume of 20 µL. Obtained libraries were pooled in equimolar amounts, purified and sequenced in the Max Planck Genome Centre Cologne.

Bioinformatic analysis
Bioinformatic analysis of obtained NGS data was conducted with a local Galaxy server and with home written scripts. Briefly, fastq files were analyzed by FastQC, 3’ ends of the reads with a quality lower than 20 were trimmed and reads containing both full-length sense and antisense strands were selected. Next, the samples were split using the internal barcodes with respect to the different experimental conditions. Afterwards the insert DNA sequence was extracted and used for further downstream analysis. The uploaded text files contain the bisulfite converted sequences with pairs of CpG sites in variable distance as described in the furhter documentation (info.pdf).
Report table generated by the “Parse blast output and compile hits” tool in...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski (2023). Report table generated by the “Parse blast output and compile hits” tool in Use Case 1–4 showing the presence of Drosophila A virus and Drosophila C virus in addition to the Nora virus in the small RNA sequencing of laboratory Drosophila. [Dataset]. http://doi.org/10.1371/journal.pone.0168397.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0168397.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Guillaume Carissimo; Marius van den Beek; Kenneth D. Vernick; Christophe Antoniewski
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
See Method section for a description of the columns.
e
NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes...
b2find.eudat.eu
Updated Jan 26, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2021). NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes by DNMT3A demonstrates efficient linker DNA methylation and a role of H3K36me3 - Dataset - B2FIND [Dataset]. https://b2find.eudat.eu/dataset/35afb7b2-f119-5d9b-8d87-cfbbd4b3317a
Explore at:
Dataset updated
Jan 26, 2021
Description
Methylation experiments: For the competitive nucleosome methylation experiments, 0.6 pmol of each nucleosome variant were digested with MluI (NEB) for 60 min at 37°C in 10 µL NEB Cutsmart buffer (50 mM KOAc/20 mM Tris-acetate pH 7.9, 10 mM Magnesium Acetate, 100 µg/mL BSA) to remove residual unbound DNA. Afterwards, DNMT3A2 or DNMT3AC was added to the mixture to a final concentration ranging from 0.5 µM to 3 µM and in 80 µL NEB Cutsmart buffer supplemented with 10 mM EDTA and 25 µM AdoMet (Perkin Elmer). The methylation reaction was allowed to proceed for 2 h at 37°C. To stop the reaction and remove all nucleosome-bound proteins, proteinase K was added to the reaction and the sample was incubated for further 60 min at 37°C. The resulting unbound DNA was purified from the reaction mixture using the Nucleospin Gel and PCR cleanup kit (Macherey-Nagel). Bisulfite conversion of the methylated DNA was performed using the EZ DNA Methylation-Lightning kit (Zymo Research). Methylation of free DNA was conducted the same way using 15 µM DNA. Library preparation and sequencing analysis: Sample-specific barcodes and indices were added to the DNA by PCR amplification in a two-step PCR process. Briefly, in the first PCR, barcoded primers were used to amplify the bisulfite converted nucleosome DNA using the HotStartTaq Polymerase (Qiagen) and the resulting 321 bp fragment was purified using the Nucleospin Gel and PCR cleanup kit (Macherey-Nagel). In the second PCR step, adaptors and indices required for sequencing were added by amplification with the respective primers and the Phusion polymerase (ThermoFisher). The final 390-bp product was purified and used for Illumina paired end 2x250 bp sequencing. Datasets were analyzed using a local instance of the Galaxy bioinformatics server. Sequence reads were trimmed with the Trim Galore! Tool (developed by Felix Krueger at the Babraham Institute) and subsequently paired using PEAR. The reads were filtered according to the expected DNA length using the Filter FASTQ tool and mapped to the corresponding reference sequence using bwameth to determine the percentage of methylated CpGs. The naming of the files is described in the Supplemental Table 1 of the accompanying manuscript.

Facebook

Twitter

Click to copy link

Link copied

Cite

Wolfgang Maier; Wolfgang Maier (2022). Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material) [Dataset]. http://doi.org/10.5281/zenodo.3732359

Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material)

Explore at:

application/gzipAvailable download formats

Unique identifier

https://doi.org/10.5281/zenodo.3732359

Dataset updated

Aug 4, 2022

Dataset provided by

Zenodohttp://zenodo.org/

Authors

Wolfgang Maier; Wolfgang Maier

License

Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically

Description

The data here is a copy of the corresponding SRR records in the NCBI SRA. The duplication serves a dual purpose:

as a backup should there be problems connecting to NCBI servers, e.g., during Galaxy user trainings.
to illustrate how to obtain raw sequencing data from alternative sources, and to organize the data into the same collection structure in a Galaxy history that is generated by specialized Galaxy SRA download tools.

Clear search

Close search

Google apps

Main menu

Training data for 'Unicycler assembly of SARS-CoV-2 genome with...

Training material for the course "Exome analysis with GALAXY"

Software and supporting data for Colib'read on Galaxy.

Additional file 3: Figure S2. of TRAPLINE: a standardized and automated...

Supporting data and materials for "NCBI BLAST+ integrated into Galaxy".

Training material for the SIGU course "Data analysis and interpretation for...

Training data for 'Reference based RADSeq ' tutorial (Galaxy Training...

Additional file 1: of sRNAPipe: a Galaxy-based pipeline for bioinformatic...

Data from: ReGaTE: Registration of Galaxy Tools in Elixir

Data from: PiRATE: a Pipeline to Retrieve and Annotate Transposable Elements...

HIV detection in ILC patient samples of Use Case 3–1.

Flexible and Accessible Workflows for Improved Proteogenomic Analysis Using...

Training material for the SIGU course "Data analysis and interpretation for...

WEBINAR: Here's one we prepared earlier: (re)creating bioinformatics methods...

Computational_Genomics

Data from: NGS data related to Rajaram et al.: Allele specific DNA...

NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes...

Data related to "Structural and biochemical insight into the mechanism of...

Report table generated by the “Parse blast output and compile hits” tool in...

NGS data related to Bröhm et al.: Methylation of recombinant mononucleosomes...

Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material)