25 datasets found

The Galaxy platform for accessible, reproducible and collaborative...
ckan.earlham.ac.uk
Updated Apr 2, 2019
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.earlham.ac.uk (2019). The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update - Datasets - CKAN [Dataset]. https://ckan.earlham.ac.uk/dataset/27a03fa3-12ad-40a6-9f80-cf348da2899d
Explore at:
Dataset updated
Apr 2, 2019
Dataset provided by
CKANhttps://ckan.org/
Description
Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.
Data from: ReGaTE: Registration of Galaxy Tools in Elixir
ckan.earlham.ac.uk
Updated Dec 30, 2018
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
ckan.earlham.ac.uk (2018). ReGaTE: Registration of Galaxy Tools in Elixir [Dataset]. https://ckan.earlham.ac.uk/dataset/ba54c894-e4ea-449e-897d-ae8b511365fe
Explore at:
Dataset updated
Dec 30, 2018
Dataset provided by
CKANhttps://ckan.org/
Description
Bioinformaticians routinely use multiple software tools and data sources in their day-to-day work and have been guided in their choices by a number of cataloguing initiatives. The ELIXIR Tools and Data Services Registry (bio.tools) aims to provide a central information point, independent of any specific scientific scope within bioinformatics or technological implementation. Meanwhile, efforts to integrate bioinformatics software in workbench and workflow environments have accelerated to enable the design, automation, and reproducibility of bioinformatics experiments. One such popular environment is the Galaxy framework, with currently more than 80 publicly available Galaxy servers around the world. In the context of a generic registry for bioinformatics software, such as bio.tools, Galaxy instances constitute a major source of valuable content. Yet there has been, to date, no convenient mechanism to register such services en masse. Findings: We present ReGaTE (Registration of Galaxy Tools in Elixir), a software utility that automates the process of registering the services available in a Galaxy instance. This utility uses the BioBlend application program interface to extract service metadata from a Galaxy server, enhance the metadata with the scientific information required by bio.tools, and push it to the registry. Conclusions: ReGaTE provides a fast and convenient way to publish Galaxy services in bio.tools. By doing so, service providers may increase the visibility of their services while enriching the software discovery function that bio.tools provides for its users. The source code of ReGaTE is freely available on Github at https://github.com/C3BI-pasteur-fr/ReGaTE.
Data from: Data files for an RNA-Seq Tutorial
zenodo.org
application/gzip, bin
Updated Mar 15, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Thon; Michael Thon (2023). Data files for an RNA-Seq Tutorial [Dataset]. http://doi.org/10.5281/zenodo.7735223
Explore at:
application/gzip, binAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.7735223
Dataset updated
Mar 15, 2023
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Thon; Michael Thon
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
These files go with a short transcriptomics (RNA-Seq) tutorial that I am preparing for an undergraduate level tutorial. The data analysis will be on a Galaxy server. I'll update the description with a link to the tutorial text when its ready.

These data are a subset of those published by O’Connell R, Thon M et al. 2012. Lifestyle transitions in plant pathogenic Colletotrichum fungi defined by genome and transcriptome analyses. Nature Genetics. 44:1060–1065.
g
Supporting data and materials for "NCBI BLAST+ integrated into Galaxy".
aspera.gigadb.org
explore.openaire.eu
Updated Aug 12, 2015
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2015). Supporting data and materials for "NCBI BLAST+ integrated into Galaxy". [Dataset]. http://doi.org/10.5524/100149
Explore at:
Unique identifier
https://doi.org/10.5524/100149
Dataset updated
Aug 12, 2015
License
U.S. Government Workshttps://www.usa.gov/government-works
License information was derived automatically
Description
The NCBI BLAST suite has become ubiquitous in modern molecular biology and is used for small tasks such as checking capillary sequencing results of single PCR products, genome annotation or even larger scale pan-genome analyses. For early adopters of the Galaxy web-based biomedical data analysis platform, integrating BLAST into Galaxy was a natural step for sequence comparison workflows. Here we provide the command line NCBI BLAST+ tool suite wrapped for use within Galaxy.
The integration of the BLAST+ tool suite into Galaxy has the goal of making common BLAST tasks easy and advanced tasks possible. This project is an informal international collaborative effort, it is deployed and used on Galaxy servers worldwide.
Influenza genomics resources for Galaxy
zenodo.org
bin, text/x-python
Updated May 9, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Wolfgang Maier; Wolfgang Maier; Aaron Kolbecher; Aaron Kolbecher (2025). Influenza genomics resources for Galaxy [Dataset]. http://doi.org/10.5281/zenodo.15364148
Explore at:
bin, text/x-pythonAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.15364148
Dataset updated
May 9, 2025
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Wolfgang Maier; Wolfgang Maier; Aaron Kolbecher; Aaron Kolbecher
License
https://www.gnu.org/licenses/agpl.txthttps://www.gnu.org/licenses/agpl.txt
Time period covered
Nov 18, 2024
Description
Per-segment reference sequence data for Influenza A

This data can be used with https://github.com/connor-lab/vapor to pick a reference for each segment that is close enough to sequenced Influenza A reads to enable successful mapping.

https://iwc.galaxyproject.org/workflow/influenza-isolates-consensus-and-subtyping-main/ is a Galaxy workflow that uses this strategy and that can use this data as input if it's uploaded to a Galaxy server and turned into a collection there.
f
ClusTrack: Feature Extraction and Similarity Measures for Clustering of...
plos.figshare.com
ai
Updated May 31, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Halfdan Rydbeck; Geir Kjetil Sandve; Egil Ferkingstad; Boris Simovski; Morten Rye; Eivind Hovig (2023). ClusTrack: Feature Extraction and Similarity Measures for Clustering of Genome-Wide Data Sets [Dataset]. http://doi.org/10.1371/journal.pone.0123261
Explore at:
aiAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0123261
Dataset updated
May 31, 2023
Dataset provided by
PLOS ONE
Authors
Halfdan Rydbeck; Geir Kjetil Sandve; Egil Ferkingstad; Boris Simovski; Morten Rye; Eivind Hovig
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Clustering is a popular technique for explorative analysis of data, as it can reveal subgroupings and similarities between data in an unsupervised manner. While clustering is routinely applied to gene expression data, there is a lack of appropriate general methodology for clustering of sequence-level genomic and epigenomic data, e.g. ChIP-based data. We here introduce a general methodology for clustering data sets of coordinates relative to a genome assembly, i.e. genomic tracks. By defining appropriate feature extraction approaches and similarity measures, we allow biologically meaningful clustering to be performed for genomic tracks using standard clustering algorithms. An implementation of the methodology is provided through a tool, ClusTrack, which allows fine-tuned clustering analyses to be specified through a web-based interface. We apply our methods to the clustering of occupancy of the H3K4me1 histone modification in samples from a range of different cell types. The majority of samples form meaningful subclusters, confirming that the definitions of features and similarity capture biological, rather than technical, variation between the genomic tracks. Input data and results are available, and can be reproduced, through a Galaxy Pages document at http://hyperbrowser.uio.no/hb/u/hb-superuser/p/clustrack. The clustering functionality is available as a Galaxy tool, under the menu option "Specialized analyzis of tracks", and the submenu option "Cluster tracks based on genome level similarity", at the Genomic HyperBrowser server: http://hyperbrowser.uio.no/hb/.
Z
Training data for 'Unicycler assembly of SARS-CoV-2 genome with...
data.niaid.nih.gov
zenodo.org
Updated Aug 4, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Maier, Wolfgang (2022). Training data for 'Unicycler assembly of SARS-CoV-2 genome with preprocessing to remove human genome reads' tutorial (Galaxy Training Material) [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_3732358
Explore at:
Dataset updated
Aug 4, 2022
Dataset authored and provided by
Maier, Wolfgang
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
The data here is a copy of the corresponding SRR records in the NCBI SRA. The duplication serves a dual purpose:

as a backup should there be problems connecting to NCBI servers, e.g., during Galaxy user trainings.

to illustrate how to obtain raw sequencing data from alternative sources, and to organize the data into the same collection structure in a Galaxy history that is generated by specialized Galaxy SRA download tools.
d
Data for galaxy assymetry experiment
dmc.datacentral.org.au
Updated Jan 24, 2017
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
(2017). Data for galaxy assymetry experiment [Dataset]. https://dmc.datacentral.org.au/dataset/data-for-galaxy-assymetry-experiment
Explore at:
Dataset updated
Jan 24, 2017
License
Open Data Commons Attribution License (ODC-By) v1.0https://www.opendatacommons.org/licenses/by/1.0/
License information was derived automatically
Description
This page provides links to data files used in the experiment described in Photometric asymmetry between clockwise and counterclockwise spiral galaxies in SDSS The files can be uploaded to SDSS Catalog Archive Server (CAS) and then used to replicate the results of the experiment. For instance, comparing the r magnitude of the of the classes can be done with the following CAS query: select avg(g) from PhotoObjAll, MyDB.cw where Objid=ID and g>0 and ra>90 and ra<270 select stdev(g) from PhotoObjAll, MyDB.cw where Objid=ID and g>0 and ra>90 and ra<270 select count(g) from PhotoObjAll, MyDB.cw where Objid=ID and g>0 and ra>90 and ra<270 select avg(g) from PhotoObjAll, MyDB.ccw where Objid=ID and g>0 and ra>90 and ra<270 select stdev(g) from PhotoObjAll, MyDB.ccw where Objid=ID and g>0 and ra>90 and ra<270 select count(g) from PhotoObjAll, MyDB.ccw where Objid=ID and g>0 and ra>90 and ra<270 Then the t-test can be calculated using the mean, standard deviation, and number of samples of the two classes. The "g>0" is added to avoid possible flag values such as "-9999". Paper reference: Shamir, L., Photometric asymmetry between clockwise and counterclockwise spiral galaxies in SDSS, PASA, In Press, 2017.
Transient Host Exchange
zenodo.org
explore.openaire.eu
application/gzip, txt
Updated Oct 22, 2021
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
THEx Team; Yu-Jing Qin; Yu-Jing Qin; THEx Team (2021). Transient Host Exchange [Dataset]. http://doi.org/10.5281/zenodo.5568962
Explore at:
application/gzip, txtAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.5568962
Dataset updated
Oct 22, 2021
Dataset provided by
Zenodohttp://zenodo.org/
Authors
THEx Team; Yu-Jing Qin; Yu-Jing Qin; THEx Team
License
MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically
Description
The First Public Data Release (DR1) of Transient Host Exchange (THEx) Dataset

Paper describing the dataset: “Linking Extragalactic Transients and their Host Galaxy Properties: Transient Sample, Multi-Wavelength Host Identification, and Database Construction” (Qin et al. 2021)

The data release contains four compressed archives.

“BSON export” is a binary export of the “host_summary” collection, which is the “full version” of the dataset. The schema was presented in the Appendix section of the paper.

You need to set up a MongoDB server to use this version of the dataset. After setting up the server, you may import this BSON file into your local database as a collection using “mongorestore” command.

You may find some useful tutorials for setting up the server and importing BSON files into your local database at:

https://docs.mongodb.com/manual/installation/

https://www.mongodb.com/basics/bson

You may run common operations like query and aggregation once you import this BSON snapshot into your local database. An official tutorial can be found at:

https://docs.mongodb.com/manual/tutorial/query-documents/

There are other packages (e.g., pymongo for Python) and software to perform these database operations.

“JSON export” is a compressed archive of JSON files. Each file, named by the unique id and the preferred name of the event, contains complete host data of a single event. The data schema and contents are identical to the “BSON” version.

“NumPy export” contains a series of NumPy tables in “npy” format. There is a row-to-row correspondence across these files. Except for the “master table” (THEx-v8.0-release-assembled.npy), which contains all the columns, each file contains the host properties cross-matched in a single external catalog. The meta info and ancillary data are summarized in THEx-v8.0-release-assembled-index.npy.

There is also a THEx-v8.0-release-typerowmask.npy file, which has rows co-indexed with other files and columns named after each transient type. The “rowmask” file allows you to select a subset of events under a specific transient type.

Note that in this version, we only include cataloged properties of the confirmed hosts or primary candidates. If the confirmed host (or primary candidate) cross-matched multiple sources in a specific catalog, we only use the representative source for host properties. Properties of other cross-matched groups are not included. Finally, table THEx-v8.0-release-MWExt.npy contains the calculated foreground extinction (in magnitudes) at host positions. These extinction values have not been applied to magnitude columns in our dataset. You need to perform this correction by yourself if desired.

“FITS export” includes the same individual tables as in “NumPy export”. However, the FITS standard limits the number of columns in a table. Therefore, we do not include the “master table” in “FITS export.”

Finally, in BSON and JSON versions, cross-matched groups (under the “groups” key) are ordered by the default ranking function. Even if the first group in this list (namely, the confirmed host or primary host candidate) is a mismatched or misidentified one, we keep it in its original position. The result of visual inspection, including our manual reassignments, has been summarized under the “vis_insp” key.

For NumPy and FITS versions, if we have manually reassigned the host of an event, the data presented in these tables are also updated accordingly. You may use the “case_code” column in the “index” file to find the result of visual inspection and manual reassignment, where the flags for this “case_code” column are summarized in case-code.txt. Generally, codes “A1” and “F1” are known and new hosts that passed our visual inspection, while codes “B1” and “G1” are mismatched known hosts and possibly misidentified new hosts that have been manually reassigned.
f
Escaped vs. unescaped text import into excel.
figshare.com
xls
Updated Jun 6, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich (2023). Escaped vs. unescaped text import into excel. [Dataset]. http://doi.org/10.1371/journal.pone.0185207.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0185207.t001
Dataset updated
Jun 6, 2023
Dataset provided by
PLOS ONE
Authors
Eric A. Welsh; Paul A. Stewart; Brent M. Kuenzi; James A. Eschrich
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Escaped vs. unescaped text import into excel.
Gene Ontology terms enriched by genes with different H3K4me1 occupancy in...
plos.figshare.com
xls
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Halfdan Rydbeck; Geir Kjetil Sandve; Egil Ferkingstad; Boris Simovski; Morten Rye; Eivind Hovig (2023). Gene Ontology terms enriched by genes with different H3K4me1 occupancy in fetal and adult brain cell types. [Dataset]. http://doi.org/10.1371/journal.pone.0123261.t001
Explore at:
xlsAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0123261.t001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOShttp://plos.org/
Authors
Halfdan Rydbeck; Geir Kjetil Sandve; Egil Ferkingstad; Boris Simovski; Morten Rye; Eivind Hovig
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Table of the three top terms of the three top annotation clusters from David. The gene IDs submitted to DAVID were selected based on different H3K4me1 occupancy between fetal and adult brain clusters, by using the Genomic HyperBrowser. Benjamini = Benjamini-Hochberg.Gene Ontology terms enriched by genes with different H3K4me1 occupancy in fetal and adult brain cell types.
f
Workflow Sample collected from Galaxy Main Server for reusability checking
figshare.com
zip
Updated Aug 19, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Khairul Alam (2022). Workflow Sample collected from Galaxy Main Server for reusability checking [Dataset]. http://doi.org/10.6084/m9.figshare.20514381.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.20514381.v1
Dataset updated
Aug 19, 2022
Dataset provided by
figshare
Authors
Khairul Alam
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
A scientific workflow describes a process for accomplishing a scientific objective, usually expressed in terms of tasks and their dependencies. We have collected publicly available workflows from Galaxy Main Server and tried to reuse them. This dataset contained our collected workflows.
o
NA12878 WES Benchmark dataset
explore.openaire.eu
zenodo.org
Updated Jan 4, 2020
+ more versions
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Pranckeviciene Erinija (2020). NA12878 WES Benchmark dataset [Dataset]. http://doi.org/10.5281/zenodo.3636192
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.3636192
Dataset updated
Jan 4, 2020
Authors
Pranckeviciene Erinija
Description
This dataset makes available the UCSC Genome Browser (genome.ucsc.edu) GRCh37 genome build public session NA12878 WES Benchmark files in a single dataset so that these files can be used in other applications or genome browsers such as IGV. The "Procedure and datasets to cross-reference OMIM genes with the genomic regions of interest" Galaxy page on usegalaxy.org server's Shared Data Pages describes practical procedure and several possible use cases for this data set. This page can be accessed freely by users logged into their accounts on usegalaxy.org. Please register if you don't have an account on usegalaxy.org Galaxy server. All genomic variant calls in all VCF files of this data set were decomposed and normalized with vt. This dataset contains: Genome in a bottle (GIAB) version 3.3.2 high confidence (HC) variant calls and genomic regions for HapMap individual NA12878 : GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz GIAB_v3.3.2_NA12878-decomposed-normalized.vcf.gz.tbi GIAB_v3.3.2_NA12878_HC_regions.bed HapMap individual NA12878 WES variant calls (VCF) and capture regions (BED) from diagnostic laboratories : ARUP whole exome sequencing data (HiSeq 2000) publically available from NCBI GeT-RM Browser converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz converted_ARUP_NA12878_Exome-decomposed-normalized.vcf.gz.tbi ARUP_SeqCap_EZ_Exome.bed UCSF whole exome sequencing data (HiSeq 2500) publically available from NCBI GeT-RM Browser converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz converted_UCSF_NA12878_WES_Agilent_V4_Custom-decomposed-normalized.vcf.gz.tbi UCSF_WES_Agilent_V4_Custom.bed Whole exome data (NextSeq 500) sequenced in CHEO diagnostic laboratory CHEO_NA12878_WES_S1dataset.vcf.gz CHEO_NA12878_WES_S1dataset.vcf.gz.tbi Agilent_CRE_v2.bed Genomic coordinates (BED) of OMIM genes for which a molecular basis of the associated disease is known (as of September 2019) : Omim_Genes.bed {"references": ["Pranckeviciene E, Potter R, Huang L, Jarinova O. Validation of bcbio-nextgen Pipeline Based on NextSeq500 Exome Sequencing. In 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI) 2019 May 19 (pp. 1-6). IEEE."]}
Z
RISPTS Dataset
data.niaid.nih.gov
zenodo.org
Updated Jan 21, 2022
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Anonymous (2022). RISPTS Dataset [Dataset]. https://data.niaid.nih.gov/resources?id=zenodo_5886175
Explore at:
Dataset updated
Jan 21, 2022
Dataset authored and provided by
Anonymous
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Downloaded workflows from Galaxy Servers
Erwinia pyrifoliae sequence reads
zenodo.org
application/gzip
Updated Feb 29, 2024
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Michael Thon; Michael Thon (2024). Erwinia pyrifoliae sequence reads [Dataset]. http://doi.org/10.5281/zenodo.10727691
Explore at:
application/gzipAvailable download formats
Unique identifier
https://doi.org/10.5281/zenodo.10727691
Dataset updated
Feb 29, 2024
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Michael Thon; Michael Thon
Description
E. pyrifoliae sequence reads originally from https://www.ncbi.nlm.nih.gov/search/all/?term=SRR1691104

Uploaded here to support a genome assembly tutorial using Galaxy Server.
Data from: Interactive Web-based Annotation of Plant MicroRNAs with...
zenodo.org
Updated Dec 16, 2020
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Ting Zhang; Ting Zhang; Jingjing Zhai; Jingjing Zhai; Xiaorong Zhang; Xiaorong Zhang; Lei Ling; Lei Ling; Menghan Li; Menghan Li; Shang Xie; Shang Xie; Minggui Song; Minggui Song; Chuang Ma; Chuang Ma (2020). Interactive Web-based Annotation of Plant MicroRNAs with iwa-miRNA [Dataset]. http://doi.org/10.5281/zenodo.4324338
Explore at:
Unique identifier
https://doi.org/10.5281/zenodo.4324338
Dataset updated
Dec 16, 2020
Dataset provided by
Zenodohttp://zenodo.org/
Authors
Ting Zhang; Ting Zhang; Jingjing Zhai; Jingjing Zhai; Xiaorong Zhang; Xiaorong Zhang; Lei Ling; Lei Ling; Menghan Li; Menghan Li; Shang Xie; Shang Xie; Minggui Song; Minggui Song; Chuang Ma; Chuang Ma
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
MicroRNAs (miRNAs) are important regulators of gene expression. The large-scale detection and profiling of miRNAs has accelerated with the development of high-throughput small RNA sequencing (sRNA-Seq) techniques and bioinformatics tools. However, generating high-quality comprehensive miRNA annotations remains challenging, due to the intrinsic complexity of sRNA-Seq data and inherent limitations of existing miRNA predictions. Here, we present iwa-miRNA, a Galaxy-based framework that can facilitate miRNA annotation in plant species by combining computational analysis and manual curation. iwa-miRNA is specifically designed to generate a comprehensive list of miRNA candidates, bridging the gap between already annotated miRNAs provided by public miRNA databases and new predictions from sRNA-Seq datasets. It can also assist users to select promising miRNA candidates in an interactive mode through the automated and manual steps, contributing to the accessibility and reproducibility of genome-wide miRNA annotation. iwa-miRNA is user-friendly and can be easily deployed as a web application for researchers without programming experience. With flexible, interactive, and easy-to-use features, iwa-miRNA is a valuable tool for annotation of miRNAs in plant species with reference genomes. We illustrated the application of iwa-miRNA for miRNA annotation of plant species with varying complexity. The sources codes and web server of iwa-miRNA is freely accessible at: http://iwa-miRNA.omicstudio.cloud/.
D
Data from: NGS data related to Adam et al.: On the accuracy of the...
darus.uni-stuttgart.de
search.nfdi4chem.de
Updated Jun 16, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Albert Jeltsch; Pavel Bashtrykov; Sabrina Adam (2023). NGS data related to Adam et al.: On the accuracy of the epigenetic copy machine - comprehensive specificity analysis of the DNMT1 DNA methyltransferase [Dataset]. http://doi.org/10.18419/DARUS-3334
Explore at:
CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
Unique identifier
https://doi.org/10.18419/DARUS-3334
Dataset updated
Jun 16, 2023
Dataset provided by
DaRUS
Authors
Albert Jeltsch; Pavel Bashtrykov; Sabrina Adam
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Dataset funded by
DFG
Description
Expression and purification of DNMT1 for biochemical work Full length murine DNMT1 (UniProtKB P13864) was overexpressed and purified as described (Adam, et al. 2020) using the Bac-to-Bac baculovirus expression system (Invitrogen). The expression construct of the DNMT1 with mutated CXXC domain was taken from Bashtrykov, et al. (2012). Synthesis long DNA substrate and methylation reactions with them The sequence of the 349 bp substrate with 44 CpG sites was taken from Adam et al. 2020. It was used in unmethylated and hemimethylated form. Generation of the substrates and the methylation reaction were conducted as described (Adam, et al. 2020). In brief, for the generation of hemimethylated substrates, the unmethylated DNA was methylated in vitro by M.SssI (purified as described in Adam, et al. 2020) to introduce methylation at all CpG sites, or by M.HhaI (NEB) together with M.MspI (NEB) to introduce methylation at GCGC and CCGG sites. For the synthesis of hemimethylated substrates, the upper strand of the methylated substrate was digested with lambda exonuclease, the ss-DNA purified and finally ds hemimethylated DNA was generated by by primer extension using Phusion® HF DNA Polymerase (Thermo). Methylation reaction were conducted using mixtures of UM, fully hemimethylated and patterned substrate (total DNA concentration 200 ng in 20 µL) in methylation buffer (100 mM HEPES, 1 mM EDTA, 0.5 mM DTT, 0.1 mg mL-1 BSA, pH 7.2 with KOH) containing 1 mM AdoMet. DNMT1 concentrations and incubation times are indicated in the text. Methylation was followed by bisulfite conversion using the EZ DNA Methylation-LightningTM Kit (ZYMO RESEARCH) followed by library generation and Illumina paired-end sequencing (Novogene). Flanking sequence preference analysis with randomized single-site substrates Methylation reactions of the randomized substrate with DNMT1 were performed similarly as described (Adam, et al. 2020; Gao, et al. 2020). Briefly, single-stranded oligonucleotides containing a methylated, hydroxymethylated or unmethylated CpG site embedded in a 10 nucleotide random context were obtained from IDT and used for generation of 67 bps long double-stranded DNA substrates by primer extension. Pools of these randomized substrates were then mixed in different combination, methylated by DNMT1 in methylation buffer (100 mM HEPES, 1 mM EDTA, 0.5 mM DTT, 0.1 mg mL-1 BSA, pH 7.2 with KOH) containing 1 mM AdoMet. DNMT1 concentrations and incubation times are indicated in the text. Methylation was followed by bisulfite conversion using the EZ DNA Methylation-LightningTM Kit (ZYMO RESEARCH) followed by library generation and Illumina paired-end sequencing (Novogene). Bioinformatics analysis NGS data sets were bioinformatically analyzed using a local instance of the Galaxy server as described (Adam, et al. 2020; Dukatz, et al. 2020; Dukatz, et al. 2022). In brief, for the long substrate, reads were trimmed, filtered by quality, mapped against the reference sequence and demultiplexed using substrate type and experiment specific barcodes. Afterwards, methylation information was assigned and retrieved by home-made skripts. For the randomized substrate, reads were trimmed and filtered according to the expected DNA size. The original DNA sequence was then reconstituted based on the bisulfite converted upper and lower strands to investigate the average methylation state of both CpG sites and the NNCGNN flanks using home-made skripts. Methylation rates of 256 NNCGNN sequence contexts in the competitive methylation experiments with the mixed single-site substrates were determined by fitting to monoexponential reaction progress curves with variable time points with MatLab skripts as described (Adam, et al. 2022). Pearson correlation factors were calculated with Excel using the correl function. Structure of the deposited data Methylation data of long substrates are placed in the “long DNA substrates” folder. Methylation data of short single-site substrates with randomized flanks are placed in the “single sites substrates” folder. In both folder an explanatory pdf file gives further information. Subfolders are arranged by enzyme (CXXC mutant or DNMT1 WT). Then, for each enzyme, the different substrates or substrate mixtures are provided in separate subfolders. References Adam S, Bräcker J, Klingel V, Osteresch B, Radde NE, Brockmeyer J, Bashtrykov P, Jeltsch A. Flanking sequences influence the activity of TET1 and TET2 methylcytosine dioxygenases and affect genomic 5hmC patterns. Communications Biology 5, 92 (2022) Adam S, Anteneh H, Hornisch M, Wagner V, Lu J, Radde NE, Bashtrykov P, Song J, Jeltsch A. DNA sequence-dependent activity and base flipping mechanisms of DNMT1 regulate genome-wide DNA methylation. Nature Commun 11, 3723 (2020) Bashtrykov P, et al. Specificity of Dnmt1 for methylation of hemimethylated CpG sites resides in its catalytic domain. Chem Biol 19, 572-578 (2012) Dukatz M, Dittrich M, Stahl E, Adam S, de Mendoza A,...
H
Home Media Server Report
archivemarketresearch.com
doc, pdf, ppt
Updated Jun 8, 2025
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Archive Market Research (2025). Home Media Server Report [Dataset]. https://www.archivemarketresearch.com/reports/home-media-server-361913
Explore at:
ppt, doc, pdfAvailable download formats
Dataset updated
Jun 8, 2025
Dataset authored and provided by
Archive Market Research
License
https://www.archivemarketresearch.com/privacy-policyhttps://www.archivemarketresearch.com/privacy-policy
Time period covered
2025 - 2033
Area covered
Global
Variables measured
Market Size
Description
The global home media server market is experiencing robust growth, driven by increasing demand for high-quality entertainment streaming, seamless media management across devices, and the expanding adoption of smart home technologies. While precise market size data for the base year (2025) is unavailable, considering the presence of major players like Samsung, Apple, and others, a reasonable estimate for the 2025 market size would be around $8 billion USD. Assuming a conservative Compound Annual Growth Rate (CAGR) of 12% based on historical trends and projected technological advancements, the market is poised to reach approximately $16 billion USD by 2033. This growth is fueled by the proliferation of high-resolution video content, the rise of 4K and 8K streaming, and consumers' increasing desire for personalized entertainment experiences. Factors such as enhanced data storage capabilities, improved network infrastructure, and the integration of AI-powered functionalities further contribute to this expansion. The market’s segmentation reveals significant opportunities for players focusing on specific niches. Growth will likely be strongest in segments offering cloud-based solutions and integrated smart home control features. Competitive intensity is high, with established tech giants competing against specialized providers. Challenges exist in managing data security concerns and maintaining seamless compatibility across different devices and operating systems. Nonetheless, the long-term outlook remains positive, driven by the continuous innovation in streaming technologies, increasing internet penetration, and the ever-growing demand for efficient and convenient home entertainment solutions. Strategic partnerships, continuous product development, and robust data security measures will be crucial for companies to succeed in this dynamic marketplace.
f
File S1 - A Model-Based Approach to Identify Binding Sites in CLIP-Seq Data
plos.figshare.com
pdf
Updated Jun 1, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Tao Wang; Beibei Chen; MinSoo Kim; Yang Xie; Guanghua Xiao (2023). File S1 - A Model-Based Approach to Identify Binding Sites in CLIP-Seq Data [Dataset]. http://doi.org/10.1371/journal.pone.0093248.s001
Explore at:
pdfAvailable download formats
Unique identifier
https://doi.org/10.1371/journal.pone.0093248.s001
Dataset updated
Jun 1, 2023
Dataset provided by
PLOS ONE
Authors
Tao Wang; Beibei Chen; MinSoo Kim; Yang Xie; Guanghua Xiao
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Supplementary data includes: Figures S1–S5, Table S1 and Text S1–S2. Table S1. The number of predicted binding sites per cluster for all CLIP clusters identified to have at least one reliable binding site in the AGO HITS-CLIP dataset. Figure S1. Distribution of tag counts and mutation ratios in each state. Figure S2. Tag pileup of a “flat” cluster from the AGO HITS-CLIP dataset. Figure S3. Target genes identified by MiClip, PARalyzer, wavClusteR and the ad hoc method in the EWSR1 experiment. Figure S4. Numbers of mutant genomic sites with the specified substitutions and in the two RSF intervals. Figure S5. The workflow of the MiClip Galaxy server. (PDF)
Additional file 2: of Dintor: functional annotation of genomic and proteomic...
springernature.figshare.com
zip
Updated May 30, 2023
Share
Facebook
Twitter
Email
Click to copy link
Link copied
Cite
Christian Weichenberger; Hagen Blankenburg; Antonia Palermo; Yuri Dâ€™Elia; Eva KĂśnig; Erik Bernstein; Francisco Domingues (2023). Additional file 2: of Dintor: functional annotation of genomic and proteomic data [Dataset]. http://doi.org/10.6084/m9.figshare.c.3596729_D2.v1
Explore at:
zipAvailable download formats
Unique identifier
https://doi.org/10.6084/m9.figshare.c.3596729_D2.v1
Dataset updated
May 30, 2023
Dataset provided by
Figsharehttp://figshare.com/
Authors
Christian Weichenberger; Hagen Blankenburg; Antonia Palermo; Yuri Dâ€™Elia; Eva KĂśnig; Erik Bernstein; Francisco Domingues
License
Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
License information was derived automatically
Description
Data files for Dintor use-cases one to three.

Facebook

Twitter

Click to copy link

Link copied

Cite

ckan.earlham.ac.uk (2019). The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update - Datasets - CKAN [Dataset]. https://ckan.earlham.ac.uk/dataset/27a03fa3-12ad-40a6-9f80-cf348da2899d

The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update - Datasets - CKAN

Explore at:

Dataset updated

Apr 2, 2019

Dataset provided by

CKANhttps://ckan.org/

Description

Galaxy (homepage: https://galaxyproject.org, main public server: https://usegalaxy.org) is a web-based scientific analysis platform used by tens of thousands of scientists across the world to analyze large biomedical datasets such as those found in genomics, proteomics, metabolomics and imaging. Started in 2005, Galaxy continues to focus on three key challenges of data-driven biomedical science: making analyses accessible to all researchers, ensuring analyses are completely reproducible, and making it simple to communicate analyses so that they can be reused and extended. During the last two years, the Galaxy team and the open-source community around Galaxy have made substantial improvements to Galaxy's core framework, user interface, tools, and training materials. Framework and user interface improvements now enable Galaxy to be used for analyzing tens of thousands of datasets, and >5500 tools are now available from the Galaxy ToolShed. The Galaxy community has led an effort to create numerous high-quality tutorials focused on common types of genomic analyses. The Galaxy developer and user communities continue to grow and be integral to Galaxy's development. The number of Galaxy public servers, developers contributing to the Galaxy framework and its tools, and users of the main Galaxy server have all increased substantially.

Clear search

Close search

Google apps

Main menu

The Galaxy platform for accessible, reproducible and collaborative...

Data from: ReGaTE: Registration of Galaxy Tools in Elixir

Data from: Data files for an RNA-Seq Tutorial

Supporting data and materials for "NCBI BLAST+ integrated into Galaxy".

Influenza genomics resources for Galaxy

Per-segment reference sequence data for Influenza A

ClusTrack: Feature Extraction and Similarity Measures for Clustering of...

Training data for 'Unicycler assembly of SARS-CoV-2 genome with...

Data for galaxy assymetry experiment

Transient Host Exchange

Escaped vs. unescaped text import into excel.

Gene Ontology terms enriched by genes with different H3K4me1 occupancy in...

Workflow Sample collected from Galaxy Main Server for reusability checking

NA12878 WES Benchmark dataset

RISPTS Dataset

Erwinia pyrifoliae sequence reads

Data from: Interactive Web-based Annotation of Plant MicroRNAs with...

Data from: NGS data related to Adam et al.: On the accuracy of the...

Home Media Server Report

File S1 - A Model-Based Approach to Identify Binding Sites in CLIP-Seq Data

Additional file 2: of Dintor: functional annotation of genomic and proteomic...

The Galaxy platform for accessible, reproducible and collaborative biomedical analyses: 2018 update - Datasets - CKAN