100+ datasets found
  1. simulated_experiments_1

    • figshare.com
    zip
    Updated Jul 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanchi Su (2022). simulated_experiments_1 [Dataset]. http://doi.org/10.6084/m9.figshare.17802935.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yanchi Su
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    simulated experiments 1

  2. simulated_experiments_2

    • figshare.com
    zip
    Updated Jul 13, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yanchi Su (2022). simulated_experiments_2 [Dataset]. http://doi.org/10.6084/m9.figshare.17802788.v1
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 13, 2022
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Yanchi Su
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    simulated experiments 2

  3. r

    expam Benchmarking - Classifier Performance Statistics

    • researchdata.edu.au
    Updated Jun 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sean Solari; Remy Young; Vanessa Marcelino; Sam Forster (2022). expam Benchmarking - Classifier Performance Statistics [Dataset]. http://doi.org/10.26180/19771072.v1
    Explore at:
    Dataset updated
    Jun 28, 2022
    Dataset provided by
    Monash University
    Authors
    Sean Solari; Remy Young; Vanessa Marcelino; Sam Forster
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Excel document containing precision, recall and F1 scores for metagenomic classifiers used in the benchmarking of expam's performance. Classifiers were tested on 140 simulated metagenomic communities, at different taxonomic ranks.

  4. [Dataset] Data for the course "Population Genomics" at Aarhus University

    • zenodo.org
    application/gzip, bin
    Updated Jan 8, 2025
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch (2025). [Dataset] Data for the course "Population Genomics" at Aarhus University [Dataset]. http://doi.org/10.5281/zenodo.7670839
    Explore at:
    application/gzip, binAvailable download formats
    Dataset updated
    Jan 8, 2025
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Samuele Soraggi; Samuele Soraggi; Kasper Munch; Kasper Munch
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Datasets, conda environments and Softwares for the course "Population Genomics" of Prof Kasper Munch. This course material is maintained by the health data science sandbox. This webpage shows the latest version of the course material.

    1. Data.tar.gz Contains the datasets and executable files for some of the softwares
      You can unpack by simply doing
      tar -zxf Data.tar.gz -C ./
      This will create a folder called Data with the uncompressed material inside
    2. Course_Env.packed.tar.gz Contains the conda environment used for the course. This needs to be unpacked to adjust all the prefixes (Note this environment is created on Ubuntu 22.10). You do this in the command line by
      1. creating the folder Course_Env: mkdir Course_Env
      2. untar the file: tar -zxf Course_Env.packed.tar.gz -C Course_Env
      3. Activate the environment: conda activate ./Course_Env
      4. Run the unpacking script (it can take quite some time to get it done): conda-unpack
    3. Course_Env.unpacked.tar.gz The same environment as above, but will work only if untarred into the folder /usr/Material - so use the version above if you are using it in another folder. This file is mostly to execute the course in our own cloud environment.
    4. environment_with_args.yml The file needed to generate the conda environment. Create and activate the environment with the following commands:
      1. conda env create -f environment_with_args.yml -p ./Course_Env
      2. conda activate ./Course_Env

    The data is connected to the following repository: https://github.com/hds-sandbox/Popgen_course_aarhus. The original course material from Prof Kasper Munch is at https://github.com/kaspermunch/PopulationGenomicsCourse.

    Description

    The participants will after the course have detailed knowledge of the methods and applications required to perform a typical population genomic study.

    The participants must at the end of the course be able to:

    • Identify an experimental platform relevant to a population genomic analysis.
    • Apply commonly used population genomic methods.
    • Explain the theory behind common population genomic methods.
    • Reflect on strengths and limitations of population genomic methods.
    • Interpret and analyze results of population genomic inference.
    • Formulate population genetics hypotheses based on data

    The course introduces key concepts in population genomics from generation of population genetic data sets to the most common population genetic analyses and association studies. The first part of the course focuses on generation of population genetic data sets. The second part introduces the most common population genetic analyses and their theoretical background. Here topics include analysis of demography, population structure, recombination and selection. The last part of the course focus on applications of population genetic data sets for association studies in relation to human health.

    Curriculum

    The curriculum for each week is listed below. "Coop" refers to a set of lecture notes by Graham Coop that we will use throughout the course.

    Course plan

    1. Course intro and overview:
    2. Drift and the coalescent:
    3. Recombination:
    4. Population strucure and incomplete lineage sorting:
    5. Hidden Markov models:
    6. Ancestral recombination graphs:
    7. Past population demography:
    8. Direct and linked selection:
    9. Admixture:
    10. Genome-wide association study (GWAS):
    11. Heritability:
      • Lecture: Coop Lecture notes Sec. 2.2 (p23-36) + Chap. 7 (p119-142)
      • Exercise: Association testing
    12. Evolution and disease:
      • Lecture: Coop Lecture notes Sec. 11.0.1 (p217-221)
      • Exercise: Estimating heritability
  5. d

    Raw motif mapping bedfile data and model training set class probabilities

    • search.dataone.org
    • data.niaid.nih.gov
    • +2more
    Updated May 6, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Phillip Davis (2025). Raw motif mapping bedfile data and model training set class probabilities [Dataset]. http://doi.org/10.5061/dryad.tdz08kq3w
    Explore at:
    Dataset updated
    May 6, 2025
    Dataset provided by
    Dryad Digital Repository
    Authors
    Phillip Davis
    Time period covered
    Jan 1, 2023
    Description

    Leveraging prior viral genome sequencing data to make predictions on whether an unknown, emergent virus harbors a ‘phenotype-of-concern’ has been a long-sought goal of genomic epidemiology. A predictive phenotype model built from nucleotide-level information alone is challenging with respect to RNA viruses due to the ultra-high intra-sequence variance of their genomes, even within closely related clades. We developed a degenerate k-mer method to accommodate this high intra-sequence variation of RNA virus genomes for modeling frameworks. By leveraging a taxonomy-guided ‘group-shuffle-split’ cross validation paradigm on complete coronavirus assemblies from prior to October 2018, we trained multiple regularized logistic regression classifiers at the nucleotide k-mer level. We demonstrate the feasibility of this method by finding models accurately predicting withheld SARS-CoV-2 genome sequences as human pathogens and accurately predicting withheld Swine Acute Diarrhea Syndrome coronavirus (...

  6. Sustained software development, not number of citations or journal choice,...

    • figshare.com
    xml
    Updated May 31, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Paul Gardner; James Paterson; S R McGimpsey; Fatemeh Ashari Ghomi; Aleksandra Pawlik; Alex Gavryushkin; Mik Black (2023). Sustained software development, not number of citations or journal choice, is indicative of accurate bioinformatic software -- PubMed XML files and scripts [Dataset]. http://doi.org/10.6084/m9.figshare.15121818.v2
    Explore at:
    xmlAvailable download formats
    Dataset updated
    May 31, 2023
    Dataset provided by
    Figsharehttp://figshare.com/
    Authors
    Paul Gardner; James Paterson; S R McGimpsey; Fatemeh Ashari Ghomi; Aleksandra Pawlik; Alex Gavryushkin; Mik Black
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    PubMed XML files for training and scoring likely benchmark papers.

  7. d

    Characterizing the targets of transcription regulators by aggregating...

    • search.dataone.org
    • borealisdata.ca
    Updated Dec 28, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Morin, Alexander (2023). Characterizing the targets of transcription regulators by aggregating ChIP-seq and perturbation expression data sets [Dataset]. http://doi.org/10.5683/SP3/MAFGFL
    Explore at:
    Dataset updated
    Dec 28, 2023
    Dataset provided by
    Borealis
    Authors
    Morin, Alexander
    Description

    There is a growing collection of genomics data sets generated for identifying the gene targets under control of transcription regulators (TRs). TR ChIP-seq and RNA expression experiments that perturb TR activity are the most common strategies for mapping TRs to genes at a genomic scale. However, the collection, preprocessing, summarization, and integration of these data sets requires a non-trivial degree of bioinformatics experience. In this study, we set out a framework to accomplish these tasks. We focus on eight TRs in both mouse and human, encompassing nearly 500 experiments, with two main objectives. The first is a detailed examination of the properties of the contributing experiments, to better learn of potential biases and pitfalls when aggregating diverse data sets. The second is to provide summarized, transparent, and convenient TR-target rankings based upon these genomic data sets for community use. Our work thus catalogues the state of the literature for a subset of important mammalian TRs, prioritizes gene targets based upon available empirical evidence, and provides a framework for ready expansion to more TR data sets.

  8. r

    Data from: Microarray time-series data classification via multiple alignment...

    • researchdata.edu.au
    Updated May 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Ataul Bari; Luis Rueda; Alioune Ngom (2022). Microarray time-series data classification via multiple alignment of gene expression profiles [Dataset]. http://doi.org/10.4225/03/5a1371a04a06e
    Explore at:
    Dataset updated
    May 5, 2022
    Dataset provided by
    Monash University
    Authors
    Ataul Bari; Luis Rueda; Alioune Ngom
    Description

    Pairwise alignment approaches for time-varying gene expression profiles have been recently developed for the detection of co-expressions in time-series microarray data sets. In this paper, we analyze multiple expression profile alignment (MEPA) methods for classifying microarray time-course data. We apply a nearest centroid classification technique, in which the centroid of each class is computed by means of a MEPA algorithm. MEPA aligns the expression profiles in such a way to minimize the total area between all aligned profiles. We propose four MEPA approaches whose effectiveness are demonstrated on the well-known budding yeast, S. cerevisiae, data set. PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1

    Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.

  9. r

    DRCAT Resource Catalogue

    • rrid.site
    • dknet.org
    • +2more
    Updated Jul 14, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2025). DRCAT Resource Catalogue [Dataset]. http://identifiers.org/RRID:SCR_005931
    Explore at:
    Dataset updated
    Jul 14, 2025
    Description

    Data resource catalog that collates metadata on bioinformatics Web-based data resources including databases, ontologies, taxonomies and catalogues. An entry includes information such as resource identifier(s), name, description and URL. ''''Query'''' lines are defined for each resource that describe what type(s) of data are available, in what format, how (by what identifier) the data can be retrieved and from where (URL). DRCAT was developed to provide more extensive data integration for EMBOSS, but it has many applications beyond EMBOSS. DRCAT entries (including ''''Query'''' lines) are annotated with terms from the EDAM ontology of common bioinformatics concepts.

  10. l

    The data set for testing cellCounts

    • opal.latrobe.edu.au
    bin
    Updated Dec 19, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi (2022). The data set for testing cellCounts [Dataset]. http://doi.org/10.26181/21588276.v1
    Explore at:
    binAvailable download formats
    Dataset updated
    Dec 19, 2022
    Dataset provided by
    La Trobe
    Authors
    Yang Liao; Dinesh Raghu; Bhupinder Pal; Lisa Mielke; Wei Shi
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    The 10x Chromium single-cell RNA sequencing technology is a powerful gene expression profiling platform, which is capable of profiling expression of thousands of genes in tens of thousands of cells simultaneously. This platform can produce hundreds of million reads in a single experiment, making it a very challenging task to quantify expression levels of genes in individual cells due to the massive data volume. Here we present cellCounts, a new tool for efficient and accurate quanti-fication of 10x Chromium. cellCounts employs the seed-and-vote strategy to align reads to a refer-ence genome, collapses reads to UMIs (Unique Molecular Identifier) and then assigns UMIs to genes based on the featureCounts program. Using multiple real datasets, we showed that cell-Counts is ~3 times faster than cellRanger, a popular quantification program developed by 10x. Using simulation and real datasets with built-in ground truth, we demonstrated that cellCounts is markedly more accurate than cellRanger, cellCounts is implemented in R, making it easily inte-grated with other R programs for analysing Chromium data.

  11. m

    Data from: Supplemental data

    • data.mendeley.com
    Updated Aug 22, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sudha Acharya (2023). Supplemental data [Dataset]. http://doi.org/10.17632/66r9pkckjz.1
    Explore at:
    Dataset updated
    Aug 22, 2023
    Authors
    Sudha Acharya
    License

    CC0 1.0 Universal Public Domain Dedicationhttps://creativecommons.org/publicdomain/zero/1.0/
    License information was derived automatically

    Description

    There are 7 supplemental data sets.

  12. Data from: UnFATE: A comprehensive probe set and bioinformatics pipeline for...

    • data.niaid.nih.gov
    zip
    Updated Jan 23, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Claudio Gennaro Ametrano (2025). UnFATE: A comprehensive probe set and bioinformatics pipeline for phylogeny reconstruction and multilocus barcoding of filamentous ascomycetes (Ascomycota, Pezizomycotina) [Dataset]. http://doi.org/10.5061/dryad.tht76hf1x
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jan 23, 2025
    Dataset provided by
    University of Trieste
    Authors
    Claudio Gennaro Ametrano
    License

    https://spdx.org/licenses/CC0-1.0.htmlhttps://spdx.org/licenses/CC0-1.0.html

    Description

    The subphylum Pezizomycotina (filamentous ascomycetes) is the largest clade within Ascomycota. Despite the importance of this group of fungi, our understanding of their evolution is still limited due to insufficient taxon sampling. Although next-generation sequencing technology allows us to obtain complete genomes for phylogenetic analyses, generating complete genomes of fungal species can be challenging, especially when fungi occur in symbiotic relationships or when the DNA of rare herbarium specimens is degraded or contaminated. Additionally, assembly, annotation, and gene extraction of whole-genome sequencing data require bioinformatics skills and computational power, resulting in a substantial data burden. To overcome these obstacles, we designed a universal target enrichment probe set to reconstruct the phylogenetic relationships of filamentous ascomycetes at different phylogenetic levels. From a pool of single-copy orthologous genes extracted from available Pezizomycotina genomes, we identified the smallest subset of genetic markers that can reliably reconstruct a robust phylogeny. We used a clustering approach to identify a sequence set that could provide an optimal trade-off between potential missing data and probe set cost. We incorporated this probe set into a user-friendly wrapper script named UnFATE (https://github.com/claudioametrano/UnFATE) that allows phylogenomic inferences without requiring expert bioinformatics knowledge. In addition to phylogenetic results, the software provides a powerful multilocus alternative to ITS-based barcoding. Phylogeny and barcoding approaches can be complemented by an integrated, pre-processed, and periodically updated database of all publicly available Pezizomycotina genomes. The UnFATE pipeline, using the 195 selected marker genes, consistently performed well across various phylogenetic depths, generating trees consistent with the reference phylogenomic inferences. The topological distance between the reference trees from literature and the best tree produced by UnFATE ranged between 0.10 and 0.14 (nRF) for phylogenies from family to subphylum level. We also tested the in vitro success of the universal baits set in a target capture approach on 25 herbarium specimens from ten representative classes in Pezizomycotina, which recovered a topology mostly congruent with recent phylogenomic inferences for this group of fungi. The discriminating power of our gene set was also assessed by the multilocus barcoding approach, which outperformed the barcoding approach based on ITS. With these tools, we aim to provide a framework for a collaborative approach to build robust, conclusive phylogenies of this important fungal clade.

  13. Data from: Benchmarking tools for transcription factor prioritization

    • zenodo.org
    application/gzip
    Updated Apr 23, 2024
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Sebastian Steinhauser; Sebastian Steinhauser; Leonor Schubert Santana; Gaulis Swann; Leonor Schubert Santana; Gaulis Swann (2024). Benchmarking tools for transcription factor prioritization [Dataset]. http://doi.org/10.5281/zenodo.10990183
    Explore at:
    application/gzipAvailable download formats
    Dataset updated
    Apr 23, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Sebastian Steinhauser; Sebastian Steinhauser; Leonor Schubert Santana; Gaulis Swann; Leonor Schubert Santana; Gaulis Swann
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Time period covered
    Apr 19, 2024
    Description

    Abstract:

    Spatiotemporal regulation of gene expression is controlled by transcription factor (TF) binding to regulatory elements, resulting in a plethora of cell types and cell states from the same genetic information. Due to the importance of regulatory elements, various sequencing methods have been developed to localise them in genomes, for example using ChIP-seq profiling of the histone mark H3K27ac that marks active regulatory regions. Moreover, multiple tools have been developed to predict TF binding to these regulatory elements based on DNA sequence. As altered gene expression is a hallmark of disease phenotypes, identifying TFs driving such gene expression programs is critical for the identification of novel drug targets.In this study, we curated 84 chromatin profiling experiments (H3K27ac ChIP-seq) where TFs were perturbed through e.g., genetic knockout or overexpression. We ran nine published tools to prioritize TFs using these real-world data sets and evaluated the performance of the methods in identifying the perturbed TFs. This allowed the nomination of three frontrunner tools, namely RcisTarget, MEIRLOP and monaLisa. Our analyses revealed opportunities and commonalities of tools that will help to guide further improvements and developments in the field.

    Dataset description:

    • tf_tool_benchmark_atacseq_diffPeaks.tar.gz -Archive containing differential peak statistics, tool diff peak input files (fore- and background) for all currated ATAC-seq datasets.
    • tf_tool_benchmark_h3K27ac_chipseq_diffPeaks.tar.gz - Archive containing differential peak statistics, tool diff peak input files (fore- and background) for all currated H3K27ac ChIP-seq datasets.
    • tf_tool_benchmark_atacseq_results.tar.gz - Archive containing the raw tool results for each ATAC-seq dataset.
    • tf_tool_benchmark_chipseq_results.tar.gz - Archive containing the raw tool results for each H3K27ac ChIP-seq dataset.
    • tf_tool_benchmark_results.tar.gz - Archive containing tool results summary for plotting (rds files).

    Contact: Sebastian Steinhauser - sebastian.steinhauser@novartis.com

  14. d

    Data from: The new bioinformatics: integrating ecological data from the gene...

    • datadryad.org
    • data.niaid.nih.gov
    • +1more
    zip
    Updated Jul 16, 2012
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Matthew B. Jones; Mark P. Schildahuer; O. J. Reichman; Shawn Bowers; Mark P. Schildhauer; O.J. Reichman (2012). The new bioinformatics: integrating ecological data from the gene to the biosphere [Dataset]. http://doi.org/10.5061/dryad.qb0d6
    Explore at:
    zipAvailable download formats
    Dataset updated
    Jul 16, 2012
    Dataset provided by
    Dryad
    Authors
    Matthew B. Jones; Mark P. Schildahuer; O. J. Reichman; Shawn Bowers; Mark P. Schildhauer; O.J. Reichman
    Time period covered
    2012
    Description

    Cumulative number of data packages in the Knowledge Network for Biocomplexity until 2007-06-21This data set records the cumulative number of data packages in the Knowledge Network for Biocomplexity (KNB) data repository through 2007-06-21. A data package represents a set of data files and metadata files that together make a coherent, citable unit for some particular scientific activity. Each data package in the KNB is described by a scientific metadata document and can be composed of one or more data files that contain various segments of the data in question.cumdatasets-20070622.csv

  15. m

    Inter-residue distances surrounding the ligand data sets using MANORAA

    • data.mendeley.com
    • narcis.nl
    Updated Sep 22, 2021
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Duangrudee Tanramluk (2021). Inter-residue distances surrounding the ligand data sets using MANORAA [Dataset]. http://doi.org/10.17632/4z4mypck9b.3
    Explore at:
    Dataset updated
    Sep 22, 2021
    Authors
    Duangrudee Tanramluk
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Distances measured between distinctive parts of amino acid residues surrounding the ligand.

  16. Simulation files and results without missing data

    • search.datacite.org
    • figshare.com
    Updated Jan 19, 2016
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    April Wright (2016). Simulation files and results without missing data [Dataset]. http://doi.org/10.6084/m9.figshare.1160601.v1
    Explore at:
    Dataset updated
    Jan 19, 2016
    Dataset provided by
    DataCitehttps://www.datacite.org/
    figshare
    Figsharehttp://figshare.com/
    Authors
    April Wright
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Publication: Wright AM and Hillis DM (2014). Bayesian analysis using a simple likelihood model outperforms parsimony for estimation of phylogeny from discrete morphological data. PLOS ONE. Contents: Data sets without missing data, and the phylogenetic trees estimated from these sets. Details: These data sets were simulated along the tree in Fig. 1 of the paper. No missing data distribution was imposed on these data sets.

  17. Multi-Dimensional Data Viewer (MDV) user manual for data exploration:...

    • zenodo.org
    pdf, zip
    Updated Jul 12, 2024
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Maria Kiourlappou; Maria Kiourlappou; Martin Sergeant; Martin Sergeant; Joshua S. Titlow; Joshua S. Titlow; Jeffrey Y. Lee; Jeffrey Y. Lee; Darragh Ennis; Stephen Taylor; Stephen Taylor; Ilan Davis; Ilan Davis; Darragh Ennis (2024). Multi-Dimensional Data Viewer (MDV) user manual for data exploration: "Systematic analysis of YFP traps reveals common discordance between mRNA and protein across the nervous system" [Dataset]. http://doi.org/10.5281/zenodo.7875495
    Explore at:
    zip, pdfAvailable download formats
    Dataset updated
    Jul 12, 2024
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Maria Kiourlappou; Maria Kiourlappou; Martin Sergeant; Martin Sergeant; Joshua S. Titlow; Joshua S. Titlow; Jeffrey Y. Lee; Jeffrey Y. Lee; Darragh Ennis; Stephen Taylor; Stephen Taylor; Ilan Davis; Ilan Davis; Darragh Ennis
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Please also see the latest version of the repository:
    https://doi.org/10.5281/zenodo.6374011 and
    our website: https://ilandavis.com/jcb2023-yfp

    The explosion in the volume of biological imaging data challenges the available technologies for data interrogation and its intersection with related published bioinformatics data sets. Moreover, intersection of highly rich and complex datasets from different sources provided as flat csv files requires advanced informatics skills, which is time consuming and not accessible to all. Here, we provide a “user manual” to our new paradigm for systematically filtering and analysing a dataset with more than 1300 microscopy data figures using Multi-Dimensional Viewer (MDV) -link, a solution for interactive multimodal data visualisation and exploration. The primary data we use are derived from our published systematic analysis of 200 YFP traps reveals common discordance between mRNA and protein across the nervous system (eprint link). This manual provides the raw image data together with the expert annotations of the mRNA and protein distribution as well as associated bioinformatics data. We provide an explanation, with specific examples, of how to use MDV to make the multiple data types interoperable and explore them together. We also provide the open-source python code (github link) used to annotate the figures, which could be adapted to any other kind of data annotation task.

  18. r

    Minimal siRNA set cover heuristic for gene family knockdown

    • researchdata.edu.au
    Updated May 5, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Xiaoguang Li; Alioune Ngom; Luis Rueda (2022). Minimal siRNA set cover heuristic for gene family knockdown [Dataset]. http://doi.org/10.4225/03/5a1371b7e405f
    Explore at:
    Dataset updated
    May 5, 2022
    Dataset provided by
    Monash University
    Authors
    Xiaoguang Li; Alioune Ngom; Luis Rueda
    Description

    PRIB 2008 proceedings found at: http://dx.doi.org/10.1007/978-3-540-88436-1

    Contributors: Monash University. Faculty of Information Technology. Gippsland School of Information Technology ; Chetty, Madhu ; Ahmad, Shandar ; Ngom, Alioune ; Teng, Shyh Wei ; Third IAPR International Conference on Pattern Recognition in Bioinformatics (PRIB) (3rd : 2008 : Melbourne, Australia) ; Coverage: Rights: Copyright by Third IAPR International Conference on Pattern Recognition in Bioinformatics. All rights reserved.

  19. d

    Whole genome DNA sequences of Gulf of Mexico invertebrates

    • search.dataone.org
    • data.griidc.org
    Updated Feb 5, 2025
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Thomas, W. Kelley (2025). Whole genome DNA sequences of Gulf of Mexico invertebrates [Dataset]. http://doi.org/10.7266/n7-pchj-dh15
    Explore at:
    Dataset updated
    Feb 5, 2025
    Dataset provided by
    GRIIDC
    Authors
    Thomas, W. Kelley
    Area covered
    Gulf of Mexico (Gulf of America)
    Description

    The dataset consists of whole genome DNA sequences, generated from invertebrate species from the Gulf of Mexico during the Benthic Invertebrate Taxonomy, Metagenomics, and Bioinformatics Workshop (BITMaB) in 2017 in Corpus Christi, Texas, USA. All genomic data sets were deposited in and distributed by GenBank (NCBI), the European Nucleotide Archive (ENA)- European Bioinformatics Institute (EMBL-EBI), DNA Data Bank of Japan, NemATOL, the Global Genome Initiative, and Ocean Genome Legacy.

  20. s

    ATGC: Montpellier bioinformatics platform

    • scicrunch.org
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    ATGC: Montpellier bioinformatics platform [Dataset]. http://identifiers.org/RRID:SCR_002917
    Explore at:
    Area covered
    Montpellier
    Description

    A bioinformatics platform that is a joint project of several South of France laboratories with available services based on their expertise, issued from their research activities which involve phylogenetics, population genetics, molecular evolution, genome dynamics, comparative and functional genomics, and transcriptome analysis. Most of the software and databases on ATGC are (co)authored by researchers from South of France teams. Some are widely used and highly cited. South of France laboratories: * CRBM (transcriptomes and stem cells). * IBC (computational biology). * MiVEGEC (evolution and phylogeny). * LGDP (plant genomics). * LIRMM (computer science). * South Green (plant genomics).

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
Yanchi Su (2022). simulated_experiments_1 [Dataset]. http://doi.org/10.6084/m9.figshare.17802935.v1
Organization logo

simulated_experiments_1

Explore at:
zipAvailable download formats
Dataset updated
Jul 13, 2022
Dataset provided by
Figsharehttp://figshare.com/
Authors
Yanchi Su
License

MIT Licensehttps://opensource.org/licenses/MIT
License information was derived automatically

Description

simulated experiments 1

Search
Clear search
Close search
Google apps
Main menu